Sequence files and other related information for the Potato Genome Sequencing Consortium (PGSC). The PGSC has sequenced two potato species: the heterozygous diploid S. tuberosum Group Tuberosum cultivar, RH89-039-16 (RH), and the doubled monoploid S. tuberosum Group Phureja clone DM1-3 (DM)
For publication using the v4.04 pseudomolecules, please cite the following article:
Michael Alan Hardigan, Emily Crisovan, John P Hamiltion, Jeongwoon Kim, Parker Laimbeer, Courtney P Leisner, Norma C Manrique-Carpintero, Linsey Newton, Gina M Pham, Brieanne Vaillancourt, Xueming Yang, Zixian Zeng, David Douches, Jiming Jiang, Richard E Veilleux, and C. Robin Buell. 2016, Genome reduction uncovers a large dispensable genome and adaptive role for copy number variation in asexually propagated Solanum tuberosum. Plant Cell, doi:10.1105/tpc.15.00538 View the article here.
For publication using the PGSC v4.03 pseudomolecules, please cite the following two articles:
Potato Genome Sequencing Consortium 2011, Genome sequence and analysis of the tuber crop potato. Nature 475: 189–195.
View the article here.
Sharma, S. K., Bolser, D., de Boer, J., Sønderkær, M., Amoros, W., Carboni, M. F., D’Ambrosio, J. M., de la Cruz, G., Di Genova, A., Douches, D. S., Eguiluz, M., Guo, X., Guzman, F., Hackett, C. A., Hamilton, J. P., Li, G., Li, Y., Lozano, R., Maass, A., Marshall, D., Martinez, D., McLean, K., Mejía, N., Milne, L., Munive, S., Nagy, I., Ponce, O., Ramirez, M., Simon, R., Thomson, S. J., Torres, Y., Waugh, R., Zhang, Z., Huang, S., Visser, R. G. F., Bachem, C. W. B., Sagredo, B., Feingold, S. E., Orjeda, G., Veilleux, R. E., Bonierbale, M., Jacobs, J. M. E., Milbourne, D., Martin, D. M. A. & Bryan, G. J. 2013, Construction of Reference Chromosome-Scale Pseudomolecules for Potato: Integrating the Potato Genome with Genetic and Physical Maps. G3: Genes|Genomes|Genetics 3: 2031-2047.
View the article here.
The Buell lab at Michigan State have created a new pseudomolecule (chrUn) created from assembled DM reads that did not map to v4.03 and released it with the v4.03 chr00-chr12 pseudomolecules as v4.04. The pseudomolecules chr00-chr12 remain the same as v4.03. The v4.04 FASTA file can be downloaded below or searched on the SpudDB BLAST server. More details about the construction of chrUn can be found in the paper by Hardigan et al. (2016).
DM_v4.04_pseudomolecules.fasta.zip -
PGSC_DM_V403_genes.gff.zip -
Gene annotation for the v4.03 Pseudomolecules in GFF3 format
PGSC_DM_V403_representative_genes.gff.zip -
Representative gene annotation for the v4.03 Pseudomolecules in GFF3 format - Only the transcript that produces the longest peptide sequence among all the alternative isoforms of a gene is included.
The format of the files:
1st column: gene ID
2nd column: library 1
3rd column: library 2
...
last column: functional annotation of the gene
The reads were mapped to S. tuberosum Group Phureja DM1-3 superscaffolds using Tophat (v1.4.1) [which made use of Bowtie (v1.0.0)] The FPKM values were calculated by Cufflinks (v1.3.0) using v3.4 representative model set only.
Tophat was run with "-i 10 -I 15000" parameters, which set a minimum intron size of 10bp (-i 10), and a maximum intron size of 15,000bp (-I 15000). These values are the minimum and maximum intron feature lengths present in the v3.4 GFF. Paired-end libraries were aligned in single end mode.
Cufflinks was run with the same maximum intron size of 15,000bp (-I 15000)
Functional annotation was based on best BLASTX hits using the CDS sequences against UniRef100. The text was assigned using a first informative best-hit strategy, which considers best BLASTX hits where E <= 1e-5, but excludes hits with non-informative functional text (eg: "Whole genome shotgun sequence of line..."). The text is also programmatically cleaned to remove some misleading and low-information strings. For gene-level annotation, the transcript-level functional text was concatenated, so there will be some redundancy due to variations in the annotation string assigned to the different isoforms.
This tab-delimited file has the following columns:
Cluster_ID,
Number_of_peptides_in_this_cluster
Number_of_species_in_this_cluster
Species (separated by space)
Peptides (separated by space)
The potato DArT array contains 7,680 probes obtained using genomic representations from a potato diversity panel also including selected probes from tomato (234) and Capsicum (54). The DArT probes were sequenced using financial support from The James Hutton Institute, UK under their Potato Genome Sequencing Grant* and are made available by Diversity Arrays Technology Pty Ltd, Yarralumla ACT 2600, Australia. This work is part of the Potato Mapping Group, a subgroup of the Potato Genome Sequencing Consortium (PGSC).
*Scottish Government Rural and Environmental Science and Analytical Services Division (RESAS), Department for Environment, Food and Rural Affairs (DEFRA), Agriculture and Horticulture Development Board (AHDB) - Potato Council.