SRA EXPERIMENT
SRA Experiment Id ERX3728630  (Link to NCBI )
Experiment Title HiSeq X Ten paired end sequencing; Raw reads: WGS_5
SRA Study
SRA Study Id ERP118387  (Link to NCBI )
Study Title Improved reference genome assembly for Pacific oyster (Crassostrea gigas)
Study Abstract Crassostrea gigas sequence assembly cgigas_uk_v1 release notes: The genome of a single female Pacific oyster (Crassostrea gigas) from a commercial oyster hatchery (Guernsey Sea Farms, United Kingdom) was sequenced and assembled. High molecular weight DNA was extracted from frozen gill tissue. Genome sequencing was performed on (i) a PacBio Sequel system at ~80 x coverage and (ii) an Illumina HiSeq X platform to a mean coverage of ~60 x, based on an estimated genome size of ~600 Mb. The estimated genome size was considered as approximately the mid point between estimates obtained from flow cytometry (~640 Mb ) and a k-mer based analysis (~555 Mb). The oyster genome was assembled from PacBio reads using Canu v1.8 and then error corrected using Arrow v2.3.2 and Pilon v1.2.3. The initial Canu-assembly was substantially larger than expected (~1.2 Gb), likely due to the high levels of heterozygosity in the genome of C. gigas which led to the separated assembly of both alleles in a given chromosome region. Highly divergent haplotypes were identified among the contigs and reassigned with a combination of the purge_haplotigs pipeline and an all-versus-all contig mapping. The haploid version of the assembly was scaffolded using Hi-C sequence reads and integrated with a previously published high-density linkage map (~20K SNPs), resulting in a chromosome-level assembly comprising ten large scaffolds (2n = 20 in C. gigas). These chromosome-level scaffolds were anchored to a cytogenetic map where possible. Scaffolds with a high fraction of regions (>30%) showing abnormal coverage (i.e. 2SD above or below the mean) were removed from the assembly. All unplaced scaffolds and contigs which contain unique transcripts were retained. The final cgigas_uk_v1 genome assembly (647 Mb) contains the ten expected chromosomes and 226 unplaced scaffolds, with a total N50 of 58.4 Mb and 1.8 Mb for scaffold and contig lengths, respectively. For questions regarding the cgigas_uk_v1 genome assembly please contact Dr Ross Houston, The Roslin Institute, University of Edinburgh, UK. Email: [email protected]. Credits Funding: National Environment Research Council NE/P010695/1 (PI – Ross Houston), British Biotechnology and Biological Sciences Research Council Strategic Programme Grants BB/P013732/1, BB/P013740/1, BB/P013759/1 Source and isolation: Animal from Guernsey Sea Farms, Guernsey, UK; Animal rearing at Centre for Environment, Fisheries and Aquaculture Science (CEFAS), Weymouth, UK; DNA and RNA extraction at The Roslin Institute, University of Edinburgh, UK. DNA and RNA Sequencing: DNA at Edinburgh Genomics, University of Edinburgh, UK; RNA at Dresden Genome Center, Pfotenhauerstraße 108, 01307 Dresden, Germany. Assembly / Assembly Map Integration: Lel Eory, Carolina Peñaloza, Alejandro Gutierrez, Alan Archibald, Ross Houston, Tim Bean, The Roslin Institute, University of Edinburgh, UK. Hi-C library preparation: Alejandro Gutierrez, The Roslin Institute, University of Edinburgh, UK. Scaffolding: Dovetail Genomics, 100 Enterprise Way, Scotts Valley, CA 95066, USA. Linkage mapping: Alejandro Gutierrez, The Roslin Institute, University of Edinburgh, UK. Cytogenetic mapping: Shan Wang, Ximing Guo, Haskin Shellfish Research Laboratory, Department of Marine and Coastal Sciences, Rutgers University, 6959 Miller Avenue, Port Norris, NJ 08349, USA. Flow cytometry: Carolina Peñaloza, The Roslin Institute, University of Edinburgh, UK. Assembly submission: Carolina Peñaloza, The Roslin Institute, University of Edinburgh, UK. Data use The Roslin Institute, University of Edinburgh, UK, has made the sequencing data freely available to the community. The assembly can be downloaded, analysed, or incorporated into databases with the condition that users properly acknowledge the data providers.
Alias ena-STUDY-THE ROSLIN INSTITUTE-13-11-2019-11:43:19:757-312
External Id PRJEB35351
SRA Sample
SRA Sample Id ERS4058887  (Link to NCBI )
Title Pacific oyster sequenced
SRA Run
SRA Run Id ERR3727450  (Link to NCBI )
Spots 22918337
Bases 6875501100
Size 3175569724
Published Date 2019-12-20
Exp Library Strategy WGS
Library Source GENOMIC
Library Selection RANDOM
Library Name
Library Layout PAIRED
Library Instrument HI_SEQ_X_TEN
Exp. Description
Spot Length