De novo genome assembly and annotation of Australia’s largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanoporesequencing read

Christopher M Austin, Tan Mun Hua, Katherine A Harrisson, Yin Peng Lee, Laurence J Croft, Paul Sunnucks, Alexandra Pavlova, Han Ming Gan

Research output: Contribution to journalArticleResearchpeer-review

Abstract

One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family.
Original languageEnglish
Article numbergix063
Number of pages6
JournalGigaScience
Volume6
Issue number8
DOIs
Publication statusPublished - 1 Aug 2017

Keywords

  • Genome
  • Hybrid assembly
  • Long reads
  • Murray cod
  • Transcriptome

Cite this

Austin, Christopher M ; Hua, Tan Mun ; Harrisson, Katherine A ; Lee, Yin Peng ; Croft, Laurence J ; Sunnucks, Paul ; Pavlova, Alexandra ; Gan, Han Ming. / De novo genome assembly and annotation of Australia’s largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanoporesequencing read. In: GigaScience. 2017 ; Vol. 6, No. 8.
@article{5311959a11b940ca9f8695d252f25bbb,
title = "De novo genome assembly and annotation of Australia’s largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanoporesequencing read",
abstract = "One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2{\%} and 91.9{\%}, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29{\%} reduction in the number of scaffolds and a 55{\%} and 54{\%} increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family.",
keywords = "Genome, Hybrid assembly, Long reads, Murray cod, Transcriptome",
author = "Austin, {Christopher M} and Hua, {Tan Mun} and Harrisson, {Katherine A} and Lee, {Yin Peng} and Croft, {Laurence J} and Paul Sunnucks and Alexandra Pavlova and Gan, {Han Ming}",
year = "2017",
month = "8",
day = "1",
doi = "10.1093/gigascience/gix063",
language = "English",
volume = "6",
journal = "GigaScience",
issn = "2047-217X",
publisher = "Springer-Verlag London Ltd.",
number = "8",

}

De novo genome assembly and annotation of Australia’s largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanoporesequencing read. / Austin, Christopher M; Hua, Tan Mun; Harrisson, Katherine A; Lee, Yin Peng; Croft, Laurence J; Sunnucks, Paul; Pavlova, Alexandra; Gan, Han Ming.

In: GigaScience, Vol. 6, No. 8, gix063, 01.08.2017.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - De novo genome assembly and annotation of Australia’s largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanoporesequencing read

AU - Austin, Christopher M

AU - Hua, Tan Mun

AU - Harrisson, Katherine A

AU - Lee, Yin Peng

AU - Croft, Laurence J

AU - Sunnucks, Paul

AU - Pavlova, Alexandra

AU - Gan, Han Ming

PY - 2017/8/1

Y1 - 2017/8/1

N2 - One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family.

AB - One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family.

KW - Genome

KW - Hybrid assembly

KW - Long reads

KW - Murray cod

KW - Transcriptome

UR - http://www.scopus.com/inward/record.url?scp=85037027532&partnerID=8YFLogxK

U2 - 10.1093/gigascience/gix063

DO - 10.1093/gigascience/gix063

M3 - Article

VL - 6

JO - GigaScience

JF - GigaScience

SN - 2047-217X

IS - 8

M1 - gix063

ER -