Genomic and transcriptomic data for the frog Platyplectrum ornatum
dataset
posted on 2022-06-10, 02:45authored byScott Edwards, Sangeet Lamichhaney, Renee Catullo, Scott Keogh, Simon ClulowSimon Clulow, Tariq Ezaz
The diversity of genome sizes across the tree of life is of key interest in evolutionary biology. Various correlates of variation in genome size, such as accumulation of transposable elements or rate of DNA gain and loss, are well known, but the underlying molecular mechanisms that drive or constrain genome size are poorly understood. Here we study one of the smallest genomes among frogs characterized thus far, that of the ornate burrowing frog (Platyplectrum ornatum) from Australia, and compare it to other published frog and vertebrate genomes to examine the forces driving reduction in genome size. At ~1.06 Gb, the P. ornatum genome is like that of birds, revealing four major mechanisms underlying TE dynamics: reduced abundance of all major classes of transposable elements (TEs); increased net deletion bias in TEs; drastic reduction in the lengths of introns; and expansion via gene duplication of the repertoire of TE-suppressing Piwi genes, accompanied by increased expression of piRNA-based TE-silencing pathway genes in germline cells. Transcriptome data from multiple tissues in both sexes corroborate these results and provide insight into sex-differentiation pathways in Platyplectrum. Genome skimming of two closely related frog species (Lechriodus fletcheri and Limnodynastes fletcheri) confirms a reduction in TEs as a major driver of genome reduction in Platyplectrum and supports a macroevolutionary scenario of small genome size in frogs driven by convergence in life history, especially rapid tadpole development and tadpole diet. The P. ornatum genome offers a model for future comparative studies on mechanisms of genome size reduction in amphibians and in vertebrates generally.
Methods
Genomic sequence data of an ornate burrowing frog (Platyplectrum ornatum) was generated from DNA isolated from the muscle of an adult female. Based on the expected genome size of 0.96 Gb, the P. ornatum genome was sequenced to ~140X coverage (Supplementary Table 2), on Illumina HiSeq 2500 platform using two different sequence libraries, (a) fragment library with average insert size of 220 bp (b) jumping library with average insert size of 6 kb, generating ~ 1.2 billion paired end reads with read length of 125 bp each. We also collected brain, heart, muscle and gonad tissues from one male and one adult female P. ornatum individuals and generated ~ 540 million transcriptome reads to use the data for genome annotations. We first generated a female transcriptome assembly combining RNAseq data from brain, gonad, heart and muscle using Trinity. In addition, we mapped RNAseq data from the male individual to the reference P. ornatum genome assembly using TopHat and extracted exon/intron junction information for downstream usage for genome annotations.
Usage Notes
We provide detailed results of our RepeatMasker analysis, which can be parsed and used in meta-analsyes of repeat landscapes in vertebrates. We also provide files and results for the Orthofinder analysis, which is based on a rooted species tree, and yields counts of orthologs in each species, which are provided in a table ("Gene_count_per_orthogroup.txt"). The gene trees for each ortholog group are also provided. Finally, we provide fasta files and the gene tree in newick format used in the analysis of Piwi genes. The full results of the aBSREL analysis are also provided.