README - ENLIGHTEN DEPOSIT - RESEARCH DATA ################################################################################################################### MANUSCRIPT: Parallelism in eco-morphology and gene expression despite variable evolutionary and genomic backgrounds in a Holarctic fish AUTHORS: Arne Jacobs, Madeleine Carruthers, Andrey Yurchenko, Natalia V. Gordeeva, Sergei S. Alekseyev, Oliver Hooker, Jong S. Leong, David R. Minkley, Eric B. Rondeau, Ben F. Koop, Colin E. Adams & Kathryn R. Elmer Journal: PLOS Genetics Preprint: Convergence in form and function overcomes non-parallel evolutionary histories in a Holarctic fish (2019) bioRxiv; doi: https://doi.org/10.1101/265272 ################################################################################################################### Linear phenotypic measurements, SNP data and gene expression data for polymorphic Arctic charr populations from multiple lakes and evolutionary lineages. The data are described in the manuscript named above. DATASETS: 1. Linear_measurements_raw.csv: CSV table containing raw linear measurements for all individuals used in the study. Legend in file. 2. Combined_filtered_global_dataset.vcf: VCF file containing filtered ddRADseq-derived SNP data called for individuals from Scotland and Siberia together and was used for genetic analyses across lineages. Raw ddRADseq data are available in NCBI SRA (see manuscript for accessions). See manuscript for filtering parameters. 3. Atlantic_filtered_SNPdataset.vcf: VCF file containing filtered ddRADseq-derived SNPs called only for individuals from Scotland and was filtered for this dataset. See manuscript for filtering parameters. 4. Siberia_filtered_SNPdataset.vcf: VCF file containing filtered ddRADseq-derived SNPs called only for individuals from Siberia and was filtered for this dataset. See manuscript for filtering parameters. 5. Popmap_SNPdata.csv: CSV file containing sample information (Name, Ecotype, Lake) for each individual in the combined VCF file. 6. HTSEQ_CHARR_GENE_COUNT_RAW.txt: Text file containing raw read counts for each annotated transcript produced with HTSEQ from genome-aligned RNAseq data. Sample information is given in the RNAseq_samples_info.txt file, and the draft annotation for the Salvelinus draft genome in the Salvelinus_alpinus_draftgenome_annotation.gff file. 7. RNAseq_samples_info.txt: Text file containing sample information (Name, Lake, Lineage, Ecotype, Lake) for each individual in the HTSEQ_CHARR_GENE_COUNT_RAW.txt file. 8. Salvelinus_alpinus_draftgenome_annotation.gff: GFF file for the draft annotation of the Salvelinus alpinus draft genome used in the manuscript. Please see manuscript for the annotation pipeline. 9. Salvelinus_alpinus_draftgenome_annotation_proteins.fa: Fasta sequences for each annotated protein in the draft annotation of the Salvelinus alpinus draft genome used in the manuscript. Please see manuscript for the annotation pipeline. ################################################################################################################### Notes: The Arctic charr (Salvelinus alpinus) genome used in this study is described in: Christensen, Kris A., et al. "The Arctic charr (Salvelinus alpinus) genome and transcriptome assembly." PloS one 13.9 (2018). All raw sequence data used in this study can be found in the NCBI Short Read Archive (SRA). ################################################################################################################### The data in this repository can be freely used by anyone, but please cite the relevant manuscript: the final manuscript (no DOI yet, but you can find it under the provided title) or the preprint (see details above).