This data collection is derived from two sources: 1) Submissions of DNA sequences of S. cerevisiae (yeast), Sus scrofa (pig) and Homo sapiens (human) to the European Nucleotide Archive, and 2) First description of these sequences in the scientific literature. The time range of the records is 1980-2000 (yeast), 1985-2005 (human) and 1990-2015 (pig). In total, each species has two associated datasets: 1) A .csv file documenting the PubMed ID of each article describing new sequences, all paper authors, all institutional affiliations of each author, country of institution, year of first submission to the European Nucleotide Archive, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the European Nucleotide Archive, number of nucleotides sequenced, number of submissions per institution, and year of submission to the database. The approximate number of records is 30,000 publications and over 2 million sequence submissions.