Data upload associated with: Citation [1]: Harvey WT, Benton DJ, Gregory V, Hall JPJ, Daniels RS, Bedford T, Haydon DT, Hay AJ, McCauley JW and Reeve R (2016) Identification of low- and high-impact hemagglutinin amino acid substitutions that drive antigenic drift of influenza A(H1N1) viruses. PLOS Pathogens 12(4): e1005526. doi:10.1371/journal.ppat.1005526 This dataset has been extended to include all available former seasonal Influenza A(H1N1) (1977-2009) haemagglutination inhibition data generated by the WHO Collaborating Centre for Reference and Research on Influenza, London, UK. Additional citation for full dataset [2]: Gregory V, Harvey WT, Daniels RS, Reeve R, Whittaker L, Halai C, Douglas A, Gonsalves R, Skehel JJ, Hay AJ and McCauley JW (2016) Human former seasonal Influenza A(H1N1) haemagglutination inhibition data 1977 - 2009 from the WHO Collaborating Centre for Reference and Research on Influenza, London, UK. The Crick Worldwide Influenza Centre, The Francis Crick Institute, Mill Hill Laboratory, The Ridgeway, Mill Hill, London NW7 1AA, UK. doi:10.5525/gla.reasearchdata.289 1. H1N1.trees [1] - Estimated phylogeny for HA1 sequences of 506 viruses used in statistical analyses in associated manuscript and additionally A/Puerto Rico/8/34 - the virus used for the described structural analysis. - Phylogeny is the maximum clade credibility tree identified from a sample of posterior trees generated by BEAST v1.7.4. - BEAST is a program for the Bayesian analysis of molecular sequences using MCMC. It is orientated towards the estimation of rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. - BEAST URL: http://beast.bio.ed.ac.uk 2. H1N1_HI.csv [1, 2] - HI data covering 48,707 assays performed on dates from 1989 to 2010. - Dataset includes 4,436 viruses and antisera raised against 92 reference viruses. - For each entry, file includes the following columns: - virus: test virus in HI assay. - reference: reference virus used to raise antiserum used in HI assay. - titre: HI titre, < indicates that haemagglutination was not inhibited by antisera at a 1:40 dilution (the lower threshold of the assay). - dateOfTest: In earlier years, day and month may be unknown. In these cases, dates take the form "FRANCE-A-95" where France is the origin of test viruses included in the HI table and 95 is the year. A, B etc. are used to indicate different (but unknown) dates. - in.analysis: * indicates that entry was used in analyses described in associated manuscript [1]. 3. H1N1_isolate_data.csv [1] - Allows for matching of virus names used in described analyses to be matched to names and isolate IDs in GISAID EpiFlu database (http://platform.gisaid.org) - For 506 viruses analysed in paper, file includes: - Isolate: Naming scheme consistent with H1N1.trees and H1N1_HI.csv - GISAID.Isolate.Name - GISAID.Isolate_Id - GISAID.HA.Segment_Id - Passage_History - Collection_Date Also available: the script used to process BEAST maximum clade credibility tree files for use in R, tree2R.rb, is available at https://github.com/richardreeve/tree2R.git Requires a phylogenetic tree (see file 1 above) and HI data for associated viruses. The script generates phylogenetic terms used in statistical analyses described in associated manuscript, generating unique identifiers ("Control.*") for each branch in phylogenetic tree. For each observation in associated HI data, each branch of phylogenetic tree is categorised as separating reference virus and test virus in a bifurcating tree ("Control.*" = 1), or not ("Control.*" = 0). Phylogenetic terms ("Control.*") can then be investigated as correlates of HI titres in R using described methods.