SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021

Lomax, N., Hoehn, A. , Heppenstall, A. , Purshouse, R., Wu, G., Zia, K. and Meier, P. (2024) SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021. [Data Collection]

Datacite DOI: 10.5255/UKDA-SN-9277-1

Collection description

Abstract copyright UK Data Service and data collection copyright owner.

The SIPHER Synthetic Population allows for the creation of a survey-based full-scale synthetic population for all of Great Britain, through a linkage with the UK Household Longitudinal Study (UKDS SN 6614, Understanding Society, wave k). By drawing on data reflecting 'real' survey respondents, the dataset represents over 50 million synthetic (i.e. 'not real') individuals. As a digital twin of the adult population in Great Britain, the SIPHER Synthetic Population provides a novel source of microdata for understanding 'status quo' and modelling 'what if' scenarios (e.g., via static/dynamic microsimulation model), as well as other exploratory analyses where a granular geographical resolution is required.

The lack of a centralised and comprehensive register-based system in Great Britain limits opportunities for studying the interaction of aspects such as health, employment, benefit payments, or housing quality at the level of individuals and households. At the same time, the data that exist are typically strictly controlled and only available in safe haven environments under a 'create-and-destroy' model. In particular, when testing policy options via simulation models where results are required swiftly, these limitations can present major hurdles to coproduction and collaborative work connecting researchers, policymakers, and key stakeholders. In some cases, survey data can provide a suitable alternative to the lack of readily available administrative data. However, survey data does typically not allow for a small-area perspective. Although Special Licence area-level linkages of survey data can offer more detailed spatial information, the data coverage and statistical power might be too low for meaningful analysis.

As the SIPHER Synthetic Population is the outcome of a statistical creation process, all results obtained from this dataset should always be treated as 'model output', including basic descriptive statistics. Here, the SIPHER Synthetic Population should not replace the underlying Understanding Society survey data for standard statistical analyses (e.g., standard regression analysis, or longitudinal multi-wave analysis). Please see the User Guide provided for this dataset for further information on creation and validation.

This research was conducted as part of the Systems Science in Public Health and Health Economics Research (SIPHER) Consortium and we thank the whole team for valuable input and discussions which have informed this work.

MAIN TOPICS:
LSOA/Data Zone modelling data. The SIPHER Synthetic Population is a digital twin of the adult population aged 16 years and older in Great Britain. It reflects more than 50 million synthetic individuals - all of which are represented through 'real' individuals covered in the Understanding Society survey. The dataset is a large-scale, two-variable file including the variables 'pidp' and 'synthetic_zone'. The dataset shared is intended for linkage with Understanding Society survey data files such as 'k_indresp' and “k_hhresp” using the survey’s person identifier variable ('pidp'). Please see the respective User Guide provided for this dataset for further information on linkages and intended applications.

Funding:
Keywords: Administrative areas, census data, modelling, research and development, simulation models, surveys.
College / School: College of Medical Veterinary and Life Sciences > School of Health and Wellbeing > MRC/CSO Unit
College of Social Sciences > School of Social and Political Sciences > Urban Studies
Date Deposited: 17 Jul 2024 14:52
URI: https://researchdata.gla.ac.uk/id/eprint/1678

Available Files

There are no files for this dataset available to download.

Repository Staff Only: Update this record

Lomax, N., Hoehn, A. , Heppenstall, A. , Purshouse, R., Wu, G., Zia, K. and Meier, P. (2024); SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021

UK Data Service

DOI: 10.5255/UKDA-SN-9277-1

Retrieved: 2025-01-20