Resources in swpc_synthea
The structure of the resources in swpc_synthea is as below:
├── src/main/resources
| ├── export
| ├── geography
| | ├── demographics.csv
| | ├── foreign_birthplace.json
| | ├── postcodes.csv
| | ├── sdoh.csv
| | ├── timezones.csv
| ├── keep_modules
| ├── modules
| ├── physiology
| | ├── hospitals.csv
| | ├── primary_care_facilities.csv
| | ├── urgent_care_facilities.csv
| ├── providers
| ├── templates
| biometrics.yml
| birthweights.csv
| bmi_correlations.json
| cdc_growth_charts.json
| cdc_wtleninf.csv
| gdb_disability_weights.csv
| growth_data_error_rates.json
| htn_drugs.csv
| immunization_schedule.json
| language_lookup.json
| names.yml
| nhanes_two_year_olds_bmi.csv
| race_ethnicity_codes.json
| shr_mapping.csv
| telemedicine_config.json
| us_core_mapping.csv
We'll now highlight a few key files for the simulation inputs
geography/demographics.csv
This file sets the general demographics for the population created during this simulation, such as age distributions and town populations. The meanings of the different columns in this file can be found in the table below.
The demographics breakdowns for the UK version have been done by the full region rather than per town as information per town was not available.
Column Name | Contains | Data Sources for UK Version |
---|---|---|
ID | Town ID number | monotonically increasing from 1 |
COUNTY | County Code where town is located | TOWN_211CODE column from ONS data table |
NAME | Town Name | TOWN_211NAME column from above ONS table |
STNAME | Region Name | REGION/COUNTRY from above ONS table |
POPESTIMATE2015 | Population of that town | TOTAL population rows for 2019 from above ONS data |
CTYNAME | County Name where town is located | range of wikipedia sites for the towns |
TOT_POP | total county population | ONS Region data |
Ethnicity (includes ASIAN, BLACK, MIXED, WHITE, OTHER columns) | Percentage of the population that is of a certain ethnicity1 | NHS Health Survey England |
Ages (includes all age breakdown columns) | percentage of population in different age groups | Another NHS Health Survey England2 |
Income (includes all income breakdown columns) | percentage of population in different income brackets | ONS Employment data3 |
LESS_THAN_HS | fraction of people with no qualifications, or level 1 or 2 of education (as classified by ONS) | ONS Education data |
HS_DEGREE | fraction of people with level 3 education | Same as above |
SOME_COLLEGE | fraction of people with apprenticeships | Same as above |
BS_DEGREE | fraction of people with level 4 education | Same as above |
Many of the above sources are complimented by data from the 2021 census
geography/sdoh.csv
This file contains information on social determinants of health for the different regions.
Column Name | Contains | Data Sources for UK Version |
---|---|---|
FOOD_INSECURITY | percentage of people with food insecurity | Sheffield University study |
SEVERE_HOUSING_COST_BURDEN | percentage of people with severe housing cost burden 4 | Government English housing survey |
UNEMPLOYMENT | percentage of unemployment | ONS local labour market data |
NO_VEHICLE_ACCESS | percentage of the population with no access to a vehicle | ONS census data |
geography/postcodes.csv
Originally called zipcodes.csv but changed to use the English word. Postcode data was found here.
modules/
Store for the clinical modules saved as jsons. Most of these are currently based on the original US SyntheaTM version. See index.md for a list of changed modules.
providers/
These files set different medical facilities for patients to attend in the simulation.
GP practices in the South West were found in the NHS digital GP Practice Data, and the conversion from postcode to latitude and longitude was done using the grid reference finder.
-
Ethnicity categories were changed from the American version to align better with UK ethnicity breakdowns ↩
-
Under 18 ages set to 0 as we are only interested in an adult population currently. ↩
-
Income brackets don't match exactly, and so estimations of the breakdowns within the brackets used in Synthea had to be done. ↩
-
We used data on mortgagors who found affording their mortgage very or fairly difficult (table AT2_8) plus renters who found affording their rent very or fairly difficult over the total number of people surveyed in the study used. This data was only available for the whole country. ↩