Important: Disclaimer
This is not the official site but a set of brief descriptions of our recent work to support transparency and collaboration. For more information about NHS England please visit our official website
Using Model Class Reliance to Understnad the Impact of Commerical Data on Predictions
“How to asses the value that commercial sales data of over-the-counter prescriptions has on respiratory death predictions”
Figure 1: Schematic of the difference between other variable importance tools and the Model Class Reliance approach to explaining the value of a sinlge input variable in a prediction
The primary aim of the project was to apply the novel variable importance technique, model class reliance, to machine learning models which could predict registered respiratory deaths in the UK. The objective was to assess the value of commercial health data in healthcare predictions compared to other available datasets.
Results
In order to apply MCR, a set of optimal models had to be created which can successfully make the required predictions. The project managed to achieve this outcome with the machine learning model PADRUS (Predicting the amount of deaths by respiratory disease using sales). PADRUS is a random forest regressor which makes accurate weekly predictions of respiratory deaths in 314 local authorities across England 17 days in advance. The models’ features are created from the following dataset types:
- week number,
- commercial sales,
- weather,
- indices of multiple deprivation,
- age and population,
- demographics,
- housing, and
- land use.
MCR was applied to PADRUS showing the highest and lowest impact variables had on predictions across all instances of the model. Grouped MCR was also employed in order for variables to be evaluated in concert as a collection of features created from a dataset type.
The MCR results implied model instances of PADRUS were using variables in different ways to achieve the same predictive results, and suggested where variables could be interchangeable or critical to predictions.
The addition of commercial data show a significant increase in predictive power. Further results are closed whilst a publication is being reviewed.
Output | Link |
---|---|
Open Source Code & Documentation | Github |
Case Study | Awaiting Sign-Off |
Technical report | Here |