HSMA Course Project: Waiting List Data Science Proofs of Concept
Sean and Amaia have spent the last 15 months on the Health Service Modelling Associates (HSMA) Programme, which is soon coming to an end. This program consists of 6 months of weekly lectures on a range of data science, operational research, coding, and web development, followed by a 9 month project, run by the University of Exeter PenChord team and is available for free for anyone working in public healthcare or policing.
During the course, we opted for the project to be a set of proof of concepts using the Waiting List Minimum Dats set (WLMDS) to identify data science techniques that may provide novel insights into this dataset. Read more about the project itself here. This blog post is a summary of our project, as well a reflection on the past 15 months of the course.
Why did you decide to attend the HSMA programme?
Amaia: Having joined the data science team at the end of my NHS Digital graduate scheme, I'd ended up getting stuck into my projects directly, and focusing on the skills needed for my specific projects, which have been the Probabilistic Data Linkage Model and Developing Artificial Primary Care Records and so I came to the realisation that my wider data science knowledge and formal training was quite lacking, and I wanted to get insight into a wider range of techniques.
Sean: In my role within the data science team, I’ve mostly focused on data pipeline development and data linkage, particularly on the CVD Pathways project. I wanted to improve my ability to generate actionable insights from data, not just process and transform it. The HSMA course also looked like a great way to gain formal training in a broader range of data science and analytical techniques, and to explore topics I don’t encounter much in my day-to-day work, like geographical modelling and discrete event simulation.
How did you find the course?
Sean: Overall, I found the course really valuable. The first section focused on Python coding skills, which was a helpful refresher. Once the course moved into more advanced areas, such as machine learning and event simulation, I found it really engaging and it provided a great overview of what these tools are and how they can be used. Having this formal background and structured walkthrough has definitely improved both my knowledge and my confidence in applying the right data science solutions to different problems. The project phase was particularly interesting! I appreciated and enjoyed the opportunity to delve into a problem that was outside of my normal project work, and apply techniques we had learnt from the taught part of the course. I also want to give a big shout out to Dan and Sammi who organise and run the course - their help has been invaluable, especially when it came to asking questions and trying to understand the technical concepts.
Amaia: Whilst the beginning of the course started quite slow, as it's designed to teach clinicians or other non technical staff how to code in Python, it quickly ramped up and I found myself learning about such a massive range of techniques every single week. I now feel like I have a comprehensive set of notes I can refer to if I'm ever in doubt about which technique is most appropriate, or how to improve a particular model. The project phase I found really challenging, particularly at the beginning, because I really struggled to figure out how to come up with a project that would be impactful, reusable, and allow me to practice as many of my new skills as possible. However, once we found our feet, I really enjoyed being able to apply my skills, and getting that committed time each week to work on it. Dan and Sammi who ran the course are also great, and always willing to help if you got stuck!
Who would you recommend the course to?
Amaia: Whilst I feel like I gained a lot from being on the course, observing my course mates, those that I think found the most value from it were technically savvy clinicians, who hadn't necessarily coded before, but had some technical knowledge, and I think those are the people who stand to gain the most from this opportunity!
Sean: I’d recommend this course to anyone without a formal data science background, or anyone coming from a different field, such as clinical work. There’s a lot of content, but the course builds a solid foundation and is a great starting point for using data science more effectively in your work.
What was your favorite part of the course?
Amaia: I absolutely loved going into the nitty gritty of Discrete Event Simulations, as it's not something I had much knowledge on at all, and I feel like I learnt such a huge amount. I also really enjoyed the web development (streamlit) bits of the course as I'd never properly used it, and I can see it being applicable to so many of my projects.
Sean: I really enjoyed learning about geographical modelling and deploying web apps with Streamlit. Understanding how to use geographic data to explore things like patient journey times or the closest clinic locations gave me lots of ideas to apply in my own projects. Learning Streamlit was great too, and I can see its potential for sharing insights quickly and making information more accessible to others.
What's something that you've learned that will feed into your day job or projects?
Sean: Two things stand out. First, I’ve learned a lot about how to choose/select analytical or machine learning approaches for a specific question, which I plan to use in future work on the CVD Pathways project (especially for understanding and predicting patient outcomes). Second, I’ve learned a lot about scoping and defining analytical problems, which is a valuable skill for my current role and for projects down the line. It’s really helped me understand the process of turning a stakeholder question of “we want to see this” into an actionable, data science plan.
Amaia: This is actually something that's already happened! I have used streamlit for showcasing some of the work on Developing Artificial Primary Care Records to colleagues and stakeholders with great success. I am also hoping that the work Sean and I did on WLMDS can be used by the Elective Recovery team to feed into future work, as they expressed wanting to implement data science techniques in the past but never having the capacity to. Our codebase should offer a reusable solution for a range of options of modelling, which should speed their process up. Through working with Sean, who despite being in my team I had never worked on a project with, I learnt a lot about object-oriented programming which I will carry forward, and this is an opportunity I would not have had if I had not done HSMA.
Check out the HSMA website for more information, and look out for the announcement of next years course (which they think will start in about April 2026).