Considerations for Building a Language Corpus with a Focus on the NHS
What are the available tools that could be used to build an NHS-focussed collection of texts which could help developers build better NLP tools for the healthcare system.
We aimed to explore how to build an Open, Representative, Extensible and Useful set of tools to curate, enrich and share sources of healthcare text data in an appropriate manner.
Results
Whilst a tool stack was developed which achieved many of our objectives, the key learning points were around the knowledge gaps which need to be addressed at both a data and tooling level before bringing these data together becomes achievable.
Output | Link |
---|---|
Open Source Code & Documentation | Github |
Case Study | n/a |
Blog | Here |