Comms and Marketing Spotlight – LLM as a Judge
“There is no joy in possession without sharing.” — Erasmus
A spotlight on how the project 'Using LLM as a Judge to evaluate Gen AI Search' was marketed effectively by Will Poulett.
“There is no joy in possession without sharing.” — Erasmus
A spotlight on how the project 'Using LLM as a Judge to evaluate Gen AI Search' was marketed effectively by Will Poulett.
We have been exploring using an LLM to evaluate and assess LLM summaries. This utilises the speed and language understanding of LLMs to score summaries, but how much trust can we put into an 'LLM-as-a-Judge'?
We have built a proof-of-concept tool which will help assurers, data scientists and clinicians to evaluate AI classifiers. We call this the RISE tool, it utilises LLM's, AI Image Generators and an interactive plot to allow users to easily evaluate image classifiers. We carried out careful experimentation to ensure its effectiveness, and plan to continue this research in the future.