What is LLM-as-a-Judge and how can we use it?
We have been exploring using an LLM to evaluate and assess LLM summaries. This utilises the speed and language understanding of LLMs to score summaries, but how much trust can we put into an 'LLM-as-a-Judge'?