More AI News - Random Benchmarks Off Reddit
Here are random AI LLM benchmarks off of Reddit.
"This project benchmarks 41 open-source large language models across 19 evaluation tasks using the lm-evaluation-harness library. Benchmarks are grouped into three categories, with the corresponding tasks and metrics listed below."
https://github.com/jayminban/41-llms-evaluated-on-19-benchmarks
Comments