Hi Jannis,
You might want to try asking this question of ai4lam.org<https://sites.google.com/view/ai4lam> too:
Google Group: https://groups.google.com/forum/#!forum/ai4lam
Slack: https://join.slack.com/t/ai4lam/shared_invite/zt-1omthldn8-9vrGySjIRdija1nKQm0ltA
Data for training and testing models is a frequent topic there.
- Tom
From: Code for Libraries <[log in to unmask]> on behalf of Ohms, Jannis <[log in to unmask]>
Date: Friday, December 6, 2024 at 4:59 AM
To: [log in to unmask] <[log in to unmask]>
Subject: [CODE4LIB] Are there datasets to evaluate the quality of a document embedding
Dear all,
Iam currently developing a RAG (https://en.wikipedia.org/wiki/Retrieval-augmented_generation) Application
Are there datasets to evaluate or test the retrieval quality of my embedding model ?
Jannis Ohms
Jannis Ohms
Technische Universität Braunschweig
Universitätsbibliothek | <i>University Library</i>
Abt.: IT und Forschungsnahe Services | <i>Dep.: IT and Research Support Services</i
Universitätsplatz 1, R212
38106 Braunschweig
Phone: +49 531 391 5027
[log in to unmask]<mailto:[log in to unmask]>