On Apr 27, 2024, at 4:35 AM, Géraldine Anne Geoffroy wrote:
> Maybe you can take a look at Openalex <https://openalex.org/>, which has a very broad, open and multi-disciplinary knowledge base of bibliographic metadata of scholary outputs and a very robust well-documented API. The entity-relationship model <https://help.openalex.org/how-it-works> behind the metadata catalog contains concept-type entities aligned with wikidata concepts which can help you build your corpus of metadata.
>
> Depending on what you want to train an LLM for and if you need the fulltext you can then use the doi to scrape the full text online.
>
> --
> Géraldine Geoffroy
> Bibilothèque de l'EPFL
Géraldine, thank you. openalex++ as well as a ++ for the whole of the OurResearch familiy of services. See:
https://ourresearch.org/
--
Eric Morgan
University of Notre Dame
|