Hi,
You can try to train a IA classifier, existe a dataset (called giant) with
a billion of reference strings with an identifier for type ( book, article,
dissertation, etc.), maybe you can try usé Flair NLP for this task.
El dom, 29 de sept de 2024, 5:57 p. m., Joe Hourclé <[log in to unmask]>
escribió:
> > On Sep 29, 2024, at 6:26 PM, Park, Sarah <[log in to unmask]> wrote:
> >
> > Hi,
> >
> > I am looking for a tool or method that can help us identify publication
> types from citations/references using scripts or AI-based tools. My
> colleague and I are interested in citation analysis to determine the types
> of sources used in a discipline, for example, journal articles, review
> articles, magazine articles, book chapters, books, websites, government
> documents (Gov Docs), and NGO documents.
> >
> > One possible method I got so far was using article database APIs, like
> Scopus, to identify document types, but Scopus seems to track some types
> but not all. I also heard that a model can be trained using ChatGPT or
> other generative AI, but I haven't heard how effective it can be.
> >
> > Any thoughts or suggestions that could lead to a possible solution would
> be greatly appreciated!
>
> I know that many style guides have slightly different serializations (how
> to built the string used in the citation section) depending on what type of
> item you're citing.
>
> I've never looked too closely to see many collisions there are (when two
> types of items result in strings they would look the same), but that might
> be a way to do a quick pass on things.
>
> ... and I would also look to groups like CrossRef to see if you can get
> some of this information from them. I know that there are now more
> commercial groups trying to sell people this sort of information,
> but they always seemed like one of the better organizations when I went to
> DataCite meetings.
>
> (I have very little experience dealing with exactly what you're trying to
> do... I was mostly involved in trying to build the concept of data
> citation, and helped to maintain the bibliographies of who was using our
> data)
>
> -Joe
> (no current affiliation)
>
|