Talpa is a fascinating project--thanks for working on it! I've been trying
to spend more time with various LLMs lately in an attempt to be able to
speak less foolishly about them (eh), but my eight-year-old child
unwittingly provided the most interesting use case I've attempted so far:
"In Which Big Nate book does Paige call Teddy 'Teddy Bear?'" I appreciated
the opportunity to play public librarian, but the various chatbots I asked
all confidently provided different wrong answers. As my child's frustration
with me neared a boiling point, I was able to answer the question with
accurate clues from the answers. One was able to identify the original
comic strip print date but mixed up the publication dates and print
coverage dates of the published collections. Palda seems to be making a
similar mistake.
Anyway, I'm trying to learn from the more expansive technological hopes of
younger generations. Spoiler alert: the answer is "Big Nate: Say Good-Bye
To Dork City."
Jason
—
Jason Casden | he/him/his
Head, Software Development
The University of North Carolina at Chapel Hill University Libraries
On Mon, Feb 26, 2024 at 5:04 PM Tim Spalding <[log in to unmask]> wrote:
> I and other LibraryThing developers have done a lot of this work in the
> process of making Talpa.ai, so here's my quick take:
>
> 1. Fine-tuning is a poor way to add knowledge to an LLM, especially at
> scale. It's mostly useful for controlling how LLM "thinking" is
> presented—for example ensuring clean, standardized output. It can also be
> helpful at reducing how many input tokens you need to use, and speed up the
> results. This is our experience; yours may be different. But it's at least
> a common view. (See
>
> https://www.reddit.com/r/LocalLLaMA/comments/16q13lm/this_research_may_explain_why_finetuning_doesnt/
> .)
>
> 2. RAG is more liable to get your results. It's good at validation and when
> the model has no clue about some facts. So, for example, if you want to use
> proprietary content to answer a query, you can use a vectorized search to
> find content, then feed them to an LLM (which is all RAG is) and see what
> happens. You can fine-tune the model you use for RAG to ensure the output
> is clean and standard. RAG can be cheap, but it tends to involve making
> very long prompts, so if you're using a commercial service, you'll want to
> think about the cost of input tokens. Although cheaper than output tokens,
> they add up fast!
>
> Anyway, RAG is probably what you want, but the way people throw around RAG
> now you'd think it was some fantastic new idea that transcends the
> limitations of LLMs. It's really not. RAG is just giving LLMs some of
> what you want them to think about, and hoping they think through it well.
> You still need to feed it the right data, and just because you give it
> something to think about doesn't mean it will think through it well. If
> LLMs are "unlimited, free stupid people" they are in effect "unlimited,
> free stupid people in possession of the text I found."
>
> You can find a deeper critique of RAG by Gary Marcus here:
> https://garymarcus.substack.com/p/no-rag-is-probably-not-going-to-rescue
>
> I'm eager to hear how things go!
>
> I would, of course, be grateful for any feedback on Talpa (
> https://www.talpa.ai), which is in active development with a new version
> due any day now. It also uses a third technique, which probably has a name.
> That technique is using LLMs not for their knowledge or for RAG, but to
> parse user queries in such a way that they can be answered by library data
> systems, not LLMs. LLMs can parse language incorrectly, but language is
> their greatest strength and, unlike facts and interpretations, seldom
> involves hallucinations. Then we use real, authoritative library and book
> data, which has no hallucination problem.
>
> Best,
> Tim
>
> On Mon, Feb 26, 2024 at 4:07 PM Eric Lease Morgan <
> [log in to unmask]> wrote:
>
> > Who out here in Code4Lib Land is practicing with either one or both of
> the
> > following things: 1) fine-tuning large-language models, or 2)
> > retrieval-augmented generation (RAG). If there is somebody out there,
> then
> > I'd love to chat.
> >
> > When it comes to generative AI -- things like ChatGPT -- one of the first
> > things us librarians say is, "I don't know how I can trust those results
> > because I don't know from whence the content originated." Thus, if we
> were
> > create our own model, then we can trust the results. Right? Well, almost.
> > The things of ChatGPT are "large language models" and the creation of
> such
> > things are very expensive. They require more content than we have, more
> > computing horsepower than we are willing to buy, and more computing
> > expertise than we are willing to hire. On the other hand there is a
> process
> > called "fine-tuning", where one's own content is used to supplement an
> > existing large-language model, and in the end the model knows about one's
> > own content. I plan to experiment with this process; I plan to fine-tune
> an
> > existing large-language model and experiment with it use.
> >
> > Another approach to generative AI is called RAG -- retrieval-augmented
> > generation. In this scenerio, one's content is first indexed using any
> > number of different techniques. Next, given a query, the index is
> searched
> > for matching documents. Third, the matching documents are given as input
> to
> > the large-language model, and the model uses the documents to structure
> the
> > result -- a simple sentence, a paragraph, a few paragraphs, an outline,
> or
> > some sort of structured data (CSV, JSON, etc.). In any case, only the
> > content given to the model is used for analysis, and the model's primary
> > purpose is to structure the result. Compared to fine-tuning, RAG is
> > computationally dirt cheap. Like fine-tuning, I plan to experiment with
> RAG.
> >
> > To the best of my recollection, I have not seen very much discussion on
> > this list about the technological aspects of fine-tuning nor RAG. If you
> > are working these technologies, then I'd love to hear from you. Let's
> share
> > war stories.
> >
> > --
> > Eric Morgan <[log in to unmask]>
> > Navari Family Center for Digital Scholarship
> > University of Notre Dame
> >
>
>
> --
> Check out my library at https://www.librarything.com/profile/timspalding
>
|