Print

Print


A note about RAG. RAG can be as follows:

   - Take a query;
   - Do a vector search using the embedded query on your data;
   - Take the (likely proprietary) results;
   - Ask an LLM to generate the answer to the original query.

However, there can be many other steps in a fully formed RAG system between
or before the query happens. For example, you can use techniques like HYDE
[1] after the query to have a more accurate vector search. Or, you might
consider reranking your results after the vector has been made. Or, how
your data retrieval can be improved by how you're chunking and loading it
into your vector database [2].

RAG is not just prompt engineering and counting tokens you have left, and
there's a wide space of considerations when you're building out the user
facing application. What Tim shows with Talpa is the type of projects where
you put things together to productionize a RAG-like system.

[1]
https://wfhbrian.com/revolutionizing-search-how-hypothetical-document-embeddings-hyde-can-save-time-and-increase-productivity/
[2] https://vectara.com/blog/grounded-generation-done-right-chunking/

Disclaimer: I currently work at pointable, a startup working on customized
RAG and RAG-like systems.

On Mon, Feb 26, 2024 at 9:39 PM Patricia Farnan <[log in to unmask]>
wrote:

> Well said, Alex – couldn’t agree more.
>
> If content for these tools has been paid for and not stolen (or there’s an
> agreement for open access material to be relied on), then I have no problem
> with them.
>
> Patricia Farnan (she/her) | Application Administrator, Discovery Services
> University Library  | St Teresa’s Library (ND17)
>
> +61 8 9433 0707 | [log in to unmask]<mailto:
> [log in to unmask]>
>
> I acknowledge the Whadjuk Noongar people, the traditional owners and
> custodians of the land on which I live and work (Walyalup), and pay my
> respects to their elders past and present.
>
> From: Code for Libraries <[log in to unmask]> On Behalf Of Alex Dunn
> Sent: Tuesday, February 27, 2024 6:03 AM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] genrative ai; fine-tuning and rag
>
> I think it's important to ask first what your aims are with a LLM.
> Personally I have never seen a valid use-case for ChatGPT or any of its
> varieties in libraries. These models are, ultimately, little more than
> glorified text compressors[1] that perform pattern-matching and which do
> not, and cannot, produce accurate or reliable information.
>
> (And this is before we get into the ethical issues of the large commercial
> models' theft of the work of writers, programmers, and artists at large to
> feed a tool that businesses and organizations are looking to use to replace
> them.)
>
> [1] https://aclanthology.org/2023.findings-acl.426/<
> https://aclanthology.org/2023.findings-acl.426>
>
> On Mon, Feb 26, 2024 at 1:49 PM Peter Murray <
> [log in to unmask]<mailto:
> [log in to unmask]>> wrote:
>
> > I took note of something recently from the Library Innovation Lab at
> > Harvard Law School: WARC-GPT: An Open-Source Tool for Exploring Web
> > Archives Using AI. It takes the contents of WARC files and feeds them
> into
> > a Retrieval Augmented Generation tool. Been meaning to play with it as a
> > way to enhance FOLIO's docmuentation search, but haven gotten around to
> it.
> >
> > Not a war story, perhaps, but a WARC story. ;-)
> >
> > Peter
> > On Feb 26, 2024 at 4:07 PM -0500, Eric Lease Morgan <
> > [log in to unmask]<mailto:
> [log in to unmask]>>, wrote:
> > > Who out here in Code4Lib Land is practicing with either one or both of
> > the following things: 1) fine-tuning large-language models, or 2)
> > retrieval-augmented generation (RAG). If there is somebody out there,
> then
> > I'd love to chat.
> > >
> > > When it comes to generative AI -- things like ChatGPT -- one of the
> > first things us librarians say is, "I don't know how I can trust those
> > results because I don't know from whence the content originated." Thus,
> if
> > we were create our own model, then we can trust the results. Right? Well,
> > almost. The things of ChatGPT are "large language models" and the
> creation
> > of such things are very expensive. They require more content than we
> have,
> > more computing horsepower than we are willing to buy, and more computing
> > expertise than we are willing to hire. On the other hand there is a
> process
> > called "fine-tuning", where one's own content is used to supplement an
> > existing large-language model, and in the end the model knows about one's
> > own content. I plan to experiment with this process; I plan to fine-tune
> an
> > existing large-language model and experiment with it use.
> > >
> > > Another approach to generative AI is called RAG -- retrieval-augmented
> > generation. In this scenerio, one's content is first indexed using any
> > number of different techniques. Next, given a query, the index is
> searched
> > for matching documents. Third, the matching documents are given as input
> to
> > the large-language model, and the model uses the documents to structure
> the
> > result -- a simple sentence, a paragraph, a few paragraphs, an outline,
> or
> > some sort of structured data (CSV, JSON, etc.). In any case, only the
> > content given to the model is used for analysis, and the model's primary
> > purpose is to structure the result. Compared to fine-tuning, RAG is
> > computationally dirt cheap. Like fine-tuning, I plan to experiment with
> RAG.
> > >
> > > To the best of my recollection, I have not seen very much discussion on
> > this list about the technological aspects of fine-tuning nor RAG. If you
> > are working these technologies, then I'd love to hear from you. Let's
> share
> > war stories.
> > >
> > > --
> > > Eric Morgan <[log in to unmask]<mailto:[log in to unmask]>>
> > > Navari Family Center for Digital Scholarship
> > > University of Notre Dame
> >
>
> Disclaimer
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by Mimecast Ltd, an innovator in Software as a
> Service (SaaS) for business. Providing a safer and more useful place for
> your human generated data. Specializing in; Security, archiving and
> compliance. To find out more visit the Mimecast website.
>


-- 
Brian Wu
Email: [log in to unmask]