Difference between Gen AI RAG BOT based on LLM and fine tuned LLM
Vanilla Usage:
Chat GPT , BARD from Google that are freely available for users across the world. Millions use them for their day to day queries Q/A.
Create your own BOT to decipher knowledge from huge datasets/documents:
Typically small scale Chat Bot Document based knowledge retrieval Q/A use cases can be fulfilled by instant upload of the documents to Langchain framework RAG (Retrieval Augmentation Generation) bot.
RAG bots can be used for conversational usecases in any domain sales, marketing, finance, legal or IT
First the documents/excel are uploaded using python framework to LLM via API. Document uploaded is embedded using proper foundation models from Open AI(davinci, gpt turbo 3.5) or from Google ( Geko, otter, bison). Embeddings of the documents uploaded in stored in vector dataabases.
The embedding of questions from users are done real time and stored in vector database.
Cosine similarity between embeddings vectors of documents and Questions are used to retrieve the sentences/paragraphs from the documents in natural language format.
Proper promt enginering is needed while asking questions to such RAG BOT.
RAG Bots we are not really finetuning any foundation model we are just creating vector databases with embeddings of the documents/excel which users would like to do Q/A.