How to build Copilot using GPT4
January 7, 2025
Introduction:
LLMs have taken the world by storm and redefined how we interact with machines. ChatGPT became the most used general-purpose copilot. The biggest complaint about ChatGPT was the knowledge base. It is trained on public datasets until Sep 2021. For many serious users, the quality of the knowledge sources is extremely important. And second, it can never have access to certain verticalized knowledge which is inside the organizations. This is where verticalization presents a significant opportunity to bridge these gaps. Vertical-specific or Topic-specific Copilots are the next big thing in knowledge-based assistants. The recent update from OpenAI, offering plugins on ChatGPT, allows such specialized copilots to be deployable within ChatGPT. But like Microsoft’s strategy to build Copilots (product-specific, topic-specific), every other organization will aim to own and mark their Copilot as the industry's best to assist users.
This article offers a starting approach to building such professional copilots, where we show how to build a copilot using GPT-4. The bot uses OpenAI's GPT-4 to answer developer and natural language questions.
We divide this process into three stages:
- Data Preparation
- Preparing Embeddings
- Designing the Copilot module
Stage 1: Data Preparation:
To begin with, let's prepare the data. You may have the knowledge base stored in different locations. So, let’s collate them and store them in one place.
Langchain’s ‘Document Loaders’ or Llama-Index (earlier GPT-index) can be used to load the data. They provide multiple off-the-shelf options for loading data.
In our case, we have our knowledge base in websites, documentation, whitepaper and videos.
From Websites:
From the sitemap, we created a list of URLs that need to be scraped for the knowledge base. This list was then stored as ‘urls’.
This stores the parsed data along with metadata like source URL.
Langchain’s URL loader was used to scrape the data from these URLs. You can add any URL to the list and prepare the knowledge base.
From PDFs:
Our whitepapers and research papers were stored as PDFs, and Langchain supports PyPDF. We stored all our whitepapers in a directory and loaded them in Langchain ‘pdf’ loader.
From text/transcription files:
The transcriptions from our webinars, tutorials etc. were stored in a directory, which was then loaded using Langchain ‘Directory Loader’.
Merging the lists.:
Next, we merged all this knowledgebase into one list.
Split the data:
To ensure the context is precise and minimize the token length, the data must be split before storing the embeddings.. Langchain has multiple text splitter options - more info here: Text Splitters.
Change the chunk sizes if you face a 'MaxToken' error.
Embedding the data:
You can use OpenAI embeddings to convert your textual data into high-dimensional vectors that can later be used to create conversations, summaries, searches, etc.
How to store the embeddings:
There are many databases which can be used to store the embeddings. Langchain has support for both open-source and paid options. Read more about various options here.
ChromDB and FAISS are the most used options when building a demo application. AtlasDB, Pinecode and Qdrant are paid options that are scalable.
Note that this is running the embeddings whenever you are running the app. Instead, you can run the embeddings, store them in a DB, JSON or Pickle file and load them while loading the application. This is where DBs can help.
We stored the embeddings in pinecone and updated them frequently to add additional knowledge.
This automatically creates an index in pinecone along with embeddings.
Creating Copilot:
Once the embeddings are created, you are ready to create the Copilot for your apps/ data.
The requirements of your ‘Copilot’ can be very specific to your use case. It can simply be answering a question or engaging in a conversation using your knowledge source. These are primarily differentiated by two key components - ‘Prompting’ and ‘Memory’. They need to be specifically optimized for various use cases.
Langchain has multiple options for building Copilots:
- load_qa_with_sources_chain
- VectorDBQA.from_chain_type
- VectorDBQAWithSourcesChain.from_chain_type
- ConversationChain
And OpenAI has two API types that can be used for building Copilot.
- openai.Completion.create
- openai.ChatCompletion.create
Langchain provides ad hoc functions to use these directly.
Copilot: To provide answers to questions as a QnA with sources
If the use case is a simple QnA module, you need to get the snippets with high similarity with the question, combine these snippets and summarize them along with the prompt.
For simple QnA, you can use any of the above options for your QA module.
Now let's define a prompt for this:
It is now ready to take queries:
This is run the query on ‘qa’ and gives the result.
Let's see the result:
AryaXAI can help you in banking by providing explainable AI solutions that can translate complex deep-learning models into understandable insights. This can assist in areas such as credit risk assessment, fraud detection, and customer segmentation by offering transparent and interpretable results, which can improve decision-making and regulatory compliance in the banking industry.
You can also find the source documents from where it is fetching the answer.
The same can be done using ‘load_qa_with_sources_chain’ from Langchain. This example is provided in the colab.
Copilot: A Conversational Engine using Langchain
A memory function needs to be added for converting this QnA engine into a conversation engine. You can use the ‘Memory function’ from Langchain directly, which does provide various options like ConversationalBufferMemory, ConversationalSummaryMemory etc. In our case, we wanted to customize memory per our requirements or even use a different LLM to build the memory.
Things to remember when building ‘Memory’:
- It will use tokens to summarize and also for prompts. The bigger the memory, you may reach the token limit, which will also be expensive to answer.
- It can also be built to remember only the last ‘n’ conversations if you are confident that the token is within the limit.
To build memory, we simply captured and summarized the conversation history before answering a new question.
‘Summarization chain’ uses the ‘llm’ defined and summarises the previous conversation.
Next, we built conversational interaction with our copilot.
Copilot: Designing the conversational copilot with customizable doc search and few-shot learning
Customizing Document Search
Another component that can be customized is the ‘document search’. Instead of using the default document search in Langchain, your own document search can be used and passed on to the conversational engine for responding to the query.
In this example, we defined the document search separately and passed these results as ‘Contexts’ to the conversational engine.
Why do you need a custom document search?
If needed, the search algorithm can be modified, and you can define cut-off cosine distance or/and ‘n’ of search results etc., to create custom ‘contexts’ which will be passed on the conversational engine along with the prompt. You can also use document ranking, which along with the cosine score can be used to pick the more preferred 'Context'. For example - if shortlisted, you may want to use certain information sources more in the answers than others.
Advance Prompting:
Recent updates from OpenAI allow users to pass on more information within the prompts using ‘Roles’. Few-shot learning can be done by passing sample prompts and responses within the prompt and jump-starting the learning.
In ‘messages’, your own prompts can be added for ‘System’, ‘User’ and ‘assistant’. For few-shot learning, add the examples within the prompt and under these roles in multiple lines. This will act like in-context feedback to the engine before answering the question.
Final Notes:
Copilots are helpful AI Assistants that provide answers or direct to the right sources for answers. Copilots built on vertical-specific or topic-specific knowledge can differentiate in the quality of the answer compared to general purpose Copilots. We have created detailed documentation on how to build Copilots, and various ways to build them, primarily using Langchain and OpenAI. You can use the same components with other LLMs too. Within OpenAI, the next addition to the Copilot is using ‘agents’. We will cover this in our future articles.
Access the Colab notebook here: https://colab.research.google.com/drive/1URBRSkQOWwB7Y9oQpg6HSA_yWiCRmx52?usp=sharing
You can access the demo of this Copilot here:
References:
- Langchain: https://python.langchain.com/en/latest/index.html
- OpenAI : https://platform.openai.com/docs/introduction/overview
- Pinecone Examples: https://github.com/pinecone-io/examples/blob/master/generation/generative-qa/openai/gen-qa-openai/gen-qa-openai.ipynb
- OpenAI Cookbook : https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md
SHARE THIS
Discover More Tutorials
Delve into a variety of expert-led tutorials designed to deepen your understanding of AI, MLOps, reinforcement learning, and more. Gain practical insights, step-by-step guidance, and actionable skills to stay ahead in the rapidly evolving tech landscape.
Is Explainability critical for your 'AI' solutions?
Schedule a demo with our team to understand how AryaXAI can make your mission-critical 'AI' acceptable and aligned with all your stakeholders.