Monday, 14 October 2024

Working with OpenAI Assistants: Using file search

This is the second post in the series where we explore the OpenAI Assistants API. In this post, we will be looking at the file search capabilities which allows us to upload files to the Assistants API and chat with them. See the following posts for the entire series:

Working with OpenAI Assistants: Create a simple assistant

Working with OpenAI Assistants: Using file search (this post)

Working with OpenAI Assistants: Chat with Excel files using code interpreter

Working with OpenAI Assistants: Using code interpreter to generate charts

The file search API uses the Retrieval Augmented Generation (RAG) pattern which has been made popular recently. The added advantage of using the Assistants API for this is that the API manages document chunking, vectorizing and indexing for us. Whereas without the Assistants API we would have to use a separate service like Azure AI Search and manage the document indexing ourselves. 

To upload and chat with documents using the Assistants API, we have to use the following moving pieces: 

  • First, we need to create a Vector Store in the Assistants API.
  • Then, we need to upload files using the Open AI File client and add them to the vector store.
  • Finally, we need to connect the vector store to either an assistant or a thread which would enable to assistant to answer questions based on the document.

For the demo code, we will be using the Azure OpenAI service for working with the OpenAI gpt-4o model and since we will be using .NET code, we will need the Azure OpenAI .NET SDK as well as Azure.AI.OpenAI.Assistants nuget packages.

Limitations


As per OpenAI docs, there are some limitations for the file search tool:

  • Each vector store can hold up to 10,000 files.
  • The maximum file size of a file which can be uploaded is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).

When querying for the documents in the vector store, we have to be aware of the following things which are not possible right now. However, the OpenAI team are working on this and some of these features will be available soon:

  • Support for deterministic pre-search filtering using custom metadata.
  • Support for parsing images within documents (including images of charts, graphs, tables etc.)
  • Support for retrievals over structured file formats (like csv or jsonl).
  • Better support for summarization — the tool today is optimized for search queries.

Current supported files types can be found in the OpenAI docs

Hope this helps!

Monday, 7 October 2024

Working with OpenAI Assistants: Create a simple assistant

With OpenAI's recently released Assistants API, building AI bots becomes a lot easier. Using the API, an assistant can leverage custom instructions, files and tools (previously called functions) and answer user questions based on them.

Before the Assistants API, building such assistants was possible but for a lot of things, we had to use our own services e.g. vector storage for file search, database for maintaining chat history etc.

The Assistants API gives us a handy wrapper on top of all these disparate services and a single endpoint to work with. So in this series of posts, let's have a look at what the Assistants API can do.

Working with OpenAI Assistants: Create a simple assistant (this post)

Working with OpenAI Assistants: Using file search

Working with OpenAI Assistants: Chat with Excel files using code interpreter

Working with OpenAI Assistants: Using code interpreter to generate charts

The first thing we are going to do is build a simple assistant which has a "SharePoint Tutor" personality. It will be used to answer questions for users who are learning to use SharePoint. Before deep diving into the code, lets understand the different moving pieces of the Assistants API: 

An assistant is a container in which all operations between the AI and the user are managed.

A thread is a list of messages which were exchanged between the user and AI. The thread is also responsible for maintaining the conversation history.

A run is a single invocation of an assistant based on the history in the thread as well as the tools available to the assistant. After a run is executed, new messages are generated and added to the thread.

For the demo code, we will be using the Azure OpenAI service for working with the OpenAI gpt-4o model and since we will be using .NET code, we will need the Azure OpenAI .NET SDK as well as Azure.AI.OpenAI.Assistants nuget packages.

This was a simple assistant creation just to get us familiar with the Assitants API. In the next posts, we will dive deeper into the API and explore the more advanced concepts. Stay tuned!