This is the second post in the series where we explore the OpenAI Assistants API. In this post, we will be looking at the file search capabilities which allows us to upload files to the Assistants API and chat with them. See the following posts for the entire series:
Working with OpenAI Assistants: Create a simple assistant
Working with OpenAI Assistants: Using file search (this post)
Working with OpenAI Assistants: Chat with Excel files using code interpreter
Working with OpenAI Assistants: Using code interpreter to generate charts
The file search API uses the Retrieval Augmented Generation (RAG) pattern which has been made popular recently. The added advantage of using the Assistants API for this is that the API manages document chunking, vectorizing and indexing for us. Whereas without the Assistants API we would have to use a separate service like Azure AI Search and manage the document indexing ourselves.
To upload and chat with documents using the Assistants API, we have to use the following moving pieces:
- First, we need to create a Vector Store in the Assistants API.
- Then, we need to upload files using the Open AI File client and add them to the vector store.
- Finally, we need to connect the vector store to either an assistant or a thread which would enable to assistant to answer questions based on the document.
Limitations
- Each vector store can hold up to 10,000 files.
- The maximum file size of a file which can be uploaded is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).
- Support for deterministic pre-search filtering using custom metadata.
- Support for parsing images within documents (including images of charts, graphs, tables etc.)
- Support for retrievals over structured file formats (like csv or jsonl).
- Better support for summarization — the tool today is optimized for search queries.