How to use OpenAI Assistant API, a step-by-step guide.
If you know how to code and want to try using the new Assistant API, here's a quick start guide to save you a few hours of figuring out why things don't work.
Yesterday open AI announced their new artificial intelligent assistant API, and developers like me swarmed to try it out. Assistant itself is not entirely new, in that it does not do anything that OpenAI could not do before. Instead, it’s a new interface to OpenAI and it does make it faster to build applications. Assistant natively integrates into the chat API and it is able to return back JSON/data in a predictable way, which can then be instantly consumed by a developer’s application.
If you haven’t seen the full presentation yet, here’s the link.
However, this newsletter update is specifically for developers who would like to try out the new Assistant feature, and perhaps are struggling to make it work. I don’t consider myself to be a very good developer, and it has taken me a few hours figure out just exactly what to do in order to make this assistant feature work. I hope I can help others to jump-start the process.
If you are not a developer and you are not into APIs, you might want to stop reading now and come back for my next newsletter with less code. I won’t be offended.
I am also assuming that you are not using the official Node or Python libraries, but instead calling the API via Get/Post requests. I am using Ruby on Rails for this, but I’ll keep the discussion language-agnostic.
1. Initiating An Assistant
The easiest way to do this is to login into your OpenAI account at the following URL and to create a new assistant in their interface. https://platform.openai.com/assistants
The most important part are your instructions. This is where you will specify what the assistant is going to be doing. For example, if I were to create a chat bot for Smartynames.com, to generate domain names for businesses, I could say this:
“You are a domain salesman. You help your clients find great names for their businesses. These names are short, easy to pronounce, and they fit well with the description of the business that your clients want to create. People should want to visit a business because of their great name. When you return results, be polite but concise. Say something kind, then return some names that end in ".com" extension, as well as some alternative extensions (TLDs), which you think would fit. Do not say anything else, but ask at the end if they would like to adjust their request to see more names.”
Here you can also specify which model you're going to be using for the assistant (gpt3.5 turbo vs gpt4 ..etc), and whether the assistant needs access to other tools like functions or ability to fetch data from an uploaded file.
Once you've set those settings they will be default to this particular assistant, but you could specifically request additional tools or a model on per-request basis later in the process. You might want to use GPT35 Turbo most of the time, but engage GPT4 on complex word processing tasks, for example.
At this point of the setup is also where you can setup functions.
What open AI calls functions though is not exactly what you imagine. They don’t evaluate anything or do anything for you. Instead, functions are more like data definitions. You can ask the AI not only to respond to you in words, but to also translate its response into data, which they return as a separate message. That message (data) can be asynchronously consumed by your application.
For example, continuing on the example where I search for domain names, I would ask the bot for an English response to the following: “Return a list of domain names that you think are fitting for this business.”
Meanwhile, in the function definition I would ask for something like this:
Let’s deal with functions next time, we don’t need that today.
Once you created an assistant, grab the assistant ID from your interface, and save that in your code.
It looks like: “asst_jdsljcveo8j3ow2u0ceiuajmcds”
2. Initiate a thread
In your application, create a POST request to the following endpoint:
https://api.openai.com/v1/threads
Make sure to include your Bearer token, and a new requested header that OpenAI need to identify you as the user of a beta feature. You’d want to keep these headers for every request, btw. For Ruby, using 'net/http' it looks like this. You can always use ChatGPT to translate this into any other programming language.
request.content_type = "application/json"
request["Authorization"] = "Bearer #{MY-TOKEN-HERE}"
request["OpenAI-Beta"] = "assistants=v1"
The request will return a bunch of JSON, but all you need is the ID of the newly created thread.
3. Post a request to your thread
In your application, create a POST request to the following endpoint:
https://api.openai.com/v1/threads/THREAD_ID/messages
The only thing you need to include is the message.
request.body = { role: 'user', content: message }.to_json
Message is whatever is that you want the assistant to process. Following along with the Smartynames example, my message would say: “I am looking to start a series of dog shelters in rural Colorado. Can you help me generate some great domains for it?”
At this point, when you send the request, nothing happens. To get the AI to respond, and to have your assistant engage with the request, you need to go a step further.
4. Request the assistant to engage on the thread
In your application, create a POST request to the following endpoint:
https://api.openai.com/v1/threads/THREAD_ID/runs
Your request only need to contain on thing, the Assistant ID
request.body = { assistant_id: 'YOUR_ASSISTANT_ID' }.to_json
At last, your thread is now aware of your assistant, and will process the response above.
From what I have tested so far, every time that you send a message to the thread, you have to force the assistant to engage in order to get a response. Otherwise, nothing happens.
Perhaps there is a way to bind an assistant to a thread to engage on every request, I haven’t found it.
5. Get the messages from the thread
The final, and the most rewarding step. This is where you get all the data, and show it your users. In your application, create a GET request to the following endpoint:
https://api.openai.com/v1/threads/THREAD_ID/messages
Depends on the language that you're using to create your chat application you can watch for new messages in the thread or parse them and display only the new ones.
Look for the [“data“] array in the response, all the messages will be there.
If you use the open AI playground to test your assistant, look at the logs and you will notice that there are quite a few requests going back-and-forth. Especially when you start using functions and other tools, they're going to be more messages on the thread, and you don't necessarily want to display all of them. A response from a function, for example, will also come as a message, but that one has to be consumed by your application.
In my case I will be displaying only the messages between the user role and the assistant.
Conclusion.
Well, that was easy.