ChatGPT has taken the internet by storm recently, and for good reason: It’s a crazy powerful natural language processing model. OpenAI released the Chat Completion API that it relies on as a public beta, so I wanted to see what I could make it do.
A simple idea, however unoriginal, was to build a basic AI assistant just like ChatGPT. Doing this simple project first will also give me a solid foundation to work off of as I go about expanding the model’s functionality later on.
I wanted something that could easily work on any device, but I didn’t want to deal with writing a frontend. As someone who is chronically online, Discord direct messages were an obvious solution. Using the discord.py package, which I was already very familiar with, I knocked out a basic bot for this in no time at all.
The first step was to log in to the Discord Developer Portal and add a new application. I entered an application name, “Charles A.,” agreed to the terms and conditions, and clicked Create. Then, I could head down to the Bot tab on the left and click Add Bot, making sure to copy the bot token and store it somewhere safe because Discord will not show it again once that tab closes. Then I clicked on the OAuth2 tab and scrolled down to invite the bot to my private test server so that it could send me DMs.
Then, I got to work on my interface code. I knew, at the most basic level, I would need to listen for DM messages, so I started there. Because discord.py doesn’t separate these events by itself, I needed to listen for all messages and check each time if the message is a DM. Since the OpenAI API’s billing is usage-based, I also added a check here to make sure only I could use it to prevent other people from burning through tokens on my API key by using my bot.
I also wanted to have the option to keep things tidy, so I added a ?clear command which makes the bot remove all its previously sent messages. I’ll get into message history a bit later, but I also added a function to handle reaction events (with similar checks as the DM handler) to reset the conversation history when the user reacts with a thumbs up emoji.
The AI Engine
Now that I have an interface, I need to make it do something. You may have noticed the call to the
ai_engine.ask() function within the interface code. That’s the public interface for my AI engine’s code, wrapping the prompt generation and official OpenAI library.
The GPT-3.5 model technically does “chat completion”, meaning it can’t actually have a conversation. It just generates a single response based on the chat history that you explicitly give to it in the form of a list of message objects, which are dictionaries containing “role” — which can be “user”, “assistant”, or “system” — and “content”, which is the message itself.
Now we get to do some prompt engineering. I opted to begin each exchange with two system messages. The first explains who the bot is and the second, provided by the user, explains who the user is.
Then I had to figure out chat history. I opted to provide up to 10 message objects, or 5 back-and-forth interactions, to the model, plus the two system messages and most recent user input. As I noted when building the interface, the user can also manually reset the conversation history by reacting to the bot with a thumbs up, which empties the list containing the past interactions.
I tend to use environment variables for configuration on projects like this, but I wanted to add some basic config validation to make sure that all the required fields are populated at runtime. I also wanted to have the option of using a .env file instead of directly accessing environment variables to make things easier for development, so I added that here too using the python-dotenv package.
Putting Everything Together
Now that I had my main two modules, I could put it together and get it working. A simple main.py file loads the config, sets up the AI engine, and then starts the Discord bot’s event loop using the three modules I just wrote.
Now, I’m looking at ways to expand on this project to add more functionality. My next step is using prompt engineering to build a plugin system that gives the model limited access to the internet. If you have any suggestions, feel free to post them in the comments. I’d love to hear them!
Editor’s Note: I’m trying a new “learn with me” style post, which I hope to write more of in the future. I also hope to expand this article into a series as I continue work on this project. The full repository from this post is available on GitHub, and a container image is available on DockerHub for you to try out.