If you work in AI, then probably none of this is new to you, but if you’re curious about the near future of this technology, I hope you find this interesting!
Large Language Models (LLMs) have shown impressive results in the past few years. I’ve noticed there’s some uncertainty among my friends about how far they’ll be able to go. A lot of the criticism of LLMs has centered on how it’s not able to pursue its own goals, and I want to argue that that won’t be a limitation for very long.
What would it look like for an LLM to pursue a goal? Here are some examples of how that might go:
- Goal: Maintain tone or topic in a conversation. E.g. to keep a human involved in a long and happy discussion about their life
- Goal: Persuade a human operator to take some action, such as buy a product
- Goal: Solve a problem through reasoning. In this case, the reward for the model would come from a sense of resolution, or being told by the human operator that their problem has been solved
- Goal: Accomplish something on a website, such as find and buy concert tickets on an unfamiliar website
You can probably imagine other cases in which an AI might use language to pursue some goal, whether through conversation, social media, or online posting.
Reinforcement Learning
There’s a whole branch of Machine Learning called Reinforcement Learning (RL), and it’s all about how to pursue goals. Modern RL has some impressive results. For years, it’s been able to play Atari games, and now it can learn to play those games in about the same number of trials as a human requires. Recently, Dreamer v3 has been able to mine diamonds in Minecraft, which I’m told is not easy for a beginner.
Large language models can be connected to RL. This is something that’s actively being worked on. Reinforcement Learning with Human Feedback is being done right now, which is how OpenAI gets ChatGPT to avoid talking about sensitive topics.
RL is famously used in content recommendation systems, where it can lead to addiction. For example, I suspect the TikTok algorithm works this way. Will we see the same problem in LLMs?
Predictions
I think the common wisdom among ML engineers is that this is an obvious integration. This is probably already happening. I’ve heard that OpenAI is doing it internally on ChatGPT-4.
I expect that in 2023 or 2024, we’ll start to see RL being integrated with LLMs in a serious way. This probably won’t immediately look like LLMs that are scary good at persuading humans to buy stuff, because of the business motives involved. Instead, I think it’ll lead to LLMs being subtly more engaging, because they’ll be trained to keep humans talking.
It might not necessarily be the case that they’re really optimizing to maximize number of interactions. Instead, they might be trained to help humans, and it turns out that they can help more if they get the human to open up more. Expect these models to have their own agendas soon.
Next Time
In the next post, I’ll talk about contextual memory in LLMs, including user-specific memory.
Comments
One response to “The Coming Wave of Goal Pursuit in Large Language Models”
[…] my last post, I talked about how Large Language Models (LLMs) like ChatGPT are going to get better because they […]