Language model roundup, April 2023

Posted by @joehoover, @mattt, and @zeke

One month ago, we blogged about innovation around LLaMA, an open-source language model from Meta Labs. We heard from users that they really wanted to see more of these kinds of posts.

So here we are, a month later, with another roundup of recent developments in the world of open-source language models.

llama personal assistant, children's illustration
Image generated by ai-forever/kandinsky-2

Models

Large language models are hot. Here’s what came out this week:

  • StableLM – A new set of language models from Stability AI, the folks behind the Stable Diffusion image generation model. These models are trained on a new dataset that’s 3x the size of The Pile.
  • Vicuna – An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
  • GPT4All – Demo, data, and code to train open-source assistant-style large language model based on GPT-J and LLaMa.

These new models join existing language models on Replicate like FLAN-T5, GPT-J, and LLaMA. As we publish more of these language models to Replicate, we’ll keep adding them to our collection of language models.

Fine-tuning

You can fine-tune language models to make them better at a particular task:

Our training API is currently in beta. If you want to use it, email us at team@replicate.com with a bit about yourself and what you want to use it for.

Playgrounds

People are building tools to compare these language models:

  • OpenPlayground - An LLM playground from GitHub’s former CEO that you can run on your laptop.
  • AI Playground - Compare and tune AI language models side-by-side, share your results, and auto-generate code snippets for Next.js.
  • ShareGPT – A lot of experimentation with language models happens in ChatGPT. ShareGPT makes it easy to share your wildest conversations with GPT-4 with a single click.

Autonomous agents

Is the singularity near? Hard to say. But when we let these large language models talk to themselves and interact with external systems, they sure do start to look resemble something like AGI. Here are a few projects that have emerged in recent weeks:

  • Auto-GPT: An experiment in making GPT-4 autonomous. Just a month in, and this project already has 100K stars, thousands of commits, and hundreds of contributors — both a testament to the explosion of interest in this space, and a reminder of how quickly things can escalate in this new age.
  • BabyAGI: An AI-powered task management system using OpenAI models as well as LLaMA.
  • Teenage-AGI: Another autonomous system like Auto-GPT and BabyAGI that takes inspiration from the paper “Generative Agents: Interactive Simulacra of Human Behavior”.

Keeping up

Lately we’ve been at a loss for words to describe the feeling of this moment. So it feels appropriately ironic to ask GPT-4 to generate a haiku to come up with a few words for us.*

Swift thoughts intertwine, Silicon minds now awake, Boundless growth ignites.

Follow us on Twitter to follow along.