Replicate Intelligence #8

Welcome to Replicate’s weekly bulletin! Each week, we’ll bring you updates on the latest open-source AI models, tools, and research. People are making cool stuff and we want to share it with you. Without further ado, here’s our hacker-in-residence deepfates with an unfiltered take on the week in AI.

Editor’s note

The big event this week was the release of Llama 3.1, Meta’s new generation of language models, including the 405 billion parameter model. This model is a peer to GPT-4, Claude 3, and Gemini 1.5, the big proprietary models from other labs.

But unlike those labs, Meta doesn’t claim to be building superintelligence, or even AGI. They think of AI as a system, and language models as one component. Mark Zuckerberg, in his letter accompanying the release, repeatedly uses the phrase “AI systems”. More than most people, he understands that software doesn’t exist in a vacuum. An “app” like Facebook or Instagram is actually a giant, interconnected set of social and technical systems. An “AI” will be like this too: not one giant end-to-end omnimodal intelligence, but a bunch of components working together.

Human intelligence is already a component in that system. Each one of us is a squishy cog in the vast machine of society. The systems look to us for guidance: Do you like this video? Would you buy this product? Does this picture contain a bus? They also inform us: Meeting in 10 minutes. Turn right at the next intersection. New message from Mom. We co-evolve with the systems we use.

Deep learning models are a new type of component. They have some of the aspects of human employees: they can perceive the world, they can make judgment calls, they can plan. But that doesn’t mean we need to package them up into a humanoid robot with a sense of self. They can be, instead, a form of distributed intelligence. We can put a little judgment here, some pattern recognition there. We can keep humans in the loop, automating away tasks instead of jobs. We can augment our own human intelligence, bit by bit.

In fact, this what humans have always done. We compose intelligences into systems to augment ourselves. Agriculture, domestication, engineering: we are already a modular intelligence. This is a different vision than “general intelligence”, and it require a different type of thinking.

Instead of building an employee, we must build an ecosystem.

--- deepfates

A giant open-source-ish language model

The Llama 3.1 generation includes a massive 405 billion parameter model as well as updated versions of the 8B and 70B models released earlier this year.

128,000 token context
Multilingual support
Can use tools and functions

This release narrows the gap between open and closed-source models. The 405B model rivals state-of-the-art closed models in many benchmarks. It particularly excels in coding and mathematical reasoning tasks.

The updated version of the Llama license allows synthetic data creation for training other AI models, with some restrictions.

try on replicate

Smaller model, big performance

Mistral AI unveils Mistral Large 2, a 123 billion parameter model under a Research License:

Matches Llama 3.1 405B in some tasks
Excels at coding and math
128,000 token context

This release demonstrates that smaller, more efficient models can compete with larger ones. Its strong performance in coding and math tasks makes it particularly interesting for developers working on technical applications.

However, the restrictive research license may limit its adoption and impact in the open-source community.

post

Cool tools

Meta’s framework for building AI agents

Meta open-sources a toolkit for creating AI agents with Llama 3.1.

Breaks down complex tasks
Uses built-in and custom tools
Configurable safety with Llama Guard

This framework allows developers to create AI agents that can tackle multi-step problems and interact with external tools. The inclusion of Llama Guard for safety provides a starting point for responsible AI development.

github

Research radar

Scaling secrets of Llama 3.1

Meta’s research on Llama 3.1 reveals:

Extensive use of synthetic data
Novel fine-tuning approaches for specialized tasks
Techniques for handling long contexts
Built-in tool use abilities

These advancements provide valuable insights for developers working on large language models, especially for domain-specific applications and complex task handling.

post | paper

Lightweight defense against LLM exploits

Along with Llama 3.1, Meta released PromptGuard, a small classification model to detect malicious prompts:

Based on mDeBERTa-v3-base with multilingual capabilities
Classifies inputs as BENIGN, INJECTION, or JAILBREAK
Helps prevent prompt injection and jailbreak exploits

Ben at Taylor AI demonstrates how to integrate PromptGuard into existing workflows. Notably, the INJECTION tag can flag on benign prompts, as it’s designed to handle both user inputs and retrieved contexts.

blog post

Changelog

Search all the public models on Replicate

We’ve added a new API endpoint for searching public models on Replicate:

Use a simple QUERY HTTP request
Search by plaintext query
Get paginated JSON responses with model details

This new endpoint makes it easier to discover and integrate models into your projects. You can now programmatically search for models based on specific criteria, streamlining your development workflow.

changelog

Bye for now

In other news, we have a subscribe form now! You can find it, and all the back issues of this letter, at replicate.com/newsletter.

Thanks for reading. Make sure to forward this letter to seven more people, or you’ll have seven weeks of cold boots.