Replicate Intelligence #10

Welcome to Replicate’s weekly bulletin! Each week, we’ll bring you updates on the latest open-source AI models, tools, and research. People are making cool stuff and we want to share it with you. Without further ado, here’s our hacker-in-residence deepfates with an unfiltered take on the week in AI.

Editor’s note

The news in open source this week was all FLUX.1. People have been amazed with the open image models, running nearly 5 million predictions on FLUX.1 [schnell] in the first week!

Fine-tuning scripts are starting to come out. Expect to see some interesting new downstream models next week. For now we have image to image generation, and a ton of cool images people are creating. Check out our X feed and blog posts for some great examples.

— deepfates

Image to image generation with FLUX.1

FLUX.1 [dev] now supports image-to-image transformations on Replicate. Bring a starter image, write a prompt, and play with the prompt strength input to balance influence between the starting image and the prompt.

This works by setting the original pixels to your starter image, instead of random noise. It does well for style transfer and composition control, but has its weaknesses. For example, it’s hard to get black-and-white line art from a color image. Imagine how far the pixels would have to wiggle to get there.

Experiment and let us know what you find!

try it on Replicate

Cool tools

A video interview featuring Zeke

Streamlit has launched a new video series, and in the inaugural episode, our very own Zeke joins to demonstrate how to build AI-powered apps using Replicate. This tutorial covers everything from getting started with Streamlit to integrating various AI models hosted on Replicate.

“The revelation, for me, was that these language models have become so sophisticated, that there are so many different amazing apps that people can build on top of them where all of the hard work is being done by the language model. And all you have to do is build a compelling and simple user interface on top that delivers value to users.” — Zeke

video | code

Research radar

Odyssey: Empowering Agents with Open-World Skills

Odyssey is a new framework that empowers language model agents with open-world skills to explore the vast Minecraft world. It includes an interactive agent with a skill library, a fine-tuned LLaMA-3 model, and a new open-world benchmark.

The framework demonstrates effective planning and exploration capabilities, making it a significant advancement in autonomous agent solutions. More importantly, they have cool videos on their GitHub page. Go watch them.

code | paper

Bye for now

Thanks for reading! If you have any thoughts or feedback, hit reply and let me know. Forward this to a friend who might find it interesting! Smash that subscribe button. Confirm and submit! Consume and obey. Eat at Joe’s. Run AI with an API on Replicate. I love you.

--- deepfates