nomic-ai/gpt4all | Run with an API on Replicate

Run time and cost

This model costs approximately $0.019 to run on Replicate, or 52 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 14 seconds. The predict time for this model varies significantly based on the inputs.

Readme

This is the GPT4all implementation written using pyllamacpp, the support Python bindings for llama.cpp and GPT4all.

Model description

GPT4All-J is an Apache-2 licensed chatbot trained over a massive curated corpus of assistant interactions including word problems, multi-turn dialogue, code, poems, songs, and stories. It builds on the March 2023 GPT4All release by training on a significantly larger corpus, by deriving its weights from the Apache-licensed GPT-J model rather than the GPL-licensed of LLaMA, and by demonstrating improved performance on creative tasks such as writing stories, poems, songs and plays.

Citation

@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}