nomic-ai / gpt4all

An open-source chatbot trained on collections of clean assistant data

  • Public
  • 527 runs
  • GitHub
  • License

This is the GPT4all implementation written using pyllamacpp, the support Python bindings for llama.cpp and GPT4all.

Model description

GPT4All-J is an Apache-2 licensed chatbot trained over a massive curated corpus of assistant interactions including word problems, multi-turn dialogue, code, poems, songs, and stories. It builds on the March 2023 GPT4All release by training on a significantly larger corpus, by deriving its weights from the Apache-licensed GPT-J model rather than the GPL-licensed of LLaMA, and by demonstrating improved performance on creative tasks such as writing stories, poems, songs and plays.

Citation

@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}