Collections

Streaming language models

Language models that support streaming responses. See https://replicate.com/docs/streaming

Recommended models

meta/meta-llama-3-8b

Base version of Llama 3, an 8 billion parameter language model from Meta.

10.7M runs

meta/meta-llama-3-70b-instruct

A 70 billion parameter language model from Meta, fine tuned for chat completions

10M runs

yorickvp/llava-13b

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

7.5M runs

mistralai/mixtral-8x7b-instruct-v0.1

The Mixtral-8x7B-instruct-v0.1 Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts tuned to be a helpful assistant.

6.8M runs

meta/llama-2-70b-chat

A 70 billion parameter language model from Meta, fine tuned for chat completions

5.6M runs

meta/llama-2-7b-chat

A 7 billion parameter language model from Meta, fine tuned for chat completions

5.3M runs

meta/llama-2-13b-chat

A 13 billion parameter language model from Meta, fine tuned for chat completions

4.1M runs

mistralai/mistral-7b-instruct-v0.2

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.

2.4M runs

meta/meta-llama-3-8b-instruct

An 8 billion parameter language model from Meta, fine tuned for chat completions

1.8M runs

fofr/prompt-classifier

Determines the toxicity of text to image prompts, llama-13b fine-tune. [SAFETY_RANKING] between 0 (safe) and 10 (toxic)

1.6M runs

yorickvp/llava-v1.6-34b

LLaVA v1.6: Large Language and Vision Assistant (Nous-Hermes-2-34B)

1.2M runs

mistralai/mistral-7b-v0.1

A 7 billion parameter language model from Mistral.

1.2M runs

replicate-internal/llama-2-70b-triton

1.1M runs

mistralai/mistral-7b-instruct-v0.1

An instruction-tuned 7 billion parameter language model from Mistral

883.5K runs

yorickvp/llava-v1.6-vicuna-13b

LLaVA v1.6: Large Language and Vision Assistant (Vicuna-13B)

717.2K runs

replicate/dolly-v2-12b

An open source instruction-tuned large language model developed by Databricks

453.1K runs

yorickvp/llava-v1.6-mistral-7b

LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)

391.8K runs

meta/llama-2-7b

Base version of Llama 2 7B, a 7 billion parameter language model

271.6K runs

spuuntries/flatdolphinmaid-8x7b-gguf

Undi95's FlatDolphinMaid 8x7B Mixtral Merge, GGUF Q5_K_M quantized by TheBloke.

264.1K runs

joehoover/instructblip-vicuna13b

An instruction-tuned multi-modal model based on BLIP-2 and Vicuna-13B

257.5K runs

replicate/vicuna-13b

A large language model that's been fine-tuned on ChatGPT interactions

251.2K runs

01-ai/yi-34b-chat

The Yi series models are large language models trained from scratch by developers at 01.AI.

248.6K runs

snowflake/snowflake-arctic-instruct

An efficient, intelligent, and truly open-source language model

231.7K runs

andreasjansson/sheep-duck-llama-2-70b-v1-1-gguf

211.8K runs

meta/llama-2-70b

Base version of Llama 2, a 70 billion parameter language model from Meta.

186.8K runs

nateraw/goliath-120b

An auto-regressive causal LM created by combining 2x finetuned Llama-2 70B into one.

184.8K runs

01-ai/yi-6b

The Yi series models are large language models trained from scratch by developers at 01.AI.

158.1K runs

antoinelyset/openhermes-2-mistral-7b-awq

135.6K runs

replicate/flan-t5-xl

A language model by Google for tasks like classification, summarization, and more

132.2K runs

meta/codellama-13b

A 13 billion parameter Llama tuned for code completion

112.4K runs

stability-ai/stablelm-tuned-alpha-7b

7 billion parameter version of Stability AI's language model

110.8K runs

replicate/llama-7b

Transformers implementation of the LLaMA language model

98K runs

meta/codellama-34b-instruct

A 34 billion parameter Llama tuned for coding and conversation

98K runs

nateraw/openchat_​3.5-awq

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

87.2K runs

google-deepmind/gemma-2b-it

2B instruct version of Google’s Gemma model

79K runs

lucataco/phi-3-mini-4k-instruct

Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets

75.8K runs

nateraw/mistral-7b-openorca

Mistral-7B-v0.1 fine tuned for chat with the OpenOrca dataset.

65.6K runs

meta/meta-llama-3-70b

Base version of Llama 3, a 70 billion parameter language model from Meta.

62K runs

meta/llama-2-13b

Base version of Llama 2 13B, a 13 billion parameter language model

60.8K runs

joehoover/mplug-owl

An instruction-tuned multimodal large language model that generates text based on user-provided prompts and images

55.8K runs

fofr/image-prompts

Generate image prompts for Midjourney. Prefix inputs with "Image: "

51.6K runs

nateraw/nous-hermes-2-solar-10.7b

Nous Hermes 2 - SOLAR 10.7B is the flagship Nous Research model on the SOLAR 10.7B base model..

48.6K runs

joehoover/falcon-40b-instruct

A 40 billion parameter language model trained to follow human instructions.

38.2K runs

kcaverly/dolphin-2.5-mixtral-8x7b-gguf

Mixtral-8x7b MOE model trained for chat with the dolphin dataset, quantized

37.9K runs

meta/codellama-13b-instruct

A 13 billion parameter Llama tuned for coding and conversation

36.4K runs

google-deepmind/gemma-7b-it

7B instruct version of Google’s Gemma model

34.2K runs

replicate/oasst-sft-1-pythia-12b

An open source instruction-tuned large language model developed by Open-Assistant

32.4K runs

meta/codellama-7b-instruct

A 7 billion parameter Llama tuned for coding and conversation

30.4K runs

lucataco/dolphin-2.2.1-mistral-7b

Mistral-7B-v0.1 fine tuned for chat with the Dolphin dataset (an open-source implementation of Microsoft's Orca)

29.2K runs

lucataco/moondream2

moondream2 is a small vision language model designed to run efficiently on edge devices

29.1K runs

yorickvp/llava-v1.6-vicuna-7b

LLaVA v1.6: Large Language and Vision Assistant (Vicuna-7B)

26.8K runs

nateraw/defog-sqlcoder-7b-2

A capable large language model for natural language to SQL generation.

20.7K runs

andreasjansson/codellama-7b-instruct-gguf

CodeLlama-7B-instruct with support for grammars and jsonschema

20.6K runs

meta/codellama-70b-instruct

A 70 billion parameter Llama tuned for coding and conversation

17.7K runs

uwulewd/airoboros-llama-2-70b

Inference Airoboros L2 70B 2.1 - GPTQ using ExLlama.

17.1K runs

replicate/lifeboat-70b

15.5K runs

meta/codellama-7b

A 7 billion parameter Llama tuned for coding and conversation

15.1K runs

nomagick/chatglm3-6b

A 6B parameter open bilingual chat LLM | 开源双语对话语言模型

14.6K runs

spuuntries/miqumaid-v1-70b-gguf

NeverSleep's MiquMaid v1 70B Miqu Finetune, GGUF Q3_K_M quantized by NeverSleep.

14K runs

gregwdata/defog-sqlcoder-q8

Defog's SQLCoder is a state-of-the-art LLM for converting natural language questions to SQL queries. SQLCoder is a 15B parameter fine-tuned on a base StarCoder model.

12.4K runs

lucataco/dolphin-2.1-mistral-7b

Mistral-7B-v0.1 fine tuned for chat with the Dolphin dataset (an open-source implementation of Microsoft's Orca)

12.1K runs

kcaverly/neuralbeagle14-7b-gguf

NeuralBeagle14-7B is (probably) the best 7B model you can find!

12.1K runs

antoinelyset/openhermes-2.5-mistral-7b

12K runs

nomagick/chatglm2-6b

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

10.6K runs

lucataco/moondream1

(Research only) Moondream1 is a vision language model that performs on par with models twice its size

9.9K runs

kcaverly/openchat-3.5-1210-gguf

The "Overall Best Performing Open Source 7B Model" for Coding + Generalization or Mathematical Reasoning

9.8K runs

meta/codellama-34b

A 34 billion parameter Llama tuned for coding and conversation

9.7K runs

kcaverly/nous-hermes-2-yi-34b-gguf

Nous Hermes 2 - Yi-34B is a state of the art Yi Fine-tune, fine tuned on GPT-4 generated synthetic data

8.5K runs

replicate/mpt-7b-storywriter

A 7B parameter LLM fine-tuned to support contexts with more than 65K tokens

8.3K runs

replicate/gpt-j-6b

A large language model by EleutherAI

8.1K runs

andreasjansson/llama-2-13b-chat-gguf

Llama-2 13B chat with support for grammars and jsonschema

7.6K runs

nateraw/nous-hermes-llama2-awq

TheBloke/Nous-Hermes-Llama2-AWQ served with vLLM

7.2K runs

joehoover/zephyr-7b-alpha

A high-performing language model trained to act as a helpful assistant

6.9K runs

google-deepmind/gemma-7b

7B base version of Google’s Gemma model

6.7K runs

meta/codellama-34b-python

A 34 billion parameter Llama tuned for coding with Python

6.3K runs

nateraw/zephyr-7b-beta

Zephyr-7B-beta, an LLM trained to act as a helpful assistant.

5.6K runs

hikikomori-haven/solar-uncensored

5.5K runs

replicate/llama-13b-lora

Transformers implementation of the LLaMA 13B language model

5K runs

nomagick/qwen-14b-chat

Qwen-14B-Chat is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc.

4.5K runs

kcaverly/dolphin-2.7-mixtral-8x7b-gguf

Uncensored Mixtral-8x7b MOE model trained for chat with the Dolphin dataset

4.2K runs

lucataco/llama-3-vision-alpha

Projection module trained to add vision capabilties to Llama 3 using SigLIP

4.1K runs

meta/codellama-13b-python

A 13 billion parameter Llama tuned for coding with Python

3.8K runs

lucataco/qwen1.5-72b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

3.7K runs

meta/codellama-7b-python

A 7 billion parameter Llama tuned for coding with Python

3.7K runs

01-ai/yi-6b-chat

The Yi series models are large language models trained from scratch by developers at 01.AI.

3.6K runs

joehoover/sql-generator

3.6K runs

nateraw/salmonn

SALMONN: Speech Audio Language Music Open Neural Network

3.5K runs

anotherjesse/llava-lies

LLaVA injecting randomness into the image

2.9K runs

kcaverly/dolphin-2.6-mixtral-8x7b-gguf

Mixtral-8x7b MOE model trained for chat with the dolphin + samantha's empathy dataset

2.6K runs

lucataco/qwen1.5-110b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

2.4K runs

organisciak/ocsai-llama2-7b

2.3K runs

01-ai/yi-34b

The Yi series models are large language models trained from scratch by developers at 01.AI.

2.1K runs

andreasjansson/llama-2-70b-chat-gguf

Llama-2 70B chat with support for grammars and jsonschema

2K runs

kcaverly/deepseek-coder-33b-instruct-gguf

A quantized 33B parameter language model from Deepseek for SOTA repository level code completion

1.9K runs

replit/replit-code-v1-3b

Generate code with Replit's replit-code-v1-3b large language model

1.9K runs

01-ai/yi-34b-200k

The Yi series models are large language models trained from scratch by developers at 01.AI.

1.6K runs

lucataco/deepseek-vl-7b-base

DeepSeek-VL: An open-source Vision-Language Model designed for real-world vision and language understanding applications

1.4K runs

mattt/orca-2-13b

1.4K runs

niron1/openorca-platypus2-13b

OpenOrca-Platypus2-13B is a merge of garage-bAInd/Platypus2-13B and Open-Orca/OpenOrcaxOpenChat-Preview2-13B.

1.3K runs

lucataco/phi-3-mini-128k-instruct

Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets

1.3K runs

camenduru/wizardlm-2-8x22b

WizardLM 2 8x22B

1.3K runs

daanelson/flan-t5-large

A language model for tasks like classification, summarization, and more.

1.2K runs

meta/codellama-70b

A 70 billion parameter Llama tuned for coding and conversation

1.1K runs

anotherjesse/sdxl-recur

explore img2img zooming sdxl

1.1K runs

niron1/qwen-7b-chat

Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Aibaba Cloud. Qwen-7B`is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books,

959 runs

nateraw/causallm-14b

CausalLM/14B model with AWQ quantization. Perhaps better than all existing models < 70B, in most quantitative evaluations...

956 runs

meta/codellama-70b-python

A 70 billion parameter Llama tuned for coding with Python

911 runs

deepseek-ai/deepseek-math-7b-instruct

Pushing the Limits of Mathematical Reasoning in Open Language Models - Instruct model

849 runs

nateraw/samsum-llama-2-13b

835 runs

spuuntries/miqumaid-v2-2x70b-dpo-gguf

NeverSleep's MiquMaid v2 2x70B Miqu-Mixtral MoE DPO Finetune, GGUF Q2_K quantized by NeverSleep.

771 runs

nomagick/qwen-vl-chat

Qwen-VL-Chat but with raw ChatML prompt interface and streaming

726 runs

andreasjansson/codellama-34b-instruct-gguf

CodeLlama-34B-instruct with support for grammars and jsonschema

710 runs

andreasjansson/wizardcoder-python-34b-v1-gguf

WizardCoder-python-34B-v1.0 with support for grammars and jsonschema

705 runs

nwhitehead/llama2-7b-chat-gptq

671 runs

moinnadeem/vllm-engine-llama-7b

666 runs

andreasjansson/llama-2-13b-gguf

Llama-2 13B with support for grammars and jsonschema

654 runs

charles-dyfis-net/llama-2-13b-hf--lmtp-8bit

647 runs

ruben-svensson/llama2-aqua-test1

605 runs

lucataco/wizardcoder-33b-v1.1-gguf

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

605 runs

antoinelyset/openhermes-2.5-mistral-7b-awq

564 runs

papermoose/llama-pajama

547 runs

stability-ai/stablelm-base-alpha-7b

7B parameter base version of Stability AI's language model

532 runs

fofr/llama2-prompter

Llama2 13b base model fine-tuned on text to image prompts

487 runs

kcaverly/deepseek-coder-6.7b-instruct

A ~7B parameter language model from Deepseek for SOTA repository level code completion

448 runs

google-deepmind/gemma-2b

2B base version of Google’s Gemma model

437 runs

fofr/star-trek-gpt-j-6b

gpt-j-6b trained on the Memory Alpha Star Trek Wiki

425 runs

replicate-internal/staging-llama-2-7b

412 runs

andreasjansson/plasma

Generate plasma shader equations

392 runs

stability-ai/stablelm-base-alpha-3b

3B parameter base version of Stability AI's language model

385 runs

camenduru/mixtral-8x22b-instruct-v0.1

Mixtral 8x22b Instruct v0.1

369 runs

theghoul21/srl

365 runs

kcaverly/dolphin-2.6-mistral-7b-gguf

Mistral 7b v2 Fine Tuned on the Dolphin dataset

363 runs

spuuntries/erosumika-7b-v3-0.2-gguf

localfultonextractor's Erosumika 7B Mistral Merge, GGUF Q4_K_S-imat quantized by Lewdiculous.

357 runs

nomagick/chatglm3-6b-32k

A 6B parameter open bilingual chat LLM (optimized for 8k+ context) | 开源双语对话语言模型

316 runs

lucataco/tinyllama-1.1b-chat-v1.0

This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

281 runs

nateraw/axolotl-llama-2-7b-english-to-hinglish

281 runs

ignaciosgithub/pllava

279 runs

kcaverly/nous-capybara-34b-gguf

A SOTA Nous Research finetune of 200k Yi-34B fine tuned on the Capybara dataset.

266 runs

nateraw/sqlcoder-70b-alpha

265 runs

peter65374/openbuddy-llemma-34b-gguf

This is a cog implementation of "openbuddy-llemma-34b" 4-bit quantization model.

264 runs

niron1/llama-2-7b-chat

LLAMA-2 7b chat version by Meta. Stream support. Unaltered prompt. Temperature working properly. Economical hardware.

262 runs

cbh123/dylan-lyrics

Llama 2 13B fine-tuned on Bob Dylan lyrics

240 runs

antoinelyset/openhermes-2-mistral-7b

Simple version of https://huggingface.co/teknium/OpenHermes-2-Mistral-7B

238 runs

lucataco/qwen1.5-14b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

225 runs

camenduru/mixtral-8x22b-v0.1-instruct-oh

Mixtral-8x22b-v0.1-Instruct-Open-Hermes

221 runs

kcaverly/phind-codellama-34b-v2-gguf

A quantized 34B parameter language model from Phind for code completion

219 runs

hayooucom/llm-60k

llm model ,for CN

209 runs

hamelsmu/llama-3-70b-instruct-awq-with-tools

Function calling with llama-3 with prompting only.

207 runs

nomagick/chatglm2-6b-int4

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型 (int4)

206 runs

zeke/nyu-llama-2-7b-chat-training-test

A test model for fine-tuning Llama 2

197 runs

deepseek-ai/deepseek-math-7b-base

Pushing the Limits of Mathematical Reasoning in Open Language Models - Base model

187 runs

kcaverly/nous-hermes-2-solar-10.7b-gguf

Nous Hermes 2 - SOLAR 10.7B is the flagship Nous Research model on the SOLAR 10.7B base model.

184 runs

xrunda/med

179 runs

adirik/mamba-2.8b

Base version of Mamba 2.8B, a 2.8 billion parameter state space language model

179 runs

lucataco/phixtral-2x2_​8

phixtral-2x2_8 is the first Mixure of Experts (MoE) made with two microsoft/phi-2 models, inspired by the mistralai/Mixtral-8x7B-v0.1 architecture

179 runs

kcaverly/nexus-raven-v2-13b-gguf

A quantized 13B parameter language model from NexusFlow for SOTA zero-shot function calling

177 runs

lucataco/wizard-vicuna-13b-uncensored

This is wizard-vicuna-13b trained with a subset of the dataset - responses that contained alignment / moralizing were removed

177 runs

hamelsmu/honeycomb-2

Honeycomb NLQ Generator

161 runs

fofr/star-trek-adventure

156 runs

nateraw/stablecode-completion-alpha-3b-4k

155 runs

zallesov/super-real-llama2

148 runs

fofr/neuromancer-13b

llama-13b-base fine-tuned on Neuromancer style

144 runs

swartype/lanne-m1-70b

Lanne M1 is the first language model produced by Lanne Tech. It is based on +70B of parameters. With performance equivalent to GPT3.5.

141 runs

m1guelpf/mario-gpt

Using language models to generate Super Mario Bros levels

137 runs

camenduru/zephyr-orpo-141b-a35b-v0.1

Mixtral 8x22b v0.1 Zephyr Orpo 141b A35b v0.1

132 runs

nateraw/samsum-llama-7b

llama-2-7b fine-tuned on the samsum dataset for dialogue summarization

131 runs

fofr/star-trek-flan

flan-t5-xl trained on the Memory Alpha Star Trek Wiki

130 runs

fofr/star-trek-llama

llama-7b trained on the Memory Alpha Star Trek Wiki

125 runs

titocosta/notus-7b-v1

Notus-7b-v1 model

122 runs

moinnadeem/fastervicuna_​13b

Re-implements LLaMa using a higher MFU implementation

119 runs

nateraw/llama-2-7b-paraphrase-v1

111 runs

rybens92/una-cybertron-7b-v2--lmtp-8bit

92 runs

cjwbw/opencodeinterpreter-ds-6.7b

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

89 runs

nateraw/wizardcoder-python-34b-v1.0

88 runs

automorphic-ai/runhouse

88 runs

crowdy/line-lang-3.6b

an implementation of 3.6b Japanese large language model

82 runs

nateraw/llama-2-7b-chat-hf

80 runs

cjwbw/starcoder2-15b

Language Models for Code

80 runs

moinnadeem/codellama-34b-instruct-vllm

75 runs

tanzir11/merge

74 runs

lucataco/olmo-7b

OLMo is a series of Open Language Models designed to enable the science of language models

73 runs

spuuntries/borealis-10.7b-dpo-gguf

Undi95's Borealis 10.7B Mistral DPO Finetune, GGUF Q5_K_M quantized by Undi95.

72 runs

lucataco/qwen1.5-1.8b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

72 runs

nateraw/codellama-7b-instruct-hf

69 runs

adirik/mamba-130m

Base version of Mamba 130M, a 130 million parameter state space language model

67 runs

juanjaragavi/abby-llama-2-7b-chat

Abby is a stoic philosopher and a loving and caring mature woman.

66 runs

peter65374/openbuddy-mistral-7b

Openbuddy finetuned mistral-7b in GPTQ quantization in 4bits by TheBloke

66 runs

lucataco/qwen1.5-7b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

64 runs

lucataco/hermes-2-pro-llama-3-8b

Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house

63 runs

nateraw/aidc-ai-business-marcoroni-13b

62 runs

replicate-internal/mixtral-8x7b-instruct-v0.1-pget

61 runs

martintmv-git/moondream2

small vision language model

58 runs

chigozienri/llava-birds

56 runs

cjwbw/c4ai-command-r-v01

CohereForAI c4ai-command-r-v01, Quantized model through bitsandbytes, 8-bit precision

54 runs

lidarbtc/kollava-v1.5

korean version of llava-v1.5

53 runs

dsingal0/mixtral-single-gpu

Runs Mixtral 8x7B on a single A40 GPU

51 runs

lucataco/nous-hermes-2-mixtral-8x7b-dpo

Nous Hermes 2 Mixtral 8x7B DPO is a Nous Research model trained over the Mixtral 8x7B MoE LLM

48 runs

cbh123/samsum

47 runs

replicate-internal/staging-honeycomb-triton

A fast version of replicate.com/hamelsmu/honeycomb-2 using TRT-LLM

47 runs

cbh123/homerbot

46 runs

titocosta/starling

Starling-LM-7B-alpha

46 runs

replicate/elixir-gen

Fine-tuned Llama 13b on Elixir docstrings (WIP)

45 runs

technillogue/mixtral-instruct-nix

45 runs

adirik/mamba-2.8b-slimpj

Base version of Mamba 2.8B Slim Pyjama, a 2.8 billion parameter state space language model

40 runs

adirik/mamba-1.4b

Base version of Mamba 1.4B, a 1.4 billion parameter state space language model

38 runs

sruthiselvaraj/finetuned-llama2

35 runs

lucataco/qwen1.5-0.5b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

35 runs

hamelsmu/honeycomb

Honeycomb NLQ Generator

33 runs

adirik/mamba-370m

Base version of Mamba 370M, a 370 million parameter state space language model

31 runs

nateraw/gairmath-abel-7b

27 runs

seanoliver/bob-dylan-fun-tuning

Llama fine-tune-athon project training llama2 on bob dylan lyrics.

26 runs

adirik/mamba-790m

Base version of Mamba 790M, a 790 million parameter state space language model

24 runs

intentface/poro-34b-gguf-checkpoint

Try out akx/Poro-34B-gguf, Q5_K, This is 1000B checkpoint model

23 runs

lucataco/qwen1.5-32b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

21 runs

fleshgordo/orni2-chat

20 runs

nateraw/codellama-7b-instruct

18 runs

lucataco/deepseek-67b-base

DeepSeek LLM, an advanced language model comprising 67 billion parameters. Trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese

18 runs

lucataco/qwen1.5-4b

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

17 runs

charles-dyfis-net/llama-2-7b-hf--lmtp-4bit

16 runs

nateraw/llama-2-7b-samsum

16 runs

juanjaragavi/abbot-llama-2-7b-chat

Abbot is brutally honest stoic philosopher. He is here to help the 'User' be their best self, no coddling.

14 runs

msamogh/iiu-generator-llama2-7b-2

14 runs

divyavanmahajan/my-pet-llama

13 runs

nateraw/codellama-7b

13 runs

nateraw/codellama-34b

13 runs

charles-dyfis-net/llama-2-13b-hf--lmtp

11 runs

jquintanilla4/qwen1.5-32b-chat

Qwen1.5 32B Chat variant. A transformer-based decoder-only language model. Good with Chinese and English.

11 runs

replicate-internal/gemma-2b-it

2B instruct version of the Gemma model

10 runs

halevi/sandbox1

10 runs

demonpore-sys/llamaxine0.1

8 runs

charles-dyfis-net/llama-2-13b-hf--lmtp-4bit

7 runs

nateraw/codellama-13b

7 runs

nateraw/codellama-13b-instruct

5 runs