kcaverly / openchat-3.5-1210-gguf

The "Overall Best Performing Open Source 7B Model" for Coding + Generalization or Mathematical Reasoning

  • Public
  • 8.4K runs
  • GitHub
  • Paper



Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 2 seconds.


A quantized, version of the OpenChat 3.5 1210 model.

The original model can be found here. The quantized version used is Q5_K_M, and can be found here.

There are two separate ‘modes’ available, which can be chosen via the prompt. This model has no system prompt.

Default Mode (GPT4 Correct): Best for coding, chat and general tasks

GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:

Mathematical Reasoning Mode: Tailored for solving math problems

Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>Math Correct Assistant: