kcaverly / openchat-3.5-1210-gguf

The "Overall Best Performing Open Source 7B Model" for Coding + Generalization or Mathematical Reasoning

  • Public
  • 26.1K runs
  • GitHub
  • Paper

Input

Output

Run time and cost

This model costs approximately $0.00072 to run on Replicate, or 1388 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 1 seconds.

Readme

A quantized, version of the OpenChat 3.5 1210 model.

The original model can be found here. The quantized version used is Q5_K_M, and can be found here.

There are two separate ‘modes’ available, which can be chosen via the prompt. This model has no system prompt.

Default Mode (GPT4 Correct): Best for coding, chat and general tasks

GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:

Mathematical Reasoning Mode: Tailored for solving math problems

Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>Math Correct Assistant: