deepseek-ai / deepseek-math-7b-instruct

Pushing the Limits of Mathematical Reasoning in Open Language Models - Instruct model

  • Public
  • 1.6K runs
  • L40S
  • GitHub
  • Paper
  • License

Input

string
Shift + Return to add a new line

Backslashes aren't interpreted as an escape sequence. For example, "\n" is two characters, not a newline.

Input text.

Default: "what is the integral of x^2 from 0 to 2?\nPlease reason step by step, and put your final answer within \boxed{}."

integer

The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

Default: 100

number

The value used to modulate the next token probabilities.

Default: 1

integer

The number of highest probability vocabulary tokens to keep for top-k-filtering.

Default: 50

number

If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.

Default: 0.9

Output

To find the integral of x^2 with respect to x, we use the power rule of integration, which states that the integral of x^n with respect to x is (x^(n+1))/(n+1) + C, where n is a constant and C is the constant of integration. So, the integral of x^2 with respect to x is (x^(2+1))/(2+1) + C = (x^3)/3 + C. Now, we need to evaluate this integral from 0 to 2. We do this by plugging in the limits of integration: [(2^3)/3 + C] - [(0^3)/3 + C] = (8/3) - (0/3) = 8/3. So, the integral of x^2 from 0 to 2 is 8/3. The answer is $\boxed{\frac{8}{3}}$.
Generated in

Run time and cost

This model costs approximately $1.45 to run on Replicate, or 0 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 25 minutes. The predict time for this model varies significantly based on the inputs.

Readme

table

Demo for deepseek-math-7b-base: https://replicate.com/cjwbw/deepseek-math-7b-base

Introduction

DeepSeekMath is initialized with DeepSeek-Coder-v1.5 7B and continues pre-training on math-related tokens sourced from Common Crawl, together with natural language and code data for 500B tokens. DeepSeekMath 7B has achieved an impressive score of 51.7% on the competition-level MATH benchmark without relying on external toolkits and voting techniques, approaching the performance level of Gemini-Ultra and GPT-4. For research purposes, we release checkpoints of base, instruct, and RL models to the public.

table

License

This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.

See the LICENSE-CODE and LICENSE-MODEL for more details.

Citation

@misc{deepseek-math,
  author = {Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y.K. Li, Y. Wu, Daya Guo},
  title = {DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models},
  journal = {CoRR},
  volume = {abs/2402.03300},
  year = {2024},
  url = {https://arxiv.org/abs/2402.03300},
}

Contact

If you have any questions, please raise an issue or contact us at service@deepseek.com.