Readme

Qwen2-1.5B-Instruct on Replicate

This Replicate model provides access to the Qwen2-1.5B-Instruct model, part of the Qwen2 language model series. It offers three variants:

Qwen/Qwen2-1.5B-Instruct: Full precision model
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int8: 8-bit quantized model
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4: 4-bit quantized model

Introduction

Qwen2 is the latest series of Qwen large language models, offering both pretrained and instruction-tuned models in five sizes: 0.5B, 1.5B, 7B, 57B-A14B, and 72B. This Replicate implementation focuses on the instruction-tuned 1.5B Qwen2 model.

Qwen2 demonstrates competitive performance against state-of-the-art open-source and proprietary models across various benchmarks, including language understanding, generation, multilingual capability, coding, mathematics, and reasoning.

For more details about Qwen2, visit:

Model Details

Qwen2 is based on the Transformer architecture and incorporates: - SwiGLU activation - Attention QKV bias - Group query attention - Improved tokenizer for multiple natural languages and code

Training Details

The model underwent pretraining with a large dataset, followed by post-training using both supervised fine-tuning and direct preference optimization.

Quickstart

To use this Replicate implementation:

Visit the Replicate model page.
Use the web interface or API to run a prediction with your desired parameters.

For local testing or development:

Clone the repository: sh git clone -b Qwen2-1.5B-Instruct https://github.com/zsxkib/cog-qwen-2.git cd cog-qwen-2
Run a prediction using Cog: sh cog predict \ -I 'top_k=1' \ -I 'top_p=1' \ -I 'prompt="Tell me a funny joke about cowboys in the style of Yoda from Star Wars"' \ -I 'model_type="Qwen2-1.5B-Instruct"' \ -I 'temperature=1' \ -I 'system_prompt="You are a funny and helpful assistant."' \ -I 'max_new_tokens=512' \ -I 'repetition_penalty=1'

Evaluation

Performance comparison between Qwen2-1.5B-Instruct and Qwen1.5-1.8B-Chat:

Dataset	Qwen1.5-0.5B-Chat	Qwen2-0.5B-Instruct	Qwen1.5-1.8B-Chat	Qwen2-1.5B-Instruct
MMLU	35.0	37.9	43.7	52.4
HumanEval	9.1	17.1	25.0	37.8
GSM8K	11.3	40.1	35.3	61.6
C-Eval	37.2	45.2	55.3	63.8
IFEval (Prompt Strict-Acc.)	14.6	20.0	16.8	29.0

Citation

If you find the Qwen2 model helpful in your work, please cite:

@article{qwen2,
  title={Qwen2 Technical Report},
  year={2024}
}

License

The Qwen2 model is licensed under the Apache 2.0 License.

Credits and Support

The Qwen2 model was developed by the Qwen team.
This Replicate implementation was created by @zsakib_.
For issues related to the Replicate implementation, please use the GitHub issue tracker.
For questions about the underlying Qwen2 model, refer to the official Qwen repository.

Model created over 1 year ago

Run time and cost