Readme
Qwen2-1.5B-Instruct on Replicate
This Replicate model provides access to the Qwen2-1.5B-Instruct model, part of the Qwen2 language model series. It offers three variants:
Qwen/Qwen2-1.5B-Instruct
: Full precision modelQwen/Qwen2-1.5B-Instruct-GPTQ-Int8
: 8-bit quantized modelQwen/Qwen2-1.5B-Instruct-GPTQ-Int4
: 4-bit quantized model
Introduction
Qwen2 is the latest series of Qwen large language models, offering both pretrained and instruction-tuned models in five sizes: 0.5B, 1.5B, 7B, 57B-A14B, and 72B. This Replicate implementation focuses on the instruction-tuned 1.5B Qwen2 model.
Qwen2 demonstrates competitive performance against state-of-the-art open-source and proprietary models across various benchmarks, including language understanding, generation, multilingual capability, coding, mathematics, and reasoning.
For more details about Qwen2, visit:
Model Details
Qwen2 is based on the Transformer architecture and incorporates: - SwiGLU activation - Attention QKV bias - Group query attention - Improved tokenizer for multiple natural languages and code
Training Details
The model underwent pretraining with a large dataset, followed by post-training using both supervised fine-tuning and direct preference optimization.
Quickstart
To use this Replicate implementation:
-
Visit the Replicate model page.
-
Use the web interface or API to run a prediction with your desired parameters.
For local testing or development:
-
Clone the repository:
sh git clone -b Qwen2-1.5B-Instruct https://github.com/zsxkib/cog-qwen-2.git cd cog-qwen-2
-
Run a prediction using Cog:
sh cog predict \ -I 'top_k=1' \ -I 'top_p=1' \ -I 'prompt="Tell me a funny joke about cowboys in the style of Yoda from Star Wars"' \ -I 'model_type="Qwen2-1.5B-Instruct"' \ -I 'temperature=1' \ -I 'system_prompt="You are a funny and helpful assistant."' \ -I 'max_new_tokens=512' \ -I 'repetition_penalty=1'
Evaluation
Performance comparison between Qwen2-1.5B-Instruct and Qwen1.5-1.8B-Chat:
Dataset | Qwen1.5-0.5B-Chat | Qwen2-0.5B-Instruct | Qwen1.5-1.8B-Chat | Qwen2-1.5B-Instruct |
---|---|---|---|---|
MMLU | 35.0 | 37.9 | 43.7 | 52.4 |
HumanEval | 9.1 | 17.1 | 25.0 | 37.8 |
GSM8K | 11.3 | 40.1 | 35.3 | 61.6 |
C-Eval | 37.2 | 45.2 | 55.3 | 63.8 |
IFEval (Prompt Strict-Acc.) | 14.6 | 20.0 | 16.8 | 29.0 |
Citation
If you find the Qwen2 model helpful in your work, please cite:
@article{qwen2,
title={Qwen2 Technical Report},
year={2024}
}
License
The Qwen2 model is licensed under the Apache 2.0 License.
Credits and Support
- The Qwen2 model was developed by the Qwen team.
- This Replicate implementation was created by @zsakib_.
- For issues related to the Replicate implementation, please use the GitHub issue tracker.
- For questions about the underlying Qwen2 model, refer to the official Qwen repository.