titocosta / notus-7b-v1

Notus-7b-v1 model

  • Public
  • 130 runs
  • L40S
  • GitHub
  • Paper
Iterate in playground

Input

*string
Shift + Return to add a new line

Prompt

string
Shift + Return to add a new line

Prompt template

Default: "<|system|>\n{system_message}</s>\n<|user|>\n{prompt}</s>\n<|assistant|>\n"

string
Shift + Return to add a new line

System message

Default: "You are a helpful AI assistant trained in the medical domain"

integer

The maximum number of tokens the model should generate as output.

Default: 512

number

Model temperature

Default: 0.2

number

Top P

Default: 0.95

integer

Top K

Default: 50

Output

As of 2021, the tallest building in the world is the Burj Khalifa in Dubai, United Arab Emirates, which stands at a height of 828 meters (2,716 feet and 8 inches). Here are the top 10 tallest buildings in the world: 1. Burj Khalifa, Dubai, UAE (828 meters) 2. Shanghai Tower, Shanghai, China (632 meters) 3. Makkah Royal Clock Tower Hotel, Mecca, Saudi Arabia (601 meters) 4. Ping An Finance Center, Shenzhen, China (599 meters) 5. One World Trade Center, New York City, USA (546 meters) 6. Taipei 101, Taipei, Taiwan (508 meters) 7. Lotte World Tower, Seoul, South Korea (555 meters) 8. Guangzhou CTF Finance Centre, Guangzhou, China (530 meters) 9. Changsha IFS Tower 1, Changsha, China (528 meters) 10. Nanjing Green Towers, Nanjing, China (506 meters) Note: The heights listed above are of the main structural spire, excluding antennas and other non-structural features.
Generated in

This output was created using a different version of the model, titocosta/notus-7b-v1:dbcd2277.

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Notus-7b-v1 model

Model card below.

Notus is a collection of fine-tuned models using Direct Preference Optimization (DPO) and related RLHF techniques. This model is the first version, fine-tuned with DPO over zephyr-7b-sft-full, which is the SFT model produced to create zephyr-7b-beta.

Following a data-first approach, the only difference between Notus-7B-v1 and Zephyr-7B-beta is the preference dataset used for dDPO.

In particular, when we started building distilabel, we invested time understanding and deep-diving into the UltraFeedback dataset. Using Argilla, we’ve found data issues in the original UltraFeedback dataset, leading to high-scores for bad responses (more details in the training data section). After curating several hundreds of data points, we decided to binarize the dataset using the preference ratings, instead of the original critique overall_score, and verified the new dataset with Argilla.

Using preference ratings, instead of critiques scores, led to a new dataset where the chosen response is different in ~50% of the cases. Using this new dataset with DPO we fine-tuned Notus, a 7B model, that surpasses Zephyr-7B-beta and Claude 2 on AlpacaEval.

Important note: While we opted for the average of multi-aspect ratings, while we fix the original dataset, a very interesting open question remains: once critique data is fixed, what works better? using the critique scores or the preference ratings? We’re very excited to do this comparison in the coming weeks, stay tuned!

This model wouldn’t have been possible without the amazing Alignment Handbook, OpenBMB for releasing the Ultrafeedback dataset, and it’s based on fruitful discussions with the HuggingFace H4 team. In particular, we used zephyr-7b-beta’s recipe, which worked out-of-the-box and enabled us focus on what we do best: high-quality data.

Notus models are intended to be used as assistants via chat-like applications, and are evaluated with Chat (MT-Bench, AlpacaEval) and Academic (Open LLM Leaderboard) benchmarks for a direct comparison with the original Zephyr dDPO model and other 7B models.

Why Notus?: Notus name comes from the ancient Greek god Notus, as a wink to Zephyr, which comes from the ancient Greek god Zephyrus; with the difference that Notus is the god of the south wind, and Zephyr the god of the west wind. More information at https://en.wikipedia.org/wiki/Anemoi.