titocosta / meditron-70b-awq

Meditron-70B-v1.0 from Meditron's open-source suite of medical LLMs, quantized with AWQ. (Updated 1 year, 6 months ago)

  • Public
  • 165 runs
  • GitHub
  • Paper
Iterate in playground

Input

pip install replicate
Set the REPLICATE_API_TOKEN environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:
import replicate

Run titocosta/meditron-70b-awq using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "titocosta/meditron-70b-awq:7cbcd02ebd1baa7f800969f60dada8bd33c30e8d467223ce78842ecba2fbbc86",
    input={
        "top_k": 50,
        "top_p": 0.95,
        "temperature": 0.2,
        "max_new_tokens": 512,
        "system_message": "You are a helpful AI assistant trained in the medical domain",
        "prompt_template": "<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>question\n{prompt}<|im_end|>\n<|im_start|>answer\n"
    }
)

# The titocosta/meditron-70b-awq model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/titocosta/meditron-70b-awq/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Output

No output yet! Press "Submit" to start a prediction.

Run time and cost

This model runs on 8x Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Meditron is a suite of open-source medical Large Language Models (LLMs).

We release Meditron-7B and Meditron-70B, which are adapted to the medical domain from Llama-2 through continued pretraining on a comprehensively curated medical corpus, including selected PubMed papers and abstracts, a new dataset of internationally-recognized medical guidelines, and a general domain corpus.

Meditron-70B, finetuned on relevant data, outperforms Llama-2-70B, GPT-3.5 and Flan-PaLM on multiple medical reasoning tasks.

Advisory Notice While Meditron is designed to encode medical knowledge from sources of high-quality evidence, it is not yet adapted to deliver this knowledge appropriately, safely, or within professional actionable constraints. We recommend against using Meditron in medical applications without extensive use-case alignment, as well as additional testing, specifically including randomized controlled trials in real-world practice settings. Model Details Developed by: EPFL LLM Team Model type: Causal decoder-only transformer language model Language(s): English (mainly) Model License: LLAMA 2 COMMUNITY LICENSE AGREEMENT Code License: APACHE 2.0 LICENSE Continue-pretrained from model: Llama-2-70B Context length: 4k tokens Input: Text only data Output: Model generates text only Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we enhance model’s performance. Knowledge Cutoff: August 2023 Trainer: epflLLM/Megatron-LLM Paper: Meditron-70B: Scaling Medical Pretraining for Large Language Models