moinnadeem/exllama-llama-7b

Public
15 runs

Input

Run this model in Node.js with one line of code:

npx create-replicate --model=moinnadeem/exllama-llama-7b
or set up a project from scratch
npm install replicate
Set the REPLICATE_API_TOKEN environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:
import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run moinnadeem/exllama-llama-7b using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "moinnadeem/exllama-llama-7b:f9facb823706ea2a25a69a009e8f8fe59e3d6461549fb60f43aefc8926aad3e6",
  {
    input: {
      debug: false,
      top_k: 50,
      top_p: 0.9,
      temperature: 0.75,
      system_prompt: "You are a helpful assistant.",
      max_new_tokens: 128,
      min_new_tokens: -1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Output

No output yet! Press "Submit" to start a prediction.

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.