Input

Run this model in Node.js with one line of code:

npx create-replicate --model=cjwbw/gorilla

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run cjwbw/gorilla using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "cjwbw/gorilla:4a1a7ce831b1315ab68a6b3ccf54bf376de787c93198a95a329582105b70e083",
  {
    input: {
      prompt: "I would like to translate 'I feel very good today.' from English to French.",
      model_name: "gorilla-llm/gorilla-falcon-7b-hf-v0",
      temperature: 0.7,
      max_new_tokens: 1024
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run cjwbw/gorilla using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "cjwbw/gorilla:4a1a7ce831b1315ab68a6b3ccf54bf376de787c93198a95a329582105b70e083",
    input={
        "prompt": "I would like to translate 'I feel very good today.' from English to French.",
        "model_name": "gorilla-llm/gorilla-falcon-7b-hf-v0",
        "temperature": 0.7,
        "max_new_tokens": 1024
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run cjwbw/gorilla using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "cjwbw/gorilla:4a1a7ce831b1315ab68a6b3ccf54bf376de787c93198a95a329582105b70e083",
    "input": {
      "prompt": "I would like to translate \'I feel very good today.\' from English to French.",
      "model_name": "gorilla-llm/gorilla-falcon-7b-hf-v0",
      "temperature": 0.7,
      "max_new_tokens": 1024
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

code_parsed

from transformers import pipeline translator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr') translated_text = translator("I feel very good today.")[0]['translation_text']"

domain_parsed

Natural Language Processing Translation\n

api_call_parsed

pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\n

raw_text_response

\n<<<domain>>>: Natural Language Processing Translation\n<<<api_call>>>: pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\n<<<api_provider>>>: Hugging Face Transformers\n<<<explanation>>>: 1. We first import the pipeline function from the transformers library provided by Hugging Face.\n2. We then use the pipeline function to create a translation model.\n3. We specify the model 'Helsinki-NLP/opus-mt-en-fr' to be loaded. This model is trained for English to French translation tasks, which is exactly what we need for translating 'I feel very good today.'\n4. We can pass the English text as input and the model will return the translated French text.\n<<<code>>>: from transformers import pipeline\ntranslator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\ntranslated_text = translator(\"I feel very good today.\")[0]['translation_text']"

explanation_parsed

1. We first import the pipeline function from the transformers library provided by Hugging Face. 2. We then use the pipeline function to create a translation model. 3. We specify the model 'Helsinki-NLP/opus-mt-en-fr' to be loaded. This model is trained for English to French translation tasks, which is exactly what we need for translating 'I feel very good today.'n4. We can pass the English text as input and the model will return the translated French text.

api_provider_parsed

Hugging Face Transformers\n

{
  "completed_at": "2023-11-07T12:32:36.416744Z",
  "created_at": "2023-11-07T12:28:21.908586Z",
  "data_removed": false,
  "error": null,
  "id": "bc2uwn3btg3wvuu5tejtocsi5m",
  "input": {
    "prompt": "I would like to translate 'I feel very good today.' from English to French.",
    "model_name": "gorilla-llm/gorilla-falcon-7b-hf-v0",
    "temperature": 0.7,
    "max_new_tokens": 1024
  },
  "logs": "/root/.pyenv/versions/3.11.4/lib/python3.11/site-packages/transformers/generation/utils.py:1270: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )\nwarnings.warn(\nThe attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\nSetting `pad_token_id` to `eos_token_id`:11 for open-end generation.",
  "metrics": {
    "predict_time": 9.00843,
    "total_time": 254.508158
  },
  "output": {
    "code_parsed": "from transformers import pipeline\ntranslator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\ntranslated_text = translator(\"I feel very good today.\")[0]['translation_text']\"",
    "domain_parsed": "Natural Language Processing Translation\\n",
    "api_call_parsed": "pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\\n",
    "raw_text_response": "\\n<<<domain>>>: Natural Language Processing Translation\\n<<<api_call>>>: pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\\n<<<api_provider>>>: Hugging Face Transformers\\n<<<explanation>>>: 1. We first import the pipeline function from the transformers library provided by Hugging Face.\\n2. We then use the pipeline function to create a translation model.\\n3. We specify the model 'Helsinki-NLP/opus-mt-en-fr' to be loaded. This model is trained for English to French translation tasks, which is exactly what we need for translating 'I feel very good today.'\\n4. We can pass the English text as input and the model will return the translated French text.\\n<<<code>>>: from transformers import pipeline\\ntranslator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')\\ntranslated_text = translator(\\\"I feel very good today.\\\")[0]['translation_text']\"",
    "explanation_parsed": "1. We first import the pipeline function from the transformers library provided by Hugging Face.\n2. We then use the pipeline function to create a translation model.\n3. We specify the model 'Helsinki-NLP/opus-mt-en-fr' to be loaded. This model is trained for English to French translation tasks, which is exactly what we need for translating 'I feel very good today.'n4. We can pass the English text as input and the model will return the translated French text.\n",
    "api_provider_parsed": "Hugging Face Transformers\\n"
  },
  "started_at": "2023-11-07T12:32:27.408314Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/bc2uwn3btg3wvuu5tejtocsi5m",
    "cancel": "https://api.replicate.com/v1/predictions/bc2uwn3btg3wvuu5tejtocsi5m/cancel"
  },
  "version": "4a1a7ce831b1315ab68a6b3ccf54bf376de787c93198a95a329582105b70e083"
}

Generated in

9.0 seconds

Tweak it Report View full prediction

Run time and cost

This model costs approximately $0.30 to run on Replicate, or 3 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 6 minutes. The predict time for this model varies significantly based on the inputs.

Readme

Gorilla: Large Language Model Connected with Massive APIs [Project Website]

For our dataset collections, all the 1640 API documentation is in data/api. We also include the APIBench dataset created by self-instruct in data/apibench. For evaluation, we convert this into a LLM-friendly chat format, and the questions are in eval/eval-data/questions, and the corresponding responses are in eval/eval-data/responses. We have also included the evaluation scripts are in eval/eval-scripts. This would be entirely sufficient to train Gorilla yourself, and reproduce our results. Please see evaluation for the details on how to use our evaluation pipeline.

Additionally, we have released all the model weights. gorilla-7b-hf-v0 lets you invoke over 925 Hugging Face APIs. Similarly, gorilla-7b-tf-v0 and gorilla-7b-th-v0 have 626 (exhaustive) Tensorflow v2, and 94 (exhaustive) Torch Hub APIs. gorilla-mpt-7b-hf-v0 and gorilla-falcon-7b-hf-v0 are Apache 2.0 licensed models (commercially usable) fine-tuned on MPT-7B and Falcon-7B respectively. We will release a model with all three combined with generic chat capability and community contributed APIs as soon as we can scale our serving infrastructure. You can run Gorilla locally from instructions in the inference/ sub-

FAQ(s)

I would like to use Gorilla commercially. Is there going to be a Apache 2.0 licensed version?

Yes! We now have models that you can use commercially without any obligations.

Can we use Gorilla with Langchain, Toolformer, AutoGPT etc?

Absolutely! You’ve highlighted a great aspect of our tools. Gorilla is an end-to-end model, specifically tailored to serve correct API calls without requiring any additional coding. It’s designed to work as part of a wider ecosystem and can be flexibly integrated with other tools.

Langchain, is a versatile developer tool. Its “agents” can efficiently swap in any LLM, Gorilla included, making it a highly adaptable solution for various needs.

AutoGPT, on the other hand, concentrates on the art of prompting GPT series models. It’s worth noting that Gorilla, as a fully fine-tuned model, consistently shows remarkable accuracy, and lowers hallucination, outperforming GPT-4 in making specific API calls.

Now, when it comes to ToolFormer, Toolformer zeroes in on a select set of tools, providing specialized functionalities. Gorilla, in contrast, has the capacity to manage thousands of API calls, offering a broader coverage over a more extensive range of tools.

The beauty of these tools truly shines when they collaborate, complementing each other’s strengths and capabilities to create an even more powerful and comprehensive solution. This is where your contribution can make a difference. We enthusiastically welcome any inputs to further refine and enhance these tools.

How to train your own Gorilla models?

We will release the training code as soon as we can get GPUs to test and finalize the pipeline. Given the demand for our hosted end-points, we have dedicated all of our GPUs to serve the models. If you would like to help with resources get in touch!

Citation

If you use Gorilla or APIBench, please cite our paper:

@article{patil2023gorilla,
  title={Gorilla: Large Language Model Connected with Massive APIs},
  author={Shishir G. Patil and Tianjun Zhang and Xin Wang and Joseph E. Gonzalez},
  year={2023},
  journal={arXiv preprint arXiv:2305.15334},
}

cjwbw / gorilla

Input

Output

Run time and cost

Readme

Gorilla: Large Language Model Connected with Massive APIs [Project Website]

FAQ(s)

Citation

Logs (bc2uwn3btg3wvuu5tejtocsi5m)