gregwdata / defog-sqlcoder-q8

Defog's SQLCoder is a state-of-the-art LLM for converting natural language questions to SQL queries. SQLCoder is a 15B parameter fine-tuned on a base StarCoder model.

  • Public
  • 12.4K runs
  • L40S
  • GitHub
  • License
Iterate in playground

Input

*string
Shift + Return to add a new line

Text prompt

integer
(minimum: 1)

Maximum number of tokens to generate. A word is generally 2-3 tokens

Default: 1200

integer
(minimum: 1)

Number of beams for beam search

Default: 5

string
Shift + Return to add a new line

Prompt template for SQLcoder - see https://github.com/defog-ai/sqlcoder/blob/main/prompt.md . Must contain {user_question} and {table_metadata_string} parameters.

Default: "### Instructions:\nYour task is convert a question into a SQL query, given a Postgres database schema.\nAdhere to these rules:\n- **Deliberately go through the question and database schema word by word** to appropriately answer the question\n- **Use Table Aliases** to prevent ambiguity. For example, `SELECT table1.col1, table2.col1 FROM table1 JOIN table2 ON table1.id = table2.id`.\n- When creating a ratio, always cast the numerator as float\n\n### Input:\nGenerate a SQL query that answers the question `{user_question}`.\nThis query will run on a database whose schema is represented in this string:\n{table_metadata_string}\n\n### Response:\nBased on your instructions, here is the SQL query I have generated to answer the question `{user_question}`:\n```sql"

string
Shift + Return to add a new line

Description of database schema. See https://github.com/defog-ai/sqlcoder/blob/main/metadata.sql

Default: "CREATE TABLE products (\n product_id INTEGER PRIMARY KEY, -- Unique ID for each product\n name VARCHAR(50), -- Name of the product\n price DECIMAL(10,2), -- Price of each unit of the product\n quantity INTEGER -- Current quantity in stock\n);\n\nCREATE TABLE customers (\n customer_id INTEGER PRIMARY KEY, -- Unique ID for each customer\n name VARCHAR(50), -- Name of the customer\n address VARCHAR(100) -- Mailing address of the customer\n);\n\nCREATE TABLE salespeople (\n salesperson_id INTEGER PRIMARY KEY, -- Unique ID for each salesperson \n name VARCHAR(50), -- Name of the salesperson\n region VARCHAR(50) -- Geographic sales region \n);\n\nCREATE TABLE sales (\n sale_id INTEGER PRIMARY KEY, -- Unique ID for each sale\n product_id INTEGER, -- ID of product sold\n customer_id INTEGER, -- ID of customer who made purchase\n salesperson_id INTEGER, -- ID of salesperson who made the sale\n sale_date DATE, -- Date the sale occurred \n quantity INTEGER -- Quantity of product sold\n);\n\nCREATE TABLE product_suppliers (\n supplier_id INTEGER PRIMARY KEY, -- Unique ID for each supplier\n product_id INTEGER, -- Product ID supplied\n supply_price DECIMAL(10,2) -- Unit price charged by supplier\n);\n\n-- sales.product_id can be joined with products.product_id\n-- sales.customer_id can be joined with customers.customer_id \n-- sales.salesperson_id can be joined with salespeople.salesperson_id\n-- product_suppliers.product_id can be joined with products.product_id\n"

integer
(minimum: -1)

Set seed for reproducible outputs. Set to -1 for random seed.

Default: -1

boolean

provide debugging output in logs

Default: false

Output

SELECT salespeople.name, sum(sales.quantity) AS total_sales FROM salespeople join sales on salespeople.salesperson_id = sales.salesperson_id GROUP BY salespeople.name ORDER BY total_sales DESC LIMIT 1;
Generated in

This output was created using a different version of the model, gregwdata/defog-sqlcoder-q8:ad5a6671.

Run time and cost

This model costs approximately $0.00098 to run on Replicate, or 1020 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 1 seconds.

Readme

This is a cog wrapper around Defog’s SQLCoder, to deploy it on Replicate .

It has initially been deployed with 8-bit quantization.

Defog released their code with an Apache v2 license, and the model weights as CC BY-SA 4.0.