andreasjansson / codegen

An open-source model for program synthesis. Competitive with OpenAI Codex.

  • Public
  • 1.1K runs
  • A100 (80GB)
  • GitHub
  • Paper
  • License

Input

string
Shift + Return to add a new line

Some starting python code. CodeGen will try to complete the code provided. Providing examples of what you want to do before your prompt can improve performance.

Default: "# Implement a function that computes the square of an integer argument.\n"

boolean

Whether to prepend your input to the output.

Default: true

integer
(minimum: 1, maximum: 10)

Number of code completions to generate from context.

Default: 1

number
(minimum: 0, maximum: 1)

Increase to improve diversity of outputs, may cause artifacts.

Default: 0.2

boolean

Whether to prepend a numpy import to the context as in the paper.

Default: true

number
(minimum: 0, maximum: 1)

Top-p sampling probability.

Default: 0.95

integer
(minimum: 32, maximum: 2048)

Max length of returned sequences.

Default: 128

integer

Seed for reproducibility. Use -1 for a random seed.

Default: -1

boolean

Whether to return a long string (multiple code snippets separated by the string `======`) or a markdown url to be downloaded. May find useful for api.

Default: false

Output

Rendering markdown...

Generated in

Run time and cost

This model costs approximately $0.19 to run on Replicate, or 5 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 136 seconds. The predict time for this model varies significantly based on the inputs.

Readme

CodeGen

This is an unofficial implementation not affiliated with Salesforce.

Sample from the codegen-6B-mono checkpoint trained on python.

Arxiv: A Conversational Paradigm for Program Synthesis

Authors: Erik Nijkamp*, Bo Pang*, Hiroaki Hayashi*, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong (* indicates equal contribution)

Released models

The models are named in the following format:

codegen-{model-size}-{data}

model-size has 4 options: 350M, 2B, 6B, 16B, which represent the number of parameters in each model.

data has 3 options: nl, multi, mono.

  • nl models are randomly initialized and trained on The Pile, a 825.18 GB English text corpous.
  • multi models are initialized from nl models and then trained on a corpus with code data consisting of multiple programming languages.
  • mono models are initialized from multi models and then trained on a corpus with Python code data.

Download the model parameters

codegen-350M-nl,multi,mono

https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-350M-nl.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-350M-multi.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-350M-mono.tar.gz

codegen-2B-nl,multi,mono

https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-2B-nl.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-2B-multi.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-2B-mono.tar.gz

codegen-6B-nl,multi,mono

https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-6B-nl.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-6B-multi.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-6B-mono.tar.gz

codegen-16B-nl,multi,mono

https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-16B-nl.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-16B-multi.tar.gz https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-16B-mono.tar.gz

Citation

If you find our code or paper useful, please cite the paper:

@article{Nijkamp2022ACP,
  title={A Conversational Paradigm for Program Synthesis},
  author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
  journal={arXiv preprint},
  year={2022}
}

License

Our code is BSD-3 licensed. See LICENSE.txt for details.