andreasjansson / codegen

An open-source model for program synthesis. Competitive with OpenAI Codex.

Demo API Examples Versions (211bde4f)

Run time and cost

Predictions run on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 136 seconds. The predict time for this model varies significantly based on the inputs.


This is an unofficial implementation not affiliated with Salesforce.

Sample from the codegen-6B-mono checkpoint trained on python.

Arxiv: A Conversational Paradigm for Program Synthesis

Authors: Erik Nijkamp*, Bo Pang*, Hiroaki Hayashi*, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong (* indicates equal contribution)

Released models

The models are named in the following format:


model-size has 4 options: 350M, 2B, 6B, 16B, which represent the number of parameters in each model.

data has 3 options: nl, multi, mono.

  • nl models are randomly initialized and trained on The Pile, a 825.18 GB English text corpous.
  • multi models are initialized from nl models and then trained on a corpus with code data consisting of multiple programming languages.
  • mono models are initialized from multi models and then trained on a corpus with Python code data.

Download the model parameters






If you find our code or paper useful, please cite the paper:

  title={A Conversational Paradigm for Program Synthesis},
  author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
  journal={arXiv preprint},


Our code is BSD-3 licensed. See LICENSE.txt for details.