Readme
This is a cog implementation of “openbuddy-llemma-34b” 4-bit quantization model.
Refer to Openbuddy’s Repo https://github.com/openbuddy/openbuddy and EAI’s llemma Official site: https://blog.eleuther.ai/llemma/
This is a cog implementation of "openbuddy-llemma-34b" 4-bit quantization model.
This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 6 seconds. The predict time for this model varies significantly based on the inputs.
This is a cog implementation of “openbuddy-llemma-34b” 4-bit quantization model.
Refer to Openbuddy’s Repo https://github.com/openbuddy/openbuddy and EAI’s llemma Official site: https://blog.eleuther.ai/llemma/