A swift setup of gemma-3-4b with INT8 weight-only quantization and sparsity for efficient inference.
Want to make some of these yourself?