paragekbote/gemma3-torchao-quant-sparse
A swift setup of gemma-3-4b with INT8 weight-only quantization and sparsity for efficient inference.
-
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64
396049cb
Latest -
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64
-
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64
-
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64
-
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64
-
- Author
-
@paragekbote
- Version
- cuda12.1-python3.10-X64