Provided Quants
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
| Link | Type | Size/GB | Notes |
|---|---|---|---|
| GGUF | Q2_K | 1.2 | |
| GGUF | Q3_K_S | 1.4 | |
| GGUF | Q3_K_M | 1.5 | lower quality |
| GGUF | Q3_K_L | 1.6 | |
| GGUF | IQ4_XS | 1.6 | |
| GGUF | Q4_K_S | 1.7 | fast, recommended |
| GGUF | Q4_K_M | 1.8 | fast, recommended |
| GGUF | Q5_K_S | 2.0 | |
| GGUF | Q5_K_M | 2.1 | |
| GGUF | Q6_K | 2.4 | very good quality |
| GGUF | Q8_0 | 3.1 | fast, best quality |
| GGUF | f16 | 5.7 | 16 bpw, overkill |
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 124
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sujalrajpoot/Jarvis-3B-GGUF
Base model
sujalrajpoot/Jarvis-3B