Update README.md
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ This optimization reduces the number of bits per parameter 4/8, significantly re
|
|
| 25 |
## Use with SGLANG
|
| 26 |
This model can be deployed efficiently using the SGLANG backend with only H200x4, as shown in the example below.
|
| 27 |
```bash
|
| 28 |
-
python -m sglang.launch_server --model novita/Deepseek-V3-
|
| 29 |
```
|
| 30 |
|
| 31 |
|
|
|
|
| 25 |
## Use with SGLANG
|
| 26 |
This model can be deployed efficiently using the SGLANG backend with only H200x4, as shown in the example below.
|
| 27 |
```bash
|
| 28 |
+
python -m sglang.launch_server --model novita/Deepseek-V3.1-W4AFP8 --mem-fraction-static 0.85 --disable-shared-experts-fusion --tp-size 4
|
| 29 |
```
|
| 30 |
|
| 31 |
|