novita
/

Deepseek-V3.1-W4AFP8

8-bit precision

Model card Files Files and versions

RandomXiong commited on Oct 24, 2025

Commit

d41150b

·

verified ·

1 Parent(s): aeefe97

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ This optimization reduces the number of bits per parameter 4/8, significantly re
 ## Use with SGLANG
 This model can be deployed efficiently using the SGLANG backend with only H200x4, as shown in the example below.
 ```bash
-python -m sglang.launch_server --model novita/Deepseek-V3-0324-W4AFP8 --mem-fraction-static 0.85 --disable-shared-experts-fusion --tp-size 4
 ```

 ## Use with SGLANG
 This model can be deployed efficiently using the SGLANG backend with only H200x4, as shown in the example below.
 ```bash
+python -m sglang.launch_server --model novita/Deepseek-V3.1-W4AFP8  --mem-fraction-static 0.85 --disable-shared-experts-fusion --tp-size 4
 ```