Perplexity table

#6
by Nexesenex - opened

Hey Ubergarm.

Could you share with each model you publish a full perplexity run, so I can see the intermediate values which fit with the final one, and calibrate my test on the related number of chunks?

Heya @Nexesenex so the goal is if you had the intermediate chunk perplexity scores, then you could test your own quants without waiting for the full run?

Its a bit of work to go publish all of those in an organized fashion. I could probably zip up a bunch of my logs if you wanted to dig through them yourself?

As for my workflow, it has changed a bit over time depending on which rig I had access and the various changes with ik_llama.cpp's arguments and feature changes.

Here is a summary of my most recent workflow for calculating perplexity for reference as well as taken from this recent post: https://github.com/ikawrakow/ik_llama.cpp/issues/942#issuecomment-3536933398


Can you please share your command to measure perplexity?

Sure here it is again. Right I always use default 512 context and unquantized f16 kv-cache for my published numbers in the charts. And yes the usual wiki.test.raw file.

$ wget https://huggingface.co/datasets/ikawrakow/validation-datasets-for-llama.cpp/resolve/main/wiki.test.raw.gz
$ gunzip wiki.test.raw.gz
$ ls -lah wiki.test.raw
-rw-rw-r-- 1 w w 1.3M Mar  5  2025 wiki.test.raw
$ sha1sum wiki.test.raw
6f1fe2054a940eebfc76b284b09680763b37f5ea  wiki.test.raw

$ numactl -N ${SOCKET} -m ${SOCKET} \
./build/bin/llama-perplexity \
    -m "$model" \
    -f wiki.test.raw \
    --seed 1337 \
    -mla 3 \
    --ctx-size 512 \
    -ub 4096 -b 4096 \
    --numa numactl \
    --threads 96 \
    --threads-batch 128 \
    --no-mmap

The seed does nothing here, it is just for fun. I don't think you need -mla 3 anymore as that is default now. I specify context just to be explicit, but 512 is the default value. You can adjust batch size as needed for your rig (generally i avoid going over 4096) and it doesn't effect results. Of course adjust threads, offload, and others as desired.


Hopefully this is enough to keep you going for now: https://ubergarm.com/images/various-perplexity-logs-2025-11-16.zip - I'll probably delete this file within a few days.

Sign up or log in to comment