How to use Large/Gundam mode with vLLM online serving?

#100

by prudant - opened 4 days ago

4 days ago

Hi! I'm successfully using DeepSeek-OCR with vLLM's OpenAI-compatible API, but I can't figure out how to configure the image resolution mode (Large, Gundam, Base, etc.).
In the official demo (app.py), you pass base_size=1280, image_size=1280, crop_mode=False to model.infer(). But vLLM doesn't use model.infer() - it uses its own inference pipeline.

Questions:
Is there a way to configure these parameters when using vLLM?
Does vLLM's default processing use candidate_resolutions: [[1024, 1024]] from processor_config.json?
Would you consider adding support for dynamic resolution modes in the processor?
I've tried --mm-processor-kwargs with various parameters but get errors.

prudant

4 days ago

solved https://github.com/deepseek-ai/DeepSeek-OCR/issues/293

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment