--- title: Docker Model Runner emoji: ๐Ÿณ colorFrom: blue colorTo: purple sdk: docker app_port: 7860 suggested_hardware: cpu-basic pinned: false --- # Docker Model Runner **Anthropic API Compatible** with **Interleaved Thinking** support. ## Hardware - **CPU Basic**: 2 vCPU ยท 16 GB RAM ## Quick Start ```bash pip install anthropic export ANTHROPIC_BASE_URL=https://likhonsheikhdev-docker-model-runner.hf.space export ANTHROPIC_API_KEY=any-key ``` ```python import anthropic client = anthropic.Anthropic() message = client.messages.create( model="MiniMax-M2", max_tokens=1000, system="You are a helpful assistant.", messages=[{"role": "user", "content": "Hi, how are you?"}] ) for block in message.content: if block.type == "thinking": print(f"Thinking:\n{block.thinking}\n") elif block.type == "text": print(f"Text:\n{block.text}\n") ``` ## Interleaved Thinking Enable thinking to get reasoning steps interleaved with responses: ```python import anthropic client = anthropic.Anthropic( base_url="https://likhonsheikhdev-docker-model-runner.hf.space" ) message = client.messages.create( model="MiniMax-M2", max_tokens=1024, thinking={ "type": "enabled", "budget_tokens": 200 }, messages=[{"role": "user", "content": "Explain quantum computing"}] ) # Response contains interleaved thinking and text blocks for block in message.content: if block.type == "thinking": print(f"๐Ÿ’ญ Thinking: {block.thinking}") elif block.type == "text": print(f"๐Ÿ“ Response: {block.text}") ``` ## Streaming with Thinking ```python import anthropic client = anthropic.Anthropic( base_url="https://likhonsheikhdev-docker-model-runner.hf.space" ) with client.messages.stream( model="MiniMax-M2", max_tokens=1024, thinking={"type": "enabled", "budget_tokens": 100}, messages=[{"role": "user", "content": "Hello!"}] ) as stream: for event in stream: if hasattr(event, 'type'): if event.type == 'content_block_start': print(f"\n[{event.content_block.type}]", end=" ") elif event.type == 'content_block_delta': if hasattr(event.delta, 'thinking'): print(event.delta.thinking, end="") elif hasattr(event.delta, 'text'): print(event.delta.text, end="") ``` ## Multi-Turn with Thinking History **Important**: In multi-turn conversations, append the complete model response (including thinking blocks) to maintain reasoning chain continuity. ```python import anthropic client = anthropic.Anthropic( base_url="https://likhonsheikhdev-docker-model-runner.hf.space" ) messages = [{"role": "user", "content": "What is 2+2?"}] # First turn response = client.messages.create( model="MiniMax-M2", max_tokens=1024, thinking={"type": "enabled", "budget_tokens": 100}, messages=messages ) # Append full response (including thinking) to history messages.append({ "role": "assistant", "content": response.content # Includes both thinking and text blocks }) # Second turn messages.append({"role": "user", "content": "Now multiply that by 3"}) response2 = client.messages.create( model="MiniMax-M2", max_tokens=1024, thinking={"type": "enabled", "budget_tokens": 100}, messages=messages ) ``` ## Supported Models | Model | Description | |-------|-------------| | MiniMax-M2 | Agentic capabilities, Advanced reasoning | | MiniMax-M2-Stable | High concurrency and commercial use | ## API Compatibility ### Parameters | Parameter | Status | |-----------|--------| | model | โœ… Fully supported | | messages | โœ… Partial (text, tool calls) | | max_tokens | โœ… Fully supported | | stream | โœ… Fully supported | | system | โœ… Fully supported | | temperature | โœ… Range (0.0, 1.0] | | thinking | โœ… Fully supported | | thinking.budget_tokens | โœ… Fully supported | | tools | โœ… Fully supported | | tool_choice | โœ… Fully supported | | top_p | โœ… Fully supported | | metadata | โœ… Fully supported | | top_k | โšช Ignored | | stop_sequences | โšช Ignored | ### Message Types | Type | Status | |------|--------| | text | โœ… Supported | | thinking | โœ… Supported | | tool_use | โœ… Supported | | tool_result | โœ… Supported | | image | โŒ Not supported | | document | โŒ Not supported | ## Endpoints | Endpoint | Method | Description | |----------|--------|-------------| | `/v1/messages` | POST | Anthropic Messages API | | `/v1/chat/completions` | POST | OpenAI Chat API | | `/v1/models` | GET | List models | | `/health` | GET | Health check | | `/info` | GET | API info | ## cURL Example ```bash curl -X POST https://likhonsheikhdev-docker-model-runner.hf.space/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: any-key" \ -d '{ "model": "MiniMax-M2", "max_tokens": 1024, "thinking": {"type": "enabled", "budget_tokens": 100}, "messages": [ {"role": "user", "content": "Explain AI briefly"} ] }' ```