Bringing Math to Life: Building StepWise Math for the MCP Hackathon

Community Article Published November 27, 2025

Math shouldn't just be solved; it should be experienced.

That was the core idea behind StepWise Math, my submission for Track 1 (Building MCP) of the Model Context Protocol (MCP) 1st Birthday Hackathon.

We’ve all been there: staring at a static textbook problem, trying to visualize how a triangle transforms or how a probability distribution shifts. Calculators give you the answer, but they don't give you the insight.

StepWise Math changes that. It is an AI-powered application that transforms static text, images, or URLs into living, interactive HTML5 visual proofs. And thanks to the new MCP support in Gradio, it's not just a web app—it's a fully compliant MCP Server that any AI agent can talk to.

The Challenge: "Show, Don't Just Tell"

The goal was to build a tool that acts as a "Digital Montessori" for students in grades 6-10. If a student asks, "Why is the sum of angles in a triangle 180 degrees?", the AI shouldn't just output text. It should generate a mini-app where the student can drag the vertices of a triangle and watch the angles update in real-time.

To achieve this, I needed an architecture that could "think" like a teacher and "code" like an engineer.

Under the Hood: The Two-Stage AI Pipeline

Building this required a robust pipeline that could respect the timeout constraints of the MCP protocol while delivering high-quality code. I used a Two-Stage Architecture:

Stage 1: The Architect (Gemini 2.5 Flash)

First, the system analyzes the user's input (whether it's a text query, a photo of a textbook, or a WebPage URL). I used Google Gemini 2.5 Flash for this stage because of its blazing speed and low latency.

Role: Analyzes the math concept.
Output: A JSON "MathSpec"—a pedagogical blueprint defining the steps, visual elements, and learning goals.
Time: ~10-15 seconds.

Stage 2: The Builder (Gemini 3.0 Pro)

Next, the "MathSpec" is passed to Gemini 3.0 Pro. This model is the heavy lifter.

Role: Takes the JSON blueprint and writes a complete, self-contained HTML5/Canvas application.
Output: Interactive code with sliders, drag events, and step-by-step navigation.
Time: ~60-100 seconds.

By splitting the task, we ensure that the "thinking" logic is sound before a single line of code is written.

Entering the Matrix: MCP Integration

This project isn't just a standalone website; it's an MCP Server.

Using Gradio 6.0+, I was able to expose the application's internal functions as MCP tools with a single flag (mcp_server=True). In addition to MCP tools, I also added Prompts and Resources. This means you can connect Claude Desktop, VS Code, or any other MCP client directly to StepWise Math.

The Exposed Tools, Prompts & Resources:

The server exposes a rich set of capabilities that allow AI agents to interact with the application programmatically:

Tools (The Functions)

create_math_specification_from_text: Analyzes a word problem to create a pedagogical JSON blueprint.
create_math_specification_from_image: Extracts mathematical concepts from diagrams or textbook photos.
create_math_specification_from_url: Processes web content (like Khan Academy links) into learning steps.
build_interactive_proof_from_specification: Synthesizes the final interactive HTML5 application from a JSON blueprint.

Prompts (The Workflows)

create_visual_math_proof: A guided conversation starter that orchestrates the entire "Teacher -> Engineer" pipeline for you.

Resources (The Data)

stepwise://specification-template: Access standardized JSON templates for creating new math specifications.
stepwise://example-pythagorean: Retrieve full example data to understand the system's inputs and outputs.

Imagine an AI agent in your IDE that can not only "see" a math problem but also pull up the correct templates, generate a visual proof, and verify it against known examples—all without leaving the chat interface. That's what this server enables.

The "Vibe Coding" Experience

This project was a testament to the power of Vibe Coding. I didn't write every line of boilerplate myself. Instead, I leaned heavily on GitHub Copilot and a robust Product Requirements Document (PRD).

The workflow was surprisingly fluid:

Write the PRD: I defined the "Soul" of the app in a detailed Markdown file (User flow, data models, constraints).
Prompt Copilot Agent: "Build StepMath application using the PRD.md file"
Iterate: I focused on the high-level architecture and the core functionality of the application, while the AI handled the implementation details of the Gradio components.

It was a huge learning experience in Spec-Driven Development. When the LLM knows exactly what you want (thanks to the PRD), the code generation is nearly flawless.

Key Features

Multi-Modal Input: Snap a picture of homework, paste a website link, or just type a question.
Interactive Canvas: Drag points, slide variables, and see the math change instantly.
Step-by-Step Guidance: The app breaks complex proofs into digestible "slides."
Feedback Loop: Don't like the color? Want to add a label? Just tell the AI, and it refines the app in seconds.

What's Next?

This hackathon was just the beginning. The roadmap includes:

Persistent Library: Saving your favorite proofs to a personal collection.
Teacher Mode: Exporting proofs as standalone files for classroom use.

A huge thank you to Hugging Face for hosting this hackathon and to the Google Gemini team for the powerful models that made this possible.

Go check out the Space and let me know what you think!

StepWise Math AI on Hugging Face

#MCPHackathon #BuildWithGemini #Gradio #AI #EdTech #Math

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote