VibecoderMcSwaggins commited on
Commit
7b20f5d
·
1 Parent(s): 8625ded

docs: add comprehensive hackathon documentation and integration plans

Browse files

- Introduced detailed priority summaries for hackathon tasks, including deadlines and current stack status.
- Documented requirements for Track 2 (MCP in Action) and outlined necessary actions for integration.
- Added specific integration plans for MCP server and Modal, including implementation options and demo scripts.
- Created submission checklists and prize opportunity analyses to guide project completion.

Files added:
- docs/pending/00_priority_summary.md
- docs/pending/01_hackathon_requirements.md
- docs/pending/02_mcp_server_integration.md
- docs/pending/03_modal_integration.md

docs/pending/00_priority_summary.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DeepCritical Hackathon Priority Summary
2
+
3
+ ## 4 Days Left (Deadline: Nov 30, 2025 11:59 PM UTC)
4
+
5
+ ---
6
+
7
+ ## Git Contribution Analysis
8
+
9
+ ```
10
+ The-Obstacle-Is-The-Way: 20+ commits (Phases 1-11, all demos, all fixes)
11
+ MarioAderman: 3 commits (Modal, LlamaIndex, PubMed fix)
12
+ JJ (Maintainer): 0 code commits (merge button only)
13
+ ```
14
+
15
+ **Conclusion:** You built 90%+ of this codebase.
16
+
17
+ ---
18
+
19
+ ## Current Stack (What We Have)
20
+
21
+ | Component | Status | Files |
22
+ |-----------|--------|-------|
23
+ | PubMed Search | ✅ Working | `src/tools/pubmed.py` |
24
+ | ClinicalTrials Search | ✅ Working | `src/tools/clinicaltrials.py` |
25
+ | bioRxiv Search | ✅ Working | `src/tools/biorxiv.py` |
26
+ | Search Handler | ✅ Working | `src/tools/search_handler.py` |
27
+ | Embeddings/ChromaDB | ✅ Working | `src/services/embeddings.py` |
28
+ | LlamaIndex RAG | ✅ Working | `src/services/llamaindex_rag.py` |
29
+ | Hypothesis Agent | ✅ Working | `src/agents/hypothesis_agent.py` |
30
+ | Report Agent | ✅ Working | `src/agents/report_agent.py` |
31
+ | Judge Agent | ✅ Working | `src/agents/judge_agent.py` |
32
+ | Orchestrator | ✅ Working | `src/orchestrator.py` |
33
+ | Gradio UI | ✅ Working | `src/app.py` |
34
+ | Modal Code Execution | ⚠️ Built, not wired | `src/tools/code_execution.py` |
35
+ | **MCP Server** | ❌ **MISSING** | Need to create |
36
+
37
+ ---
38
+
39
+ ## What's Required for Track 2 (MCP in Action)
40
+
41
+ | Requirement | Have It? | Priority |
42
+ |-------------|----------|----------|
43
+ | Autonomous agent behavior | ✅ Yes | - |
44
+ | Must use MCP servers as tools | ❌ **NO** | **P0** |
45
+ | Must be Gradio app | ✅ Yes | - |
46
+ | Planning/reasoning/execution | ✅ Yes | - |
47
+
48
+ **Bottom Line:** Without MCP server, we're potentially disqualified from Track 2.
49
+
50
+ ---
51
+
52
+ ## 3 Things To Do (In Order)
53
+
54
+ ### 1. MCP Server (P0 - Required)
55
+ - **File:** `src/mcp_server.py`
56
+ - **Time:** 2-4 hours
57
+ - **Doc:** `02_mcp_server_integration.md`
58
+ - **Why:** Required for Track 2. No MCP = no entry.
59
+
60
+ ### 2. Modal Wiring (P1 - $2,500 Prize)
61
+ - **File:** Update `src/agents/analysis_agent.py`
62
+ - **Time:** 2-3 hours
63
+ - **Doc:** `03_modal_integration.md`
64
+ - **Why:** Modal Innovation Award is $2,500
65
+
66
+ ### 3. Demo Video + Submission (P0 - Required)
67
+ - **Time:** 1-2 hours
68
+ - **Why:** Required for all submissions
69
+
70
+ ---
71
+
72
+ ## Submission Checklist
73
+
74
+ - [ ] Space in MCP-1st-Birthday org
75
+ - [ ] Tag: `mcp-in-action-track-enterprise`
76
+ - [ ] Social media post link
77
+ - [ ] Demo video (1-5 min)
78
+ - [ ] MCP server working
79
+ - [ ] All tests passing
80
+
81
+ ---
82
+
83
+ ## Prize Math
84
+
85
+ | Award | Amount | Eligible? |
86
+ |-------|--------|-----------|
87
+ | Track 2 1st Place | $2,500 | If MCP works |
88
+ | Modal Innovation | $2,500 | If Modal wired |
89
+ | LlamaIndex | $1,000 | Yes (have it) |
90
+ | Community Choice | $1,000 | Maybe |
91
+ | **Total Potential** | **$7,000** | With MCP + Modal |
92
+
93
+ ---
94
+
95
+ ## Next Actions
96
+
97
+ ```bash
98
+ # 1. Read MCP integration doc
99
+ cat docs/pending/02_mcp_server_integration.md
100
+
101
+ # 2. Create MCP server
102
+ # (implement based on doc)
103
+
104
+ # 3. Test MCP works
105
+ uv run python src/mcp_server.py
106
+
107
+ # 4. Wire Modal into pipeline
108
+ # (see 03_modal_integration.md)
109
+
110
+ # 5. Record demo video
111
+
112
+ # 6. Submit to MCP-1st-Birthday org
113
+ ```
docs/pending/01_hackathon_requirements.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MCP's 1st Birthday Hackathon - Requirements Analysis
2
+
3
+ ## Deadline: November 30, 2025 11:59 PM UTC
4
+
5
+ ---
6
+
7
+ ## Track Selection: MCP in Action (Track 2)
8
+
9
+ DeepCritical fits **Track 2: MCP in Action** - AI agent applications.
10
+
11
+ ### Required Tags (pick one)
12
+ ```yaml
13
+ tags:
14
+ - mcp-in-action-track-enterprise # Drug repurposing = enterprise/healthcare
15
+ # OR
16
+ - mcp-in-action-track-consumer # If targeting patients/consumers
17
+ ```
18
+
19
+ ### Track 2 Requirements
20
+
21
+ | Requirement | DeepCritical Status | Action Needed |
22
+ |-------------|---------------------|---------------|
23
+ | Autonomous Agent behavior | ✅ Have it | Search-Judge-Synthesize loop |
24
+ | Must use MCP servers as tools | ❌ **MISSING** | Add MCP server wrapper |
25
+ | Must be a Gradio app | ✅ Have it | `src/app.py` |
26
+ | Planning, reasoning, execution | ✅ Have it | Orchestrator + Judge |
27
+ | Context Engineering / RAG | ✅ Have it | LlamaIndex + ChromaDB |
28
+
29
+ ---
30
+
31
+ ## Prize Opportunities
32
+
33
+ ### Current Eligibility vs With MCP Integration
34
+
35
+ | Award | Prize | Current | With MCP |
36
+ |-------|-------|---------|----------|
37
+ | MCP in Action (1st) | $2,500 | ✅ Eligible | ✅ STRONGER |
38
+ | Modal Innovation | $2,500 | ❌ Not using | ✅ ELIGIBLE (code execution) |
39
+ | Blaxel Choice | $2,500 | ❌ Not using | ⚠️ Could integrate |
40
+ | LlamaIndex | $1,000 | ✅ Using (Mario's code) | ✅ ELIGIBLE |
41
+ | Google Gemini | $10K credits | ❌ Not using | ⚠️ Could add |
42
+ | Community Choice | $1,000 | ⚠️ Possible | ✅ Better demo helps |
43
+ | **TOTAL POTENTIAL** | | ~$2,500 | **$8,500+** |
44
+
45
+ ---
46
+
47
+ ## Submission Checklist
48
+
49
+ - [ ] HuggingFace Space in `MCP-1st-Birthday` organization
50
+ - [ ] Track tags in Space README.md
51
+ - [ ] Social media post link (X, LinkedIn)
52
+ - [ ] Demo video (1-5 minutes)
53
+ - [ ] All team members registered
54
+ - [ ] Original work (Nov 14-30)
55
+
56
+ ---
57
+
58
+ ## Priority Integration Order
59
+
60
+ ### P0 - MUST HAVE (Required for Track 2)
61
+ 1. **MCP Server Wrapper** - Expose search tools as MCP servers
62
+ - See: `02_mcp_server_integration.md`
63
+
64
+ ### P1 - HIGH VALUE ($2,500 each)
65
+ 2. **Modal Integration** - Already have code, need to wire up
66
+ - See: `03_modal_integration.md`
67
+
68
+ ### P2 - NICE TO HAVE
69
+ 3. **Blaxel** - MCP hosting platform (if time permits)
70
+ 4. **Gemini API** - Add as LLM option for Google prize
71
+
72
+ ---
73
+
74
+ ## What MCP Actually Means for Us
75
+
76
+ MCP (Model Context Protocol) is Anthropic's standard for connecting AI to tools.
77
+
78
+ **Current state:**
79
+ - We have `PubMedTool`, `ClinicalTrialsTool`, `BioRxivTool`
80
+ - They're Python classes with `search()` methods
81
+
82
+ **What we need:**
83
+ - Wrap these as MCP servers
84
+ - So Claude Desktop, Cursor, or any MCP client can use them
85
+
86
+ **Why this matters:**
87
+ - Judges will test if our tools work with Claude Desktop
88
+ - No MCP = disqualified from Track 2
89
+
90
+ ---
91
+
92
+ ## Reference Links
93
+
94
+ - [Hackathon Page](https://huggingface.co/MCP-1st-Birthday)
95
+ - [MCP Documentation](https://modelcontextprotocol.io/)
96
+ - [Gradio MCP Guide](https://www.gradio.app/guides/building-mcp-server-with-gradio)
97
+ - [Discord: #agents-mcp-hackathon-winter25](https://discord.gg/huggingface)
docs/pending/02_mcp_server_integration.md ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MCP Server Integration
2
+
3
+ ## Priority: P0 - REQUIRED FOR TRACK 2
4
+
5
+ ---
6
+
7
+ ## What We Need
8
+
9
+ Expose our search tools as MCP servers so Claude Desktop/Cursor can use them.
10
+
11
+ ### Current Tools to Expose
12
+
13
+ | Tool | File | MCP Tool Name |
14
+ |------|------|---------------|
15
+ | PubMed Search | `src/tools/pubmed.py` | `search_pubmed` |
16
+ | ClinicalTrials Search | `src/tools/clinicaltrials.py` | `search_clinical_trials` |
17
+ | bioRxiv Search | `src/tools/biorxiv.py` | `search_biorxiv` |
18
+ | Combined Search | `src/tools/search_handler.py` | `search_all_sources` |
19
+
20
+ ---
21
+
22
+ ## Implementation Options
23
+
24
+ ### Option 1: Gradio MCP (Recommended)
25
+
26
+ Gradio 5.0+ can expose any Gradio app as an MCP server automatically.
27
+
28
+ ```python
29
+ # src/mcp_server.py
30
+ import gradio as gr
31
+ from src.tools.pubmed import PubMedTool
32
+ from src.tools.clinicaltrials import ClinicalTrialsTool
33
+ from src.tools.biorxiv import BioRxivTool
34
+
35
+ pubmed = PubMedTool()
36
+ trials = ClinicalTrialsTool()
37
+ biorxiv = BioRxivTool()
38
+
39
+ async def search_pubmed(query: str, max_results: int = 10) -> str:
40
+ """Search PubMed for biomedical literature."""
41
+ results = await pubmed.search(query, max_results)
42
+ return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
43
+
44
+ async def search_clinical_trials(query: str, max_results: int = 10) -> str:
45
+ """Search ClinicalTrials.gov for clinical trial data."""
46
+ results = await trials.search(query, max_results)
47
+ return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
48
+
49
+ async def search_biorxiv(query: str, max_results: int = 10) -> str:
50
+ """Search bioRxiv/medRxiv for preprints."""
51
+ results = await biorxiv.search(query, max_results)
52
+ return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
53
+
54
+ # Create Gradio interface
55
+ demo = gr.Interface(
56
+ fn=[search_pubmed, search_clinical_trials, search_biorxiv],
57
+ inputs=[gr.Textbox(label="Query"), gr.Number(label="Max Results", value=10)],
58
+ outputs=gr.Textbox(label="Results"),
59
+ )
60
+
61
+ # Launch as MCP server
62
+ if __name__ == "__main__":
63
+ demo.launch(mcp_server=True) # Gradio 5.0+ feature
64
+ ```
65
+
66
+ ### Option 2: Native MCP SDK
67
+
68
+ Use the official MCP Python SDK:
69
+
70
+ ```bash
71
+ uv add mcp
72
+ ```
73
+
74
+ ```python
75
+ # src/mcp_server.py
76
+ from mcp.server import Server
77
+ from mcp.types import Tool, TextContent
78
+
79
+ from src.tools.pubmed import PubMedTool
80
+ from src.tools.clinicaltrials import ClinicalTrialsTool
81
+ from src.tools.biorxiv import BioRxivTool
82
+
83
+ server = Server("deepcritical-research")
84
+
85
+ @server.tool()
86
+ async def search_pubmed(query: str, max_results: int = 10) -> list[TextContent]:
87
+ """Search PubMed for biomedical literature on drug repurposing."""
88
+ tool = PubMedTool()
89
+ results = await tool.search(query, max_results)
90
+ return [TextContent(type="text", text=e.content) for e in results]
91
+
92
+ @server.tool()
93
+ async def search_clinical_trials(query: str, max_results: int = 10) -> list[TextContent]:
94
+ """Search ClinicalTrials.gov for clinical trials."""
95
+ tool = ClinicalTrialsTool()
96
+ results = await tool.search(query, max_results)
97
+ return [TextContent(type="text", text=e.content) for e in results]
98
+
99
+ @server.tool()
100
+ async def search_biorxiv(query: str, max_results: int = 10) -> list[TextContent]:
101
+ """Search bioRxiv/medRxiv for preprints (not peer-reviewed)."""
102
+ tool = BioRxivTool()
103
+ results = await tool.search(query, max_results)
104
+ return [TextContent(type="text", text=e.content) for e in results]
105
+
106
+ if __name__ == "__main__":
107
+ server.run()
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Claude Desktop Configuration
113
+
114
+ After implementing, users add to `claude_desktop_config.json`:
115
+
116
+ ```json
117
+ {
118
+ "mcpServers": {
119
+ "deepcritical": {
120
+ "command": "uv",
121
+ "args": ["run", "python", "src/mcp_server.py"],
122
+ "cwd": "/path/to/DeepCritical-1"
123
+ }
124
+ }
125
+ }
126
+ ```
127
+
128
+ ---
129
+
130
+ ## Testing MCP Server
131
+
132
+ 1. Start the MCP server:
133
+ ```bash
134
+ uv run python src/mcp_server.py
135
+ ```
136
+
137
+ 2. Test with Claude Desktop or MCP Inspector:
138
+ ```bash
139
+ npx @anthropic/mcp-inspector
140
+ ```
141
+
142
+ 3. Verify tools appear and work
143
+
144
+ ---
145
+
146
+ ## Demo Video Script
147
+
148
+ For the hackathon submission video:
149
+
150
+ 1. Show Claude Desktop with DeepCritical MCP tools
151
+ 2. Ask: "Search PubMed for metformin Alzheimer's"
152
+ 3. Show real results appearing
153
+ 4. Ask: "Now search clinical trials for the same"
154
+ 5. Show combined analysis
155
+
156
+ This proves MCP integration works.
157
+
158
+ ---
159
+
160
+ ## Files to Create
161
+
162
+ - [ ] `src/mcp_server.py` - MCP server implementation
163
+ - [ ] `examples/mcp_demo/test_mcp.py` - Test script
164
+ - [ ] Update `README.md` with MCP usage instructions
docs/pending/03_modal_integration.md ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Modal Integration
2
+
3
+ ## Priority: P1 - HIGH VALUE ($2,500 Modal Innovation Award)
4
+
5
+ ---
6
+
7
+ ## What Modal Is For
8
+
9
+ Modal provides serverless GPU/CPU compute. For DeepCritical:
10
+
11
+ ### Current Use Case (Mario's Code)
12
+ - `src/tools/code_execution.py` - Run LLM-generated analysis code in sandboxes
13
+ - Scientific computing (pandas, scipy, numpy) in isolated containers
14
+
15
+ ### Potential Additional Use Cases
16
+
17
+ | Use Case | Benefit | Complexity |
18
+ |----------|---------|------------|
19
+ | Code Execution Sandbox | Run statistical analysis safely | ✅ Already built |
20
+ | LLM Inference | Run local models (no API costs) | Medium |
21
+ | Batch Processing | Process many papers in parallel | Medium |
22
+ | Embedding Generation | GPU-accelerated embeddings | Low |
23
+
24
+ ---
25
+
26
+ ## Current State
27
+
28
+ Mario implemented `src/tools/code_execution.py`:
29
+
30
+ ```python
31
+ # Already exists - ModalCodeExecutor
32
+ executor = get_code_executor()
33
+ result = executor.execute("""
34
+ import pandas as pd
35
+ import numpy as np
36
+ # LLM-generated statistical analysis
37
+ """)
38
+ ```
39
+
40
+ ### What's Missing
41
+
42
+ 1. **Not wired into the main pipeline** - The executor exists but isn't used
43
+ 2. **No Modal tokens configured** - Needs MODAL_TOKEN_ID/MODAL_TOKEN_SECRET
44
+ 3. **No demo showing it works** - Judges need to see it
45
+
46
+ ---
47
+
48
+ ## Integration Plan
49
+
50
+ ### Step 1: Wire Into Agent Pipeline
51
+
52
+ Add a `StatisticalAnalysisAgent` that uses Modal:
53
+
54
+ ```python
55
+ # src/agents/analysis_agent.py
56
+ from src.tools.code_execution import get_code_executor
57
+
58
+ class AnalysisAgent:
59
+ """Run statistical analysis on evidence using Modal sandbox."""
60
+
61
+ async def analyze(self, evidence: list[Evidence], query: str) -> str:
62
+ # 1. LLM generates analysis code
63
+ code = await self._generate_analysis_code(evidence, query)
64
+
65
+ # 2. Execute in Modal sandbox
66
+ executor = get_code_executor()
67
+ result = executor.execute(code)
68
+
69
+ # 3. Return results
70
+ return result["stdout"]
71
+ ```
72
+
73
+ ### Step 2: Add to Orchestrator
74
+
75
+ ```python
76
+ # In orchestrator, after gathering evidence:
77
+ if settings.enable_modal_analysis:
78
+ analysis_agent = AnalysisAgent()
79
+ stats_results = await analysis_agent.analyze(evidence, query)
80
+ ```
81
+
82
+ ### Step 3: Create Demo
83
+
84
+ ```python
85
+ # examples/modal_demo/run_analysis.py
86
+ """Demo: Modal-powered statistical analysis of drug evidence."""
87
+
88
+ # Show:
89
+ # 1. Gather evidence from PubMed
90
+ # 2. Generate analysis code with LLM
91
+ # 3. Execute in Modal sandbox
92
+ # 4. Return statistical insights
93
+ ```
94
+
95
+ ---
96
+
97
+ ## Modal Setup
98
+
99
+ ### 1. Install Modal CLI
100
+ ```bash
101
+ pip install modal
102
+ modal setup # Authenticates with Modal
103
+ ```
104
+
105
+ ### 2. Set Environment Variables
106
+ ```bash
107
+ # In .env
108
+ MODAL_TOKEN_ID=your-token-id
109
+ MODAL_TOKEN_SECRET=your-token-secret
110
+ ```
111
+
112
+ ### 3. Deploy (Optional)
113
+ ```bash
114
+ modal deploy src/tools/code_execution.py
115
+ ```
116
+
117
+ ---
118
+
119
+ ## What to Show Judges
120
+
121
+ For the Modal Innovation Award ($2,500):
122
+
123
+ 1. **Sandbox Isolation** - Code runs in container, not local
124
+ 2. **Scientific Computing** - Real pandas/scipy analysis
125
+ 3. **Safety** - Can't access local filesystem
126
+ 4. **Speed** - Modal's fast cold starts
127
+
128
+ ### Demo Script
129
+
130
+ ```bash
131
+ # Run the Modal verification script
132
+ uv run python examples/modal_demo/verify_sandbox.py
133
+ ```
134
+
135
+ This proves code runs in Modal, not locally.
136
+
137
+ ---
138
+
139
+ ## Files to Update
140
+
141
+ - [ ] Wire `code_execution.py` into pipeline
142
+ - [ ] Create `src/agents/analysis_agent.py`
143
+ - [ ] Update `examples/modal_demo/` with working demo
144
+ - [ ] Add Modal setup to README
145
+ - [ ] Test with real Modal account
146
+
147
+ ---
148
+
149
+ ## Cost Estimate
150
+
151
+ Modal pricing for our use case:
152
+ - CPU sandbox: ~$0.0001 per execution
153
+ - For demo/judging: < $1 total
154
+ - Free tier: 30 hours/month
155
+
156
+ Not a cost concern.