This view is limited to 50 files because it contains too many changes.  See the raw diff here.
Files changed (50) hide show
  1. .cursorrules +0 -240
  2. .env.example +0 -107
  3. .github/README.md +0 -56
  4. .github/scripts/deploy_to_hf_space.py +0 -391
  5. .github/workflows/ci.yml +0 -127
  6. .github/workflows/deploy-hf-space.yml +0 -47
  7. .gitignore +0 -84
  8. .pre-commit-config.yaml +0 -64
  9. .pre-commit-hooks/run_pytest.ps1 +0 -19
  10. .pre-commit-hooks/run_pytest.sh +0 -20
  11. .pre-commit-hooks/run_pytest_embeddings.ps1 +0 -14
  12. .pre-commit-hooks/run_pytest_embeddings.sh +0 -15
  13. .pre-commit-hooks/run_pytest_unit.ps1 +0 -14
  14. .pre-commit-hooks/run_pytest_unit.sh +0 -15
  15. .pre-commit-hooks/run_pytest_with_sync.ps1 +0 -25
  16. .pre-commit-hooks/run_pytest_with_sync.py +0 -235
  17. .python-version +0 -1
  18. AGENTS.txt +0 -236
  19. CONTRIBUTING.md +0 -494
  20. Dockerfile +0 -52
  21. LICENSE.md +0 -25
  22. README.md +8 -56
  23. deployments/README.md +0 -46
  24. deployments/modal_tts.py +0 -97
  25. dev/.cursorrules +0 -241
  26. dev/AGENTS.txt +0 -236
  27. dev/docs_plugins.py +0 -74
  28. docs/LICENSE.md +0 -35
  29. docs/api/agents.md +0 -211
  30. docs/api/models.md +0 -191
  31. docs/api/orchestrators.md +0 -149
  32. docs/api/services.md +0 -279
  33. docs/api/tools.md +0 -259
  34. docs/architecture/agents.md +0 -293
  35. docs/architecture/graph_orchestration.md +0 -302
  36. docs/architecture/middleware.md +0 -146
  37. docs/architecture/orchestrators.md +0 -201
  38. docs/architecture/services.md +0 -146
  39. docs/architecture/tools.md +0 -167
  40. docs/architecture/workflow-diagrams.md +0 -655
  41. docs/configuration/index.md +0 -564
  42. docs/contributing/code-quality.md +0 -120
  43. docs/contributing/code-style.md +0 -83
  44. docs/contributing/error-handling.md +0 -54
  45. docs/contributing/implementation-patterns.md +0 -67
  46. docs/contributing/index.md +0 -254
  47. docs/contributing/prompt-engineering.md +0 -55
  48. docs/contributing/testing.md +0 -115
  49. docs/getting-started/examples.md +0 -198
  50. docs/getting-started/installation.md +0 -152
.cursorrules DELETED
@@ -1,240 +0,0 @@
1
- # DeepCritical Project - Cursor Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
237
-
238
-
239
-
240
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.env.example DELETED
@@ -1,107 +0,0 @@
1
- # HuggingFace
2
- HF_TOKEN=your_huggingface_token_here
3
-
4
- # OpenAI (optional)
5
- OPENAI_API_KEY=your_openai_key_here
6
-
7
- # Anthropic (optional)
8
- ANTHROPIC_API_KEY=your_anthropic_key_here
9
-
10
- # Model names (optional - sensible defaults set in config.py)
11
- # ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
12
- # OPENAI_MODEL=gpt-5.1
13
-
14
-
15
- # ============================================
16
- # Audio Processing Configuration (TTS)
17
- # ============================================
18
- # Kokoro TTS Model Configuration
19
- TTS_MODEL=hexgrad/Kokoro-82M
20
- TTS_VOICE=af_heart
21
- TTS_SPEED=1.0
22
- TTS_GPU=T4
23
- TTS_TIMEOUT=60
24
-
25
- # Available TTS Voices:
26
- # American English Female: af_heart, af_bella, af_nicole, af_aoede, af_kore, af_sarah, af_nova, af_sky, af_alloy, af_jessica, af_river
27
- # American English Male: am_michael, am_fenrir, am_puck, am_echo, am_eric, am_liam, am_onyx, am_santa, am_adam
28
-
29
- # Available GPU Types (Modal):
30
- # T4 - Cheapest, good for testing (default)
31
- # A10 - Good balance of cost/performance
32
- # A100 - Fastest, most expensive
33
- # L4 - NVIDIA L4 GPU
34
- # L40S - NVIDIA L40S GPU
35
- # Note: GPU type is set at function definition time. Changes require app restart.
36
-
37
- # ============================================
38
- # Audio Processing Configuration (STT)
39
- # ============================================
40
- # Speech-to-Text API Configuration
41
- STT_API_URL=nvidia/canary-1b-v2
42
- STT_SOURCE_LANG=English
43
- STT_TARGET_LANG=English
44
-
45
- # Available STT Languages:
46
- # English, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian
47
-
48
- # ============================================
49
- # Audio Feature Flags
50
- # ============================================
51
- ENABLE_AUDIO_INPUT=true
52
- ENABLE_AUDIO_OUTPUT=true
53
-
54
- # ============================================
55
- # Image OCR Configuration
56
- # ============================================
57
- OCR_API_URL=prithivMLmods/Multimodal-OCR3
58
- ENABLE_IMAGE_INPUT=true
59
-
60
- # ============== EMBEDDINGS ==============
61
-
62
- # OpenAI Embedding Model (used if LLM_PROVIDER is openai and performing RAG/Embeddings)
63
- OPENAI_EMBEDDING_MODEL=text-embedding-3-small
64
-
65
- # Local Embedding Model (used for local/offline embeddings)
66
- LOCAL_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
67
-
68
- # ============== HUGGINGFACE (FREE TIER) ==============
69
-
70
- # HuggingFace Token - enables Llama 3.1 (best quality free model)
71
- # Get yours at: https://huggingface.co/settings/tokens
72
- #
73
- # WITHOUT HF_TOKEN: Falls back to ungated models (zephyr-7b-beta)
74
- # WITH HF_TOKEN: Uses Llama 3.1 8B Instruct (requires accepting license)
75
- #
76
- # For HuggingFace Spaces deployment:
77
- # Set this as a "Secret" in Space Settings -> Variables and secrets
78
- # Users/judges don't need their own token - the Space secret is used
79
- #
80
- HF_TOKEN=hf_your-token-here
81
-
82
- # ============== AGENT CONFIGURATION ==============
83
-
84
- MAX_ITERATIONS=10
85
- SEARCH_TIMEOUT=30
86
- LOG_LEVEL=INFO
87
-
88
- # ============================================
89
- # Modal Configuration (Required for TTS)
90
- # ============================================
91
- # Modal credentials are required for TTS (Text-to-Speech) functionality
92
- # Get your credentials from: https://modal.com/
93
- MODAL_TOKEN_ID=your_modal_token_id_here
94
- MODAL_TOKEN_SECRET=your_modal_token_secret_here
95
-
96
- # ============== EXTERNAL SERVICES ==============
97
-
98
- # PubMed (optional - higher rate limits)
99
- NCBI_API_KEY=your-ncbi-key-here
100
-
101
- # Vector Database (optional - for LlamaIndex RAG)
102
- CHROMA_DB_PATH=./chroma_db
103
- # Neo4j Knowledge Graph
104
- NEO4J_URI=bolt://localhost:7687
105
- NEO4J_USER=neo4j
106
- NEO4J_PASSWORD=your_neo4j_password_here
107
- NEO4J_DATABASE=your_database_name
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/README.md DELETED
@@ -1,56 +0,0 @@
1
-
2
- > [!IMPORTANT]
3
- > **You are reading the Github README!**
4
- >
5
- > - 📚 **Documentation**: See our [technical documentation](https://deepcritical.github.io/GradioDemo/) for detailed information
6
- > - 📖 **Demo README**: Check out the [Demo README](..README.md) for more information > - 🏆 **Demo**: Kindly consider using our [Free Demo](https://hf.co/DataQuests/GradioDemo)
7
-
8
-
9
- <div align="center">
10
-
11
- [![GitHub](https://img.shields.io/github/stars/DeepCritical/GradioDemo?style=for-the-badge&logo=github&logoColor=white&label=🐙%20GitHub&labelColor=181717&color=181717)](https://github.com/DeepCritical/GradioDemo)
12
- [![Documentation](https://img.shields.io/badge/Docs-0080FF?style=for-the-badge&logo=readthedocs&logoColor=white&labelColor=0080FF&color=0080FF)](deepcritical.github.io/GradioDemo/)
13
- [![Demo](https://img.shields.io/badge/🚀%20Demo-FFD21E?style=for-the-badge&logo=huggingface&logoColor=white&labelColor=FFD21E&color=FFD21E)](https://huggingface.co/spaces/DataQuests/DeepCritical)
14
- [![codecov](https://codecov.io/gh/DeepCritical/GradioDemo/graph/badge.svg?token=B1f05RCGpz)](https://codecov.io/gh/DeepCritical/GradioDemo)
15
- [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP)
16
-
17
- </div>
18
-
19
- ## Quick Start
20
-
21
- ### 1. Environment Setup
22
-
23
- ```bash
24
- # Install uv if you haven't already
25
- pip install uv
26
-
27
- # Sync dependencies
28
- uv sync --all-extras
29
- ```
30
-
31
- ### 2. Run the UI
32
-
33
- ```bash
34
- # Start the Gradio app
35
- gradio run "src/app.py"
36
- ```
37
-
38
- Open your browser to `http://localhost:7860`.
39
-
40
- ### 3. Connect via MCP
41
-
42
- This application exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
43
-
44
- **MCP Server URL**: `http://localhost:7860/gradio_api/mcp/`
45
-
46
- **Claude Desktop Configuration**:
47
- Add this to your `claude_desktop_config.json`:
48
- ```json
49
- {
50
- "mcpServers": {
51
- "deepcritical": {
52
- "url": "http://localhost:7860/gradio_api/mcp/"
53
- }
54
- }
55
- }
56
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/scripts/deploy_to_hf_space.py DELETED
@@ -1,391 +0,0 @@
1
- """Deploy repository to Hugging Face Space, excluding unnecessary files."""
2
-
3
- import os
4
- import shutil
5
- import subprocess
6
- import tempfile
7
- from pathlib import Path
8
-
9
- from huggingface_hub import HfApi
10
-
11
-
12
- def get_excluded_dirs() -> set[str]:
13
- """Get set of directory names to exclude from deployment."""
14
- return {
15
- "docs",
16
- "dev",
17
- "folder",
18
- "site",
19
- "tests", # Optional - can be included if desired
20
- "examples", # Optional - can be included if desired
21
- ".git",
22
- ".github",
23
- "__pycache__",
24
- ".pytest_cache",
25
- ".mypy_cache",
26
- ".ruff_cache",
27
- ".venv",
28
- "venv",
29
- "env",
30
- "ENV",
31
- "node_modules",
32
- ".cursor",
33
- "reference_repos",
34
- "burner_docs",
35
- "chroma_db",
36
- "logs",
37
- "build",
38
- "dist",
39
- ".eggs",
40
- "htmlcov",
41
- "hf_space", # Exclude the cloned HF Space directory itself
42
- }
43
-
44
-
45
- def get_excluded_files() -> set[str]:
46
- """Get set of file names to exclude from deployment."""
47
- return {
48
- ".pre-commit-config.yaml",
49
- "mkdocs.yml",
50
- "uv.lock",
51
- "AGENTS.txt",
52
- ".env",
53
- ".env.local",
54
- "*.local",
55
- ".DS_Store",
56
- "Thumbs.db",
57
- "*.log",
58
- ".coverage",
59
- "coverage.xml",
60
- }
61
-
62
-
63
- def should_exclude(path: Path, excluded_dirs: set[str], excluded_files: set[str]) -> bool:
64
- """Check if a path should be excluded from deployment."""
65
- # Check if any parent directory is excluded
66
- for parent in path.parents:
67
- if parent.name in excluded_dirs:
68
- return True
69
-
70
- # Check if the path itself is a directory that should be excluded
71
- if path.is_dir() and path.name in excluded_dirs:
72
- return True
73
-
74
- # Check if the file name matches excluded patterns
75
- if path.is_file():
76
- # Check exact match
77
- if path.name in excluded_files:
78
- return True
79
- # Check pattern matches (simple wildcard support)
80
- for pattern in excluded_files:
81
- if "*" in pattern:
82
- # Simple pattern matching (e.g., "*.log")
83
- suffix = pattern.replace("*", "")
84
- if path.name.endswith(suffix):
85
- return True
86
-
87
- return False
88
-
89
-
90
- def deploy_to_hf_space() -> None:
91
- """Deploy repository to Hugging Face Space.
92
-
93
- Supports both user and organization Spaces:
94
- - User Space: username/space-name
95
- - Organization Space: organization-name/space-name
96
-
97
- Works with both classic tokens and fine-grained tokens.
98
- """
99
- # Get configuration from environment variables
100
- hf_token = os.getenv("HF_TOKEN")
101
- hf_username = os.getenv("HF_USERNAME") # Can be username or organization name
102
- space_name = os.getenv("HF_SPACE_NAME")
103
-
104
- # Check which variables are missing and provide helpful error message
105
- missing = []
106
- if not hf_token:
107
- missing.append("HF_TOKEN (should be in repository secrets)")
108
- if not hf_username:
109
- missing.append("HF_USERNAME (should be in repository variables)")
110
- if not space_name:
111
- missing.append("HF_SPACE_NAME (should be in repository variables)")
112
-
113
- if missing:
114
- raise ValueError(
115
- f"Missing required environment variables: {', '.join(missing)}\n"
116
- f"Please configure:\n"
117
- f" - HF_TOKEN in Settings > Secrets and variables > Actions > Secrets\n"
118
- f" - HF_USERNAME in Settings > Secrets and variables > Actions > Variables\n"
119
- f" - HF_SPACE_NAME in Settings > Secrets and variables > Actions > Variables"
120
- )
121
-
122
- # HF_USERNAME can be either a username or organization name
123
- # Format: {username|organization}/{space_name}
124
- repo_id = f"{hf_username}/{space_name}"
125
- local_dir = "hf_space"
126
-
127
- print(f"🚀 Deploying to Hugging Face Space: {repo_id}")
128
-
129
- # Initialize HF API
130
- api = HfApi(token=hf_token)
131
-
132
- # Create Space if it doesn't exist
133
- try:
134
- api.repo_info(repo_id=repo_id, repo_type="space", token=hf_token)
135
- print(f"✅ Space exists: {repo_id}")
136
- except Exception:
137
- print(f"⚠️ Space does not exist, creating: {repo_id}")
138
- # Create new repository
139
- # Note: For organizations, repo_id should be "org/space-name"
140
- # For users, repo_id should be "username/space-name"
141
- api.create_repo(
142
- repo_id=repo_id, # Full repo_id including owner
143
- repo_type="space",
144
- space_sdk="gradio",
145
- token=hf_token,
146
- exist_ok=True,
147
- )
148
- print(f"✅ Created new Space: {repo_id}")
149
-
150
- # Configure Git credential helper for authentication
151
- # This is needed for Git LFS to work properly with fine-grained tokens
152
- print("🔐 Configuring Git credentials...")
153
-
154
- # Use Git credential store to store the token
155
- # This allows Git LFS to authenticate properly
156
- temp_dir = Path(tempfile.gettempdir())
157
- credential_store = temp_dir / ".git-credentials-hf"
158
-
159
- # Write credentials in the format: https://username:token@huggingface.co
160
- credential_store.write_text(
161
- f"https://{hf_username}:{hf_token}@huggingface.co\n", encoding="utf-8"
162
- )
163
- try:
164
- credential_store.chmod(0o600) # Secure permissions (Unix only)
165
- except OSError:
166
- # Windows doesn't support chmod, skip
167
- pass
168
-
169
- # Configure Git to use the credential store
170
- subprocess.run(
171
- ["git", "config", "--global", "credential.helper", f"store --file={credential_store}"],
172
- check=True,
173
- capture_output=True,
174
- )
175
-
176
- # Also set environment variable for Git LFS
177
- os.environ["GIT_CREDENTIAL_HELPER"] = f"store --file={credential_store}"
178
-
179
- # Clone repository using git
180
- # Use the token in the URL for initial clone, but LFS will use credential store
181
- space_url = f"https://{hf_username}:{hf_token}@huggingface.co/spaces/{repo_id}"
182
-
183
- if Path(local_dir).exists():
184
- print(f"🧹 Removing existing {local_dir} directory...")
185
- shutil.rmtree(local_dir)
186
-
187
- print("📥 Cloning Space repository...")
188
- try:
189
- result = subprocess.run(
190
- ["git", "clone", space_url, local_dir],
191
- check=True,
192
- capture_output=True,
193
- text=True,
194
- )
195
- print("✅ Cloned Space repository")
196
-
197
- # After clone, configure the remote to use credential helper
198
- # This ensures future operations (like push) use the credential store
199
- os.chdir(local_dir)
200
- subprocess.run(
201
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
202
- check=True,
203
- capture_output=True,
204
- )
205
- os.chdir("..")
206
-
207
- except subprocess.CalledProcessError as e:
208
- error_msg = e.stderr if e.stderr else e.stdout if e.stdout else "Unknown error"
209
- print(f"❌ Failed to clone Space repository: {error_msg}")
210
-
211
- # Try alternative: clone with LFS skip, then fetch LFS files separately
212
- print("🔄 Trying alternative clone method (skip LFS during clone)...")
213
- try:
214
- env = os.environ.copy()
215
- env["GIT_LFS_SKIP_SMUDGE"] = "1" # Skip LFS during clone
216
-
217
- subprocess.run(
218
- ["git", "clone", space_url, local_dir],
219
- check=True,
220
- capture_output=True,
221
- text=True,
222
- env=env,
223
- )
224
- print("✅ Cloned Space repository (LFS skipped)")
225
-
226
- # Configure remote
227
- os.chdir(local_dir)
228
- subprocess.run(
229
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
230
- check=True,
231
- capture_output=True,
232
- )
233
-
234
- # Try to fetch LFS files with proper authentication
235
- print("📥 Fetching LFS files...")
236
- subprocess.run(
237
- ["git", "lfs", "pull"],
238
- check=False, # Don't fail if LFS pull fails - we'll continue without LFS files
239
- capture_output=True,
240
- text=True,
241
- )
242
- os.chdir("..")
243
- print("✅ Repository cloned (LFS files may be incomplete, but deployment can continue)")
244
- except subprocess.CalledProcessError as e2:
245
- error_msg2 = e2.stderr if e2.stderr else e2.stdout if e2.stdout else "Unknown error"
246
- print(f"❌ Alternative clone method also failed: {error_msg2}")
247
- raise RuntimeError(f"Git clone failed: {error_msg}") from e
248
-
249
- # Get exclusion sets
250
- excluded_dirs = get_excluded_dirs()
251
- excluded_files = get_excluded_files()
252
-
253
- # Remove all existing files in HF Space (except .git)
254
- print("🧹 Cleaning existing files...")
255
- for item in Path(local_dir).iterdir():
256
- if item.name == ".git":
257
- continue
258
- if item.is_dir():
259
- shutil.rmtree(item)
260
- else:
261
- item.unlink()
262
-
263
- # Copy files from repository root
264
- print("📦 Copying files...")
265
- repo_root = Path(".")
266
- files_copied = 0
267
- dirs_copied = 0
268
-
269
- for item in repo_root.rglob("*"):
270
- # Skip if in .git directory
271
- if ".git" in item.parts:
272
- continue
273
-
274
- # Skip if in hf_space directory (the cloned Space directory)
275
- if "hf_space" in item.parts:
276
- continue
277
-
278
- # Skip if should be excluded
279
- if should_exclude(item, excluded_dirs, excluded_files):
280
- continue
281
-
282
- # Calculate relative path
283
- try:
284
- rel_path = item.relative_to(repo_root)
285
- except ValueError:
286
- # Item is outside repo root, skip
287
- continue
288
-
289
- # Skip if in excluded directory
290
- if any(part in excluded_dirs for part in rel_path.parts):
291
- continue
292
-
293
- # Destination path
294
- dest_path = Path(local_dir) / rel_path
295
-
296
- # Create parent directories
297
- dest_path.parent.mkdir(parents=True, exist_ok=True)
298
-
299
- # Copy file or directory
300
- if item.is_file():
301
- shutil.copy2(item, dest_path)
302
- files_copied += 1
303
- elif item.is_dir():
304
- # Directory will be created by parent mkdir, but we track it
305
- dirs_copied += 1
306
-
307
- print(f"✅ Copied {files_copied} files and {dirs_copied} directories")
308
-
309
- # Commit and push changes using git
310
- print("💾 Committing changes...")
311
-
312
- # Change to the Space directory
313
- original_cwd = os.getcwd()
314
- os.chdir(local_dir)
315
-
316
- try:
317
- # Configure git user (required for commit)
318
- subprocess.run(
319
- ["git", "config", "user.name", "github-actions[bot]"],
320
- check=True,
321
- capture_output=True,
322
- )
323
- subprocess.run(
324
- ["git", "config", "user.email", "github-actions[bot]@users.noreply.github.com"],
325
- check=True,
326
- capture_output=True,
327
- )
328
-
329
- # Add all files
330
- subprocess.run(
331
- ["git", "add", "."],
332
- check=True,
333
- capture_output=True,
334
- )
335
-
336
- # Check if there are changes to commit
337
- result = subprocess.run(
338
- ["git", "status", "--porcelain"],
339
- check=False,
340
- capture_output=True,
341
- text=True,
342
- )
343
-
344
- if result.stdout.strip():
345
- # There are changes, commit and push
346
- subprocess.run(
347
- ["git", "commit", "-m", "Deploy to Hugging Face Space [skip ci]"],
348
- check=True,
349
- capture_output=True,
350
- )
351
- print("📤 Pushing to Hugging Face Space...")
352
- # Ensure remote URL uses credential helper (not token in URL)
353
- subprocess.run(
354
- ["git", "remote", "set-url", "origin", f"https://huggingface.co/spaces/{repo_id}"],
355
- check=True,
356
- capture_output=True,
357
- )
358
- subprocess.run(
359
- ["git", "push"],
360
- check=True,
361
- capture_output=True,
362
- )
363
- print("✅ Deployment complete!")
364
- else:
365
- print("ℹ️ No changes to commit (repository is up to date)")
366
- except subprocess.CalledProcessError as e:
367
- error_msg = e.stderr if e.stderr else (e.stdout if e.stdout else str(e))
368
- if isinstance(error_msg, bytes):
369
- error_msg = error_msg.decode("utf-8", errors="replace")
370
- if "nothing to commit" in error_msg.lower():
371
- print("ℹ️ No changes to commit (repository is up to date)")
372
- else:
373
- print(f"⚠️ Error during git operations: {error_msg}")
374
- raise RuntimeError(f"Git operation failed: {error_msg}") from e
375
- finally:
376
- # Return to original directory
377
- os.chdir(original_cwd)
378
-
379
- # Clean up credential store for security
380
- try:
381
- if credential_store.exists():
382
- credential_store.unlink()
383
- except Exception:
384
- # Ignore cleanup errors
385
- pass
386
-
387
- print(f"🎉 Successfully deployed to: https://huggingface.co/spaces/{repo_id}")
388
-
389
-
390
- if __name__ == "__main__":
391
- deploy_to_hf_space()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/workflows/ci.yml DELETED
@@ -1,127 +0,0 @@
1
- name: CI
2
-
3
- on:
4
- push:
5
- branches: [main, dev, develop]
6
- pull_request:
7
- branches: [main, dev, develop]
8
-
9
- jobs:
10
- test:
11
- runs-on: ubuntu-latest
12
- strategy:
13
- matrix:
14
- python-version: ["3.11"]
15
-
16
- steps:
17
- - uses: actions/checkout@v4
18
-
19
- - name: Set up Python ${{ matrix.python-version }}
20
- uses: actions/setup-python@v5
21
- with:
22
- python-version: ${{ matrix.python-version }}
23
-
24
- - name: Install dependencies
25
- run: |
26
- python -m pip install --upgrade pip
27
- pip install -e ".[dev]"
28
-
29
- - name: Lint with ruff
30
- run: |
31
- ruff check . --exclude tests
32
- ruff format --check . --exclude tests
33
- continue-on-error: true
34
-
35
- - name: Type check with mypy
36
- run: |
37
- mypy src
38
- continue-on-error: true
39
-
40
- - name: Install embedding dependencies
41
- run: |
42
- pip install -e ".[embeddings]"
43
-
44
- - name: Run unit tests (excluding OpenAI and embedding providers)
45
- env:
46
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
47
- run: |
48
- pytest tests/unit/ -v -m "not openai and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term
49
-
50
- - name: Run local embeddings tests
51
- env:
52
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
53
- run: |
54
- pytest tests/ -v -m "local_embeddings" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
55
- continue-on-error: true # Allow failures if dependencies not available
56
-
57
- - name: Run HuggingFace integration tests
58
- env:
59
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
60
- run: |
61
- pytest tests/integration/ -v -m "huggingface and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
62
- continue-on-error: true # Allow failures if HF_TOKEN not set
63
-
64
- - name: Run non-OpenAI integration tests (excluding embedding providers)
65
- env:
66
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
67
- run: |
68
- pytest tests/integration/ -v -m "integration and not openai and not embedding_provider" --tb=short -p no:logfire --cov --cov-branch --cov-report=xml --cov-report=term --cov-append || true
69
- continue-on-error: true # Allow failures if dependencies not available
70
-
71
- - name: Upload coverage reports to Codecov
72
- uses: codecov/codecov-action@v5
73
- with:
74
- token: ${{ secrets.CODECOV_TOKEN }}
75
- slug: DeepCritical/GradioDemo
76
- files: ./coverage.xml
77
- fail_ci_if_error: false
78
- continue-on-error: true
79
-
80
- docs:
81
- runs-on: ubuntu-latest
82
- permissions:
83
- contents: write
84
- if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/dev' || github.ref == 'refs/heads/develop')
85
- steps:
86
- - uses: actions/checkout@v4
87
- with:
88
- fetch-depth: 0
89
-
90
- - name: Set up Python
91
- uses: actions/setup-python@v5
92
- with:
93
- python-version: '3.11'
94
-
95
- - name: Install uv
96
- uses: astral-sh/setup-uv@v5
97
- with:
98
- version: "latest"
99
-
100
- - name: Install dependencies
101
- run: |
102
- uv sync --extra dev
103
-
104
- - name: Configure Git
105
- run: |
106
- git config user.name "github-actions[bot]"
107
- git config user.email "github-actions[bot]@users.noreply.github.com"
108
- git remote set-url origin https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/${{ github.repository }}.git
109
-
110
- - name: Deploy to GitHub Pages
111
- run: |
112
- # mkdocs gh-deploy automatically creates .nojekyll, but let's verify
113
- uv run mkdocs gh-deploy --force --message "Deploy docs [skip ci]" --strict
114
- # Verify .nojekyll was created in gh-pages branch
115
- git fetch origin gh-pages:gh-pages || true
116
- git checkout gh-pages || true
117
- if [ -f .nojekyll ]; then
118
- echo "✓ .nojekyll file exists"
119
- else
120
- echo "⚠ .nojekyll file missing, creating it..."
121
- touch .nojekyll
122
- git add .nojekyll
123
- git commit -m "Add .nojekyll to disable Jekyll [skip ci]" || true
124
- git push origin gh-pages || true
125
- fi
126
- env:
127
- GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.github/workflows/deploy-hf-space.yml DELETED
@@ -1,47 +0,0 @@
1
- name: Deploy to Hugging Face Space
2
-
3
- on:
4
- push:
5
- branches: [main]
6
- workflow_dispatch: # Allow manual triggering
7
-
8
- jobs:
9
- deploy:
10
- runs-on: ubuntu-latest
11
- permissions:
12
- contents: read
13
- # No write permissions needed for GitHub repo (we're pushing to HF Space)
14
-
15
- steps:
16
- - name: Checkout Repository
17
- uses: actions/checkout@v4
18
- with:
19
- fetch-depth: 0
20
-
21
- - name: Set up Python
22
- uses: actions/setup-python@v5
23
- with:
24
- python-version: '3.11'
25
-
26
- - name: Install dependencies
27
- run: |
28
- pip install --upgrade pip
29
- pip install huggingface-hub
30
-
31
- - name: Deploy to Hugging Face Space
32
- env:
33
- # Token from secrets (sensitive data)
34
- HF_TOKEN: ${{ secrets.HF_TOKEN }}
35
- # Username/Organization from repository variables (non-sensitive)
36
- HF_USERNAME: ${{ vars.HF_USERNAME }}
37
- # Space name from repository variables (non-sensitive)
38
- HF_SPACE_NAME: ${{ vars.HF_SPACE_NAME }}
39
- run: |
40
- python .github/scripts/deploy_to_hf_space.py
41
-
42
- - name: Verify deployment
43
- if: success()
44
- run: |
45
- echo "✅ Deployment completed successfully!"
46
- echo "Space URL: https://huggingface.co/spaces/${{ vars.HF_USERNAME }}/${{ vars.HF_SPACE_NAME }}"
47
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.gitignore DELETED
@@ -1,84 +0,0 @@
1
- folder/
2
- site/
3
- .cursor/
4
- .ruff_cache/
5
- # Python
6
- __pycache__/
7
- *.py[cod]
8
- *$py.class
9
- *.so
10
- .Python
11
- build/
12
- develop-eggs/
13
- dist/
14
- downloads/
15
- eggs/
16
- .eggs/
17
- lib/
18
- lib64/
19
- parts/
20
- sdist/
21
- var/
22
- wheels/
23
- *.egg-info/
24
- .installed.cfg
25
- *.egg
26
-
27
- # Virtual environments
28
- .venv/
29
- venv/
30
- ENV/
31
- env/
32
-
33
- # IDE
34
- .vscode/
35
- .idea/
36
- *.swp
37
- *.swo
38
-
39
- # Environment
40
- .env
41
- .env.local
42
- *.local
43
-
44
- # Claude
45
- .claude/
46
-
47
- # Burner docs (working drafts, not for commit)
48
- burner_docs/
49
-
50
- # Reference repos (clone locally, don't commit)
51
- reference_repos/autogen-microsoft/
52
- reference_repos/claude-agent-sdk/
53
- reference_repos/pydanticai-research-agent/
54
- reference_repos/pubmed-mcp-server/
55
- reference_repos/DeepCritical/
56
-
57
- # Keep the README in reference_repos
58
- !reference_repos/README.md
59
-
60
- # Development directory
61
- dev/
62
-
63
- # OS
64
- .DS_Store
65
- Thumbs.db
66
-
67
- # Logs
68
- *.log
69
- logs/
70
-
71
- # Testing
72
- .pytest_cache/
73
- .mypy_cache/
74
- .coverage
75
- htmlcov/
76
- test_output*.txt
77
-
78
- # Database files
79
- chroma_db/
80
- *.sqlite3
81
-
82
-
83
- # Trigger rebuild Wed Nov 26 17:51:41 EST 2025
84
- .env
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-config.yaml DELETED
@@ -1,64 +0,0 @@
1
- repos:
2
- - repo: https://github.com/astral-sh/ruff-pre-commit
3
- rev: v0.4.4
4
- hooks:
5
- - id: ruff
6
- args: [--fix, --exclude, tests]
7
- exclude: ^reference_repos/
8
- - id: ruff-format
9
- args: [--exclude, tests]
10
- exclude: ^reference_repos/
11
-
12
- - repo: https://github.com/pre-commit/mirrors-mypy
13
- rev: v1.10.0
14
- hooks:
15
- - id: mypy
16
- files: ^src/
17
- exclude: ^folder|^src/app.py
18
- additional_dependencies:
19
- - pydantic>=2.7
20
- - pydantic-settings>=2.2
21
- - tenacity>=8.2
22
- - pydantic-ai>=0.0.16
23
- args: [--ignore-missing-imports]
24
-
25
- - repo: local
26
- hooks:
27
- - id: pytest-unit
28
- name: pytest unit tests (no OpenAI)
29
- entry: uv
30
- language: system
31
- types: [python]
32
- args: [
33
- "run",
34
- "pytest",
35
- "tests/unit/",
36
- "-v",
37
- "-m",
38
- "not openai and not embedding_provider",
39
- "--tb=short",
40
- "-p",
41
- "no:logfire",
42
- ]
43
- pass_filenames: false
44
- always_run: true
45
- require_serial: false
46
- - id: pytest-local-embeddings
47
- name: pytest local embeddings tests
48
- entry: uv
49
- language: system
50
- types: [python]
51
- args: [
52
- "run",
53
- "pytest",
54
- "tests/",
55
- "-v",
56
- "-m",
57
- "local_embeddings",
58
- "--tb=short",
59
- "-p",
60
- "no:logfire",
61
- ]
62
- pass_filenames: false
63
- always_run: true
64
- require_serial: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest.ps1 DELETED
@@ -1,19 +0,0 @@
1
- # PowerShell pytest runner for pre-commit (Windows)
2
- # Uses uv if available, otherwise falls back to pytest
3
-
4
- if (Get-Command uv -ErrorAction SilentlyContinue) {
5
- # Sync dependencies before running tests
6
- uv sync
7
- uv run pytest $args
8
- } else {
9
- Write-Warning "uv not found, using system pytest (may have missing dependencies)"
10
- pytest $args
11
- }
12
-
13
-
14
-
15
-
16
-
17
-
18
-
19
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest.sh DELETED
@@ -1,20 +0,0 @@
1
- #!/bin/bash
2
- # Cross-platform pytest runner for pre-commit
3
- # Uses uv if available, otherwise falls back to pytest
4
-
5
- if command -v uv >/dev/null 2>&1; then
6
- # Sync dependencies before running tests
7
- uv sync
8
- uv run pytest "$@"
9
- else
10
- echo "Warning: uv not found, using system pytest (may have missing dependencies)"
11
- pytest "$@"
12
- fi
13
-
14
-
15
-
16
-
17
-
18
-
19
-
20
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_embeddings.ps1 DELETED
@@ -1,14 +0,0 @@
1
- # PowerShell wrapper to sync embeddings dependencies and run embeddings tests
2
-
3
- $ErrorActionPreference = "Stop"
4
-
5
- if (Get-Command uv -ErrorAction SilentlyContinue) {
6
- Write-Host "Syncing embeddings dependencies..."
7
- uv sync --extra embeddings
8
- Write-Host "Running embeddings tests..."
9
- uv run pytest tests/ -v -m local_embeddings --tb=short -p no:logfire
10
- } else {
11
- Write-Error "uv not found"
12
- exit 1
13
- }
14
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_embeddings.sh DELETED
@@ -1,15 +0,0 @@
1
- #!/bin/bash
2
- # Wrapper script to sync embeddings dependencies and run embeddings tests
3
-
4
- set -e
5
-
6
- if command -v uv >/dev/null 2>&1; then
7
- echo "Syncing embeddings dependencies..."
8
- uv sync --extra embeddings
9
- echo "Running embeddings tests..."
10
- uv run pytest tests/ -v -m local_embeddings --tb=short -p no:logfire
11
- else
12
- echo "Error: uv not found"
13
- exit 1
14
- fi
15
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_unit.ps1 DELETED
@@ -1,14 +0,0 @@
1
- # PowerShell wrapper to sync dependencies and run unit tests
2
-
3
- $ErrorActionPreference = "Stop"
4
-
5
- if (Get-Command uv -ErrorAction SilentlyContinue) {
6
- Write-Host "Syncing dependencies..."
7
- uv sync
8
- Write-Host "Running unit tests..."
9
- uv run pytest tests/unit/ -v -m "not openai and not embedding_provider" --tb=short -p no:logfire
10
- } else {
11
- Write-Error "uv not found"
12
- exit 1
13
- }
14
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_unit.sh DELETED
@@ -1,15 +0,0 @@
1
- #!/bin/bash
2
- # Wrapper script to sync dependencies and run unit tests
3
-
4
- set -e
5
-
6
- if command -v uv >/dev/null 2>&1; then
7
- echo "Syncing dependencies..."
8
- uv sync
9
- echo "Running unit tests..."
10
- uv run pytest tests/unit/ -v -m "not openai and not embedding_provider" --tb=short -p no:logfire
11
- else
12
- echo "Error: uv not found"
13
- exit 1
14
- fi
15
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_with_sync.ps1 DELETED
@@ -1,25 +0,0 @@
1
- # PowerShell wrapper for pytest runner
2
- # Ensures uv is available and runs the Python script
3
-
4
- param(
5
- [Parameter(Position=0)]
6
- [string]$TestType = "unit"
7
- )
8
-
9
- $ErrorActionPreference = "Stop"
10
-
11
- # Check if uv is available
12
- if (-not (Get-Command uv -ErrorAction SilentlyContinue)) {
13
- Write-Error "uv not found. Please install uv: https://github.com/astral-sh/uv"
14
- exit 1
15
- }
16
-
17
- # Get the script directory
18
- $ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
19
- $PythonScript = Join-Path $ScriptDir "run_pytest_with_sync.py"
20
-
21
- # Run the Python script using uv
22
- uv run python $PythonScript $TestType
23
-
24
- exit $LASTEXITCODE
25
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.pre-commit-hooks/run_pytest_with_sync.py DELETED
@@ -1,235 +0,0 @@
1
- #!/usr/bin/env python3
2
- """Cross-platform pytest runner that syncs dependencies before running tests."""
3
-
4
- import shutil
5
- import subprocess
6
- import sys
7
- from pathlib import Path
8
-
9
-
10
- def clean_caches(project_root: Path) -> None:
11
- """Remove pytest and Python cache directories and files.
12
-
13
- Comprehensively removes all cache files and directories to ensure
14
- clean test runs. Only scans specific directories to avoid resource
15
- exhaustion from scanning large directories like .venv on Windows.
16
- """
17
- # Directories to scan for caches (only project code, not dependencies)
18
- scan_dirs = ["src", "tests", ".pre-commit-hooks"]
19
-
20
- # Directories to exclude (to avoid resource issues)
21
- exclude_dirs = {
22
- ".venv",
23
- "venv",
24
- "ENV",
25
- "env",
26
- ".git",
27
- "node_modules",
28
- "dist",
29
- "build",
30
- ".eggs",
31
- "reference_repos",
32
- "folder",
33
- }
34
-
35
- # Comprehensive list of cache patterns to remove
36
- cache_patterns = [
37
- ".pytest_cache",
38
- "__pycache__",
39
- "*.pyc",
40
- "*.pyo",
41
- "*.pyd",
42
- ".mypy_cache",
43
- ".ruff_cache",
44
- ".coverage",
45
- "coverage.xml",
46
- "htmlcov",
47
- ".hypothesis", # Hypothesis testing framework cache
48
- ".tox", # Tox cache (if used)
49
- ".cache", # General Python cache
50
- ]
51
-
52
- def should_exclude(path: Path) -> bool:
53
- """Check if a path should be excluded from cache cleanup."""
54
- # Check if any parent directory is in exclude list
55
- for parent in path.parents:
56
- if parent.name in exclude_dirs:
57
- return True
58
- # Check if the path itself is excluded
59
- if path.name in exclude_dirs:
60
- return True
61
- return False
62
-
63
- cleaned = []
64
-
65
- # Only scan specific directories to avoid resource exhaustion
66
- for scan_dir in scan_dirs:
67
- scan_path = project_root / scan_dir
68
- if not scan_path.exists():
69
- continue
70
-
71
- for pattern in cache_patterns:
72
- if "*" in pattern:
73
- # Handle glob patterns for files
74
- try:
75
- for cache_file in scan_path.rglob(pattern):
76
- if should_exclude(cache_file):
77
- continue
78
- try:
79
- if cache_file.is_file():
80
- cache_file.unlink()
81
- cleaned.append(str(cache_file.relative_to(project_root)))
82
- except OSError:
83
- pass # Ignore errors (file might be locked or already deleted)
84
- except OSError:
85
- pass # Ignore errors during directory traversal
86
- else:
87
- # Handle directory patterns
88
- try:
89
- for cache_dir in scan_path.rglob(pattern):
90
- if should_exclude(cache_dir):
91
- continue
92
- try:
93
- if cache_dir.is_dir():
94
- shutil.rmtree(cache_dir, ignore_errors=True)
95
- cleaned.append(str(cache_dir.relative_to(project_root)))
96
- except OSError:
97
- pass # Ignore errors (directory might be locked)
98
- except OSError:
99
- pass # Ignore errors during directory traversal
100
-
101
- # Also clean root-level caches (like .pytest_cache in project root)
102
- root_cache_patterns = [
103
- ".pytest_cache",
104
- ".mypy_cache",
105
- ".ruff_cache",
106
- ".coverage",
107
- "coverage.xml",
108
- "htmlcov",
109
- ".hypothesis",
110
- ".tox",
111
- ".cache",
112
- ".pytest",
113
- ]
114
- for pattern in root_cache_patterns:
115
- cache_path = project_root / pattern
116
- if cache_path.exists():
117
- try:
118
- if cache_path.is_dir():
119
- shutil.rmtree(cache_path, ignore_errors=True)
120
- elif cache_path.is_file():
121
- cache_path.unlink()
122
- cleaned.append(pattern)
123
- except OSError:
124
- pass
125
-
126
- # Also remove any .pyc files in root directory
127
- try:
128
- for pyc_file in project_root.glob("*.pyc"):
129
- try:
130
- pyc_file.unlink()
131
- cleaned.append(pyc_file.name)
132
- except OSError:
133
- pass
134
- except OSError:
135
- pass
136
-
137
- if cleaned:
138
- print(
139
- f"Cleaned {len(cleaned)} cache items: {', '.join(cleaned[:10])}{'...' if len(cleaned) > 10 else ''}"
140
- )
141
- else:
142
- print("No cache files found to clean")
143
-
144
-
145
- def run_command(
146
- cmd: list[str], check: bool = True, shell: bool = False, cwd: str | None = None
147
- ) -> int:
148
- """Run a command and return exit code."""
149
- try:
150
- result = subprocess.run(
151
- cmd,
152
- check=check,
153
- shell=shell,
154
- cwd=cwd,
155
- env=None, # Use current environment, uv will handle venv
156
- )
157
- return result.returncode
158
- except subprocess.CalledProcessError as e:
159
- return e.returncode
160
- except FileNotFoundError:
161
- print(f"Error: Command not found: {cmd[0]}")
162
- return 1
163
-
164
-
165
- def main() -> int:
166
- """Main entry point."""
167
- import os
168
-
169
- # Get the project root (where pyproject.toml is)
170
- script_dir = Path(__file__).parent
171
- project_root = script_dir.parent
172
-
173
- # Change to project root to ensure uv works correctly
174
- os.chdir(project_root)
175
-
176
- # Clean caches before running tests
177
- print("Cleaning pytest and Python caches...")
178
- clean_caches(project_root)
179
-
180
- # Check if uv is available
181
- if run_command(["uv", "--version"], check=False) != 0:
182
- print("Error: uv not found. Please install uv: https://github.com/astral-sh/uv")
183
- return 1
184
-
185
- # Parse arguments
186
- test_type = sys.argv[1] if len(sys.argv) > 1 else "unit"
187
- extra_args = sys.argv[2:] if len(sys.argv) > 2 else []
188
-
189
- # Sync dependencies - always include dev
190
- # Note: embeddings dependencies are now in main dependencies, not optional
191
- # Use --extra dev for [project.optional-dependencies].dev (not --dev which is for [dependency-groups])
192
- sync_cmd = ["uv", "sync", "--extra", "dev"]
193
-
194
- print(f"Syncing dependencies for {test_type} tests...")
195
- if run_command(sync_cmd, cwd=project_root) != 0:
196
- return 1
197
-
198
- # Build pytest command - use uv run to ensure correct environment
199
- if test_type == "unit":
200
- pytest_args = [
201
- "tests/unit/",
202
- "-v",
203
- "-m",
204
- "not openai and not embedding_provider",
205
- "--tb=short",
206
- "-p",
207
- "no:logfire",
208
- "--cache-clear", # Clear pytest cache before running
209
- ]
210
- elif test_type == "embeddings":
211
- pytest_args = [
212
- "tests/",
213
- "-v",
214
- "-m",
215
- "local_embeddings",
216
- "--tb=short",
217
- "-p",
218
- "no:logfire",
219
- "--cache-clear", # Clear pytest cache before running
220
- ]
221
- else:
222
- pytest_args = []
223
-
224
- pytest_args.extend(extra_args)
225
-
226
- # Use uv run python -m pytest to ensure we use the venv's pytest
227
- # This is more reliable than uv run pytest which might find system pytest
228
- pytest_cmd = ["uv", "run", "python", "-m", "pytest", *pytest_args]
229
-
230
- print(f"Running {test_type} tests...")
231
- return run_command(pytest_cmd, cwd=project_root)
232
-
233
-
234
- if __name__ == "__main__":
235
- sys.exit(main())
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.python-version DELETED
@@ -1 +0,0 @@
1
- 3.11
 
 
AGENTS.txt DELETED
@@ -1,236 +0,0 @@
1
- # DeepCritical Project - Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CONTRIBUTING.md DELETED
@@ -1,494 +0,0 @@
1
- # Contributing to The DETERMINATOR
2
-
3
- Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
4
-
5
- ## Table of Contents
6
-
7
- - [Git Workflow](#git-workflow)
8
- - [Getting Started](#getting-started)
9
- - [Development Commands](#development-commands)
10
- - [MCP Integration](#mcp-integration)
11
- - [Common Pitfalls](#common-pitfalls)
12
- - [Key Principles](#key-principles)
13
- - [Pull Request Process](#pull-request-process)
14
-
15
- > **Note**: Additional sections (Code Style, Error Handling, Testing, Implementation Patterns, Code Quality, and Prompt Engineering) are available as separate pages in the [documentation](https://deepcritical.github.io/GradioDemo/contributing/).
16
- > **Note on Project Names**: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name.
17
-
18
- ## Repository Information
19
-
20
- - **GitHub Repository**: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo) (source of truth, PRs, code review)
21
- - **HuggingFace Space**: [`DataQuests/DeepCritical`](https://huggingface.co/spaces/DataQuests/DeepCritical) (deployment/demo)
22
- - **Package Name**: `determinator` (Python package name in `pyproject.toml`)
23
-
24
- ## Git Workflow
25
-
26
- - `main`: Production-ready (GitHub)
27
- - `dev`: Development integration (GitHub)
28
- - Use feature branches: `yourname-dev`
29
- - **NEVER** push directly to `main` or `dev` on HuggingFace
30
- - GitHub is source of truth; HuggingFace is for deployment
31
-
32
- ### Dual Repository Setup
33
-
34
- This project uses a dual repository setup:
35
-
36
- - **GitHub (`DeepCritical/GradioDemo`)**: Source of truth for code, PRs, and code review
37
- - **HuggingFace (`DataQuests/DeepCritical`)**: Deployment target for the Gradio demo
38
-
39
- #### Remote Configuration
40
-
41
- When cloning, set up remotes as follows:
42
-
43
- ```bash
44
- # Clone from GitHub
45
- git clone https://github.com/DeepCritical/GradioDemo.git
46
- cd GradioDemo
47
-
48
- # Add HuggingFace remote (optional, for deployment)
49
- git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
50
- ```
51
-
52
- **Important**: Never push directly to `main` or `dev` on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
53
-
54
- ## Getting Started
55
-
56
- 1. **Fork the repository** on GitHub: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo)
57
- 2. **Clone your fork**:
58
-
59
- ```bash
60
- git clone https://github.com/yourusername/GradioDemo.git
61
- cd GradioDemo
62
- ```
63
-
64
- 3. **Install dependencies**:
65
-
66
- ```bash
67
- uv sync --all-extras
68
- uv run pre-commit install
69
- ```
70
-
71
- 4. **Create a feature branch**:
72
-
73
- ```bash
74
- git checkout -b yourname-feature-name
75
- ```
76
-
77
- 5. **Make your changes** following the guidelines below
78
- 6. **Run checks**:
79
-
80
- ```bash
81
- uv run ruff check src tests
82
- uv run mypy src
83
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
84
- ```
85
-
86
- 7. **Commit and push**:
87
-
88
- ```bash
89
- git commit -m "Description of changes"
90
- git push origin yourname-feature-name
91
- ```
92
-
93
- 8. **Create a pull request** on GitHub
94
-
95
- ## Package Manager
96
-
97
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
98
-
99
- ### Installation
100
-
101
- ```bash
102
- # Install uv if you haven't already (recommended: standalone installer)
103
- # Unix/macOS/Linux:
104
- curl -LsSf https://astral.sh/uv/install.sh | sh
105
-
106
- # Windows (PowerShell):
107
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
108
-
109
- # Alternative: pipx install uv
110
- # Or: pip install uv
111
-
112
- # Sync all dependencies including dev extras
113
- uv sync --all-extras
114
-
115
- # Install pre-commit hooks
116
- uv run pre-commit install
117
- ```
118
-
119
- ## Development Commands
120
-
121
- ```bash
122
- # Installation
123
- uv sync --all-extras # Install all dependencies including dev
124
- uv run pre-commit install # Install pre-commit hooks
125
-
126
- # Code Quality Checks (run all before committing)
127
- uv run ruff check src tests # Lint with ruff
128
- uv run ruff format src tests # Format with ruff
129
- uv run mypy src # Type checking
130
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
131
-
132
- # Testing Commands
133
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
134
- uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
135
- uv run pytest tests/ -v -p no:logfire # Run all tests
136
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
137
- uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
138
-
139
- # Documentation Commands
140
- uv run mkdocs build # Build documentation
141
- uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
142
- ```
143
-
144
- ### Test Markers
145
-
146
- The project uses pytest markers to categorize tests. See [Testing Guidelines](docs/contributing/testing.md) for details:
147
-
148
- - `unit`: Unit tests (mocked, fast)
149
- - `integration`: Integration tests (real APIs)
150
- - `slow`: Slow tests
151
- - `openai`: Tests requiring OpenAI API key
152
- - `huggingface`: Tests requiring HuggingFace API key
153
- - `embedding_provider`: Tests requiring API-based embedding providers
154
- - `local_embeddings`: Tests using local embeddings
155
-
156
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
157
-
158
- ## Code Style & Conventions
159
-
160
- ### Type Safety
161
-
162
- - **ALWAYS** use type hints for all function parameters and return types
163
- - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
164
- - Use `TYPE_CHECKING` imports for circular dependencies:
165
-
166
- <!--codeinclude-->
167
- [TYPE_CHECKING Import Pattern](../src/utils/citation_validator.py) start_line:8 end_line:11
168
- <!--/codeinclude-->
169
-
170
- ### Pydantic Models
171
-
172
- - All data exchange uses Pydantic models (`src/utils/models.py`)
173
- - Models are frozen (`model_config = {"frozen": True}`) for immutability
174
- - Use `Field()` with descriptions for all model fields
175
- - Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints
176
-
177
- ### Async Patterns
178
-
179
- - **ALL** I/O operations must be async (`async def`, `await`)
180
- - Use `asyncio.gather()` for parallel operations
181
- - CPU-bound work (embeddings, parsing) must use `run_in_executor()`:
182
-
183
- ```python
184
- loop = asyncio.get_running_loop()
185
- result = await loop.run_in_executor(None, cpu_bound_function, args)
186
- ```
187
-
188
- - Never block the event loop with synchronous I/O
189
-
190
- ### Linting
191
-
192
- - Ruff with 100-char line length
193
- - Ignore rules documented in `pyproject.toml`:
194
- - `PLR0913`: Too many arguments (agents need many params)
195
- - `PLR0912`: Too many branches (complex orchestrator logic)
196
- - `PLR0911`: Too many return statements (complex agent logic)
197
- - `PLR2004`: Magic values (statistical constants)
198
- - `PLW0603`: Global statement (singleton pattern)
199
- - `PLC0415`: Lazy imports for optional dependencies
200
-
201
- ### Pre-commit
202
-
203
- - Pre-commit hooks run automatically on commit
204
- - Must pass: lint + typecheck + test-cov
205
- - Install hooks with: `uv run pre-commit install`
206
- - Note: `uv sync --all-extras` installs the pre-commit package, but you must run `uv run pre-commit install` separately to set up the git hooks
207
-
208
- ## Error Handling & Logging
209
-
210
- ### Exception Hierarchy
211
-
212
- Use custom exception hierarchy (`src/utils/exceptions.py`):
213
-
214
- <!--codeinclude-->
215
- [Exception Hierarchy](../src/utils/exceptions.py) start_line:4 end_line:31
216
- <!--/codeinclude-->
217
-
218
- ### Error Handling Rules
219
-
220
- - Always chain exceptions: `raise SearchError(...) from e`
221
- - Log errors with context using `structlog`:
222
-
223
- ```python
224
- logger.error("Operation failed", error=str(e), context=value)
225
- ```
226
-
227
- - Never silently swallow exceptions
228
- - Provide actionable error messages
229
-
230
- ### Logging
231
-
232
- - Use `structlog` for all logging (NOT `print` or `logging`)
233
- - Import: `import structlog; logger = structlog.get_logger()`
234
- - Log with structured data: `logger.info("event", key=value)`
235
- - Use appropriate levels: DEBUG, INFO, WARNING, ERROR
236
-
237
- ### Logging Examples
238
-
239
- ```python
240
- logger.info("Starting search", query=query, tools=[t.name for t in tools])
241
- logger.warning("Search tool failed", tool=tool.name, error=str(result))
242
- logger.error("Assessment failed", error=str(e))
243
- ```
244
-
245
- ### Error Chaining
246
-
247
- Always preserve exception context:
248
-
249
- ```python
250
- try:
251
- result = await api_call()
252
- except httpx.HTTPError as e:
253
- raise SearchError(f"API call failed: {e}") from e
254
- ```
255
-
256
- ## Testing Requirements
257
-
258
- ### Test Structure
259
-
260
- - Unit tests in `tests/unit/` (mocked, fast)
261
- - Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`)
262
- - Use markers: `unit`, `integration`, `slow`
263
-
264
- ### Mocking
265
-
266
- - Use `respx` for httpx mocking
267
- - Use `pytest-mock` for general mocking
268
- - Mock LLM calls in unit tests (use `MockJudgeHandler`)
269
- - Fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`
270
-
271
- ### TDD Workflow
272
-
273
- 1. Write failing test in `tests/unit/`
274
- 2. Implement in `src/`
275
- 3. Ensure test passes
276
- 4. Run checks: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
277
-
278
- ### Test Examples
279
-
280
- ```python
281
- @pytest.mark.unit
282
- async def test_pubmed_search(mock_httpx_client):
283
- tool = PubMedTool()
284
- results = await tool.search("metformin", max_results=5)
285
- assert len(results) > 0
286
- assert all(isinstance(r, Evidence) for r in results)
287
-
288
- @pytest.mark.integration
289
- async def test_real_pubmed_search():
290
- tool = PubMedTool()
291
- results = await tool.search("metformin", max_results=3)
292
- assert len(results) <= 3
293
- ```
294
-
295
- ### Test Coverage
296
-
297
- - Run `uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire` for coverage report
298
- - Run `uv run pytest --cov=src --cov-report=html -p no:logfire` for HTML coverage report (opens `htmlcov/index.html`)
299
- - Aim for >80% coverage on critical paths
300
- - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
301
-
302
- ## Implementation Patterns
303
-
304
- ### Search Tools
305
-
306
- All tools implement `SearchTool` protocol (`src/tools/base.py`):
307
-
308
- - Must have `name` property
309
- - Must implement `async def search(query, max_results) -> list[Evidence]`
310
- - Use `@retry` decorator from tenacity for resilience
311
- - Rate limiting: Implement `_rate_limit()` for APIs with limits (e.g., PubMed)
312
- - Error handling: Raise `SearchError` or `RateLimitError` on failures
313
-
314
- Example pattern:
315
-
316
- ```python
317
- class MySearchTool:
318
- @property
319
- def name(self) -> str:
320
- return "mytool"
321
-
322
- @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
323
- async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
324
- # Implementation
325
- return evidence_list
326
- ```
327
-
328
- ### Judge Handlers
329
-
330
- - Implement `JudgeHandlerProtocol` (`async def assess(question, evidence) -> JudgeAssessment`)
331
- - Use pydantic-ai `Agent` with `output_type=JudgeAssessment`
332
- - System prompts in `src/prompts/judge.py`
333
- - Support fallback handlers: `MockJudgeHandler`, `HFInferenceJudgeHandler`
334
- - Always return valid `JudgeAssessment` (never raise exceptions)
335
-
336
- ### Agent Factory Pattern
337
-
338
- - Use factory functions for creating agents (`src/agent_factory/`)
339
- - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
340
- - Check requirements before initialization:
341
-
342
- <!--codeinclude-->
343
- [Check Magentic Requirements](../src/utils/llm_factory.py) start_line:152 end_line:170
344
- <!--/codeinclude-->
345
-
346
- ### State Management
347
-
348
- - **Magentic Mode**: Use `ContextVar` for thread-safe state (`src/agents/state.py`)
349
- - **Simple Mode**: Pass state via function parameters
350
- - Never use global mutable state (except singletons via `@lru_cache`)
351
-
352
- ### Singleton Pattern
353
-
354
- Use `@lru_cache(maxsize=1)` for singletons:
355
-
356
- <!--codeinclude-->
357
- [Singleton Pattern Example](../src/services/statistical_analyzer.py) start_line:252 end_line:255
358
- <!--/codeinclude-->
359
-
360
- - Lazy initialization to avoid requiring dependencies at import time
361
-
362
- ## Code Quality & Documentation
363
-
364
- ### Docstrings
365
-
366
- - Google-style docstrings for all public functions
367
- - Include Args, Returns, Raises sections
368
- - Use type hints in docstrings only if needed for clarity
369
-
370
- Example:
371
-
372
- <!--codeinclude-->
373
- [Search Method Docstring Example](../src/tools/pubmed.py) start_line:51 end_line:58
374
- <!--/codeinclude-->
375
-
376
- ### Code Comments
377
-
378
- - Explain WHY, not WHAT
379
- - Document non-obvious patterns (e.g., why `requests` not `httpx` for ClinicalTrials)
380
- - Mark critical sections: `# CRITICAL: ...`
381
- - Document rate limiting rationale
382
- - Explain async patterns when non-obvious
383
-
384
- ## Prompt Engineering & Citation Validation
385
-
386
- ### Judge Prompts
387
-
388
- - System prompt in `src/prompts/judge.py`
389
- - Format evidence with truncation (1500 chars per item)
390
- - Handle empty evidence case separately
391
- - Always request structured JSON output
392
- - Use `format_user_prompt()` and `format_empty_evidence_prompt()` helpers
393
-
394
- ### Hypothesis Prompts
395
-
396
- - Use diverse evidence selection (MMR algorithm)
397
- - Sentence-aware truncation (`truncate_at_sentence()`)
398
- - Format: Drug → Target → Pathway → Effect
399
- - System prompt emphasizes mechanistic reasoning
400
- - Use `format_hypothesis_prompt()` with embeddings for diversity
401
-
402
- ### Report Prompts
403
-
404
- - Include full citation details for validation
405
- - Use diverse evidence selection (n=20)
406
- - **CRITICAL**: Emphasize citation validation rules
407
- - Format hypotheses with support/contradiction counts
408
- - System prompt includes explicit JSON structure requirements
409
-
410
- ### Citation Validation
411
-
412
- - **ALWAYS** validate references before returning reports
413
- - Use `validate_references()` from `src/utils/citation_validator.py`
414
- - Remove hallucinated citations (URLs not in evidence)
415
- - Log warnings for removed citations
416
- - Never trust LLM-generated citations without validation
417
-
418
- ### Citation Validation Rules
419
-
420
- 1. Every reference URL must EXACTLY match a provided evidence URL
421
- 2. Do NOT invent, fabricate, or hallucinate any references
422
- 3. Do NOT modify paper titles, authors, dates, or URLs
423
- 4. If unsure about a citation, OMIT it rather than guess
424
- 5. Copy URLs exactly as provided - do not create similar-looking URLs
425
-
426
- ### Evidence Selection
427
-
428
- - Use `select_diverse_evidence()` for MMR-based selection
429
- - Balance relevance vs diversity (lambda=0.7 default)
430
- - Sentence-aware truncation preserves meaning
431
- - Limit evidence per prompt to avoid context overflow
432
-
433
- ## MCP Integration
434
-
435
- ### MCP Tools
436
-
437
- - Functions in `src/mcp_tools.py` for Claude Desktop
438
- - Full type hints required
439
- - Google-style docstrings with Args/Returns sections
440
- - Formatted string returns (markdown)
441
-
442
- ### Gradio MCP Server
443
-
444
- - Enable with `mcp_server=True` in `demo.launch()`
445
- - Endpoint: `/gradio_api/mcp/`
446
- - Use `ssr_mode=False` to fix hydration issues in HF Spaces
447
-
448
- ## Common Pitfalls
449
-
450
- 1. **Blocking the event loop**: Never use sync I/O in async functions
451
- 2. **Missing type hints**: All functions must have complete type annotations
452
- 3. **Hallucinated citations**: Always validate references
453
- 4. **Global mutable state**: Use ContextVar or pass via parameters
454
- 5. **Import errors**: Lazy-load optional dependencies (magentic, modal, embeddings)
455
- 6. **Rate limiting**: Always implement for external APIs
456
- 7. **Error chaining**: Always use `from e` when raising exceptions
457
-
458
- ## Key Principles
459
-
460
- 1. **Type Safety First**: All code must pass `mypy --strict`
461
- 2. **Async Everything**: All I/O must be async
462
- 3. **Test-Driven**: Write tests before implementation
463
- 4. **No Hallucinations**: Validate all citations
464
- 5. **Graceful Degradation**: Support free tier (HF Inference) when no API keys
465
- 6. **Lazy Loading**: Don't require optional dependencies at import time
466
- 7. **Structured Logging**: Use structlog, never print()
467
- 8. **Error Chaining**: Always preserve exception context
468
-
469
- ## Pull Request Process
470
-
471
- 1. Ensure all checks pass: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
472
- 2. Update documentation if needed
473
- 3. Add tests for new features
474
- 4. Update CHANGELOG if applicable
475
- 5. Request review from maintainers
476
- 6. Address review feedback
477
- 7. Wait for approval before merging
478
-
479
- ## Project Structure
480
-
481
- - `src/`: Main source code
482
- - `tests/`: Test files (`unit/` and `integration/`)
483
- - `docs/`: Documentation source files (MkDocs)
484
- - `examples/`: Example usage scripts
485
- - `pyproject.toml`: Project configuration and dependencies
486
- - `.pre-commit-config.yaml`: Pre-commit hook configuration
487
-
488
- ## Questions?
489
-
490
- - Open an issue on [GitHub](https://github.com/DeepCritical/GradioDemo)
491
- - Check existing [documentation](https://deepcritical.github.io/GradioDemo/)
492
- - Review code examples in the codebase
493
-
494
- Thank you for contributing to The DETERMINATOR!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Dockerfile DELETED
@@ -1,52 +0,0 @@
1
- # Dockerfile for DeepCritical
2
- FROM python:3.11-slim
3
-
4
- # Set working directory
5
- WORKDIR /app
6
-
7
- # Install system dependencies (curl needed for HEALTHCHECK)
8
- RUN apt-get update && apt-get install -y \
9
- git \
10
- curl \
11
- && rm -rf /var/lib/apt/lists/*
12
-
13
- # Install uv
14
- RUN pip install uv==0.5.4
15
-
16
- # Copy project files
17
- COPY pyproject.toml .
18
- COPY uv.lock .
19
- COPY src/ src/
20
- COPY README.md .
21
-
22
- # Install runtime dependencies only (no dev/test tools)
23
- RUN uv sync --frozen --no-dev --extra embeddings --extra magentic
24
-
25
- # Create non-root user BEFORE downloading models
26
- RUN useradd --create-home --shell /bin/bash appuser
27
-
28
- # Set cache directory for HuggingFace models (must be writable by appuser)
29
- ENV HF_HOME=/app/.cache
30
- ENV TRANSFORMERS_CACHE=/app/.cache
31
-
32
- # Create cache dir with correct ownership
33
- RUN mkdir -p /app/.cache && chown -R appuser:appuser /app/.cache
34
-
35
- # Pre-download the embedding model during build (as appuser to set correct ownership)
36
- USER appuser
37
- RUN uv run python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
38
-
39
- # Expose port
40
- EXPOSE 7860
41
-
42
- # Health check
43
- HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
44
- CMD curl -f http://localhost:7860/ || exit 1
45
-
46
- # Set environment variables
47
- ENV GRADIO_SERVER_NAME=0.0.0.0
48
- ENV GRADIO_SERVER_PORT=7860
49
- ENV PYTHONPATH=/app
50
-
51
- # Run the app
52
- CMD ["uv", "run", "python", "-m", "src.app"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
LICENSE.md DELETED
@@ -1,25 +0,0 @@
1
- # License
2
-
3
- DeepCritical is licensed under the MIT License.
4
-
5
- ## MIT License
6
-
7
- Copyright (c) 2024 DeepCritical Team
8
-
9
- Permission is hereby granted, free of charge, to any person obtaining a copy
10
- of this software and associated documentation files (the "Software"), to deal
11
- in the Software without restriction, including without limitation the rights
12
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
13
- copies of the Software, and to permit persons to whom the Software is
14
- furnished to do so, subject to the following conditions:
15
-
16
- The above copyright notice and this permission notice shall be included in all
17
- copies or substantial portions of the Software.
18
-
19
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25
- SOFTWARE.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,63 +1,15 @@
1
  ---
2
- title: The DETERMINATOR
3
- emoji: 🐉
4
- colorFrom: red
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: "6.0.1"
8
- python_version: "3.11"
9
  app_file: src/app.py
10
- hf_oauth: true
11
- hf_oauth_expiration_minutes: 480
12
- hf_oauth_scopes:
13
- # Required for HuggingFace Inference API (includes all third-party providers)
14
- # This scope grants access to:
15
- # - HuggingFace's own Inference API
16
- # - Third-party inference providers (nebius, together, scaleway, hyperbolic, novita, nscale, sambanova, ovh, fireworks, etc.)
17
- # - All models available through the Inference Providers API
18
- - inference-api
19
- # Optional: Uncomment if you need to access user's billing information
20
- # - read-billing
21
- pinned: true
22
  license: mit
23
- tags:
24
- - mcp-in-action-track-enterprise
25
- - mcp-hackathon
26
- - deep-research
27
- - biomedical-ai
28
- - pydantic-ai
29
- - llamaindex
30
- - modal
31
- - building-mcp-track-enterprise
32
- - building-mcp-track-consumer
33
- - mcp-in-action-track-enterprise
34
- - mcp-in-action-track-consumer
35
- - building-mcp-track-modal
36
- - building-mcp-track-blaxel
37
- - building-mcp-track-llama-index
38
- - building-mcp-track-HUGGINGFACE
39
  ---
40
 
41
- > [!IMPORTANT]
42
- > **You are reading the Gradio Demo README!**
43
- >
44
- > - 📚 **Documentation**: See our [technical documentation](https://deepcritical.github.io/GradioDemo/) for detailed information
45
- > - 📖 **Complete README**: Check out the [Github README](.github/README.md) for setup, configuration, and contribution guidelines
46
- > - ⚠️**This README is for our Gradio Demo Only !**
47
 
48
- <div align="center">
49
-
50
- [![GitHub](https://img.shields.io/github/stars/DeepCritical/GradioDemo?style=for-the-badge&logo=github&logoColor=white&label=GitHub&labelColor=181717&color=181717)](https://github.com/DeepCritical/GradioDemo)
51
- [![Documentation](https://img.shields.io/badge/Docs-0080FF?style=for-the-badge&logo=readthedocs&logoColor=white&labelColor=0080FF&color=0080FF)](deepcritical.github.io/GradioDemo/)
52
- [![Demo](https://img.shields.io/badge/Demo-FFD21E?style=for-the-badge&logo=huggingface&logoColor=white&labelColor=FFD21E&color=FFD21E)](https://huggingface.co/spaces/DataQuests/DeepCritical)
53
- [![codecov](https://codecov.io/gh/DeepCritical/GradioDemo/graph/badge.svg?token=B1f05RCGpz)](https://codecov.io/gh/DeepCritical/GradioDemo)
54
- [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP)
55
-
56
-
57
- </div>
58
-
59
- # The DETERMINATOR
60
-
61
- ## About
62
-
63
- The DETERMINATOR is a powerful generalist deep research agent system that stops at nothing until finding precise answers to complex questions. It uses iterative search-and-judge loops to comprehensively investigate any research question from any domain.
 
1
  ---
2
+ title: DeepCritical
3
+ emoji: 📈
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 6.0.0
 
8
  app_file: src/app.py
9
+ pinned: false
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
+ short_description: Deep Search for Critical Research [BigData] -> [Actionable]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
+ ### DeepCritical
 
 
 
 
 
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
deployments/README.md DELETED
@@ -1,46 +0,0 @@
1
- # Deployments
2
-
3
- This directory contains infrastructure deployment scripts for DeepCritical services.
4
-
5
- ## Modal Deployments
6
-
7
- ### TTS Service (`modal_tts.py`)
8
-
9
- Deploys the Kokoro TTS (Text-to-Speech) function to Modal's GPU infrastructure.
10
-
11
- **Deploy:**
12
- ```bash
13
- modal deploy deployments/modal_tts.py
14
- ```
15
-
16
- **Features:**
17
- - Kokoro 82M TTS model
18
- - GPU-accelerated (T4)
19
- - Voice options: af_heart, af_bella, am_michael, etc.
20
- - Configurable speech speed
21
-
22
- **Requirements:**
23
- - Modal account and credentials (`MODAL_TOKEN_ID`, `MODAL_TOKEN_SECRET` in `.env`)
24
- - GPU quota on Modal
25
-
26
- **After Deployment:**
27
- The function will be available at:
28
- - App: `deepcritical-tts`
29
- - Function: `kokoro_tts_function`
30
-
31
- The main application (`src/services/tts_modal.py`) will call this deployed function.
32
-
33
- ---
34
-
35
- ## Adding New Deployments
36
-
37
- When adding new deployment scripts:
38
-
39
- 1. Create a new file: `deployments/<service_name>.py`
40
- 2. Use Modal's app pattern:
41
- ```python
42
- import modal
43
- app = modal.App("deepcritical-<service-name>")
44
- ```
45
- 3. Document in this README
46
- 4. Test deployment: `modal deploy deployments/<service_name>.py`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
deployments/modal_tts.py DELETED
@@ -1,97 +0,0 @@
1
- """Deploy Kokoro TTS function to Modal.
2
-
3
- This script deploys the TTS function to Modal so it can be called
4
- from the main DeepCritical application.
5
-
6
- Usage:
7
- modal deploy deploy_modal_tts.py
8
-
9
- After deployment, the function will be available at:
10
- App: deepcritical-tts
11
- Function: kokoro_tts_function
12
- """
13
-
14
- import modal
15
- import numpy as np
16
-
17
- # Create Modal app
18
- app = modal.App("deepcritical-tts")
19
-
20
- # Define Kokoro TTS dependencies
21
- KOKORO_DEPENDENCIES = [
22
- "torch>=2.0.0",
23
- "transformers>=4.30.0",
24
- "numpy<2.0",
25
- ]
26
-
27
- # Create Modal image with Kokoro
28
- tts_image = (
29
- modal.Image.debian_slim(python_version="3.11")
30
- .apt_install("git") # Install git first for pip install from github
31
- .pip_install(*KOKORO_DEPENDENCIES)
32
- .pip_install("git+https://github.com/hexgrad/kokoro.git")
33
- )
34
-
35
-
36
- @app.function(
37
- image=tts_image,
38
- gpu="T4",
39
- timeout=60,
40
- )
41
- def kokoro_tts_function(text: str, voice: str, speed: float) -> tuple[int, np.ndarray]:
42
- """Modal GPU function for Kokoro TTS.
43
-
44
- This function runs on Modal's GPU infrastructure.
45
- Based on: https://huggingface.co/spaces/hexgrad/Kokoro-TTS
46
-
47
- Args:
48
- text: Text to synthesize
49
- voice: Voice ID (e.g., af_heart, af_bella, am_michael)
50
- speed: Speech speed multiplier (0.5-2.0)
51
-
52
- Returns:
53
- Tuple of (sample_rate, audio_array)
54
- """
55
- import numpy as np
56
-
57
- try:
58
- import torch
59
- from kokoro import KModel, KPipeline
60
-
61
- # Initialize model (cached on GPU)
62
- model = KModel().to("cuda").eval()
63
- pipeline = KPipeline(lang_code=voice[0])
64
- pack = pipeline.load_voice(voice)
65
-
66
- # Generate audio - accumulate all chunks
67
- audio_chunks = []
68
- for _, ps, _ in pipeline(text, voice, speed):
69
- ref_s = pack[len(ps) - 1]
70
- audio = model(ps, ref_s, speed)
71
- audio_chunks.append(audio.numpy())
72
-
73
- # Concatenate all audio chunks
74
- if audio_chunks:
75
- full_audio = np.concatenate(audio_chunks)
76
- return (24000, full_audio)
77
-
78
- # If no audio generated, return empty
79
- return (24000, np.zeros(1, dtype=np.float32))
80
-
81
- except ImportError as e:
82
- raise RuntimeError(
83
- f"Kokoro not installed: {e}. "
84
- "Install with: pip install git+https://github.com/hexgrad/kokoro.git"
85
- ) from e
86
- except Exception as e:
87
- raise RuntimeError(f"TTS synthesis failed: {e}") from e
88
-
89
-
90
- # Optional: Add a test entrypoint
91
- @app.local_entrypoint()
92
- def test():
93
- """Test the TTS function."""
94
- print("Testing Modal TTS function...")
95
- sample_rate, audio = kokoro_tts_function.remote("Hello, this is a test.", "af_heart", 1.0)
96
- print(f"Generated audio: {sample_rate}Hz, shape={audio.shape}")
97
- print("✓ TTS function works!")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dev/.cursorrules DELETED
@@ -1,241 +0,0 @@
1
- # DeepCritical Project - Cursor Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
237
-
238
-
239
-
240
-
241
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dev/AGENTS.txt DELETED
@@ -1,236 +0,0 @@
1
- # DeepCritical Project - Rules
2
-
3
- ## Project-Wide Rules
4
-
5
- **Architecture**: Multi-agent research system using Pydantic AI for agent orchestration, supporting iterative and deep research patterns. Uses middleware for state management, budget tracking, and workflow coordination.
6
-
7
- **Type Safety**: ALWAYS use complete type hints. All functions must have parameter and return type annotations. Use `mypy --strict` compliance. Use `TYPE_CHECKING` imports for circular dependencies: `from typing import TYPE_CHECKING; if TYPE_CHECKING: from src.services.embeddings import EmbeddingService`
8
-
9
- **Async Patterns**: ALL I/O operations must be async (`async def`, `await`). Use `asyncio.gather()` for parallel operations. CPU-bound work must use `run_in_executor()`: `loop = asyncio.get_running_loop(); result = await loop.run_in_executor(None, cpu_bound_function, args)`. Never block the event loop.
10
-
11
- **Error Handling**: Use custom exceptions from `src/utils/exceptions.py`: `DeepCriticalError`, `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions: `raise SearchError(...) from e`. Log with structlog: `logger.error("Operation failed", error=str(e), context=value)`.
12
-
13
- **Logging**: Use `structlog` for ALL logging (NOT `print` or `logging`). Import: `import structlog; logger = structlog.get_logger()`. Log with structured data: `logger.info("event", key=value)`. Use appropriate levels: DEBUG, INFO, WARNING, ERROR.
14
-
15
- **Pydantic Models**: All data exchange uses Pydantic models from `src/utils/models.py`. Models are frozen (`model_config = {"frozen": True}`) for immutability. Use `Field()` with descriptions. Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints.
16
-
17
- **Code Style**: Ruff with 100-char line length. Ignore rules: `PLR0913` (too many arguments), `PLR0912` (too many branches), `PLR0911` (too many returns), `PLR2004` (magic values), `PLW0603` (global statement), `PLC0415` (lazy imports).
18
-
19
- **Docstrings**: Google-style docstrings for all public functions. Include Args, Returns, Raises sections. Use type hints in docstrings only if needed for clarity.
20
-
21
- **Testing**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`). Use `respx` for httpx mocking, `pytest-mock` for general mocking.
22
-
23
- **State Management**: Use `ContextVar` in middleware for thread-safe isolation. Never use global mutable state (except singletons via `@lru_cache`). Use `WorkflowState` from `src/middleware/state_machine.py` for workflow state.
24
-
25
- **Citation Validation**: ALWAYS validate references before returning reports. Use `validate_references()` from `src/utils/citation_validator.py`. Remove hallucinated citations. Log warnings for removed citations.
26
-
27
- ---
28
-
29
- ## src/agents/ - Agent Implementation Rules
30
-
31
- **Pattern**: All agents use Pydantic AI `Agent` class. Agents have structured output types (Pydantic models) or return strings. Use factory functions in `src/agent_factory/agents.py` for creation.
32
-
33
- **Agent Structure**:
34
- - System prompt as module-level constant (with date injection: `datetime.now().strftime("%Y-%m-%d")`)
35
- - Agent class with `__init__(model: Any | None = None)`
36
- - Main method (e.g., `async def evaluate()`, `async def write_report()`)
37
- - Factory function: `def create_agent_name(model: Any | None = None) -> AgentName`
38
-
39
- **Model Initialization**: Use `get_model()` from `src/agent_factory/judges.py` if no model provided. Support OpenAI/Anthropic/HF Inference via settings.
40
-
41
- **Error Handling**: Return fallback values (e.g., `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`) on failure. Log errors with context. Use retry logic (3 retries) in Pydantic AI Agent initialization.
42
-
43
- **Input Validation**: Validate query/inputs are not empty. Truncate very long inputs with warnings. Handle None values gracefully.
44
-
45
- **Output Types**: Use structured output types from `src/utils/models.py` (e.g., `KnowledgeGapOutput`, `AgentSelectionPlan`, `ReportDraft`). For text output (writer agents), return `str` directly.
46
-
47
- **Agent-Specific Rules**:
48
- - `knowledge_gap.py`: Outputs `KnowledgeGapOutput`. Evaluates research completeness.
49
- - `tool_selector.py`: Outputs `AgentSelectionPlan`. Selects tools (RAG/web/database).
50
- - `writer.py`: Returns markdown string. Includes citations in numbered format.
51
- - `long_writer.py`: Uses `ReportDraft` input/output. Handles section-by-section writing.
52
- - `proofreader.py`: Takes `ReportDraft`, returns polished markdown.
53
- - `thinking.py`: Returns observation string from conversation history.
54
- - `input_parser.py`: Outputs `ParsedQuery` with research mode detection.
55
-
56
- ---
57
-
58
- ## src/tools/ - Search Tool Rules
59
-
60
- **Protocol**: All tools implement `SearchTool` protocol from `src/tools/base.py`: `name` property and `async def search(query, max_results) -> list[Evidence]`.
61
-
62
- **Rate Limiting**: Use `@retry` decorator from tenacity: `@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))`. Implement `_rate_limit()` method for APIs with limits. Use shared rate limiters from `src/tools/rate_limiter.py`.
63
-
64
- **Error Handling**: Raise `SearchError` or `RateLimitError` on failures. Handle HTTP errors (429, 500, timeout). Return empty list on non-critical errors (log warning).
65
-
66
- **Query Preprocessing**: Use `preprocess_query()` from `src/tools/query_utils.py` to remove noise and expand synonyms.
67
-
68
- **Evidence Conversion**: Convert API responses to `Evidence` objects with `Citation`. Extract metadata (title, url, date, authors). Set relevance scores (0.0-1.0). Handle missing fields gracefully.
69
-
70
- **Tool-Specific Rules**:
71
- - `pubmed.py`: Use NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Parse XML with `xmltodict`. Handle single vs. multiple articles.
72
- - `clinicaltrials.py`: Use `requests` library (NOT httpx - WAF blocks httpx). Run in thread pool: `await asyncio.to_thread(requests.get, ...)`. Filter: Only interventional studies, active/completed.
73
- - `europepmc.py`: Handle preprint markers: `[PREPRINT - Not peer-reviewed]`. Build URLs from DOI or PMID.
74
- - `rag_tool.py`: Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results. Handles ingestion.
75
- - `search_handler.py`: Orchestrates parallel searches across multiple tools. Uses `asyncio.gather()` with `return_exceptions=True`. Aggregates results into `SearchResult`.
76
-
77
- ---
78
-
79
- ## src/middleware/ - Middleware Rules
80
-
81
- **State Management**: Use `ContextVar` for thread-safe isolation. `WorkflowState` uses `ContextVar[WorkflowState | None]`. Initialize with `init_workflow_state(embedding_service)`. Access with `get_workflow_state()` (auto-initializes if missing).
82
-
83
- **WorkflowState**: Tracks `evidence: list[Evidence]`, `conversation: Conversation`, `embedding_service: Any`. Methods: `add_evidence()` (deduplicates by URL), `async search_related()` (semantic search).
84
-
85
- **WorkflowManager**: Manages parallel research loops. Methods: `add_loop()`, `run_loops_parallel()`, `update_loop_status()`, `sync_loop_evidence_to_state()`. Uses `asyncio.gather()` for parallel execution. Handles errors per loop (don't fail all if one fails).
86
-
87
- **BudgetTracker**: Tracks tokens, time, iterations per loop and globally. Methods: `create_budget()`, `add_tokens()`, `start_timer()`, `update_timer()`, `increment_iteration()`, `check_budget()`, `can_continue()`. Token estimation: `estimate_tokens(text)` (~4 chars per token), `estimate_llm_call_tokens(prompt, response)`.
88
-
89
- **Models**: All middleware models in `src/utils/models.py`. `IterationData`, `Conversation`, `ResearchLoop`, `BudgetStatus` are used by middleware.
90
-
91
- ---
92
-
93
- ## src/orchestrator/ - Orchestration Rules
94
-
95
- **Research Flows**: Two patterns: `IterativeResearchFlow` (single loop) and `DeepResearchFlow` (plan → parallel loops → synthesis). Both support agent chains (`use_graph=False`) and graph execution (`use_graph=True`).
96
-
97
- **IterativeResearchFlow**: Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete. Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`, `JudgeHandler`. Tracks iterations, time, budget.
98
-
99
- **DeepResearchFlow**: Pattern: Planner → Parallel iterative loops per section → Synthesizer. Uses `PlannerAgent`, `IterativeResearchFlow` (per section), `LongWriterAgent` or `ProofreaderAgent`. Uses `WorkflowManager` for parallel execution.
100
-
101
- **Graph Orchestrator**: Uses Pydantic AI Graphs (when available) or agent chains (fallback). Routes based on research mode (iterative/deep/auto). Streams `AgentEvent` objects for UI.
102
-
103
- **State Initialization**: Always call `init_workflow_state()` before running flows. Initialize `BudgetTracker` per loop. Use `WorkflowManager` for parallel coordination.
104
-
105
- **Event Streaming**: Yield `AgentEvent` objects during execution. Event types: "started", "search_complete", "judge_complete", "hypothesizing", "synthesizing", "complete", "error". Include iteration numbers and data payloads.
106
-
107
- ---
108
-
109
- ## src/services/ - Service Rules
110
-
111
- **EmbeddingService**: Local sentence-transformers (NO API key required). All operations async-safe via `run_in_executor()`. ChromaDB for vector storage. Deduplication threshold: 0.85 (85% similarity = duplicate).
112
-
113
- **LlamaIndexRAGService**: Uses OpenAI embeddings (requires `OPENAI_API_KEY`). Methods: `ingest_evidence()`, `retrieve()`, `query()`. Returns documents with metadata (source, title, url, date, authors). Lazy initialization with graceful fallback.
114
-
115
- **StatisticalAnalyzer**: Generates Python code via LLM. Executes in Modal sandbox (secure, isolated). Library versions pinned in `SANDBOX_LIBRARIES` dict. Returns `AnalysisResult` with verdict (SUPPORTED/REFUTED/INCONCLUSIVE).
116
-
117
- **Singleton Pattern**: Use `@lru_cache(maxsize=1)` for singletons: `@lru_cache(maxsize=1); def get_service() -> Service: return Service()`. Lazy initialization to avoid requiring dependencies at import time.
118
-
119
- ---
120
-
121
- ## src/utils/ - Utility Rules
122
-
123
- **Models**: All Pydantic models in `src/utils/models.py`. Use frozen models (`model_config = {"frozen": True}`) except where mutation needed. Use `Field()` with descriptions. Validate with constraints.
124
-
125
- **Config**: Settings via Pydantic Settings (`src/utils/config.py`). Load from `.env` automatically. Use `settings` singleton: `from src.utils.config import settings`. Validate API keys with properties: `has_openai_key`, `has_anthropic_key`.
126
-
127
- **Exceptions**: Custom exception hierarchy in `src/utils/exceptions.py`. Base: `DeepCriticalError`. Specific: `SearchError`, `RateLimitError`, `JudgeError`, `ConfigurationError`. Always chain exceptions.
128
-
129
- **LLM Factory**: Centralized LLM model creation in `src/utils/llm_factory.py`. Supports OpenAI, Anthropic, HF Inference. Use `get_model()` or factory functions. Check requirements before initialization.
130
-
131
- **Citation Validator**: Use `validate_references()` from `src/utils/citation_validator.py`. Removes hallucinated citations (URLs not in evidence). Logs warnings. Returns validated report string.
132
-
133
- ---
134
-
135
- ## src/orchestrator_factory.py Rules
136
-
137
- **Purpose**: Factory for creating orchestrators. Supports "simple" (legacy) and "advanced" (magentic) modes. Auto-detects mode based on API key availability.
138
-
139
- **Pattern**: Lazy import for optional dependencies (`_get_magentic_orchestrator_class()`). Handles `ImportError` gracefully with clear error messages.
140
-
141
- **Mode Detection**: `_determine_mode()` checks explicit mode or auto-detects: "advanced" if `settings.has_openai_key`, else "simple". Maps "magentic" → "advanced".
142
-
143
- **Function Signature**: `create_orchestrator(search_handler, judge_handler, config, mode) -> Any`. Simple mode requires handlers. Advanced mode uses MagenticOrchestrator.
144
-
145
- **Error Handling**: Raise `ValueError` with clear messages if requirements not met. Log mode selection with structlog.
146
-
147
- ---
148
-
149
- ## src/orchestrator_hierarchical.py Rules
150
-
151
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams. Adapts Magentic ChatAgent to SubIterationTeam protocol.
152
-
153
- **Pattern**: Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`. Event-driven via callback queue.
154
-
155
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated, but kept for compatibility).
156
-
157
- **Event Streaming**: Uses `asyncio.Queue` for event coordination. Yields `AgentEvent` objects. Handles event callback pattern with `asyncio.wait()`.
158
-
159
- **Error Handling**: Log errors with context. Yield error events. Process remaining events after task completion.
160
-
161
- ---
162
-
163
- ## src/orchestrator_magentic.py Rules
164
-
165
- **Purpose**: Magentic-based orchestrator using ChatAgent pattern. Each agent has internal LLM. Manager orchestrates agents.
166
-
167
- **Pattern**: Uses `MagenticBuilder` with participants (searcher, hypothesizer, judge, reporter). Manager uses `OpenAIChatClient`. Workflow built in `_build_workflow()`.
168
-
169
- **Event Processing**: `_process_event()` converts Magentic events to `AgentEvent`. Handles: `MagenticOrchestratorMessageEvent`, `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, `MagenticAgentDeltaEvent`, `WorkflowOutputEvent`.
170
-
171
- **Text Extraction**: `_extract_text()` defensively extracts text from messages. Priority: `.content` → `.text` → `str(message)`. Handles buggy message objects.
172
-
173
- **State Initialization**: Initialize embedding service with graceful fallback. Use `init_magentic_state()` (deprecated).
174
-
175
- **Requirements**: Must call `check_magentic_requirements()` in `__init__`. Requires `agent-framework-core` and OpenAI API key.
176
-
177
- **Event Types**: Maps agent names to event types: "search" → "search_complete", "judge" → "judge_complete", "hypothes" → "hypothesizing", "report" → "synthesizing".
178
-
179
- ---
180
-
181
- ## src/agent_factory/ - Factory Rules
182
-
183
- **Pattern**: Factory functions for creating agents and handlers. Lazy initialization for optional dependencies. Support OpenAI/Anthropic/HF Inference.
184
-
185
- **Judges**: `create_judge_handler()` creates `JudgeHandler` with structured output (`JudgeAssessment`). Supports `MockJudgeHandler`, `HFInferenceJudgeHandler` as fallbacks.
186
-
187
- **Agents**: Factory functions in `agents.py` for all Pydantic AI agents. Pattern: `create_agent_name(model: Any | None = None) -> AgentName`. Use `get_model()` if model not provided.
188
-
189
- **Graph Builder**: `graph_builder.py` contains utilities for building research graphs. Supports iterative and deep research graph construction.
190
-
191
- **Error Handling**: Raise `ConfigurationError` if required API keys missing. Log agent creation. Handle import errors gracefully.
192
-
193
- ---
194
-
195
- ## src/prompts/ - Prompt Rules
196
-
197
- **Pattern**: System prompts stored as module-level constants. Include date injection: `datetime.now().strftime("%Y-%m-%d")`. Format evidence with truncation (1500 chars per item).
198
-
199
- **Judge Prompts**: In `judge.py`. Handle empty evidence case separately. Always request structured JSON output.
200
-
201
- **Hypothesis Prompts**: In `hypothesis.py`. Use diverse evidence selection (MMR algorithm). Sentence-aware truncation.
202
-
203
- **Report Prompts**: In `report.py`. Include full citation details. Use diverse evidence selection (n=20). Emphasize citation validation rules.
204
-
205
- ---
206
-
207
- ## Testing Rules
208
-
209
- **Structure**: Unit tests in `tests/unit/` (mocked, fast). Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`).
210
-
211
- **Mocking**: Use `respx` for httpx mocking. Use `pytest-mock` for general mocking. Mock LLM calls in unit tests (use `MockJudgeHandler`).
212
-
213
- **Fixtures**: Common fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`.
214
-
215
- **Coverage**: Aim for >80% coverage. Test error handling, edge cases, and integration paths.
216
-
217
- ---
218
-
219
- ## File-Specific Agent Rules
220
-
221
- **knowledge_gap.py**: Outputs `KnowledgeGapOutput`. System prompt evaluates research completeness. Handles conversation history. Returns fallback on error.
222
-
223
- **writer.py**: Returns markdown string. System prompt includes citation format examples. Validates inputs. Truncates long findings. Retry logic for transient failures.
224
-
225
- **long_writer.py**: Uses `ReportDraft` input/output. Writes sections iteratively. Reformats references (deduplicates, renumbers). Reformats section headings.
226
-
227
- **proofreader.py**: Takes `ReportDraft`, returns polished markdown. Removes duplicates. Adds summary. Preserves references.
228
-
229
- **tool_selector.py**: Outputs `AgentSelectionPlan`. System prompt lists available agents (WebSearchAgent, SiteCrawlerAgent, RAGAgent). Guidelines for when to use each.
230
-
231
- **thinking.py**: Returns observation string. Generates observations from conversation history. Uses query and background context.
232
-
233
- **input_parser.py**: Outputs `ParsedQuery`. Detects research mode (iterative/deep). Extracts entities and research questions. Improves/refines query.
234
-
235
-
236
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dev/docs_plugins.py DELETED
@@ -1,74 +0,0 @@
1
- """Custom MkDocs extension to handle code anchor format: ```start:end:filepath"""
2
-
3
- import re
4
- from pathlib import Path
5
-
6
- from markdown import Markdown
7
- from markdown.extensions import Extension
8
- from markdown.preprocessors import Preprocessor
9
-
10
-
11
- class CodeAnchorPreprocessor(Preprocessor):
12
- """Preprocess code blocks with anchor format: ```start:end:filepath"""
13
-
14
- def __init__(self, md: Markdown, base_path: Path):
15
- super().__init__(md)
16
- self.base_path = base_path
17
- self.pattern = re.compile(r"^```(\d+):(\d+):([^\n]+)\n(.*?)```$", re.MULTILINE | re.DOTALL)
18
-
19
- def run(self, lines: list[str]) -> list[str]:
20
- """Process lines and convert code anchor format to standard code blocks."""
21
- text = "\n".join(lines)
22
- new_text = self.pattern.sub(self._replace_code_anchor, text)
23
- return new_text.split("\n")
24
-
25
- def _replace_code_anchor(self, match) -> str:
26
- """Replace code anchor format with standard code block + link."""
27
- start_line = int(match.group(1))
28
- end_line = int(match.group(2))
29
- file_path = match.group(3).strip()
30
- existing_code = match.group(4)
31
-
32
- # Determine language from file extension
33
- ext = Path(file_path).suffix.lower()
34
- lang_map = {
35
- ".py": "python",
36
- ".js": "javascript",
37
- ".ts": "typescript",
38
- ".md": "markdown",
39
- ".yaml": "yaml",
40
- ".yml": "yaml",
41
- ".toml": "toml",
42
- ".json": "json",
43
- ".html": "html",
44
- ".css": "css",
45
- ".sh": "bash",
46
- }
47
- language = lang_map.get(ext, "python")
48
-
49
- # Generate GitHub link
50
- repo_url = "https://github.com/DeepCritical/GradioDemo"
51
- github_link = f"{repo_url}/blob/main/{file_path}#L{start_line}-L{end_line}"
52
-
53
- # Return standard code block with source link
54
- return (
55
- f'[View source: `{file_path}` (lines {start_line}-{end_line})]({github_link}){{: target="_blank" }}\n\n'
56
- f"```{language}\n{existing_code}\n```"
57
- )
58
-
59
-
60
- class CodeAnchorExtension(Extension):
61
- """Markdown extension for code anchors."""
62
-
63
- def __init__(self, base_path: str = ".", **kwargs):
64
- super().__init__(**kwargs)
65
- self.base_path = Path(base_path)
66
-
67
- def extendMarkdown(self, md: Markdown): # noqa: N802
68
- """Register the preprocessor."""
69
- md.preprocessors.register(CodeAnchorPreprocessor(md, self.base_path), "codeanchor", 25)
70
-
71
-
72
- def makeExtension(**kwargs): # noqa: N802
73
- """Create the extension."""
74
- return CodeAnchorExtension(**kwargs)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/LICENSE.md DELETED
@@ -1,35 +0,0 @@
1
- # License
2
-
3
- DeepCritical is licensed under the MIT License.
4
-
5
- ## MIT License
6
-
7
- Copyright (c) 2024 DeepCritical Team
8
-
9
- Permission is hereby granted, free of charge, to any person obtaining a copy
10
- of this software and associated documentation files (the "Software"), to deal
11
- in the Software without restriction, including without limitation the rights
12
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
13
- copies of the Software, and to permit persons to whom the Software is
14
- furnished to do so, subject to the following conditions:
15
-
16
- The above copyright notice and this permission notice shall be included in all
17
- copies or substantial portions of the Software.
18
-
19
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25
- SOFTWARE.
26
-
27
-
28
-
29
-
30
-
31
-
32
-
33
-
34
-
35
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/api/agents.md DELETED
@@ -1,211 +0,0 @@
1
- # Agents API Reference
2
-
3
- This page documents the API for DeepCritical agents.
4
-
5
- ## KnowledgeGapAgent
6
-
7
- **Module**: `src.agents.knowledge_gap`
8
-
9
- **Purpose**: Evaluates research state and identifies knowledge gaps.
10
-
11
- ### Methods
12
-
13
- #### `evaluate`
14
-
15
- <!--codeinclude-->
16
- [KnowledgeGapAgent.evaluate](../src/agents/knowledge_gap.py) start_line:66 end_line:74
17
- <!--/codeinclude-->
18
-
19
- Evaluates research completeness and identifies outstanding knowledge gaps.
20
-
21
- **Parameters**:
22
- - `query`: Research query string
23
- - `background_context`: Background context for the query (default: "")
24
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
25
- - `iteration`: Current iteration number (default: 0)
26
- - `time_elapsed_minutes`: Elapsed time in minutes (default: 0.0)
27
- - `max_time_minutes`: Maximum time limit in minutes (default: 10)
28
-
29
- **Returns**: `KnowledgeGapOutput` with:
30
- - `research_complete`: Boolean indicating if research is complete
31
- - `outstanding_gaps`: List of remaining knowledge gaps
32
-
33
- ## ToolSelectorAgent
34
-
35
- **Module**: `src.agents.tool_selector`
36
-
37
- **Purpose**: Selects appropriate tools for addressing knowledge gaps.
38
-
39
- ### Methods
40
-
41
- #### `select_tools`
42
-
43
- <!--codeinclude-->
44
- [ToolSelectorAgent.select_tools](../src/agents/tool_selector.py) start_line:78 end_line:84
45
- <!--/codeinclude-->
46
-
47
- Selects tools for addressing a knowledge gap.
48
-
49
- **Parameters**:
50
- - `gap`: The knowledge gap to address
51
- - `query`: Research query string
52
- - `background_context`: Optional background context (default: "")
53
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
54
-
55
- **Returns**: `AgentSelectionPlan` with list of `AgentTask` objects.
56
-
57
- ## WriterAgent
58
-
59
- **Module**: `src.agents.writer`
60
-
61
- **Purpose**: Generates final reports from research findings.
62
-
63
- ### Methods
64
-
65
- #### `write_report`
66
-
67
- <!--codeinclude-->
68
- [WriterAgent.write_report](../src/agents/writer.py) start_line:67 end_line:73
69
- <!--/codeinclude-->
70
-
71
- Generates a markdown report from research findings.
72
-
73
- **Parameters**:
74
- - `query`: Research query string
75
- - `findings`: Research findings to include in report
76
- - `output_length`: Optional description of desired output length (default: "")
77
- - `output_instructions`: Optional additional instructions for report generation (default: "")
78
-
79
- **Returns**: Markdown string with numbered citations.
80
-
81
- ## LongWriterAgent
82
-
83
- **Module**: `src.agents.long_writer`
84
-
85
- **Purpose**: Long-form report generation with section-by-section writing.
86
-
87
- ### Methods
88
-
89
- #### `write_next_section`
90
-
91
- <!--codeinclude-->
92
- [LongWriterAgent.write_next_section](../src/agents/long_writer.py) start_line:94 end_line:100
93
- <!--/codeinclude-->
94
-
95
- Writes the next section of a long-form report.
96
-
97
- **Parameters**:
98
- - `original_query`: The original research query
99
- - `report_draft`: Current report draft as string (all sections written so far)
100
- - `next_section_title`: Title of the section to write
101
- - `next_section_draft`: Draft content for the next section
102
-
103
- **Returns**: `LongWriterOutput` with formatted section and references.
104
-
105
- #### `write_report`
106
-
107
- <!--codeinclude-->
108
- [LongWriterAgent.write_report](../src/agents/long_writer.py) start_line:263 end_line:268
109
- <!--/codeinclude-->
110
-
111
- Generates final report from draft.
112
-
113
- **Parameters**:
114
- - `query`: Research query string
115
- - `report_title`: Title of the report
116
- - `report_draft`: Complete report draft
117
-
118
- **Returns**: Final markdown report string.
119
-
120
- ## ProofreaderAgent
121
-
122
- **Module**: `src.agents.proofreader`
123
-
124
- **Purpose**: Proofreads and polishes report drafts.
125
-
126
- ### Methods
127
-
128
- #### `proofread`
129
-
130
- <!--codeinclude-->
131
- [ProofreaderAgent.proofread](../src/agents/proofreader.py) start_line:72 end_line:76
132
- <!--/codeinclude-->
133
-
134
- Proofreads and polishes a report draft.
135
-
136
- **Parameters**:
137
- - `query`: Research query string
138
- - `report_title`: Title of the report
139
- - `report_draft`: Report draft to proofread
140
-
141
- **Returns**: Polished markdown string.
142
-
143
- ## ThinkingAgent
144
-
145
- **Module**: `src.agents.thinking`
146
-
147
- **Purpose**: Generates observations from conversation history.
148
-
149
- ### Methods
150
-
151
- #### `generate_observations`
152
-
153
- <!--codeinclude-->
154
- [ThinkingAgent.generate_observations](../src/agents/thinking.py) start_line:70 end_line:76
155
- <!--/codeinclude-->
156
-
157
- Generates observations from conversation history.
158
-
159
- **Parameters**:
160
- - `query`: Research query string
161
- - `background_context`: Optional background context (default: "")
162
- - `conversation_history`: History of actions, findings, and thoughts as string (default: "")
163
- - `iteration`: Current iteration number (default: 1)
164
-
165
- **Returns**: Observation string.
166
-
167
- ## InputParserAgent
168
-
169
- **Module**: `src.agents.input_parser`
170
-
171
- **Purpose**: Parses and improves user queries, detects research mode.
172
-
173
- ### Methods
174
-
175
- #### `parse`
176
-
177
- <!--codeinclude-->
178
- [InputParserAgent.parse](../src/agents/input_parser.py) start_line:82 end_line:82
179
- <!--/codeinclude-->
180
-
181
- Parses and improves a user query.
182
-
183
- **Parameters**:
184
- - `query`: Original query string
185
-
186
- **Returns**: `ParsedQuery` with:
187
- - `original_query`: Original query string
188
- - `improved_query`: Refined query string
189
- - `research_mode`: "iterative" or "deep"
190
- - `key_entities`: List of key entities
191
- - `research_questions`: List of research questions
192
-
193
- ## Factory Functions
194
-
195
- All agents have factory functions in `src.agent_factory.agents`:
196
-
197
- <!--codeinclude-->
198
- [Factory Functions](../src/agent_factory/agents.py) start_line:30 end_line:50
199
- <!--/codeinclude-->
200
-
201
- **Parameters**:
202
- - `model`: Optional Pydantic AI model. If None, uses `get_model()` from settings.
203
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
204
-
205
- **Returns**: Agent instance.
206
-
207
- ## See Also
208
-
209
- - [Architecture - Agents](../architecture/agents.md) - Architecture overview
210
- - [Models API](models.md) - Data models used by agents
211
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/api/models.md DELETED
@@ -1,191 +0,0 @@
1
- # Models API Reference
2
-
3
- This page documents the Pydantic models used throughout DeepCritical.
4
-
5
- ## Evidence
6
-
7
- **Module**: `src.utils.models`
8
-
9
- **Purpose**: Represents evidence from search results.
10
-
11
- <!--codeinclude-->
12
- [Evidence Model](../src/utils/models.py) start_line:33 end_line:44
13
- <!--/codeinclude-->
14
-
15
- **Fields**:
16
- - `citation`: Citation information (title, URL, date, authors)
17
- - `content`: Evidence text content
18
- - `relevance`: Relevance score (0.0-1.0)
19
- - `metadata`: Additional metadata dictionary
20
-
21
- ## Citation
22
-
23
- **Module**: `src.utils.models`
24
-
25
- **Purpose**: Citation information for evidence.
26
-
27
- <!--codeinclude-->
28
- [Citation Model](../src/utils/models.py) start_line:12 end_line:30
29
- <!--/codeinclude-->
30
-
31
- **Fields**:
32
- - `source`: Source name (e.g., "pubmed", "clinicaltrials", "europepmc", "web", "rag")
33
- - `title`: Article/trial title
34
- - `url`: Source URL
35
- - `date`: Publication date (YYYY-MM-DD or "Unknown")
36
- - `authors`: List of authors (optional)
37
-
38
- ## KnowledgeGapOutput
39
-
40
- **Module**: `src.utils.models`
41
-
42
- **Purpose**: Output from knowledge gap evaluation.
43
-
44
- <!--codeinclude-->
45
- [KnowledgeGapOutput Model](../src/utils/models.py) start_line:494 end_line:504
46
- <!--/codeinclude-->
47
-
48
- **Fields**:
49
- - `research_complete`: Boolean indicating if research is complete
50
- - `outstanding_gaps`: List of remaining knowledge gaps
51
-
52
- ## AgentSelectionPlan
53
-
54
- **Module**: `src.utils.models`
55
-
56
- **Purpose**: Plan for tool/agent selection.
57
-
58
- <!--codeinclude-->
59
- [AgentSelectionPlan Model](../src/utils/models.py) start_line:521 end_line:526
60
- <!--/codeinclude-->
61
-
62
- **Fields**:
63
- - `tasks`: List of agent tasks to execute
64
-
65
- ## AgentTask
66
-
67
- **Module**: `src.utils.models`
68
-
69
- **Purpose**: Individual agent task.
70
-
71
- <!--codeinclude-->
72
- [AgentTask Model](../src/utils/models.py) start_line:507 end_line:518
73
- <!--/codeinclude-->
74
-
75
- **Fields**:
76
- - `gap`: The knowledge gap being addressed (optional)
77
- - `agent`: Name of agent to use
78
- - `query`: The specific query for the agent
79
- - `entity_website`: The website of the entity being researched, if known (optional)
80
-
81
- ## ReportDraft
82
-
83
- **Module**: `src.utils.models`
84
-
85
- **Purpose**: Draft structure for long-form reports.
86
-
87
- <!--codeinclude-->
88
- [ReportDraft Model](../src/utils/models.py) start_line:538 end_line:545
89
- <!--/codeinclude-->
90
-
91
- **Fields**:
92
- - `sections`: List of report sections
93
-
94
- ## ReportSection
95
-
96
- **Module**: `src.utils.models`
97
-
98
- **Purpose**: Individual section in a report draft.
99
-
100
- <!--codeinclude-->
101
- [ReportDraftSection Model](../src/utils/models.py) start_line:529 end_line:535
102
- <!--/codeinclude-->
103
-
104
- **Fields**:
105
- - `section_title`: The title of the section
106
- - `section_content`: The content of the section
107
-
108
- ## ParsedQuery
109
-
110
- **Module**: `src.utils.models`
111
-
112
- **Purpose**: Parsed and improved query.
113
-
114
- <!--codeinclude-->
115
- [ParsedQuery Model](../src/utils/models.py) start_line:557 end_line:572
116
- <!--/codeinclude-->
117
-
118
- **Fields**:
119
- - `original_query`: Original query string
120
- - `improved_query`: Refined query string
121
- - `research_mode`: Research mode ("iterative" or "deep")
122
- - `key_entities`: List of key entities
123
- - `research_questions`: List of research questions
124
-
125
- ## Conversation
126
-
127
- **Module**: `src.utils.models`
128
-
129
- **Purpose**: Conversation history with iterations.
130
-
131
- <!--codeinclude-->
132
- [Conversation Model](../src/utils/models.py) start_line:331 end_line:337
133
- <!--/codeinclude-->
134
-
135
- **Fields**:
136
- - `history`: List of iteration data
137
-
138
- ## IterationData
139
-
140
- **Module**: `src.utils.models`
141
-
142
- **Purpose**: Data for a single iteration.
143
-
144
- <!--codeinclude-->
145
- [IterationData Model](../src/utils/models.py) start_line:315 end_line:328
146
- <!--/codeinclude-->
147
-
148
- **Fields**:
149
- - `gap`: The gap addressed in the iteration
150
- - `tool_calls`: The tool calls made
151
- - `findings`: The findings collected from tool calls
152
- - `thought`: The thinking done to reflect on the success of the iteration and next steps
153
-
154
- ## AgentEvent
155
-
156
- **Module**: `src.utils.models`
157
-
158
- **Purpose**: Event emitted during research execution.
159
-
160
- <!--codeinclude-->
161
- [AgentEvent Model](../src/utils/models.py) start_line:104 end_line:125
162
- <!--/codeinclude-->
163
-
164
- **Fields**:
165
- - `type`: Event type (e.g., "started", "search_complete", "complete")
166
- - `iteration`: Iteration number (optional)
167
- - `data`: Event data dictionary
168
-
169
- ## BudgetStatus
170
-
171
- **Module**: `src.utils.models`
172
-
173
- **Purpose**: Current budget status.
174
-
175
- <!--codeinclude-->
176
- [BudgetStatus Model](../src/middleware/budget_tracker.py) start_line:15 end_line:25
177
- <!--/codeinclude-->
178
-
179
- **Fields**:
180
- - `tokens_used`: Total tokens used
181
- - `tokens_limit`: Token budget limit
182
- - `time_elapsed_seconds`: Time elapsed in seconds
183
- - `time_limit_seconds`: Time budget limit (default: 600.0 seconds / 10 minutes)
184
- - `iterations`: Number of iterations completed
185
- - `iterations_limit`: Maximum iterations (default: 10)
186
- - `iteration_tokens`: Tokens used per iteration (iteration number -> token count)
187
-
188
- ## See Also
189
-
190
- - [Architecture - Agents](../architecture/agents.md) - How models are used
191
- - [Configuration](../configuration/index.md) - Model configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/api/orchestrators.md DELETED
@@ -1,149 +0,0 @@
1
- # Orchestrators API Reference
2
-
3
- This page documents the API for DeepCritical orchestrators.
4
-
5
- ## IterativeResearchFlow
6
-
7
- **Module**: `src.orchestrator.research_flow`
8
-
9
- **Purpose**: Single-loop research with search-judge-synthesize cycles.
10
-
11
- ### Methods
12
-
13
- #### `run`
14
-
15
- <!--codeinclude-->
16
- [IterativeResearchFlow.run](../src/orchestrator/research_flow.py) start_line:134 end_line:140
17
- <!--/codeinclude-->
18
-
19
- Runs iterative research flow.
20
-
21
- **Parameters**:
22
- - `query`: Research query string
23
- - `background_context`: Background context (default: "")
24
- - `output_length`: Optional description of desired output length (default: "")
25
- - `output_instructions`: Optional additional instructions for report generation (default: "")
26
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
27
-
28
- **Returns**: Final report string.
29
-
30
- **Note**: The `message_history` parameter enables multi-turn conversations by providing context from previous interactions.
31
-
32
- **Note**: `max_iterations`, `max_time_minutes`, and `token_budget` are constructor parameters, not `run()` parameters.
33
-
34
- ## DeepResearchFlow
35
-
36
- **Module**: `src.orchestrator.research_flow`
37
-
38
- **Purpose**: Multi-section parallel research with planning and synthesis.
39
-
40
- ### Methods
41
-
42
- #### `run`
43
-
44
- <!--codeinclude-->
45
- [DeepResearchFlow.run](../src/orchestrator/research_flow.py) start_line:778 end_line:778
46
- <!--/codeinclude-->
47
-
48
- Runs deep research flow.
49
-
50
- **Parameters**:
51
- - `query`: Research query string
52
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
53
-
54
- **Returns**: Final report string.
55
-
56
- **Note**: The `message_history` parameter enables multi-turn conversations by providing context from previous interactions.
57
-
58
- **Note**: `max_iterations_per_section`, `max_time_minutes`, and `token_budget` are constructor parameters, not `run()` parameters.
59
-
60
- ## GraphOrchestrator
61
-
62
- **Module**: `src.orchestrator.graph_orchestrator`
63
-
64
- **Purpose**: Graph-based execution using Pydantic AI agents as nodes.
65
-
66
- ### Methods
67
-
68
- #### `run`
69
-
70
- <!--codeinclude-->
71
- [GraphOrchestrator.run](../src/orchestrator/graph_orchestrator.py) start_line:177 end_line:177
72
- <!--/codeinclude-->
73
-
74
- Runs graph-based research orchestration.
75
-
76
- **Parameters**:
77
- - `query`: Research query string
78
- - `message_history`: Optional user conversation history in Pydantic AI `ModelMessage` format (default: None)
79
-
80
- **Yields**: `AgentEvent` objects during graph execution.
81
-
82
- **Note**:
83
- - `research_mode` and `use_graph` are constructor parameters, not `run()` parameters.
84
- - The `message_history` parameter enables multi-turn conversations by providing context from previous interactions. Message history is stored in `GraphExecutionContext` and passed to agents during execution.
85
-
86
- ## Orchestrator Factory
87
-
88
- **Module**: `src.orchestrator_factory`
89
-
90
- **Purpose**: Factory for creating orchestrators.
91
-
92
- ### Functions
93
-
94
- #### `create_orchestrator`
95
-
96
- <!--codeinclude-->
97
- [create_orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:50
98
- <!--/codeinclude-->
99
-
100
- Creates an orchestrator instance.
101
-
102
- **Parameters**:
103
- - `search_handler`: Search handler protocol implementation (optional, required for simple mode)
104
- - `judge_handler`: Judge handler protocol implementation (optional, required for simple mode)
105
- - `config`: Configuration object (optional)
106
- - `mode`: Orchestrator mode ("simple", "advanced", "magentic", "iterative", "deep", "auto", or None for auto-detect)
107
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
108
-
109
- **Returns**: Orchestrator instance.
110
-
111
- **Raises**:
112
- - `ValueError`: If requirements not met
113
-
114
- **Modes**:
115
- - `"simple"`: Legacy orchestrator
116
- - `"advanced"` or `"magentic"`: Magentic orchestrator (requires OpenAI API key)
117
- - `None`: Auto-detect based on API key availability
118
-
119
- ## MagenticOrchestrator
120
-
121
- **Module**: `src.orchestrator_magentic`
122
-
123
- **Purpose**: Multi-agent coordination using Microsoft Agent Framework.
124
-
125
- ### Methods
126
-
127
- #### `run`
128
-
129
- <!--codeinclude-->
130
- [MagenticOrchestrator.run](../src/orchestrator_magentic.py) start_line:101 end_line:101
131
- <!--/codeinclude-->
132
-
133
- Runs Magentic orchestration.
134
-
135
- **Parameters**:
136
- - `query`: Research query string
137
-
138
- **Yields**: `AgentEvent` objects converted from Magentic events.
139
-
140
- **Note**: `max_rounds` and `max_stalls` are constructor parameters, not `run()` parameters.
141
-
142
- **Requirements**:
143
- - `agent-framework-core` package
144
- - OpenAI API key
145
-
146
- ## See Also
147
-
148
- - [Architecture - Orchestrators](../architecture/orchestrators.md) - Architecture overview
149
- - [Graph Orchestration](../architecture/graph_orchestration.md) - Graph execution details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/api/services.md DELETED
@@ -1,279 +0,0 @@
1
- # Services API Reference
2
-
3
- This page documents the API for DeepCritical services.
4
-
5
- ## EmbeddingService
6
-
7
- **Module**: `src.services.embeddings`
8
-
9
- **Purpose**: Local sentence-transformers for semantic search and deduplication.
10
-
11
- ### Methods
12
-
13
- #### `embed`
14
-
15
- <!--codeinclude-->
16
- [EmbeddingService.embed](../src/services/embeddings.py) start_line:55 end_line:55
17
- <!--/codeinclude-->
18
-
19
- Generates embedding for a text string.
20
-
21
- **Parameters**:
22
- - `text`: Text to embed
23
-
24
- **Returns**: Embedding vector as list of floats.
25
-
26
- #### `embed_batch`
27
-
28
- ```python
29
- async def embed_batch(self, texts: list[str]) -> list[list[float]]
30
- ```
31
-
32
- Generates embeddings for multiple texts.
33
-
34
- **Parameters**:
35
- - `texts`: List of texts to embed
36
-
37
- **Returns**: List of embedding vectors.
38
-
39
- #### `similarity`
40
-
41
- ```python
42
- async def similarity(self, text1: str, text2: str) -> float
43
- ```
44
-
45
- Calculates similarity between two texts.
46
-
47
- **Parameters**:
48
- - `text1`: First text
49
- - `text2`: Second text
50
-
51
- **Returns**: Similarity score (0.0-1.0).
52
-
53
- #### `find_duplicates`
54
-
55
- ```python
56
- async def find_duplicates(
57
- self,
58
- texts: list[str],
59
- threshold: float = 0.85
60
- ) -> list[tuple[int, int]]
61
- ```
62
-
63
- Finds duplicate texts based on similarity threshold.
64
-
65
- **Parameters**:
66
- - `texts`: List of texts to check
67
- - `threshold`: Similarity threshold (default: 0.85)
68
-
69
- **Returns**: List of (index1, index2) tuples for duplicate pairs.
70
-
71
- #### `add_evidence`
72
-
73
- ```python
74
- async def add_evidence(
75
- self,
76
- evidence_id: str,
77
- content: str,
78
- metadata: dict[str, Any]
79
- ) -> None
80
- ```
81
-
82
- Adds evidence to vector store for semantic search.
83
-
84
- **Parameters**:
85
- - `evidence_id`: Unique identifier for the evidence
86
- - `content`: Evidence text content
87
- - `metadata`: Additional metadata dictionary
88
-
89
- #### `search_similar`
90
-
91
- ```python
92
- async def search_similar(
93
- self,
94
- query: str,
95
- n_results: int = 5
96
- ) -> list[dict[str, Any]]
97
- ```
98
-
99
- Finds semantically similar evidence.
100
-
101
- **Parameters**:
102
- - `query`: Search query string
103
- - `n_results`: Number of results to return (default: 5)
104
-
105
- **Returns**: List of dictionaries with `id`, `content`, `metadata`, and `distance` keys.
106
-
107
- #### `deduplicate`
108
-
109
- ```python
110
- async def deduplicate(
111
- self,
112
- new_evidence: list[Evidence],
113
- threshold: float = 0.9
114
- ) -> list[Evidence]
115
- ```
116
-
117
- Removes semantically duplicate evidence.
118
-
119
- **Parameters**:
120
- - `new_evidence`: List of evidence items to deduplicate
121
- - `threshold`: Similarity threshold (default: 0.9, where 0.9 = 90% similar is duplicate)
122
-
123
- **Returns**: List of unique evidence items (not already in vector store).
124
-
125
- ### Factory Function
126
-
127
- #### `get_embedding_service`
128
-
129
- ```python
130
- @lru_cache(maxsize=1)
131
- def get_embedding_service() -> EmbeddingService
132
- ```
133
-
134
- Returns singleton EmbeddingService instance.
135
-
136
- ## LlamaIndexRAGService
137
-
138
- **Module**: `src.services.rag`
139
-
140
- **Purpose**: Retrieval-Augmented Generation using LlamaIndex.
141
-
142
- ### Methods
143
-
144
- #### `ingest_evidence`
145
-
146
- <!--codeinclude-->
147
- [LlamaIndexRAGService.ingest_evidence](../src/services/llamaindex_rag.py) start_line:290 end_line:290
148
- <!--/codeinclude-->
149
-
150
- Ingests evidence into RAG service.
151
-
152
- **Parameters**:
153
- - `evidence_list`: List of Evidence objects to ingest
154
-
155
- **Note**: Supports multiple embedding providers (OpenAI, local sentence-transformers, Hugging Face).
156
-
157
- #### `retrieve`
158
-
159
- ```python
160
- def retrieve(
161
- self,
162
- query: str,
163
- top_k: int | None = None
164
- ) -> list[dict[str, Any]]
165
- ```
166
-
167
- Retrieves relevant documents for a query.
168
-
169
- **Parameters**:
170
- - `query`: Search query string
171
- - `top_k`: Number of top results to return (defaults to `similarity_top_k` from constructor)
172
-
173
- **Returns**: List of dictionaries with `text`, `score`, and `metadata` keys.
174
-
175
- #### `query`
176
-
177
- ```python
178
- def query(
179
- self,
180
- query_str: str,
181
- top_k: int | None = None
182
- ) -> str
183
- ```
184
-
185
- Queries RAG service and returns synthesized response.
186
-
187
- **Parameters**:
188
- - `query_str`: Query string
189
- - `top_k`: Number of results to use (defaults to `similarity_top_k` from constructor)
190
-
191
- **Returns**: Synthesized response string.
192
-
193
- **Raises**:
194
- - `ConfigurationError`: If no LLM API key is available for query synthesis
195
-
196
- #### `ingest_documents`
197
-
198
- ```python
199
- def ingest_documents(self, documents: list[Any]) -> None
200
- ```
201
-
202
- Ingests raw LlamaIndex Documents.
203
-
204
- **Parameters**:
205
- - `documents`: List of LlamaIndex Document objects
206
-
207
- #### `clear_collection`
208
-
209
- ```python
210
- def clear_collection(self) -> None
211
- ```
212
-
213
- Clears all documents from the collection.
214
-
215
- ### Factory Function
216
-
217
- #### `get_rag_service`
218
-
219
- ```python
220
- def get_rag_service(
221
- collection_name: str = "deepcritical_evidence",
222
- oauth_token: str | None = None,
223
- **kwargs: Any
224
- ) -> LlamaIndexRAGService
225
- ```
226
-
227
- Get or create a RAG service instance.
228
-
229
- **Parameters**:
230
- - `collection_name`: Name of the ChromaDB collection (default: "deepcritical_evidence")
231
- - `oauth_token`: Optional OAuth token from HuggingFace login (takes priority over env vars)
232
- - `**kwargs`: Additional arguments for LlamaIndexRAGService (e.g., `use_openai_embeddings=False`)
233
-
234
- **Returns**: Configured LlamaIndexRAGService instance.
235
-
236
- **Note**: By default, uses local embeddings (sentence-transformers) which require no API keys.
237
-
238
- ## StatisticalAnalyzer
239
-
240
- **Module**: `src.services.statistical_analyzer`
241
-
242
- **Purpose**: Secure execution of AI-generated statistical code.
243
-
244
- ### Methods
245
-
246
- #### `analyze`
247
-
248
- ```python
249
- async def analyze(
250
- self,
251
- query: str,
252
- evidence: list[Evidence],
253
- hypothesis: dict[str, Any] | None = None
254
- ) -> AnalysisResult
255
- ```
256
-
257
- Analyzes a research question using statistical methods.
258
-
259
- **Parameters**:
260
- - `query`: The research question
261
- - `evidence`: List of Evidence objects to analyze
262
- - `hypothesis`: Optional hypothesis dict with `drug`, `target`, `pathway`, `effect`, `confidence` keys
263
-
264
- **Returns**: `AnalysisResult` with:
265
- - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
266
- - `confidence`: Confidence in verdict (0.0-1.0)
267
- - `statistical_evidence`: Summary of statistical findings
268
- - `code_generated`: Python code that was executed
269
- - `execution_output`: Output from code execution
270
- - `key_takeaways`: Key takeaways from analysis
271
- - `limitations`: List of limitations
272
-
273
- **Note**: Requires Modal credentials for sandbox execution.
274
-
275
- ## See Also
276
-
277
- - [Architecture - Services](../architecture/services.md) - Architecture overview
278
- - [Configuration](../configuration/index.md) - Service configuration
279
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/api/tools.md DELETED
@@ -1,259 +0,0 @@
1
- # Tools API Reference
2
-
3
- This page documents the API for DeepCritical search tools.
4
-
5
- ## SearchTool Protocol
6
-
7
- All tools implement the `SearchTool` protocol:
8
-
9
- ```python
10
- class SearchTool(Protocol):
11
- @property
12
- def name(self) -> str: ...
13
-
14
- async def search(
15
- self,
16
- query: str,
17
- max_results: int = 10
18
- ) -> list[Evidence]: ...
19
- ```
20
-
21
- ## PubMedTool
22
-
23
- **Module**: `src.tools.pubmed`
24
-
25
- **Purpose**: Search peer-reviewed biomedical literature from PubMed.
26
-
27
- ### Properties
28
-
29
- #### `name`
30
-
31
- ```python
32
- @property
33
- def name(self) -> str
34
- ```
35
-
36
- Returns tool name: `"pubmed"`
37
-
38
- ### Methods
39
-
40
- #### `search`
41
-
42
- ```python
43
- async def search(
44
- self,
45
- query: str,
46
- max_results: int = 10
47
- ) -> list[Evidence]
48
- ```
49
-
50
- Searches PubMed for articles.
51
-
52
- **Parameters**:
53
- - `query`: Search query string
54
- - `max_results`: Maximum number of results to return (default: 10)
55
-
56
- **Returns**: List of `Evidence` objects with PubMed articles.
57
-
58
- **Raises**:
59
- - `SearchError`: If search fails (timeout, HTTP error, XML parsing error)
60
- - `RateLimitError`: If rate limit is exceeded (429 status code)
61
-
62
- **Note**: Uses NCBI E-utilities (ESearch → EFetch). Rate limit: 0.34s between requests. Handles single vs. multiple articles.
63
-
64
- ## ClinicalTrialsTool
65
-
66
- **Module**: `src.tools.clinicaltrials`
67
-
68
- **Purpose**: Search ClinicalTrials.gov for interventional studies.
69
-
70
- ### Properties
71
-
72
- #### `name`
73
-
74
- ```python
75
- @property
76
- def name(self) -> str
77
- ```
78
-
79
- Returns tool name: `"clinicaltrials"`
80
-
81
- ### Methods
82
-
83
- #### `search`
84
-
85
- ```python
86
- async def search(
87
- self,
88
- query: str,
89
- max_results: int = 10
90
- ) -> list[Evidence]
91
- ```
92
-
93
- Searches ClinicalTrials.gov for trials.
94
-
95
- **Parameters**:
96
- - `query`: Search query string
97
- - `max_results`: Maximum number of results to return (default: 10)
98
-
99
- **Returns**: List of `Evidence` objects with clinical trials.
100
-
101
- **Note**: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION. Uses `requests` library (NOT httpx - WAF blocks httpx). Runs in thread pool for async compatibility.
102
-
103
- **Raises**:
104
- - `SearchError`: If search fails (HTTP error, request exception)
105
-
106
- ## EuropePMCTool
107
-
108
- **Module**: `src.tools.europepmc`
109
-
110
- **Purpose**: Search Europe PMC for preprints and peer-reviewed articles.
111
-
112
- ### Properties
113
-
114
- #### `name`
115
-
116
- ```python
117
- @property
118
- def name(self) -> str
119
- ```
120
-
121
- Returns tool name: `"europepmc"`
122
-
123
- ### Methods
124
-
125
- #### `search`
126
-
127
- ```python
128
- async def search(
129
- self,
130
- query: str,
131
- max_results: int = 10
132
- ) -> list[Evidence]
133
- ```
134
-
135
- Searches Europe PMC for articles and preprints.
136
-
137
- **Parameters**:
138
- - `query`: Search query string
139
- - `max_results`: Maximum number of results to return (default: 10)
140
-
141
- **Returns**: List of `Evidence` objects with articles/preprints.
142
-
143
- **Note**: Includes both preprints (marked with `[PREPRINT - Not peer-reviewed]`) and peer-reviewed articles. Handles preprint markers. Builds URLs from DOI or PMID.
144
-
145
- **Raises**:
146
- - `SearchError`: If search fails (HTTP error, connection error)
147
-
148
- ## RAGTool
149
-
150
- **Module**: `src.tools.rag_tool`
151
-
152
- **Purpose**: Semantic search within collected evidence.
153
-
154
- ### Initialization
155
-
156
- ```python
157
- def __init__(
158
- self,
159
- rag_service: LlamaIndexRAGService | None = None,
160
- oauth_token: str | None = None
161
- ) -> None
162
- ```
163
-
164
- **Parameters**:
165
- - `rag_service`: Optional RAG service instance. If None, will be lazy-initialized.
166
- - `oauth_token`: Optional OAuth token from HuggingFace login (for RAG LLM)
167
-
168
- ### Properties
169
-
170
- #### `name`
171
-
172
- ```python
173
- @property
174
- def name(self) -> str
175
- ```
176
-
177
- Returns tool name: `"rag"`
178
-
179
- ### Methods
180
-
181
- #### `search`
182
-
183
- ```python
184
- async def search(
185
- self,
186
- query: str,
187
- max_results: int = 10
188
- ) -> list[Evidence]
189
- ```
190
-
191
- Searches collected evidence using semantic similarity.
192
-
193
- **Parameters**:
194
- - `query`: Search query string
195
- - `max_results`: Maximum number of results to return (default: 10)
196
-
197
- **Returns**: List of `Evidence` objects from collected evidence.
198
-
199
- **Raises**:
200
- - `ConfigurationError`: If RAG service is unavailable
201
-
202
- **Note**: Requires evidence to be ingested into RAG service first. Wraps `LlamaIndexRAGService`. Returns Evidence from RAG results.
203
-
204
- ## SearchHandler
205
-
206
- **Module**: `src.tools.search_handler`
207
-
208
- **Purpose**: Orchestrates parallel searches across multiple tools.
209
-
210
- ### Initialization
211
-
212
- ```python
213
- def __init__(
214
- self,
215
- tools: list[SearchTool],
216
- timeout: float = 30.0,
217
- include_rag: bool = False,
218
- auto_ingest_to_rag: bool = True,
219
- oauth_token: str | None = None
220
- ) -> None
221
- ```
222
-
223
- **Parameters**:
224
- - `tools`: List of search tools to use
225
- - `timeout`: Timeout for each search in seconds (default: 30.0)
226
- - `include_rag`: Whether to include RAG tool in searches (default: False)
227
- - `auto_ingest_to_rag`: Whether to automatically ingest results into RAG (default: True)
228
- - `oauth_token`: Optional OAuth token from HuggingFace login (for RAG LLM)
229
-
230
- ### Methods
231
-
232
- #### `execute`
233
-
234
- <!--codeinclude-->
235
- [SearchHandler.execute](../src/tools/search_handler.py) start_line:86 end_line:86
236
- <!--/codeinclude-->
237
-
238
- Searches multiple tools in parallel.
239
-
240
- **Parameters**:
241
- - `query`: Search query string
242
- - `max_results_per_tool`: Maximum results per tool (default: 10)
243
-
244
- **Returns**: `SearchResult` with:
245
- - `query`: The search query
246
- - `evidence`: Aggregated list of evidence
247
- - `sources_searched`: List of source names searched
248
- - `total_found`: Total number of results
249
- - `errors`: List of error messages from failed tools
250
-
251
- **Raises**:
252
- - `SearchError`: If search times out
253
-
254
- **Note**: Uses `asyncio.gather()` for parallel execution. Handles tool failures gracefully (returns errors in `SearchResult.errors`). Automatically ingests evidence into RAG if enabled.
255
-
256
- ## See Also
257
-
258
- - [Architecture - Tools](../architecture/tools.md) - Architecture overview
259
- - [Models API](models.md) - Data models used by tools
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/agents.md DELETED
@@ -1,293 +0,0 @@
1
- # Agents Architecture
2
-
3
- DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.
4
-
5
- ## Agent Pattern
6
-
7
- ### Pydantic AI Agents
8
-
9
- Pydantic AI agents use the `Agent` class with the following structure:
10
-
11
- - **System Prompt**: Module-level constant with date injection
12
- - **Agent Class**: `__init__(model: Any | None = None)`
13
- - **Main Method**: Async method (e.g., `async def evaluate()`, `async def write_report()`)
14
- - **Factory Function**: `def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentName`
15
-
16
- **Note**: Factory functions accept an optional `oauth_token` parameter for HuggingFace authentication, which takes priority over environment variables.
17
-
18
- ## Model Initialization
19
-
20
- Agents use `get_model()` from `src/agent_factory/judges.py` if no model is provided. This supports:
21
-
22
- - OpenAI models
23
- - Anthropic models
24
- - HuggingFace Inference API models
25
-
26
- The model selection is based on the configured `LLM_PROVIDER` in settings.
27
-
28
- ## Error Handling
29
-
30
- Agents return fallback values on failure rather than raising exceptions:
31
-
32
- - `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`
33
- - Empty strings for text outputs
34
- - Default structured outputs
35
-
36
- All errors are logged with context using structlog.
37
-
38
- ## Input Validation
39
-
40
- All agents validate inputs:
41
-
42
- - Check that queries/inputs are not empty
43
- - Truncate very long inputs with warnings
44
- - Handle None values gracefully
45
-
46
- ## Output Types
47
-
48
- Agents use structured output types from `src/utils/models.py`:
49
-
50
- - `KnowledgeGapOutput`: Research completeness evaluation
51
- - `AgentSelectionPlan`: Tool selection plan
52
- - `ReportDraft`: Long-form report structure
53
- - `ParsedQuery`: Query parsing and mode detection
54
-
55
- For text output (writer agents), agents return `str` directly.
56
-
57
- ## Agent Types
58
-
59
- ### Knowledge Gap Agent
60
-
61
- **File**: `src/agents/knowledge_gap.py`
62
-
63
- **Purpose**: Evaluates research state and identifies knowledge gaps.
64
-
65
- **Output**: `KnowledgeGapOutput` with:
66
- - `research_complete`: Boolean indicating if research is complete
67
- - `outstanding_gaps`: List of remaining knowledge gaps
68
-
69
- **Methods**:
70
- - `async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput`
71
-
72
- ### Tool Selector Agent
73
-
74
- **File**: `src/agents/tool_selector.py`
75
-
76
- **Purpose**: Selects appropriate tools for addressing knowledge gaps.
77
-
78
- **Output**: `AgentSelectionPlan` with list of `AgentTask` objects.
79
-
80
- **Available Agents**:
81
- - `WebSearchAgent`: General web search for fresh information
82
- - `SiteCrawlerAgent`: Research specific entities/companies
83
- - `RAGAgent`: Semantic search within collected evidence
84
-
85
- ### Writer Agent
86
-
87
- **File**: `src/agents/writer.py`
88
-
89
- **Purpose**: Generates final reports from research findings.
90
-
91
- **Output**: Markdown string with numbered citations.
92
-
93
- **Methods**:
94
- - `async def write_report(query, findings, output_length, output_instructions) -> str`
95
-
96
- **Features**:
97
- - Validates inputs
98
- - Truncates very long findings (max 50000 chars) with warning
99
- - Retry logic for transient failures (3 retries)
100
- - Citation validation before returning
101
-
102
- ### Long Writer Agent
103
-
104
- **File**: `src/agents/long_writer.py`
105
-
106
- **Purpose**: Long-form report generation with section-by-section writing.
107
-
108
- **Input/Output**: Uses `ReportDraft` models.
109
-
110
- **Methods**:
111
- - `async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput`
112
- - `async def write_report(query, report_title, report_draft) -> str`
113
-
114
- **Features**:
115
- - Writes sections iteratively
116
- - Aggregates references across sections
117
- - Reformats section headings and references
118
- - Deduplicates and renumbers references
119
-
120
- ### Proofreader Agent
121
-
122
- **File**: `src/agents/proofreader.py`
123
-
124
- **Purpose**: Proofreads and polishes report drafts.
125
-
126
- **Input**: `ReportDraft`
127
- **Output**: Polished markdown string
128
-
129
- **Methods**:
130
- - `async def proofread(query, report_title, report_draft) -> str`
131
-
132
- **Features**:
133
- - Removes duplicate content across sections
134
- - Adds executive summary if multiple sections
135
- - Preserves all references and citations
136
- - Improves flow and readability
137
-
138
- ### Thinking Agent
139
-
140
- **File**: `src/agents/thinking.py`
141
-
142
- **Purpose**: Generates observations from conversation history.
143
-
144
- **Output**: Observation string
145
-
146
- **Methods**:
147
- - `async def generate_observations(query, background_context, conversation_history) -> str`
148
-
149
- ### Input Parser Agent
150
-
151
- **File**: `src/agents/input_parser.py`
152
-
153
- **Purpose**: Parses and improves user queries, detects research mode.
154
-
155
- **Output**: `ParsedQuery` with:
156
- - `original_query`: Original query string
157
- - `improved_query`: Refined query string
158
- - `research_mode`: "iterative" or "deep"
159
- - `key_entities`: List of key entities
160
- - `research_questions`: List of research questions
161
-
162
- ## Magentic Agents
163
-
164
- The following agents use the `BaseAgent` pattern from `agent-framework` and are used exclusively with `MagenticOrchestrator`:
165
-
166
- ### Hypothesis Agent
167
-
168
- **File**: `src/agents/hypothesis_agent.py`
169
-
170
- **Purpose**: Generates mechanistic hypotheses based on evidence.
171
-
172
- **Pattern**: `BaseAgent` from `agent-framework`
173
-
174
- **Methods**:
175
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
176
-
177
- **Features**:
178
- - Uses internal Pydantic AI `Agent` with `HypothesisAssessment` output type
179
- - Accesses shared `evidence_store` for evidence
180
- - Uses embedding service for diverse evidence selection (MMR algorithm)
181
- - Stores hypotheses in shared context
182
-
183
- ### Search Agent
184
-
185
- **File**: `src/agents/search_agent.py`
186
-
187
- **Purpose**: Wraps `SearchHandler` as an agent for Magentic orchestrator.
188
-
189
- **Pattern**: `BaseAgent` from `agent-framework`
190
-
191
- **Methods**:
192
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
193
-
194
- **Features**:
195
- - Executes searches via `SearchHandlerProtocol`
196
- - Deduplicates evidence using embedding service
197
- - Searches for semantically related evidence
198
- - Updates shared evidence store
199
-
200
- ### Analysis Agent
201
-
202
- **File**: `src/agents/analysis_agent.py`
203
-
204
- **Purpose**: Performs statistical analysis using Modal sandbox.
205
-
206
- **Pattern**: `BaseAgent` from `agent-framework`
207
-
208
- **Methods**:
209
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
210
-
211
- **Features**:
212
- - Wraps `StatisticalAnalyzer` service
213
- - Analyzes evidence and hypotheses
214
- - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE)
215
- - Stores analysis results in shared context
216
-
217
- ### Report Agent (Magentic)
218
-
219
- **File**: `src/agents/report_agent.py`
220
-
221
- **Purpose**: Generates structured scientific reports from evidence and hypotheses.
222
-
223
- **Pattern**: `BaseAgent` from `agent-framework`
224
-
225
- **Methods**:
226
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
227
-
228
- **Features**:
229
- - Uses internal Pydantic AI `Agent` with `ResearchReport` output type
230
- - Accesses shared evidence store and hypotheses
231
- - Validates citations before returning
232
- - Formats report as markdown
233
-
234
- ### Judge Agent
235
-
236
- **File**: `src/agents/judge_agent.py`
237
-
238
- **Purpose**: Evaluates evidence quality and determines if sufficient for synthesis.
239
-
240
- **Pattern**: `BaseAgent` from `agent-framework`
241
-
242
- **Methods**:
243
- - `async def run(messages, thread, **kwargs) -> AgentRunResponse`
244
- - `async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]`
245
-
246
- **Features**:
247
- - Wraps `JudgeHandlerProtocol`
248
- - Accesses shared evidence store
249
- - Returns `JudgeAssessment` with sufficient flag, confidence, and recommendation
250
-
251
- ## Agent Patterns
252
-
253
- DeepCritical uses two distinct agent patterns:
254
-
255
- ### 1. Pydantic AI Agents (Traditional Pattern)
256
-
257
- These agents use the Pydantic AI `Agent` class directly and are used in iterative and deep research flows:
258
-
259
- - **Pattern**: `Agent(model, output_type, system_prompt)`
260
- - **Initialization**: `__init__(model: Any | None = None)`
261
- - **Methods**: Agent-specific async methods (e.g., `async def evaluate()`, `async def write_report()`)
262
- - **Examples**: `KnowledgeGapAgent`, `ToolSelectorAgent`, `WriterAgent`, `LongWriterAgent`, `ProofreaderAgent`, `ThinkingAgent`, `InputParserAgent`
263
-
264
- ### 2. Magentic Agents (Agent-Framework Pattern)
265
-
266
- These agents use the `BaseAgent` class from `agent-framework` and are used in Magentic orchestrator:
267
-
268
- - **Pattern**: `BaseAgent` from `agent-framework` with `async def run()` method
269
- - **Initialization**: `__init__(evidence_store, embedding_service, ...)`
270
- - **Methods**: `async def run(messages, thread, **kwargs) -> AgentRunResponse`
271
- - **Examples**: `HypothesisAgent`, `SearchAgent`, `AnalysisAgent`, `ReportAgent`, `JudgeAgent`
272
-
273
- **Note**: Magentic agents are used exclusively with the `MagenticOrchestrator` and follow the agent-framework protocol for multi-agent coordination.
274
-
275
- ## Factory Functions
276
-
277
- All agents have factory functions in `src/agent_factory/agents.py`:
278
-
279
- <!--codeinclude-->
280
- [Factory Functions](../src/agent_factory/agents.py) start_line:79 end_line:100
281
- <!--/codeinclude-->
282
-
283
- Factory functions:
284
- - Use `get_model()` if no model provided
285
- - Accept `oauth_token` parameter for HuggingFace authentication
286
- - Raise `ConfigurationError` if creation fails
287
- - Log agent creation
288
-
289
- ## See Also
290
-
291
- - [Orchestrators](orchestrators.md) - How agents are orchestrated
292
- - [API Reference - Agents](../api/agents.md) - API documentation
293
- - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/graph_orchestration.md DELETED
@@ -1,302 +0,0 @@
1
- # Graph Orchestration Architecture
2
-
3
- ## Overview
4
-
5
- DeepCritical implements a graph-based orchestration system for research workflows using Pydantic AI agents as nodes. This enables better parallel execution, conditional routing, and state management compared to simple agent chains.
6
-
7
- ## Conversation History
8
-
9
- DeepCritical supports multi-turn conversations through Pydantic AI's native message history format. The system maintains two types of history:
10
-
11
- 1. **User Conversation History**: Multi-turn user interactions (from Gradio chat interface) stored as `list[ModelMessage]`
12
- 2. **Research Iteration History**: Internal research process state (existing `Conversation` model)
13
-
14
- ### Message History Flow
15
-
16
- ```
17
- Gradio Chat History → convert_gradio_to_message_history() → GraphOrchestrator.run(message_history)
18
-
19
- GraphExecutionContext (stores message_history)
20
-
21
- Agent Nodes (receive message_history via agent.run())
22
-
23
- WorkflowState (persists user_message_history)
24
- ```
25
-
26
- ### Usage
27
-
28
- Message history is automatically converted from Gradio format and passed through the orchestrator:
29
-
30
- ```python
31
- # In app.py - automatic conversion
32
- message_history = convert_gradio_to_message_history(history) if history else None
33
- async for event in orchestrator.run(query, message_history=message_history):
34
- yield event
35
- ```
36
-
37
- Agents receive message history through their `run()` methods:
38
-
39
- ```python
40
- # In agent execution
41
- if message_history:
42
- result = await agent.run(input_data, message_history=message_history)
43
- ```
44
-
45
- ## Graph Patterns
46
-
47
- ### Iterative Research Graph
48
-
49
- The iterative research graph follows this pattern:
50
-
51
- ```
52
- [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?]
53
- ↓ No ↓ Yes
54
- [Tool Selector] [Writer]
55
-
56
- [Execute Tools] → [Loop Back]
57
- ```
58
-
59
- **Node IDs**: `thinking` → `knowledge_gap` → `continue_decision` → `tool_selector`/`writer` → `execute_tools` → (loop back to `thinking`)
60
-
61
- **Special Node Handling**:
62
- - `execute_tools`: State node that uses `search_handler` to execute searches and add evidence to workflow state
63
- - `continue_decision`: Decision node that routes based on `research_complete` flag from `KnowledgeGapOutput`
64
-
65
- ### Deep Research Graph
66
-
67
- The deep research graph follows this pattern:
68
-
69
- ```
70
- [Input] → [Planner] → [Store Plan] → [Parallel Loops] → [Collect Drafts] → [Synthesizer]
71
- ↓ ↓ ↓
72
- [Loop1] [Loop2] [Loop3]
73
- ```
74
-
75
- **Node IDs**: `planner` → `store_plan` → `parallel_loops` → `collect_drafts` → `synthesizer`
76
-
77
- **Special Node Handling**:
78
- - `planner`: Agent node that creates `ReportPlan` with report outline
79
- - `store_plan`: State node that stores `ReportPlan` in context for parallel loops
80
- - `parallel_loops`: Parallel node that executes `IterativeResearchFlow` instances for each section
81
- - `collect_drafts`: State node that collects section drafts from parallel loops
82
- - `synthesizer`: Agent node that calls `LongWriterAgent.write_report()` directly with `ReportDraft`
83
-
84
- ### Deep Research
85
-
86
- ```mermaid
87
-
88
- sequenceDiagram
89
- actor User
90
- participant GraphOrchestrator
91
- participant InputParser
92
- participant GraphBuilder
93
- participant GraphExecutor
94
- participant Agent
95
- participant BudgetTracker
96
- participant WorkflowState
97
-
98
- User->>GraphOrchestrator: run(query)
99
- GraphOrchestrator->>InputParser: detect_research_mode(query)
100
- InputParser-->>GraphOrchestrator: mode (iterative/deep)
101
- GraphOrchestrator->>GraphBuilder: build_graph(mode)
102
- GraphBuilder-->>GraphOrchestrator: ResearchGraph
103
- GraphOrchestrator->>WorkflowState: init_workflow_state()
104
- GraphOrchestrator->>BudgetTracker: create_budget()
105
- GraphOrchestrator->>GraphExecutor: _execute_graph(graph)
106
-
107
- loop For each node in graph
108
- GraphExecutor->>Agent: execute_node(agent_node)
109
- Agent->>Agent: process_input
110
- Agent-->>GraphExecutor: result
111
- GraphExecutor->>WorkflowState: update_state(result)
112
- GraphExecutor->>BudgetTracker: add_tokens(used)
113
- GraphExecutor->>BudgetTracker: check_budget()
114
- alt Budget exceeded
115
- GraphExecutor->>GraphOrchestrator: emit(error_event)
116
- else Continue
117
- GraphExecutor->>GraphOrchestrator: emit(progress_event)
118
- end
119
- end
120
-
121
- GraphOrchestrator->>User: AsyncGenerator[AgentEvent]
122
-
123
- ```
124
-
125
- ### Iterative Research
126
-
127
- ```mermaid
128
- sequenceDiagram
129
- participant IterativeFlow
130
- participant ThinkingAgent
131
- participant KnowledgeGapAgent
132
- participant ToolSelector
133
- participant ToolExecutor
134
- participant JudgeHandler
135
- participant WriterAgent
136
-
137
- IterativeFlow->>IterativeFlow: run(query)
138
-
139
- loop Until complete or max_iterations
140
- IterativeFlow->>ThinkingAgent: generate_observations()
141
- ThinkingAgent-->>IterativeFlow: observations
142
-
143
- IterativeFlow->>KnowledgeGapAgent: evaluate_gaps()
144
- KnowledgeGapAgent-->>IterativeFlow: KnowledgeGapOutput
145
-
146
- alt Research complete
147
- IterativeFlow->>WriterAgent: create_final_report()
148
- WriterAgent-->>IterativeFlow: final_report
149
- else Gaps remain
150
- IterativeFlow->>ToolSelector: select_agents(gap)
151
- ToolSelector-->>IterativeFlow: AgentSelectionPlan
152
-
153
- IterativeFlow->>ToolExecutor: execute_tool_tasks()
154
- ToolExecutor-->>IterativeFlow: ToolAgentOutput[]
155
-
156
- IterativeFlow->>JudgeHandler: assess_evidence()
157
- JudgeHandler-->>IterativeFlow: should_continue
158
- end
159
- end
160
- ```
161
-
162
-
163
- ## Graph Structure
164
-
165
- ### Nodes
166
-
167
- Graph nodes represent different stages in the research workflow:
168
-
169
- 1. **Agent Nodes**: Execute Pydantic AI agents
170
- - Input: Prompt/query
171
- - Output: Structured or unstructured response
172
- - Examples: `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`
173
-
174
- 2. **State Nodes**: Update or read workflow state
175
- - Input: Current state
176
- - Output: Updated state
177
- - Examples: Update evidence, update conversation history
178
-
179
- 3. **Decision Nodes**: Make routing decisions based on conditions
180
- - Input: Current state/results
181
- - Output: Next node ID
182
- - Examples: Continue research vs. complete research
183
-
184
- 4. **Parallel Nodes**: Execute multiple nodes concurrently
185
- - Input: List of node IDs
186
- - Output: Aggregated results
187
- - Examples: Parallel iterative research loops
188
-
189
- ### Edges
190
-
191
- Edges define transitions between nodes:
192
-
193
- 1. **Sequential Edges**: Always traversed (no condition)
194
- - From: Source node
195
- - To: Target node
196
- - Condition: None (always True)
197
-
198
- 2. **Conditional Edges**: Traversed based on condition
199
- - From: Source node
200
- - To: Target node
201
- - Condition: Callable that returns bool
202
- - Example: If research complete → go to writer, else → continue loop
203
-
204
- 3. **Parallel Edges**: Used for parallel execution branches
205
- - From: Parallel node
206
- - To: Multiple target nodes
207
- - Execution: All targets run concurrently
208
-
209
-
210
- ## State Management
211
-
212
- State is managed via `WorkflowState` using `ContextVar` for thread-safe isolation:
213
-
214
- - **Evidence**: Collected evidence from searches
215
- - **Conversation**: Iteration history (gaps, tool calls, findings, thoughts)
216
- - **Embedding Service**: For semantic search
217
-
218
- State transitions occur at state nodes, which update the global workflow state.
219
-
220
- ## Execution Flow
221
-
222
- 1. **Graph Construction**: Build graph from nodes and edges using `create_iterative_graph()` or `create_deep_graph()`
223
- 2. **Graph Validation**: Ensure graph is valid (no cycles, all nodes reachable) via `ResearchGraph.validate_structure()`
224
- 3. **Graph Execution**: Traverse graph from entry node using `GraphOrchestrator._execute_graph()`
225
- 4. **Node Execution**: Execute each node based on type:
226
- - **Agent Nodes**: Call `agent.run()` with transformed input
227
- - **State Nodes**: Update workflow state via `state_updater` function
228
- - **Decision Nodes**: Evaluate `decision_function` to get next node ID
229
- - **Parallel Nodes**: Execute all parallel nodes concurrently via `asyncio.gather()`
230
- 5. **Edge Evaluation**: Determine next node(s) based on edges and conditions
231
- 6. **Parallel Execution**: Use `asyncio.gather()` for parallel nodes
232
- 7. **State Updates**: Update state at state nodes via `GraphExecutionContext.update_state()`
233
- 8. **Event Streaming**: Yield `AgentEvent` objects during execution for UI
234
-
235
- ### GraphExecutionContext
236
-
237
- The `GraphExecutionContext` class manages execution state during graph traversal:
238
-
239
- - **State**: Current `WorkflowState` instance
240
- - **Budget Tracker**: `BudgetTracker` instance for budget enforcement
241
- - **Node Results**: Dictionary storing results from each node execution
242
- - **Visited Nodes**: Set of node IDs that have been executed
243
- - **Current Node**: ID of the node currently being executed
244
-
245
- Methods:
246
- - `set_node_result(node_id, result)`: Store result from node execution
247
- - `get_node_result(node_id)`: Retrieve stored result
248
- - `has_visited(node_id)`: Check if node was visited
249
- - `mark_visited(node_id)`: Mark node as visited
250
- - `update_state(updater, data)`: Update workflow state
251
-
252
- ## Conditional Routing
253
-
254
- Decision nodes evaluate conditions and return next node IDs:
255
-
256
- - **Knowledge Gap Decision**: If `research_complete` → writer, else → tool selector
257
- - **Budget Decision**: If budget exceeded → exit, else → continue
258
- - **Iteration Decision**: If max iterations → exit, else → continue
259
-
260
- ## Parallel Execution
261
-
262
- Parallel nodes execute multiple nodes concurrently:
263
-
264
- - Each parallel branch runs independently
265
- - Results are aggregated after all branches complete
266
- - State is synchronized after parallel execution
267
- - Errors in one branch don't stop other branches
268
-
269
- ## Budget Enforcement
270
-
271
- Budget constraints are enforced at decision nodes:
272
-
273
- - **Token Budget**: Track LLM token usage
274
- - **Time Budget**: Track elapsed time
275
- - **Iteration Budget**: Track iteration count
276
-
277
- If any budget is exceeded, execution routes to exit node.
278
-
279
- ## Error Handling
280
-
281
- Errors are handled at multiple levels:
282
-
283
- 1. **Node Level**: Catch errors in individual node execution
284
- 2. **Graph Level**: Handle errors during graph traversal
285
- 3. **State Level**: Rollback state changes on error
286
-
287
- Errors are logged and yield error events for UI.
288
-
289
- ## Backward Compatibility
290
-
291
- Graph execution is optional via feature flag:
292
-
293
- - `USE_GRAPH_EXECUTION=true`: Use graph-based execution
294
- - `USE_GRAPH_EXECUTION=false`: Use agent chain execution (existing)
295
-
296
- This allows gradual migration and fallback if needed.
297
-
298
- ## See Also
299
-
300
- - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
301
- - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
302
- - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/middleware.md DELETED
@@ -1,146 +0,0 @@
1
- # Middleware Architecture
2
-
3
- DeepCritical uses middleware for state management, budget tracking, and workflow coordination.
4
-
5
- ## State Management
6
-
7
- ### WorkflowState
8
-
9
- **File**: `src/middleware/state_machine.py`
10
-
11
- **Purpose**: Thread-safe state management for research workflows
12
-
13
- **Implementation**: Uses `ContextVar` for thread-safe isolation
14
-
15
- **State Components**:
16
- - `evidence: list[Evidence]`: Collected evidence from searches
17
- - `conversation: Conversation`: Iteration history (gaps, tool calls, findings, thoughts)
18
- - `embedding_service: Any`: Embedding service for semantic search
19
-
20
- **Methods**:
21
- - `add_evidence(new_evidence: list[Evidence]) -> int`: Adds evidence with URL-based deduplication. Returns the number of new items added (excluding duplicates).
22
- - `async search_related(query: str, n_results: int = 5) -> list[Evidence]`: Semantic search for related evidence using embedding service
23
-
24
- **Initialization**:
25
-
26
- <!--codeinclude-->
27
- [Initialize Workflow State](../src/middleware/state_machine.py) start_line:98 end_line:110
28
- <!--/codeinclude-->
29
-
30
- **Access**:
31
-
32
- <!--codeinclude-->
33
- [Get Workflow State](../src/middleware/state_machine.py) start_line:115 end_line:129
34
- <!--/codeinclude-->
35
-
36
- ## Workflow Manager
37
-
38
- **File**: `src/middleware/workflow_manager.py`
39
-
40
- **Purpose**: Coordinates parallel research loops
41
-
42
- **Methods**:
43
- - `async add_loop(loop_id: str, query: str) -> ResearchLoop`: Add a new research loop to manage
44
- - `async run_loops_parallel(loop_configs: list[dict], loop_func: Callable, judge_handler: Any | None = None, budget_tracker: Any | None = None) -> list[Any]`: Run multiple research loops in parallel. Takes configuration dicts and a loop function.
45
- - `async update_loop_status(loop_id: str, status: LoopStatus, error: str | None = None)`: Update loop status
46
- - `async sync_loop_evidence_to_state(loop_id: str)`: Synchronize evidence from a specific loop to global state
47
-
48
- **Features**:
49
- - Uses `asyncio.gather()` for parallel execution
50
- - Handles errors per loop (doesn't fail all if one fails)
51
- - Tracks loop status: `pending`, `running`, `completed`, `failed`, `cancelled`
52
- - Evidence deduplication across parallel loops
53
-
54
- **Usage**:
55
- ```python
56
- from src.middleware.workflow_manager import WorkflowManager
57
-
58
- manager = WorkflowManager()
59
- await manager.add_loop("loop1", "Research query 1")
60
- await manager.add_loop("loop2", "Research query 2")
61
-
62
- async def run_research(config: dict) -> str:
63
- loop_id = config["loop_id"]
64
- query = config["query"]
65
- # ... research logic ...
66
- return "report"
67
-
68
- results = await manager.run_loops_parallel(
69
- loop_configs=[
70
- {"loop_id": "loop1", "query": "Research query 1"},
71
- {"loop_id": "loop2", "query": "Research query 2"},
72
- ],
73
- loop_func=run_research,
74
- )
75
- ```
76
-
77
- ## Budget Tracker
78
-
79
- **File**: `src/middleware/budget_tracker.py`
80
-
81
- **Purpose**: Tracks and enforces resource limits
82
-
83
- **Budget Components**:
84
- - **Tokens**: LLM token usage
85
- - **Time**: Elapsed time in seconds
86
- - **Iterations**: Number of iterations
87
-
88
- **Methods**:
89
- - `create_budget(loop_id: str, tokens_limit: int = 100000, time_limit_seconds: float = 600.0, iterations_limit: int = 10) -> BudgetStatus`: Create a budget for a specific loop
90
- - `add_tokens(loop_id: str, tokens: int)`: Add token usage to a loop's budget
91
- - `start_timer(loop_id: str)`: Start time tracking for a loop
92
- - `update_timer(loop_id: str)`: Update elapsed time for a loop
93
- - `increment_iteration(loop_id: str)`: Increment iteration count for a loop
94
- - `check_budget(loop_id: str) -> tuple[bool, str]`: Check if a loop's budget has been exceeded. Returns (exceeded: bool, reason: str)
95
- - `can_continue(loop_id: str) -> bool`: Check if a loop can continue based on budget
96
-
97
- **Token Estimation**:
98
- - `estimate_tokens(text: str) -> int`: ~4 chars per token
99
- - `estimate_llm_call_tokens(prompt: str, response: str) -> int`: Estimate LLM call tokens
100
-
101
- **Usage**:
102
- ```python
103
- from src.middleware.budget_tracker import BudgetTracker
104
-
105
- tracker = BudgetTracker()
106
- budget = tracker.create_budget(
107
- loop_id="research_loop",
108
- tokens_limit=100000,
109
- time_limit_seconds=600,
110
- iterations_limit=10
111
- )
112
- tracker.start_timer("research_loop")
113
- # ... research operations ...
114
- tracker.add_tokens("research_loop", 5000)
115
- tracker.update_timer("research_loop")
116
- exceeded, reason = tracker.check_budget("research_loop")
117
- if exceeded:
118
- # Budget exceeded, stop research
119
- pass
120
- if not tracker.can_continue("research_loop"):
121
- # Budget exceeded, stop research
122
- pass
123
- ```
124
-
125
- ## Models
126
-
127
- All middleware models are defined in `src/utils/models.py`:
128
-
129
- - `IterationData`: Data for a single iteration
130
- - `Conversation`: Conversation history with iterations
131
- - `ResearchLoop`: Research loop state and configuration
132
- - `BudgetStatus`: Current budget status
133
-
134
- ## Thread Safety
135
-
136
- All middleware components use `ContextVar` for thread-safe isolation:
137
-
138
- - Each request/thread has its own workflow state
139
- - No global mutable state
140
- - Safe for concurrent requests
141
-
142
- ## See Also
143
-
144
- - [Orchestrators](orchestrators.md) - How middleware is used in orchestration
145
- - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
146
- - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/orchestrators.md DELETED
@@ -1,201 +0,0 @@
1
- # Orchestrators Architecture
2
-
3
- DeepCritical supports multiple orchestration patterns for research workflows.
4
-
5
- ## Research Flows
6
-
7
- ### IterativeResearchFlow
8
-
9
- **File**: `src/orchestrator/research_flow.py`
10
-
11
- **Pattern**: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete
12
-
13
- **Agents Used**:
14
- - `KnowledgeGapAgent`: Evaluates research completeness
15
- - `ToolSelectorAgent`: Selects tools for addressing gaps
16
- - `ThinkingAgent`: Generates observations
17
- - `WriterAgent`: Creates final report
18
- - `JudgeHandler`: Assesses evidence sufficiency
19
-
20
- **Features**:
21
- - Tracks iterations, time, budget
22
- - Supports graph execution (`use_graph=True`) and agent chains (`use_graph=False`)
23
- - Iterates until research complete or constraints met
24
-
25
- **Usage**:
26
-
27
- <!--codeinclude-->
28
- [IterativeResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:57 end_line:80
29
- <!--/codeinclude-->
30
-
31
- ### DeepResearchFlow
32
-
33
- **File**: `src/orchestrator/research_flow.py`
34
-
35
- **Pattern**: Planner → Parallel iterative loops per section → Synthesizer
36
-
37
- **Agents Used**:
38
- - `PlannerAgent`: Breaks query into report sections
39
- - `IterativeResearchFlow`: Per-section research (parallel)
40
- - `LongWriterAgent` or `ProofreaderAgent`: Final synthesis
41
-
42
- **Features**:
43
- - Uses `WorkflowManager` for parallel execution
44
- - Budget tracking per section and globally
45
- - State synchronization across parallel loops
46
- - Supports graph execution and agent chains
47
-
48
- **Usage**:
49
-
50
- <!--codeinclude-->
51
- [DeepResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:709 end_line:728
52
- <!--/codeinclude-->
53
-
54
- ## Graph Orchestrator
55
-
56
- **File**: `src/orchestrator/graph_orchestrator.py`
57
-
58
- **Purpose**: Graph-based execution using Pydantic AI agents as nodes
59
-
60
- **Features**:
61
- - Uses graph execution (`use_graph=True`) or agent chains (`use_graph=False`) as fallback
62
- - Routes based on research mode (iterative/deep/auto)
63
- - Streams `AgentEvent` objects for UI
64
- - Uses `GraphExecutionContext` to manage execution state
65
-
66
- **Node Types**:
67
- - **Agent Nodes**: Execute Pydantic AI agents
68
- - **State Nodes**: Update or read workflow state
69
- - **Decision Nodes**: Make routing decisions
70
- - **Parallel Nodes**: Execute multiple nodes concurrently
71
-
72
- **Edge Types**:
73
- - **Sequential Edges**: Always traversed
74
- - **Conditional Edges**: Traversed based on condition
75
- - **Parallel Edges**: Used for parallel execution branches
76
-
77
- **Special Node Handling**:
78
-
79
- The `GraphOrchestrator` has special handling for certain nodes:
80
-
81
- - **`execute_tools` node**: State node that uses `search_handler` to execute searches and add evidence to workflow state
82
- - **`parallel_loops` node**: Parallel node that executes `IterativeResearchFlow` instances for each section in deep research mode
83
- - **`synthesizer` node**: Agent node that calls `LongWriterAgent.write_report()` directly with `ReportDraft` instead of using `agent.run()`
84
- - **`writer` node**: Agent node that calls `WriterAgent.write_report()` directly with findings instead of using `agent.run()`
85
-
86
- **GraphExecutionContext**:
87
-
88
- The orchestrator uses `GraphExecutionContext` to manage execution state:
89
- - Tracks current node, visited nodes, and node results
90
- - Manages workflow state and budget tracker
91
- - Provides methods to store and retrieve node execution results
92
-
93
- ## Orchestrator Factory
94
-
95
- **File**: `src/orchestrator_factory.py`
96
-
97
- **Purpose**: Factory for creating orchestrators
98
-
99
- **Modes**:
100
- - **Simple**: Legacy orchestrator (backward compatible)
101
- - **Advanced**: Magentic orchestrator (requires OpenAI API key)
102
- - **Auto-detect**: Chooses based on API key availability
103
-
104
- **Usage**:
105
-
106
- <!--codeinclude-->
107
- [Create Orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:66
108
- <!--/codeinclude-->
109
-
110
- ## Magentic Orchestrator
111
-
112
- **File**: `src/orchestrator_magentic.py`
113
-
114
- **Purpose**: Multi-agent coordination using Microsoft Agent Framework
115
-
116
- **Features**:
117
- - Uses `agent-framework-core`
118
- - ChatAgent pattern with internal LLMs per agent
119
- - `MagenticBuilder` with participants:
120
- - `searcher`: SearchAgent (wraps SearchHandler)
121
- - `hypothesizer`: HypothesisAgent (generates hypotheses)
122
- - `judge`: JudgeAgent (evaluates evidence)
123
- - `reporter`: ReportAgent (generates final report)
124
- - Manager orchestrates agents via chat client (OpenAI or HuggingFace)
125
- - Event-driven: converts Magentic events to `AgentEvent` for UI streaming via `_process_event()` method
126
- - Supports max rounds, stall detection, and reset handling
127
-
128
- **Event Processing**:
129
-
130
- The orchestrator processes Magentic events and converts them to `AgentEvent`:
131
- - `MagenticOrchestratorMessageEvent` → `AgentEvent` with type based on message content
132
- - `MagenticAgentMessageEvent` → `AgentEvent` with type based on agent name
133
- - `MagenticAgentDeltaEvent` → `AgentEvent` for streaming updates
134
- - `MagenticFinalResultEvent` → `AgentEvent` with type "complete"
135
-
136
- **Requirements**:
137
- - `agent-framework-core` package
138
- - OpenAI API key or HuggingFace authentication
139
-
140
- ## Hierarchical Orchestrator
141
-
142
- **File**: `src/orchestrator_hierarchical.py`
143
-
144
- **Purpose**: Hierarchical orchestrator using middleware and sub-teams
145
-
146
- **Features**:
147
- - Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`
148
- - Adapts Magentic ChatAgent to `SubIterationTeam` protocol
149
- - Event-driven via `asyncio.Queue` for coordination
150
- - Supports sub-iteration patterns for complex research tasks
151
-
152
- ## Legacy Simple Mode
153
-
154
- **File**: `src/legacy_orchestrator.py`
155
-
156
- **Purpose**: Linear search-judge-synthesize loop
157
-
158
- **Features**:
159
- - Uses `SearchHandlerProtocol` and `JudgeHandlerProtocol`
160
- - Generator-based design yielding `AgentEvent` objects
161
- - Backward compatibility for simple use cases
162
-
163
- ## State Initialization
164
-
165
- All orchestrators must initialize workflow state:
166
-
167
- <!--codeinclude-->
168
- [Initialize Workflow State](../src/middleware/state_machine.py) start_line:98 end_line:112
169
- <!--/codeinclude-->
170
-
171
- ## Event Streaming
172
-
173
- All orchestrators yield `AgentEvent` objects:
174
-
175
- **Event Types**:
176
- - `started`: Research started
177
- - `searching`: Search in progress
178
- - `search_complete`: Search completed
179
- - `judging`: Evidence evaluation in progress
180
- - `judge_complete`: Evidence evaluation completed
181
- - `looping`: Iteration in progress
182
- - `hypothesizing`: Generating hypotheses
183
- - `analyzing`: Statistical analysis in progress
184
- - `analysis_complete`: Statistical analysis completed
185
- - `synthesizing`: Synthesizing results
186
- - `complete`: Research completed
187
- - `error`: Error occurred
188
- - `streaming`: Streaming update (delta events)
189
-
190
- **Event Structure**:
191
-
192
- <!--codeinclude-->
193
- [AgentEvent Model](../src/utils/models.py) start_line:104 end_line:126
194
- <!--/codeinclude-->
195
-
196
- ## See Also
197
-
198
- - [Graph Orchestration](graph_orchestration.md) - Graph-based execution details
199
- - [Workflow Diagrams](workflow-diagrams.md) - Detailed workflow diagrams
200
- - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
201
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/services.md DELETED
@@ -1,146 +0,0 @@
1
- # Services Architecture
2
-
3
- DeepCritical provides several services for embeddings, RAG, and statistical analysis.
4
-
5
- ## Embedding Service
6
-
7
- **File**: `src/services/embeddings.py`
8
-
9
- **Purpose**: Local sentence-transformers for semantic search and deduplication
10
-
11
- **Features**:
12
- - **No API Key Required**: Uses local sentence-transformers models
13
- - **Async-Safe**: All operations use `run_in_executor()` to avoid blocking the event loop
14
- - **ChromaDB Storage**: In-memory vector storage for embeddings
15
- - **Deduplication**: 0.9 similarity threshold by default (90% similarity = duplicate, configurable)
16
-
17
- **Model**: Configurable via `settings.local_embedding_model` (default: `all-MiniLM-L6-v2`)
18
-
19
- **Methods**:
20
- - `async def embed(text: str) -> list[float]`: Generate embeddings (async-safe via `run_in_executor()`)
21
- - `async def embed_batch(texts: list[str]) -> list[list[float]]`: Batch embedding (more efficient)
22
- - `async def add_evidence(evidence_id: str, content: str, metadata: dict[str, Any]) -> None`: Add evidence to vector store
23
- - `async def search_similar(query: str, n_results: int = 5) -> list[dict[str, Any]]`: Find semantically similar evidence
24
- - `async def deduplicate(new_evidence: list[Evidence], threshold: float = 0.9) -> list[Evidence]`: Remove semantically duplicate evidence
25
-
26
- **Usage**:
27
- ```python
28
- from src.services.embeddings import get_embedding_service
29
-
30
- service = get_embedding_service()
31
- embedding = await service.embed("text to embed")
32
- ```
33
-
34
- ## LlamaIndex RAG Service
35
-
36
- **File**: `src/services/llamaindex_rag.py`
37
-
38
- **Purpose**: Retrieval-Augmented Generation using LlamaIndex
39
-
40
- **Features**:
41
- - **Multiple Embedding Providers**: OpenAI embeddings (requires `OPENAI_API_KEY`) or local sentence-transformers (no API key)
42
- - **Multiple LLM Providers**: HuggingFace LLM (preferred) or OpenAI LLM (fallback) for query synthesis
43
- - **ChromaDB Storage**: Vector database for document storage (supports in-memory mode)
44
- - **Metadata Preservation**: Preserves source, title, URL, date, authors
45
- - **Lazy Initialization**: Graceful fallback if dependencies not available
46
-
47
- **Initialization Parameters**:
48
- - `use_openai_embeddings: bool | None`: Force OpenAI embeddings (None = auto-detect)
49
- - `use_in_memory: bool`: Use in-memory ChromaDB client (useful for tests)
50
- - `oauth_token: str | None`: Optional OAuth token from HuggingFace login (takes priority over env vars)
51
-
52
- **Methods**:
53
- - `async def ingest_evidence(evidence: list[Evidence]) -> None`: Ingest evidence into RAG
54
- - `async def retrieve(query: str, top_k: int = 5) -> list[Document]`: Retrieve relevant documents
55
- - `async def query(query: str, top_k: int = 5) -> str`: Query with RAG
56
-
57
- **Usage**:
58
- ```python
59
- from src.services.llamaindex_rag import get_rag_service
60
-
61
- service = get_rag_service(
62
- use_openai_embeddings=False, # Use local embeddings
63
- use_in_memory=True, # Use in-memory ChromaDB
64
- oauth_token=token # Optional HuggingFace token
65
- )
66
- if service:
67
- documents = await service.retrieve("query", top_k=5)
68
- ```
69
-
70
- ## Statistical Analyzer
71
-
72
- **File**: `src/services/statistical_analyzer.py`
73
-
74
- **Purpose**: Secure execution of AI-generated statistical code
75
-
76
- **Features**:
77
- - **Modal Sandbox**: Secure, isolated execution environment
78
- - **Code Generation**: Generates Python code via LLM
79
- - **Library Pinning**: Version-pinned libraries in `SANDBOX_LIBRARIES`
80
- - **Network Isolation**: `block_network=True` by default
81
-
82
- **Libraries Available**:
83
- - pandas, numpy, scipy
84
- - matplotlib, scikit-learn
85
- - statsmodels
86
-
87
- **Output**: `AnalysisResult` with:
88
- - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
89
- - `code`: Generated analysis code
90
- - `output`: Execution output
91
- - `error`: Error message if execution failed
92
-
93
- **Usage**:
94
- ```python
95
- from src.services.statistical_analyzer import StatisticalAnalyzer
96
-
97
- analyzer = StatisticalAnalyzer()
98
- result = await analyzer.analyze(
99
- hypothesis="Metformin reduces cancer risk",
100
- evidence=evidence_list
101
- )
102
- ```
103
-
104
- ## Singleton Pattern
105
-
106
- Services use singleton patterns for lazy initialization:
107
-
108
- **EmbeddingService**: Uses a global variable pattern:
109
-
110
- <!--codeinclude-->
111
- [EmbeddingService Singleton](../src/services/embeddings.py) start_line:164 end_line:172
112
- <!--/codeinclude-->
113
-
114
- **LlamaIndexRAGService**: Direct instantiation (no caching):
115
-
116
- <!--codeinclude-->
117
- [LlamaIndexRAGService Factory](../src/services/llamaindex_rag.py) start_line:440 end_line:466
118
- <!--/codeinclude-->
119
-
120
- This ensures:
121
- - Single instance per process
122
- - Lazy initialization
123
- - No dependencies required at import time
124
-
125
- ## Service Availability
126
-
127
- Services check availability before use:
128
-
129
- ```python
130
- from src.utils.config import settings
131
-
132
- if settings.modal_available:
133
- # Use Modal sandbox
134
- pass
135
-
136
- if settings.has_openai_key:
137
- # Use OpenAI embeddings for RAG
138
- pass
139
- ```
140
-
141
- ## See Also
142
-
143
- - [Tools](tools.md) - How services are used by search tools
144
- - [API Reference - Services](../api/services.md) - API documentation
145
- - [Configuration](../configuration/index.md) - Service configuration
146
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/tools.md DELETED
@@ -1,167 +0,0 @@
1
- # Tools Architecture
2
-
3
- DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.
4
-
5
- ## SearchTool Protocol
6
-
7
- All tools implement the `SearchTool` protocol from `src/tools/base.py`:
8
-
9
- <!--codeinclude-->
10
- [SearchTool Protocol](../src/tools/base.py) start_line:8 end_line:31
11
- <!--/codeinclude-->
12
-
13
- ## Rate Limiting
14
-
15
- All tools use the `@retry` decorator from tenacity:
16
-
17
- <!--codeinclude-->
18
- [Retry Decorator Pattern](../src/tools/pubmed.py) start_line:46 end_line:50
19
- <!--/codeinclude-->
20
-
21
- Tools with API rate limits implement `_rate_limit()` method and use shared rate limiters from `src/tools/rate_limiter.py`.
22
-
23
- ## Error Handling
24
-
25
- Tools raise custom exceptions:
26
-
27
- - `SearchError`: General search failures
28
- - `RateLimitError`: Rate limit exceeded
29
-
30
- Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).
31
-
32
- ## Query Preprocessing
33
-
34
- Tools use `preprocess_query()` from `src/tools/query_utils.py` to:
35
-
36
- - Remove noise from queries
37
- - Expand synonyms
38
- - Normalize query format
39
-
40
- ## Evidence Conversion
41
-
42
- All tools convert API responses to `Evidence` objects with:
43
-
44
- - `Citation`: Title, URL, date, authors
45
- - `content`: Evidence text
46
- - `relevance_score`: 0.0-1.0 relevance score
47
- - `metadata`: Additional metadata
48
-
49
- Missing fields are handled gracefully with defaults.
50
-
51
- ## Tool Implementations
52
-
53
- ### PubMed Tool
54
-
55
- **File**: `src/tools/pubmed.py`
56
-
57
- **API**: NCBI E-utilities (ESearch → EFetch)
58
-
59
- **Rate Limiting**:
60
- - 0.34s between requests (3 req/sec without API key)
61
- - 0.1s between requests (10 req/sec with NCBI API key)
62
-
63
- **Features**:
64
- - XML parsing with `xmltodict`
65
- - Handles single vs. multiple articles
66
- - Query preprocessing
67
- - Evidence conversion with metadata extraction
68
-
69
- ### ClinicalTrials Tool
70
-
71
- **File**: `src/tools/clinicaltrials.py`
72
-
73
- **API**: ClinicalTrials.gov API v2
74
-
75
- **Important**: Uses `requests` library (NOT httpx) because WAF blocks httpx TLS fingerprint.
76
-
77
- **Execution**: Runs in thread pool: `await asyncio.to_thread(requests.get, ...)`
78
-
79
- **Filtering**:
80
- - Only interventional studies
81
- - Status: `COMPLETED`, `ACTIVE_NOT_RECRUITING`, `RECRUITING`, `ENROLLING_BY_INVITATION`
82
-
83
- **Features**:
84
- - Parses nested JSON structure
85
- - Extracts trial metadata
86
- - Evidence conversion
87
-
88
- ### Europe PMC Tool
89
-
90
- **File**: `src/tools/europepmc.py`
91
-
92
- **API**: Europe PMC REST API
93
-
94
- **Features**:
95
- - Handles preprint markers: `[PREPRINT - Not peer-reviewed]`
96
- - Builds URLs from DOI or PMID
97
- - Checks `pubTypeList` for preprint detection
98
- - Includes both preprints and peer-reviewed articles
99
-
100
- ### RAG Tool
101
-
102
- **File**: `src/tools/rag_tool.py`
103
-
104
- **Purpose**: Semantic search within collected evidence
105
-
106
- **Implementation**: Wraps `LlamaIndexRAGService`
107
-
108
- **Features**:
109
- - Returns Evidence from RAG results
110
- - Handles evidence ingestion
111
- - Semantic similarity search
112
- - Metadata preservation
113
-
114
- ### Search Handler
115
-
116
- **File**: `src/tools/search_handler.py`
117
-
118
- **Purpose**: Orchestrates parallel searches across multiple tools
119
-
120
- **Initialization Parameters**:
121
- - `tools: list[SearchTool]`: List of search tools to use
122
- - `timeout: float = 30.0`: Timeout for each search in seconds
123
- - `include_rag: bool = False`: Whether to include RAG tool in searches
124
- - `auto_ingest_to_rag: bool = True`: Whether to automatically ingest results into RAG
125
- - `oauth_token: str | None = None`: Optional OAuth token from HuggingFace login (for RAG LLM)
126
-
127
- **Methods**:
128
- - `async def execute(query: str, max_results_per_tool: int = 10) -> SearchResult`: Execute search across all tools in parallel
129
-
130
- **Features**:
131
- - Uses `asyncio.gather()` with `return_exceptions=True` for parallel execution
132
- - Aggregates results into `SearchResult` with evidence and metadata
133
- - Handles tool failures gracefully (continues with other tools)
134
- - Deduplicates results by URL
135
- - Automatically ingests results into RAG if `auto_ingest_to_rag=True`
136
- - Can add RAG tool dynamically via `add_rag_tool()` method
137
-
138
- ## Tool Registration
139
-
140
- Tools are registered in the search handler:
141
-
142
- ```python
143
- from src.tools.pubmed import PubMedTool
144
- from src.tools.clinicaltrials import ClinicalTrialsTool
145
- from src.tools.europepmc import EuropePMCTool
146
- from src.tools.search_handler import SearchHandler
147
-
148
- search_handler = SearchHandler(
149
- tools=[
150
- PubMedTool(),
151
- ClinicalTrialsTool(),
152
- EuropePMCTool(),
153
- ],
154
- include_rag=True, # Include RAG tool for semantic search
155
- auto_ingest_to_rag=True, # Automatically ingest results into RAG
156
- oauth_token=token # Optional HuggingFace token for RAG LLM
157
- )
158
-
159
- # Execute search
160
- result = await search_handler.execute("query", max_results_per_tool=10)
161
- ```
162
-
163
- ## See Also
164
-
165
- - [Services](services.md) - RAG and embedding services
166
- - [API Reference - Tools](../api/tools.md) - API documentation
167
- - [Contributing - Implementation Patterns](../contributing/implementation-patterns.md) - Development guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/architecture/workflow-diagrams.md DELETED
@@ -1,655 +0,0 @@
1
- # DeepCritical Workflow - Simplified Magentic Architecture
2
-
3
- > **Architecture Pattern**: Microsoft Magentic Orchestration
4
- > **Design Philosophy**: Simple, dynamic, manager-driven coordination
5
- > **Key Innovation**: Intelligent manager replaces rigid sequential phases
6
-
7
- ---
8
-
9
- ## 1. High-Level Magentic Workflow
10
-
11
- ```mermaid
12
- flowchart TD
13
- Start([User Query]) --> Manager[Magentic Manager<br/>Plan • Select • Assess • Adapt]
14
-
15
- Manager -->|Plans| Task1[Task Decomposition]
16
- Task1 --> Manager
17
-
18
- Manager -->|Selects & Executes| HypAgent[Hypothesis Agent]
19
- Manager -->|Selects & Executes| SearchAgent[Search Agent]
20
- Manager -->|Selects & Executes| AnalysisAgent[Analysis Agent]
21
- Manager -->|Selects & Executes| ReportAgent[Report Agent]
22
-
23
- HypAgent -->|Results| Manager
24
- SearchAgent -->|Results| Manager
25
- AnalysisAgent -->|Results| Manager
26
- ReportAgent -->|Results| Manager
27
-
28
- Manager -->|Assesses Quality| Decision{Good Enough?}
29
- Decision -->|No - Refine| Manager
30
- Decision -->|No - Different Agent| Manager
31
- Decision -->|No - Stalled| Replan[Reset Plan]
32
- Replan --> Manager
33
-
34
- Decision -->|Yes| Synthesis[Synthesize Final Result]
35
- Synthesis --> Output([Research Report])
36
-
37
- style Start fill:#e1f5e1
38
- style Manager fill:#ffe6e6
39
- style HypAgent fill:#fff4e6
40
- style SearchAgent fill:#fff4e6
41
- style AnalysisAgent fill:#fff4e6
42
- style ReportAgent fill:#fff4e6
43
- style Decision fill:#ffd6d6
44
- style Synthesis fill:#d4edda
45
- style Output fill:#e1f5e1
46
- ```
47
-
48
- ## 2. Magentic Manager: The 6-Phase Cycle
49
-
50
- ```mermaid
51
- flowchart LR
52
- P1[1. Planning<br/>Analyze task<br/>Create strategy] --> P2[2. Agent Selection<br/>Pick best agent<br/>for subtask]
53
- P2 --> P3[3. Execution<br/>Run selected<br/>agent with tools]
54
- P3 --> P4[4. Assessment<br/>Evaluate quality<br/>Check progress]
55
- P4 --> Decision{Quality OK?<br/>Progress made?}
56
- Decision -->|Yes| P6[6. Synthesis<br/>Combine results<br/>Generate report]
57
- Decision -->|No| P5[5. Iteration<br/>Adjust plan<br/>Try again]
58
- P5 --> P2
59
- P6 --> Done([Complete])
60
-
61
- style P1 fill:#fff4e6
62
- style P2 fill:#ffe6e6
63
- style P3 fill:#e6f3ff
64
- style P4 fill:#ffd6d6
65
- style P5 fill:#fff3cd
66
- style P6 fill:#d4edda
67
- style Done fill:#e1f5e1
68
- ```
69
-
70
- ## 3. Simplified Agent Architecture
71
-
72
- ```mermaid
73
- graph TB
74
- subgraph "Orchestration Layer"
75
- Manager[Magentic Manager<br/>• Plans workflow<br/>• Selects agents<br/>• Assesses quality<br/>• Adapts strategy]
76
- SharedContext[(Shared Context<br/>• Hypotheses<br/>• Search Results<br/>• Analysis<br/>• Progress)]
77
- Manager <--> SharedContext
78
- end
79
-
80
- subgraph "Specialist Agents"
81
- HypAgent[Hypothesis Agent<br/>• Domain understanding<br/>• Hypothesis generation<br/>• Testability refinement]
82
- SearchAgent[Search Agent<br/>• Multi-source search<br/>• RAG retrieval<br/>• Result ranking]
83
- AnalysisAgent[Analysis Agent<br/>• Evidence extraction<br/>• Statistical analysis<br/>• Code execution]
84
- ReportAgent[Report Agent<br/>• Report assembly<br/>• Visualization<br/>• Citation formatting]
85
- end
86
-
87
- subgraph "MCP Tools"
88
- WebSearch[Web Search<br/>PubMed • arXiv • bioRxiv]
89
- CodeExec[Code Execution<br/>Sandboxed Python]
90
- RAG[RAG Retrieval<br/>Vector DB • Embeddings]
91
- Viz[Visualization<br/>Charts • Graphs]
92
- end
93
-
94
- Manager -->|Selects & Directs| HypAgent
95
- Manager -->|Selects & Directs| SearchAgent
96
- Manager -->|Selects & Directs| AnalysisAgent
97
- Manager -->|Selects & Directs| ReportAgent
98
-
99
- HypAgent --> SharedContext
100
- SearchAgent --> SharedContext
101
- AnalysisAgent --> SharedContext
102
- ReportAgent --> SharedContext
103
-
104
- SearchAgent --> WebSearch
105
- SearchAgent --> RAG
106
- AnalysisAgent --> CodeExec
107
- ReportAgent --> CodeExec
108
- ReportAgent --> Viz
109
-
110
- style Manager fill:#ffe6e6
111
- style SharedContext fill:#ffe6f0
112
- style HypAgent fill:#fff4e6
113
- style SearchAgent fill:#fff4e6
114
- style AnalysisAgent fill:#fff4e6
115
- style ReportAgent fill:#fff4e6
116
- style WebSearch fill:#e6f3ff
117
- style CodeExec fill:#e6f3ff
118
- style RAG fill:#e6f3ff
119
- style Viz fill:#e6f3ff
120
- ```
121
-
122
- ## 4. Dynamic Workflow Example
123
-
124
- ```mermaid
125
- sequenceDiagram
126
- participant User
127
- participant Manager
128
- participant HypAgent
129
- participant SearchAgent
130
- participant AnalysisAgent
131
- participant ReportAgent
132
-
133
- User->>Manager: "Research protein folding in Alzheimer's"
134
-
135
- Note over Manager: PLAN: Generate hypotheses → Search → Analyze → Report
136
-
137
- Manager->>HypAgent: Generate 3 hypotheses
138
- HypAgent-->>Manager: Returns 3 hypotheses
139
- Note over Manager: ASSESS: Good quality, proceed
140
-
141
- Manager->>SearchAgent: Search literature for hypothesis 1
142
- SearchAgent-->>Manager: Returns 15 papers
143
- Note over Manager: ASSESS: Good results, continue
144
-
145
- Manager->>SearchAgent: Search for hypothesis 2
146
- SearchAgent-->>Manager: Only 2 papers found
147
- Note over Manager: ASSESS: Insufficient, refine search
148
-
149
- Manager->>SearchAgent: Refined query for hypothesis 2
150
- SearchAgent-->>Manager: Returns 12 papers
151
- Note over Manager: ASSESS: Better, proceed
152
-
153
- Manager->>AnalysisAgent: Analyze evidence for all hypotheses
154
- AnalysisAgent-->>Manager: Returns analysis with code
155
- Note over Manager: ASSESS: Complete, generate report
156
-
157
- Manager->>ReportAgent: Create comprehensive report
158
- ReportAgent-->>Manager: Returns formatted report
159
- Note over Manager: SYNTHESIZE: Combine all results
160
-
161
- Manager->>User: Final Research Report
162
- ```
163
-
164
- ## 5. Manager Decision Logic
165
-
166
- ```mermaid
167
- flowchart TD
168
- Start([Manager Receives Task]) --> Plan[Create Initial Plan]
169
-
170
- Plan --> Select[Select Agent for Next Subtask]
171
- Select --> Execute[Execute Agent]
172
- Execute --> Collect[Collect Results]
173
-
174
- Collect --> Assess[Assess Quality & Progress]
175
-
176
- Assess --> Q1{Quality Sufficient?}
177
- Q1 -->|No| Q2{Same Agent Can Fix?}
178
- Q2 -->|Yes| Feedback[Provide Specific Feedback]
179
- Feedback --> Execute
180
- Q2 -->|No| Different[Try Different Agent]
181
- Different --> Select
182
-
183
- Q1 -->|Yes| Q3{Task Complete?}
184
- Q3 -->|No| Q4{Making Progress?}
185
- Q4 -->|Yes| Select
186
- Q4 -->|No - Stalled| Replan[Reset Plan & Approach]
187
- Replan --> Plan
188
-
189
- Q3 -->|Yes| Synth[Synthesize Final Result]
190
- Synth --> Done([Return Report])
191
-
192
- style Start fill:#e1f5e1
193
- style Plan fill:#fff4e6
194
- style Select fill:#ffe6e6
195
- style Execute fill:#e6f3ff
196
- style Assess fill:#ffd6d6
197
- style Q1 fill:#ffe6e6
198
- style Q2 fill:#ffe6e6
199
- style Q3 fill:#ffe6e6
200
- style Q4 fill:#ffe6e6
201
- style Synth fill:#d4edda
202
- style Done fill:#e1f5e1
203
- ```
204
-
205
- ## 6. Hypothesis Agent Workflow
206
-
207
- ```mermaid
208
- flowchart LR
209
- Input[Research Query] --> Domain[Identify Domain<br/>& Key Concepts]
210
- Domain --> Context[Retrieve Background<br/>Knowledge]
211
- Context --> Generate[Generate 3-5<br/>Initial Hypotheses]
212
- Generate --> Refine[Refine for<br/>Testability]
213
- Refine --> Rank[Rank by<br/>Quality Score]
214
- Rank --> Output[Return Top<br/>Hypotheses]
215
-
216
- Output --> Struct[Hypothesis Structure:<br/>• Statement<br/>• Rationale<br/>• Testability Score<br/>• Data Requirements<br/>• Expected Outcomes]
217
-
218
- style Input fill:#e1f5e1
219
- style Output fill:#fff4e6
220
- style Struct fill:#e6f3ff
221
- ```
222
-
223
- ## 7. Search Agent Workflow
224
-
225
- ```mermaid
226
- flowchart TD
227
- Input[Hypotheses] --> Strategy[Formulate Search<br/>Strategy per Hypothesis]
228
-
229
- Strategy --> Multi[Multi-Source Search]
230
-
231
- Multi --> PubMed[PubMed Search<br/>via MCP]
232
- Multi --> ArXiv[arXiv Search<br/>via MCP]
233
- Multi --> BioRxiv[bioRxiv Search<br/>via MCP]
234
-
235
- PubMed --> Aggregate[Aggregate Results]
236
- ArXiv --> Aggregate
237
- BioRxiv --> Aggregate
238
-
239
- Aggregate --> Filter[Filter & Rank<br/>by Relevance]
240
- Filter --> Dedup[Deduplicate<br/>Cross-Reference]
241
- Dedup --> Embed[Embed Documents<br/>via MCP]
242
- Embed --> Vector[(Vector DB)]
243
- Vector --> RAGRetrieval[RAG Retrieval<br/>Top-K per Hypothesis]
244
- RAGRetrieval --> Output[Return Contextualized<br/>Search Results]
245
-
246
- style Input fill:#fff4e6
247
- style Multi fill:#ffe6e6
248
- style Vector fill:#ffe6f0
249
- style Output fill:#e6f3ff
250
- ```
251
-
252
- ## 8. Analysis Agent Workflow
253
-
254
- ```mermaid
255
- flowchart TD
256
- Input1[Hypotheses] --> Extract
257
- Input2[Search Results] --> Extract[Extract Evidence<br/>per Hypothesis]
258
-
259
- Extract --> Methods[Determine Analysis<br/>Methods Needed]
260
-
261
- Methods --> Branch{Requires<br/>Computation?}
262
- Branch -->|Yes| GenCode[Generate Python<br/>Analysis Code]
263
- Branch -->|No| Qual[Qualitative<br/>Synthesis]
264
-
265
- GenCode --> Execute[Execute Code<br/>via MCP Sandbox]
266
- Execute --> Interpret1[Interpret<br/>Results]
267
- Qual --> Interpret2[Interpret<br/>Findings]
268
-
269
- Interpret1 --> Synthesize[Synthesize Evidence<br/>Across Sources]
270
- Interpret2 --> Synthesize
271
-
272
- Synthesize --> Verdict[Determine Verdict<br/>per Hypothesis]
273
- Verdict --> Support[• Supported<br/>• Refuted<br/>• Inconclusive]
274
- Support --> Gaps[Identify Knowledge<br/>Gaps & Limitations]
275
- Gaps --> Output[Return Analysis<br/>Report]
276
-
277
- style Input1 fill:#fff4e6
278
- style Input2 fill:#e6f3ff
279
- style Execute fill:#ffe6e6
280
- style Output fill:#e6ffe6
281
- ```
282
-
283
- ## 9. Report Agent Workflow
284
-
285
- ```mermaid
286
- flowchart TD
287
- Input1[Query] --> Assemble
288
- Input2[Hypotheses] --> Assemble
289
- Input3[Search Results] --> Assemble
290
- Input4[Analysis] --> Assemble[Assemble Report<br/>Sections]
291
-
292
- Assemble --> Exec[Executive Summary]
293
- Assemble --> Intro[Introduction]
294
- Assemble --> Methods[Methods]
295
- Assemble --> Results[Results per<br/>Hypothesis]
296
- Assemble --> Discussion[Discussion]
297
- Assemble --> Future[Future Directions]
298
- Assemble --> Refs[References]
299
-
300
- Results --> VizCheck{Needs<br/>Visualization?}
301
- VizCheck -->|Yes| GenViz[Generate Viz Code]
302
- GenViz --> ExecViz[Execute via MCP<br/>Create Charts]
303
- ExecViz --> Combine
304
- VizCheck -->|No| Combine[Combine All<br/>Sections]
305
-
306
- Exec --> Combine
307
- Intro --> Combine
308
- Methods --> Combine
309
- Discussion --> Combine
310
- Future --> Combine
311
- Refs --> Combine
312
-
313
- Combine --> Format[Format Output]
314
- Format --> MD[Markdown]
315
- Format --> PDF[PDF]
316
- Format --> JSON[JSON]
317
-
318
- MD --> Output[Return Final<br/>Report]
319
- PDF --> Output
320
- JSON --> Output
321
-
322
- style Input1 fill:#e1f5e1
323
- style Input2 fill:#fff4e6
324
- style Input3 fill:#e6f3ff
325
- style Input4 fill:#e6ffe6
326
- style Output fill:#d4edda
327
- ```
328
-
329
- ## 10. Data Flow & Event Streaming
330
-
331
- ```mermaid
332
- flowchart TD
333
- User[👤 User] -->|Research Query| UI[Gradio UI]
334
- UI -->|Submit| Manager[Magentic Manager]
335
-
336
- Manager -->|Event: Planning| UI
337
- Manager -->|Select Agent| HypAgent[Hypothesis Agent]
338
- HypAgent -->|Event: Delta/Message| UI
339
- HypAgent -->|Hypotheses| Context[(Shared Context)]
340
-
341
- Context -->|Retrieved by| Manager
342
- Manager -->|Select Agent| SearchAgent[Search Agent]
343
- SearchAgent -->|MCP Request| WebSearch[Web Search Tool]
344
- WebSearch -->|Results| SearchAgent
345
- SearchAgent -->|Event: Delta/Message| UI
346
- SearchAgent -->|Documents| Context
347
- SearchAgent -->|Embeddings| VectorDB[(Vector DB)]
348
-
349
- Context -->|Retrieved by| Manager
350
- Manager -->|Select Agent| AnalysisAgent[Analysis Agent]
351
- AnalysisAgent -->|MCP Request| CodeExec[Code Execution Tool]
352
- CodeExec -->|Results| AnalysisAgent
353
- AnalysisAgent -->|Event: Delta/Message| UI
354
- AnalysisAgent -->|Analysis| Context
355
-
356
- Context -->|Retrieved by| Manager
357
- Manager -->|Select Agent| ReportAgent[Report Agent]
358
- ReportAgent -->|MCP Request| CodeExec
359
- ReportAgent -->|Event: Delta/Message| UI
360
- ReportAgent -->|Report| Context
361
-
362
- Manager -->|Event: Final Result| UI
363
- UI -->|Display| User
364
-
365
- style User fill:#e1f5e1
366
- style UI fill:#e6f3ff
367
- style Manager fill:#ffe6e6
368
- style Context fill:#ffe6f0
369
- style VectorDB fill:#ffe6f0
370
- style WebSearch fill:#f0f0f0
371
- style CodeExec fill:#f0f0f0
372
- ```
373
-
374
- ## 11. MCP Tool Architecture
375
-
376
- ```mermaid
377
- graph TB
378
- subgraph "Agent Layer"
379
- Manager[Magentic Manager]
380
- HypAgent[Hypothesis Agent]
381
- SearchAgent[Search Agent]
382
- AnalysisAgent[Analysis Agent]
383
- ReportAgent[Report Agent]
384
- end
385
-
386
- subgraph "MCP Protocol Layer"
387
- Registry[MCP Tool Registry<br/>• Discovers tools<br/>• Routes requests<br/>• Manages connections]
388
- end
389
-
390
- subgraph "MCP Servers"
391
- Server1[Web Search Server<br/>localhost:8001<br/>• PubMed<br/>• arXiv<br/>• bioRxiv]
392
- Server2[Code Execution Server<br/>localhost:8002<br/>• Sandboxed Python<br/>• Package management]
393
- Server3[RAG Server<br/>localhost:8003<br/>• Vector embeddings<br/>• Similarity search]
394
- Server4[Visualization Server<br/>localhost:8004<br/>• Chart generation<br/>• Plot rendering]
395
- end
396
-
397
- subgraph "External Services"
398
- PubMed[PubMed API]
399
- ArXiv[arXiv API]
400
- BioRxiv[bioRxiv API]
401
- Modal[Modal Sandbox]
402
- ChromaDB[(ChromaDB)]
403
- end
404
-
405
- SearchAgent -->|Request| Registry
406
- AnalysisAgent -->|Request| Registry
407
- ReportAgent -->|Request| Registry
408
-
409
- Registry --> Server1
410
- Registry --> Server2
411
- Registry --> Server3
412
- Registry --> Server4
413
-
414
- Server1 --> PubMed
415
- Server1 --> ArXiv
416
- Server1 --> BioRxiv
417
- Server2 --> Modal
418
- Server3 --> ChromaDB
419
-
420
- style Manager fill:#ffe6e6
421
- style Registry fill:#fff4e6
422
- style Server1 fill:#e6f3ff
423
- style Server2 fill:#e6f3ff
424
- style Server3 fill:#e6f3ff
425
- style Server4 fill:#e6f3ff
426
- ```
427
-
428
- ## 12. Progress Tracking & Stall Detection
429
-
430
- ```mermaid
431
- stateDiagram-v2
432
- [*] --> Initialization: User Query
433
-
434
- Initialization --> Planning: Manager starts
435
-
436
- Planning --> AgentExecution: Select agent
437
-
438
- AgentExecution --> Assessment: Collect results
439
-
440
- Assessment --> QualityCheck: Evaluate output
441
-
442
- QualityCheck --> AgentExecution: Poor quality<br/>(retry < max_rounds)
443
- QualityCheck --> Planning: Poor quality<br/>(try different agent)
444
- QualityCheck --> NextAgent: Good quality<br/>(task incomplete)
445
- QualityCheck --> Synthesis: Good quality<br/>(task complete)
446
-
447
- NextAgent --> AgentExecution: Select next agent
448
-
449
- state StallDetection <<choice>>
450
- Assessment --> StallDetection: Check progress
451
- StallDetection --> Planning: No progress<br/>(stall count < max)
452
- StallDetection --> ErrorRecovery: No progress<br/>(max stalls reached)
453
-
454
- ErrorRecovery --> PartialReport: Generate partial results
455
- PartialReport --> [*]
456
-
457
- Synthesis --> FinalReport: Combine all outputs
458
- FinalReport --> [*]
459
-
460
- note right of QualityCheck
461
- Manager assesses:
462
- • Output completeness
463
- • Quality metrics
464
- • Progress made
465
- end note
466
-
467
- note right of StallDetection
468
- Stall = no new progress
469
- after agent execution
470
- Triggers plan reset
471
- end note
472
- ```
473
-
474
- ## 13. Gradio UI Integration
475
-
476
- ```mermaid
477
- graph TD
478
- App[Gradio App<br/>DeepCritical Research Agent]
479
-
480
- App --> Input[Input Section]
481
- App --> Status[Status Section]
482
- App --> Output[Output Section]
483
-
484
- Input --> Query[Research Question<br/>Text Area]
485
- Input --> Controls[Controls]
486
- Controls --> MaxHyp[Max Hypotheses: 1-10]
487
- Controls --> MaxRounds[Max Rounds: 5-20]
488
- Controls --> Submit[Start Research Button]
489
-
490
- Status --> Log[Real-time Event Log<br/>• Manager planning<br/>• Agent selection<br/>• Execution updates<br/>• Quality assessment]
491
- Status --> Progress[Progress Tracker<br/>• Current agent<br/>• Round count<br/>• Stall count]
492
-
493
- Output --> Tabs[Tabbed Results]
494
- Tabs --> Tab1[Hypotheses Tab<br/>Generated hypotheses with scores]
495
- Tabs --> Tab2[Search Results Tab<br/>Papers & sources found]
496
- Tabs --> Tab3[Analysis Tab<br/>Evidence & verdicts]
497
- Tabs --> Tab4[Report Tab<br/>Final research report]
498
- Tab4 --> Download[Download Report<br/>MD / PDF / JSON]
499
-
500
- Submit -.->|Triggers| Workflow[Magentic Workflow]
501
- Workflow -.->|MagenticOrchestratorMessageEvent| Log
502
- Workflow -.->|MagenticAgentDeltaEvent| Log
503
- Workflow -.->|MagenticAgentMessageEvent| Log
504
- Workflow -.->|MagenticFinalResultEvent| Tab4
505
-
506
- style App fill:#e1f5e1
507
- style Input fill:#fff4e6
508
- style Status fill:#e6f3ff
509
- style Output fill:#e6ffe6
510
- style Workflow fill:#ffe6e6
511
- ```
512
-
513
- ## 14. Complete System Context
514
-
515
- ```mermaid
516
- graph LR
517
- User[👤 Researcher<br/>Asks research questions] -->|Submits query| DC[DeepCritical<br/>Magentic Workflow]
518
-
519
- DC -->|Literature search| PubMed[PubMed API<br/>Medical papers]
520
- DC -->|Preprint search| ArXiv[arXiv API<br/>Scientific preprints]
521
- DC -->|Biology search| BioRxiv[bioRxiv API<br/>Biology preprints]
522
- DC -->|Agent reasoning| Claude[Claude API<br/>Sonnet 4 / Opus]
523
- DC -->|Code execution| Modal[Modal Sandbox<br/>Safe Python env]
524
- DC -->|Vector storage| Chroma[ChromaDB<br/>Embeddings & RAG]
525
-
526
- DC -->|Deployed on| HF[HuggingFace Spaces<br/>Gradio 6.0]
527
-
528
- PubMed -->|Results| DC
529
- ArXiv -->|Results| DC
530
- BioRxiv -->|Results| DC
531
- Claude -->|Responses| DC
532
- Modal -->|Output| DC
533
- Chroma -->|Context| DC
534
-
535
- DC -->|Research report| User
536
-
537
- style User fill:#e1f5e1
538
- style DC fill:#ffe6e6
539
- style PubMed fill:#e6f3ff
540
- style ArXiv fill:#e6f3ff
541
- style BioRxiv fill:#e6f3ff
542
- style Claude fill:#ffd6d6
543
- style Modal fill:#f0f0f0
544
- style Chroma fill:#ffe6f0
545
- style HF fill:#d4edda
546
- ```
547
-
548
- ## 15. Workflow Timeline (Simplified)
549
-
550
- ```mermaid
551
- gantt
552
- title DeepCritical Magentic Workflow - Typical Execution
553
- dateFormat mm:ss
554
- axisFormat %M:%S
555
-
556
- section Manager Planning
557
- Initial planning :p1, 00:00, 10s
558
-
559
- section Hypothesis Agent
560
- Generate hypotheses :h1, after p1, 30s
561
- Manager assessment :h2, after h1, 5s
562
-
563
- section Search Agent
564
- Search hypothesis 1 :s1, after h2, 20s
565
- Search hypothesis 2 :s2, after s1, 20s
566
- Search hypothesis 3 :s3, after s2, 20s
567
- RAG processing :s4, after s3, 15s
568
- Manager assessment :s5, after s4, 5s
569
-
570
- section Analysis Agent
571
- Evidence extraction :a1, after s5, 15s
572
- Code generation :a2, after a1, 20s
573
- Code execution :a3, after a2, 25s
574
- Synthesis :a4, after a3, 20s
575
- Manager assessment :a5, after a4, 5s
576
-
577
- section Report Agent
578
- Report assembly :r1, after a5, 30s
579
- Visualization :r2, after r1, 15s
580
- Formatting :r3, after r2, 10s
581
-
582
- section Manager Synthesis
583
- Final synthesis :f1, after r3, 10s
584
- ```
585
-
586
- ---
587
-
588
- ## Key Differences from Original Design
589
-
590
- | Aspect | Original (Judge-in-Loop) | New (Magentic) |
591
- |--------|-------------------------|----------------|
592
- | **Control Flow** | Fixed sequential phases | Dynamic agent selection |
593
- | **Quality Control** | Separate Judge Agent | Manager assessment built-in |
594
- | **Retry Logic** | Phase-level with feedback | Agent-level with adaptation |
595
- | **Flexibility** | Rigid 4-phase pipeline | Adaptive workflow |
596
- | **Complexity** | 5 agents (including Judge) | 4 agents (no Judge) |
597
- | **Progress Tracking** | Manual state management | Built-in round/stall detection |
598
- | **Agent Coordination** | Sequential handoff | Manager-driven dynamic selection |
599
- | **Error Recovery** | Retry same phase | Try different agent or replan |
600
-
601
- ---
602
-
603
- ## Simplified Design Principles
604
-
605
- 1. **Manager is Intelligent**: LLM-powered manager handles planning, selection, and quality assessment
606
- 2. **No Separate Judge**: Manager's assessment phase replaces dedicated Judge Agent
607
- 3. **Dynamic Workflow**: Agents can be called multiple times in any order based on need
608
- 4. **Built-in Safety**: max_round_count (15) and max_stall_count (3) prevent infinite loops
609
- 5. **Event-Driven UI**: Real-time streaming updates to Gradio interface
610
- 6. **MCP-Powered Tools**: All external capabilities via Model Context Protocol
611
- 7. **Shared Context**: Centralized state accessible to all agents
612
- 8. **Progress Awareness**: Manager tracks what's been done and what's needed
613
-
614
- ---
615
-
616
- ## Legend
617
-
618
- - 🔴 **Red/Pink**: Manager, orchestration, decision-making
619
- - 🟡 **Yellow/Orange**: Specialist agents, processing
620
- - 🔵 **Blue**: Data, tools, MCP services
621
- - 🟣 **Purple/Pink**: Storage, databases, state
622
- - 🟢 **Green**: User interactions, final outputs
623
- - ⚪ **Gray**: External services, APIs
624
-
625
- ---
626
-
627
- ## Implementation Highlights
628
-
629
- **Simple 4-Agent Setup:**
630
-
631
- <!--codeinclude-->
632
- [Magentic Workflow Builder](../src/orchestrator_magentic.py) start_line:72 end_line:99
633
- <!--/codeinclude-->
634
-
635
- **Manager handles quality assessment in its instructions:**
636
- - Checks hypothesis quality (testable, novel, clear)
637
- - Validates search results (relevant, authoritative, recent)
638
- - Assesses analysis soundness (methodology, evidence, conclusions)
639
- - Ensures report completeness (all sections, proper citations)
640
-
641
- No separate Judge Agent needed - manager does it all!
642
-
643
- ---
644
-
645
- **Document Version**: 2.0 (Magentic Simplified)
646
- **Last Updated**: 2025-11-24
647
- **Architecture**: Microsoft Magentic Orchestration Pattern
648
- **Agents**: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager
649
- **License**: MIT
650
-
651
- ## See Also
652
-
653
- - [Orchestrators](orchestrators.md) - Overview of all orchestrator patterns
654
- - [Graph Orchestration](graph_orchestration.md) - Graph-based execution overview
655
- - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/configuration/index.md DELETED
@@ -1,564 +0,0 @@
1
- # Configuration Guide
2
-
3
- ## Overview
4
-
5
- DeepCritical uses **Pydantic Settings** for centralized configuration management. All settings are defined in the `Settings` class in `src/utils/config.py` and can be configured via environment variables or a `.env` file.
6
-
7
- The configuration system provides:
8
-
9
- - **Type Safety**: Strongly-typed fields with Pydantic validation
10
- - **Environment File Support**: Automatically loads from `.env` file (if present)
11
- - **Case-Insensitive**: Environment variables are case-insensitive
12
- - **Singleton Pattern**: Global `settings` instance for easy access throughout the codebase
13
- - **Validation**: Automatic validation on load with helpful error messages
14
-
15
- ## Quick Start
16
-
17
- 1. Create a `.env` file in the project root
18
- 2. Set at least one LLM API key (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `HF_TOKEN`)
19
- 3. Optionally configure other services as needed
20
- 4. The application will automatically load and validate your configuration
21
-
22
- ## Configuration System Architecture
23
-
24
- ### Settings Class
25
-
26
- The [`Settings`][settings-class] class extends `BaseSettings` from `pydantic_settings` and defines all application configuration:
27
-
28
- <!--codeinclude-->
29
- [Settings Class Definition](../src/utils/config.py) start_line:13 end_line:21
30
- <!--/codeinclude-->
31
-
32
- [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L13-L21)
33
-
34
- ### Singleton Instance
35
-
36
- A global `settings` instance is available for import:
37
-
38
- <!--codeinclude-->
39
- [Singleton Instance](../src/utils/config.py) start_line:234 end_line:235
40
- <!--/codeinclude-->
41
-
42
- [View source](https://github.com/DeepCritical/GradioDemo/blob/main/src/utils/config.py#L234-L235)
43
-
44
- ### Usage Pattern
45
-
46
- Access configuration throughout the codebase:
47
-
48
- ```python
49
- from src.utils.config import settings
50
-
51
- # Check if API keys are available
52
- if settings.has_openai_key:
53
- # Use OpenAI
54
- pass
55
-
56
- # Access configuration values
57
- max_iterations = settings.max_iterations
58
- web_search_provider = settings.web_search_provider
59
- ```
60
-
61
- ## Required Configuration
62
-
63
- ### LLM Provider
64
-
65
- You must configure at least one LLM provider. The system supports:
66
-
67
- - **OpenAI**: Requires `OPENAI_API_KEY`
68
- - **Anthropic**: Requires `ANTHROPIC_API_KEY`
69
- - **HuggingFace**: Optional `HF_TOKEN` or `HUGGINGFACE_API_KEY` (can work without key for public models)
70
-
71
- #### OpenAI Configuration
72
-
73
- ```bash
74
- LLM_PROVIDER=openai
75
- OPENAI_API_KEY=your_openai_api_key_here
76
- OPENAI_MODEL=gpt-5.1
77
- ```
78
-
79
- The default model is defined in the `Settings` class:
80
-
81
- <!--codeinclude-->
82
- [OpenAI Model Configuration](../src/utils/config.py) start_line:29 end_line:29
83
- <!--/codeinclude-->
84
-
85
- #### Anthropic Configuration
86
-
87
- ```bash
88
- LLM_PROVIDER=anthropic
89
- ANTHROPIC_API_KEY=your_anthropic_api_key_here
90
- ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
91
- ```
92
-
93
- The default model is defined in the `Settings` class:
94
-
95
- <!--codeinclude-->
96
- [Anthropic Model Configuration](../src/utils/config.py) start_line:30 end_line:32
97
- <!--/codeinclude-->
98
-
99
- #### HuggingFace Configuration
100
-
101
- HuggingFace can work without an API key for public models, but an API key provides higher rate limits:
102
-
103
- ```bash
104
- # Option 1: Using HF_TOKEN (preferred)
105
- HF_TOKEN=your_huggingface_token_here
106
-
107
- # Option 2: Using HUGGINGFACE_API_KEY (alternative)
108
- HUGGINGFACE_API_KEY=your_huggingface_api_key_here
109
-
110
- # Default model
111
- HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
112
- ```
113
-
114
- The HuggingFace token can be set via either environment variable:
115
-
116
- <!--codeinclude-->
117
- [HuggingFace Token Configuration](../src/utils/config.py) start_line:33 end_line:35
118
- <!--/codeinclude-->
119
-
120
- <!--codeinclude-->
121
- [HuggingFace API Key Configuration](../src/utils/config.py) start_line:57 end_line:59
122
- <!--/codeinclude-->
123
-
124
- ## Optional Configuration
125
-
126
- ### Embedding Configuration
127
-
128
- DeepCritical supports multiple embedding providers for semantic search and RAG:
129
-
130
- ```bash
131
- # Embedding Provider: "openai", "local", or "huggingface"
132
- EMBEDDING_PROVIDER=local
133
-
134
- # OpenAI Embedding Model (used by LlamaIndex RAG)
135
- OPENAI_EMBEDDING_MODEL=text-embedding-3-small
136
-
137
- # Local Embedding Model (sentence-transformers, used by EmbeddingService)
138
- LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
139
-
140
- # HuggingFace Embedding Model
141
- HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
142
- ```
143
-
144
- The embedding provider configuration:
145
-
146
- <!--codeinclude-->
147
- [Embedding Provider Configuration](../src/utils/config.py) start_line:47 end_line:50
148
- <!--/codeinclude-->
149
-
150
- **Note**: OpenAI embeddings require `OPENAI_API_KEY`. The local provider (default) uses sentence-transformers and requires no API key.
151
-
152
- ### Web Search Configuration
153
-
154
- DeepCritical supports multiple web search providers:
155
-
156
- ```bash
157
- # Web Search Provider: "serper", "searchxng", "brave", "tavily", or "duckduckgo"
158
- # Default: "duckduckgo" (no API key required)
159
- WEB_SEARCH_PROVIDER=duckduckgo
160
-
161
- # Serper API Key (for Google search via Serper)
162
- SERPER_API_KEY=your_serper_api_key_here
163
-
164
- # SearchXNG Host URL (for self-hosted search)
165
- SEARCHXNG_HOST=http://localhost:8080
166
-
167
- # Brave Search API Key
168
- BRAVE_API_KEY=your_brave_api_key_here
169
-
170
- # Tavily API Key
171
- TAVILY_API_KEY=your_tavily_api_key_here
172
- ```
173
-
174
- The web search provider configuration:
175
-
176
- <!--codeinclude-->
177
- [Web Search Provider Configuration](../src/utils/config.py) start_line:71 end_line:74
178
- <!--/codeinclude-->
179
-
180
- **Note**: DuckDuckGo is the default and requires no API key, making it ideal for development and testing.
181
-
182
- ### PubMed Configuration
183
-
184
- PubMed search supports optional NCBI API key for higher rate limits:
185
-
186
- ```bash
187
- # NCBI API Key (optional, for higher rate limits: 10 req/sec vs 3 req/sec)
188
- NCBI_API_KEY=your_ncbi_api_key_here
189
- ```
190
-
191
- The PubMed tool uses this configuration:
192
-
193
- <!--codeinclude-->
194
- [PubMed Tool Configuration](../src/tools/pubmed.py) start_line:22 end_line:29
195
- <!--/codeinclude-->
196
-
197
- ### Agent Configuration
198
-
199
- Control agent behavior and research loop execution:
200
-
201
- ```bash
202
- # Maximum iterations per research loop (1-50, default: 10)
203
- MAX_ITERATIONS=10
204
-
205
- # Search timeout in seconds
206
- SEARCH_TIMEOUT=30
207
-
208
- # Use graph-based execution for research flows
209
- USE_GRAPH_EXECUTION=false
210
- ```
211
-
212
- The agent configuration fields:
213
-
214
- <!--codeinclude-->
215
- [Agent Configuration](../src/utils/config.py) start_line:80 end_line:85
216
- <!--/codeinclude-->
217
-
218
- ### Budget & Rate Limiting Configuration
219
-
220
- Control resource limits for research loops:
221
-
222
- ```bash
223
- # Default token budget per research loop (1000-1000000, default: 100000)
224
- DEFAULT_TOKEN_LIMIT=100000
225
-
226
- # Default time limit per research loop in minutes (1-120, default: 10)
227
- DEFAULT_TIME_LIMIT_MINUTES=10
228
-
229
- # Default iterations limit per research loop (1-50, default: 10)
230
- DEFAULT_ITERATIONS_LIMIT=10
231
- ```
232
-
233
- The budget configuration with validation:
234
-
235
- <!--codeinclude-->
236
- [Budget Configuration](../src/utils/config.py) start_line:87 end_line:105
237
- <!--/codeinclude-->
238
-
239
- ### RAG Service Configuration
240
-
241
- Configure the Retrieval-Augmented Generation service:
242
-
243
- ```bash
244
- # ChromaDB collection name for RAG
245
- RAG_COLLECTION_NAME=deepcritical_evidence
246
-
247
- # Number of top results to retrieve from RAG (1-50, default: 5)
248
- RAG_SIMILARITY_TOP_K=5
249
-
250
- # Automatically ingest evidence into RAG
251
- RAG_AUTO_INGEST=true
252
- ```
253
-
254
- The RAG configuration:
255
-
256
- <!--codeinclude-->
257
- [RAG Service Configuration](../src/utils/config.py) start_line:127 end_line:141
258
- <!--/codeinclude-->
259
-
260
- ### ChromaDB Configuration
261
-
262
- Configure the vector database for embeddings and RAG:
263
-
264
- ```bash
265
- # ChromaDB storage path
266
- CHROMA_DB_PATH=./chroma_db
267
-
268
- # Whether to persist ChromaDB to disk
269
- CHROMA_DB_PERSIST=true
270
-
271
- # ChromaDB server host (for remote ChromaDB, optional)
272
- CHROMA_DB_HOST=localhost
273
-
274
- # ChromaDB server port (for remote ChromaDB, optional)
275
- CHROMA_DB_PORT=8000
276
- ```
277
-
278
- The ChromaDB configuration:
279
-
280
- <!--codeinclude-->
281
- [ChromaDB Configuration](../src/utils/config.py) start_line:113 end_line:125
282
- <!--/codeinclude-->
283
-
284
- ### External Services
285
-
286
- #### Modal Configuration
287
-
288
- Modal is used for secure sandbox execution of statistical analysis:
289
-
290
- ```bash
291
- # Modal Token ID (for Modal sandbox execution)
292
- MODAL_TOKEN_ID=your_modal_token_id_here
293
-
294
- # Modal Token Secret
295
- MODAL_TOKEN_SECRET=your_modal_token_secret_here
296
- ```
297
-
298
- The Modal configuration:
299
-
300
- <!--codeinclude-->
301
- [Modal Configuration](../src/utils/config.py) start_line:110 end_line:112
302
- <!--/codeinclude-->
303
-
304
- ### Logging Configuration
305
-
306
- Configure structured logging:
307
-
308
- ```bash
309
- # Log Level: "DEBUG", "INFO", "WARNING", or "ERROR"
310
- LOG_LEVEL=INFO
311
- ```
312
-
313
- The logging configuration:
314
-
315
- <!--codeinclude-->
316
- [Logging Configuration](../src/utils/config.py) start_line:107 end_line:108
317
- <!--/codeinclude-->
318
-
319
- Logging is configured via the `configure_logging()` function:
320
-
321
- <!--codeinclude-->
322
- [Configure Logging Function](../src/utils/config.py) start_line:212 end_line:231
323
- <!--/codeinclude-->
324
-
325
- ## Configuration Properties
326
-
327
- The `Settings` class provides helpful properties for checking configuration state:
328
-
329
- ### API Key Availability
330
-
331
- Check which API keys are available:
332
-
333
- <!--codeinclude-->
334
- [API Key Availability Properties](../src/utils/config.py) start_line:171 end_line:189
335
- <!--/codeinclude-->
336
-
337
- **Usage:**
338
-
339
- ```python
340
- from src.utils.config import settings
341
-
342
- # Check API key availability
343
- if settings.has_openai_key:
344
- # Use OpenAI
345
- pass
346
-
347
- if settings.has_anthropic_key:
348
- # Use Anthropic
349
- pass
350
-
351
- if settings.has_huggingface_key:
352
- # Use HuggingFace
353
- pass
354
-
355
- if settings.has_any_llm_key:
356
- # At least one LLM is available
357
- pass
358
- ```
359
-
360
- ### Service Availability
361
-
362
- Check if external services are configured:
363
-
364
- <!--codeinclude-->
365
- [Modal Availability Property](../src/utils/config.py) start_line:143 end_line:146
366
- <!--/codeinclude-->
367
-
368
- <!--codeinclude-->
369
- [Web Search Availability Property](../src/utils/config.py) start_line:191 end_line:204
370
- <!--/codeinclude-->
371
-
372
- **Usage:**
373
-
374
- ```python
375
- from src.utils.config import settings
376
-
377
- # Check service availability
378
- if settings.modal_available:
379
- # Use Modal sandbox
380
- pass
381
-
382
- if settings.web_search_available:
383
- # Web search is configured
384
- pass
385
- ```
386
-
387
- ### API Key Retrieval
388
-
389
- Get the API key for the configured provider:
390
-
391
- <!--codeinclude-->
392
- [Get API Key Method](../src/utils/config.py) start_line:148 end_line:160
393
- <!--/codeinclude-->
394
-
395
- For OpenAI-specific operations (e.g., Magentic mode):
396
-
397
- <!--codeinclude-->
398
- [Get OpenAI API Key Method](../src/utils/config.py) start_line:162 end_line:169
399
- <!--/codeinclude-->
400
-
401
- ## Configuration Usage in Codebase
402
-
403
- The configuration system is used throughout the codebase:
404
-
405
- ### LLM Factory
406
-
407
- The LLM factory uses settings to create appropriate models:
408
-
409
- <!--codeinclude-->
410
- [LLM Factory Usage](../src/utils/llm_factory.py) start_line:129 end_line:144
411
- <!--/codeinclude-->
412
-
413
- ### Embedding Service
414
-
415
- The embedding service uses local embedding model configuration:
416
-
417
- <!--codeinclude-->
418
- [Embedding Service Usage](../src/services/embeddings.py) start_line:29 end_line:31
419
- <!--/codeinclude-->
420
-
421
- ### Orchestrator Factory
422
-
423
- The orchestrator factory uses settings to determine mode:
424
-
425
- <!--codeinclude-->
426
- [Orchestrator Factory Mode Detection](../src/orchestrator_factory.py) start_line:69 end_line:80
427
- <!--/codeinclude-->
428
-
429
- ## Environment Variables Reference
430
-
431
- ### Required (at least one LLM)
432
-
433
- - `OPENAI_API_KEY` - OpenAI API key (required for OpenAI provider)
434
- - `ANTHROPIC_API_KEY` - Anthropic API key (required for Anthropic provider)
435
- - `HF_TOKEN` or `HUGGINGFACE_API_KEY` - HuggingFace API token (optional, can work without for public models)
436
-
437
- #### LLM Configuration Variables
438
-
439
- - `LLM_PROVIDER` - Provider to use: `"openai"`, `"anthropic"`, or `"huggingface"` (default: `"huggingface"`)
440
- - `OPENAI_MODEL` - OpenAI model name (default: `"gpt-5.1"`)
441
- - `ANTHROPIC_MODEL` - Anthropic model name (default: `"claude-sonnet-4-5-20250929"`)
442
- - `HUGGINGFACE_MODEL` - HuggingFace model ID (default: `"meta-llama/Llama-3.1-8B-Instruct"`)
443
-
444
- #### Embedding Configuration Variables
445
-
446
- - `EMBEDDING_PROVIDER` - Provider: `"openai"`, `"local"`, or `"huggingface"` (default: `"local"`)
447
- - `OPENAI_EMBEDDING_MODEL` - OpenAI embedding model (default: `"text-embedding-3-small"`)
448
- - `LOCAL_EMBEDDING_MODEL` - Local sentence-transformers model (default: `"all-MiniLM-L6-v2"`)
449
- - `HUGGINGFACE_EMBEDDING_MODEL` - HuggingFace embedding model (default: `"sentence-transformers/all-MiniLM-L6-v2"`)
450
-
451
- #### Web Search Configuration Variables
452
-
453
- - `WEB_SEARCH_PROVIDER` - Provider: `"serper"`, `"searchxng"`, `"brave"`, `"tavily"`, or `"duckduckgo"` (default: `"duckduckgo"`)
454
- - `SERPER_API_KEY` - Serper API key (required for Serper provider)
455
- - `SEARCHXNG_HOST` - SearchXNG host URL (required for SearchXNG provider)
456
- - `BRAVE_API_KEY` - Brave Search API key (required for Brave provider)
457
- - `TAVILY_API_KEY` - Tavily API key (required for Tavily provider)
458
-
459
- #### PubMed Configuration Variables
460
-
461
- - `NCBI_API_KEY` - NCBI API key (optional, increases rate limit from 3 to 10 req/sec)
462
-
463
- #### Agent Configuration Variables
464
-
465
- - `MAX_ITERATIONS` - Maximum iterations per research loop (1-50, default: `10`)
466
- - `SEARCH_TIMEOUT` - Search timeout in seconds (default: `30`)
467
- - `USE_GRAPH_EXECUTION` - Use graph-based execution (default: `false`)
468
-
469
- #### Budget Configuration Variables
470
-
471
- - `DEFAULT_TOKEN_LIMIT` - Default token budget per research loop (1000-1000000, default: `100000`)
472
- - `DEFAULT_TIME_LIMIT_MINUTES` - Default time limit in minutes (1-120, default: `10`)
473
- - `DEFAULT_ITERATIONS_LIMIT` - Default iterations limit (1-50, default: `10`)
474
-
475
- #### RAG Configuration Variables
476
-
477
- - `RAG_COLLECTION_NAME` - ChromaDB collection name (default: `"deepcritical_evidence"`)
478
- - `RAG_SIMILARITY_TOP_K` - Number of top results to retrieve (1-50, default: `5`)
479
- - `RAG_AUTO_INGEST` - Automatically ingest evidence into RAG (default: `true`)
480
-
481
- #### ChromaDB Configuration Variables
482
-
483
- - `CHROMA_DB_PATH` - ChromaDB storage path (default: `"./chroma_db"`)
484
- - `CHROMA_DB_PERSIST` - Whether to persist ChromaDB to disk (default: `true`)
485
- - `CHROMA_DB_HOST` - ChromaDB server host (optional, for remote ChromaDB)
486
- - `CHROMA_DB_PORT` - ChromaDB server port (optional, for remote ChromaDB)
487
-
488
- #### External Services Variables
489
-
490
- - `MODAL_TOKEN_ID` - Modal token ID (optional, for Modal sandbox execution)
491
- - `MODAL_TOKEN_SECRET` - Modal token secret (optional, for Modal sandbox execution)
492
-
493
- #### Logging Configuration Variables
494
-
495
- - `LOG_LEVEL` - Log level: `"DEBUG"`, `"INFO"`, `"WARNING"`, or `"ERROR"` (default: `"INFO"`)
496
-
497
- ## Validation
498
-
499
- Settings are validated on load using Pydantic validation:
500
-
501
- - **Type Checking**: All fields are strongly typed
502
- - **Range Validation**: Numeric fields have min/max constraints (e.g., `ge=1, le=50` for `max_iterations`)
503
- - **Literal Validation**: Enum fields only accept specific values (e.g., `Literal["openai", "anthropic", "huggingface"]`)
504
- - **Required Fields**: API keys are checked when accessed via `get_api_key()` or `get_openai_api_key()`
505
-
506
- ### Validation Examples
507
-
508
- The `max_iterations` field has range validation:
509
-
510
- <!--codeinclude-->
511
- [Max Iterations Validation](../src/utils/config.py) start_line:81 end_line:81
512
- <!--/codeinclude-->
513
-
514
- The `llm_provider` field has literal validation:
515
-
516
- <!--codeinclude-->
517
- [LLM Provider Literal Validation](../src/utils/config.py) start_line:26 end_line:28
518
- <!--/codeinclude-->
519
-
520
- ## Error Handling
521
-
522
- Configuration errors raise `ConfigurationError` from `src/utils/exceptions.py`:
523
-
524
- ```22:25:src/utils/exceptions.py
525
- class ConfigurationError(DeepCriticalError):
526
- """Raised when configuration is invalid."""
527
-
528
- pass
529
- ```
530
-
531
- ### Error Handling Example
532
-
533
- ```python
534
- from src.utils.config import settings
535
- from src.utils.exceptions import ConfigurationError
536
-
537
- try:
538
- api_key = settings.get_api_key()
539
- except ConfigurationError as e:
540
- print(f"Configuration error: {e}")
541
- ```
542
-
543
- ### Common Configuration Errors
544
-
545
- 1. **Missing API Key**: When `get_api_key()` is called but the required API key is not set
546
- 2. **Invalid Provider**: When `llm_provider` is set to an unsupported value
547
- 3. **Out of Range**: When numeric values exceed their min/max constraints
548
- 4. **Invalid Literal**: When enum fields receive unsupported values
549
-
550
- ## Configuration Best Practices
551
-
552
- 1. **Use `.env` File**: Store sensitive keys in `.env` file (add to `.gitignore`)
553
- 2. **Check Availability**: Use properties like `has_openai_key` before accessing API keys
554
- 3. **Handle Errors**: Always catch `ConfigurationError` when calling `get_api_key()`
555
- 4. **Validate Early**: Configuration is validated on import, so errors surface immediately
556
- 5. **Use Defaults**: Leverage sensible defaults for optional configuration
557
-
558
- ## Future Enhancements
559
-
560
- The following configurations are planned for future phases:
561
-
562
- 1. **Additional LLM Providers**: DeepSeek, OpenRouter, Gemini, Perplexity, Azure OpenAI, Local models
563
- 2. **Model Selection**: Reasoning/main/fast model configuration
564
- 3. **Service Integration**: Additional service integrations and configurations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/code-quality.md DELETED
@@ -1,120 +0,0 @@
1
- # Code Quality & Documentation
2
-
3
- This document outlines code quality standards and documentation requirements for The DETERMINATOR.
4
-
5
- ## Linting
6
-
7
- - Ruff with 100-char line length
8
- - Ignore rules documented in `pyproject.toml`:
9
- - `PLR0913`: Too many arguments (agents need many params)
10
- - `PLR0912`: Too many branches (complex orchestrator logic)
11
- - `PLR0911`: Too many return statements (complex agent logic)
12
- - `PLR2004`: Magic values (statistical constants)
13
- - `PLW0603`: Global statement (singleton pattern)
14
- - `PLC0415`: Lazy imports for optional dependencies
15
- - `E402`: Module level import not at top (needed for pytest.importorskip)
16
- - `E501`: Line too long (ignore line length violations)
17
- - `RUF100`: Unused noqa (version differences between local/CI)
18
-
19
- ## Type Checking
20
-
21
- - `mypy --strict` compliance
22
- - `ignore_missing_imports = true` (for optional dependencies)
23
- - Exclude: `reference_repos/`, `examples/`
24
- - All functions must have complete type annotations
25
-
26
- ## Pre-commit
27
-
28
- Pre-commit hooks run automatically on commit to ensure code quality. Configuration is in `.pre-commit-config.yaml`.
29
-
30
- ### Installation
31
-
32
- ```bash
33
- # Install dependencies (includes pre-commit package)
34
- uv sync --all-extras
35
-
36
- # Set up git hooks (must be run separately)
37
- uv run pre-commit install
38
- ```
39
-
40
- **Note**: `uv sync --all-extras` installs the pre-commit package, but you must run `uv run pre-commit install` separately to set up the git hooks.
41
-
42
- ### Pre-commit Hooks
43
-
44
- The following hooks run automatically on commit:
45
-
46
- 1. **ruff**: Lints code and fixes issues automatically
47
- - Runs on: `src/` (excludes `tests/`, `reference_repos/`)
48
- - Auto-fixes: Yes
49
-
50
- 2. **ruff-format**: Formats code with ruff
51
- - Runs on: `src/` (excludes `tests/`, `reference_repos/`)
52
- - Auto-fixes: Yes
53
-
54
- 3. **mypy**: Type checking
55
- - Runs on: `src/` (excludes `folder/`)
56
- - Additional dependencies: pydantic, pydantic-settings, tenacity, pydantic-ai
57
-
58
- 4. **pytest-unit**: Runs unit tests (excludes OpenAI and embedding_provider tests)
59
- - Runs: `tests/unit/` with `-m "not openai and not embedding_provider"`
60
- - Always runs: Yes (not just on changed files)
61
-
62
- 5. **pytest-local-embeddings**: Runs local embedding tests
63
- - Runs: `tests/` with `-m "local_embeddings"`
64
- - Always runs: Yes
65
-
66
- ### Manual Pre-commit Run
67
-
68
- To run pre-commit hooks manually (without committing):
69
-
70
- ```bash
71
- uv run pre-commit run --all-files
72
- ```
73
-
74
- ### Troubleshooting
75
-
76
- - **Hooks failing**: Fix the issues shown in the output, then commit again
77
- - **Skipping hooks**: Use `git commit --no-verify` (not recommended)
78
- - **Hook not running**: Ensure hooks are installed with `uv run pre-commit install`
79
- - **Type errors**: Check that all dependencies are installed with `uv sync --all-extras`
80
-
81
- ## Documentation
82
-
83
- ### Building Documentation
84
-
85
- Documentation is built using MkDocs. Source files are in `docs/`, and the configuration is in `mkdocs.yml`.
86
-
87
- ```bash
88
- # Build documentation
89
- uv run mkdocs build
90
-
91
- # Serve documentation locally (http://127.0.0.1:8000)
92
- uv run mkdocs serve
93
- ```
94
-
95
- The documentation site is published at: <https://deepcritical.github.io/GradioDemo/>
96
-
97
- ### Docstrings
98
-
99
- - Google-style docstrings for all public functions
100
- - Include Args, Returns, Raises sections
101
- - Use type hints in docstrings only if needed for clarity
102
-
103
- Example:
104
-
105
- <!--codeinclude-->
106
- [Search Method Docstring Example](../src/tools/pubmed.py) start_line:51 end_line:70
107
- <!--/codeinclude-->
108
-
109
- ### Code Comments
110
-
111
- - Explain WHY, not WHAT
112
- - Document non-obvious patterns (e.g., why `requests` not `httpx` for ClinicalTrials)
113
- - Mark critical sections: `# CRITICAL: ...`
114
- - Document rate limiting rationale
115
- - Explain async patterns when non-obvious
116
-
117
- ## See Also
118
-
119
- - [Code Style](code-style.md) - Code style guidelines
120
- - [Testing](testing.md) - Testing guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/code-style.md DELETED
@@ -1,83 +0,0 @@
1
- # Code Style & Conventions
2
-
3
- This document outlines the code style and conventions for The DETERMINATOR.
4
-
5
- ## Package Manager
6
-
7
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
8
-
9
- ### Installation
10
-
11
- ```bash
12
- # Install uv if you haven't already (recommended: standalone installer)
13
- # Unix/macOS/Linux:
14
- curl -LsSf https://astral.sh/uv/install.sh | sh
15
-
16
- # Windows (PowerShell):
17
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
18
-
19
- # Alternative: pipx install uv
20
- # Or: pip install uv
21
-
22
- # Sync all dependencies including dev extras
23
- uv sync --all-extras
24
- ```
25
-
26
- ### Running Commands
27
-
28
- All development commands should use `uv run` prefix:
29
-
30
- ```bash
31
- # Instead of: pytest tests/
32
- uv run pytest tests/
33
-
34
- # Instead of: ruff check src
35
- uv run ruff check src
36
-
37
- # Instead of: mypy src
38
- uv run mypy src
39
- ```
40
-
41
- This ensures commands run in the correct virtual environment managed by `uv`.
42
-
43
- ## Type Safety
44
-
45
- - **ALWAYS** use type hints for all function parameters and return types
46
- - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
47
- - Use `TYPE_CHECKING` imports for circular dependencies:
48
-
49
- <!--codeinclude-->
50
- [TYPE_CHECKING Import Pattern](../src/utils/citation_validator.py) start_line:8 end_line:11
51
- <!--/codeinclude-->
52
-
53
- ## Pydantic Models
54
-
55
- - All data exchange uses Pydantic models (`src/utils/models.py`)
56
- - Models are frozen (`model_config = {"frozen": True}`) for immutability
57
- - Use `Field()` with descriptions for all model fields
58
- - Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints
59
-
60
- ## Async Patterns
61
-
62
- - **ALL** I/O operations must be async (`async def`, `await`)
63
- - Use `asyncio.gather()` for parallel operations
64
- - CPU-bound work (embeddings, parsing) must use `run_in_executor()`:
65
-
66
- ```python
67
- loop = asyncio.get_running_loop()
68
- result = await loop.run_in_executor(None, cpu_bound_function, args)
69
- ```
70
-
71
- - Never block the event loop with synchronous I/O
72
-
73
- ## Common Pitfalls
74
-
75
- 1. **Blocking the event loop**: Never use sync I/O in async functions
76
- 2. **Missing type hints**: All functions must have complete type annotations
77
- 3. **Global mutable state**: Use ContextVar or pass via parameters
78
- 4. **Import errors**: Lazy-load optional dependencies (magentic, modal, embeddings)
79
-
80
- ## See Also
81
-
82
- - [Error Handling](error-handling.md) - Error handling guidelines
83
- - [Implementation Patterns](implementation-patterns.md) - Common patterns
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/error-handling.md DELETED
@@ -1,54 +0,0 @@
1
- # Error Handling & Logging
2
-
3
- This document outlines error handling and logging conventions for The DETERMINATOR.
4
-
5
- ## Exception Hierarchy
6
-
7
- Use custom exception hierarchy (`src/utils/exceptions.py`):
8
-
9
- <!--codeinclude-->
10
- [Exception Hierarchy](../src/utils/exceptions.py) start_line:4 end_line:31
11
- <!--/codeinclude-->
12
-
13
- ## Error Handling Rules
14
-
15
- - Always chain exceptions: `raise SearchError(...) from e`
16
- - Log errors with context using `structlog`:
17
-
18
- ```python
19
- logger.error("Operation failed", error=str(e), context=value)
20
- ```
21
-
22
- - Never silently swallow exceptions
23
- - Provide actionable error messages
24
-
25
- ## Logging
26
-
27
- - Use `structlog` for all logging (NOT `print` or `logging`)
28
- - Import: `import structlog; logger = structlog.get_logger()`
29
- - Log with structured data: `logger.info("event", key=value)`
30
- - Use appropriate levels: DEBUG, INFO, WARNING, ERROR
31
-
32
- ## Logging Examples
33
-
34
- ```python
35
- logger.info("Starting search", query=query, tools=[t.name for t in tools])
36
- logger.warning("Search tool failed", tool=tool.name, error=str(result))
37
- logger.error("Assessment failed", error=str(e))
38
- ```
39
-
40
- ## Error Chaining
41
-
42
- Always preserve exception context:
43
-
44
- ```python
45
- try:
46
- result = await api_call()
47
- except httpx.HTTPError as e:
48
- raise SearchError(f"API call failed: {e}") from e
49
- ```
50
-
51
- ## See Also
52
-
53
- - [Code Style](code-style.md) - Code style guidelines
54
- - [Testing](testing.md) - Testing guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/implementation-patterns.md DELETED
@@ -1,67 +0,0 @@
1
- # Implementation Patterns
2
-
3
- This document outlines common implementation patterns used in The DETERMINATOR.
4
-
5
- ## Search Tools
6
-
7
- All tools implement `SearchTool` protocol (`src/tools/base.py`):
8
-
9
- - Must have `name` property
10
- - Must implement `async def search(query, max_results) -> list[Evidence]`
11
- - Use `@retry` decorator from tenacity for resilience
12
- - Rate limiting: Implement `_rate_limit()` for APIs with limits (e.g., PubMed)
13
- - Error handling: Raise `SearchError` or `RateLimitError` on failures
14
-
15
- Example pattern:
16
-
17
- ```python
18
- class MySearchTool:
19
- @property
20
- def name(self) -> str:
21
- return "mytool"
22
-
23
- @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
24
- async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
25
- # Implementation
26
- return evidence_list
27
- ```
28
-
29
- ## Judge Handlers
30
-
31
- - Implement `JudgeHandlerProtocol` (`async def assess(question, evidence) -> JudgeAssessment`)
32
- - Use pydantic-ai `Agent` with `output_type=JudgeAssessment`
33
- - System prompts in `src/prompts/judge.py`
34
- - Support fallback handlers: `MockJudgeHandler`, `HFInferenceJudgeHandler`
35
- - Always return valid `JudgeAssessment` (never raise exceptions)
36
-
37
- ## Agent Factory Pattern
38
-
39
- - Use factory functions for creating agents (`src/agent_factory/`)
40
- - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
41
- - Check requirements before initialization:
42
-
43
- <!--codeinclude-->
44
- [Check Magentic Requirements](../src/utils/llm_factory.py) start_line:152 end_line:170
45
- <!--/codeinclude-->
46
-
47
- ## State Management
48
-
49
- - **Magentic Mode**: Use `ContextVar` for thread-safe state (`src/agents/state.py`)
50
- - **Simple Mode**: Pass state via function parameters
51
- - Never use global mutable state (except singletons via `@lru_cache`)
52
-
53
- ## Singleton Pattern
54
-
55
- Use `@lru_cache(maxsize=1)` for singletons:
56
-
57
- <!--codeinclude-->
58
- [Singleton Pattern Example](../src/services/statistical_analyzer.py) start_line:252 end_line:255
59
- <!--/codeinclude-->
60
-
61
- - Lazy initialization to avoid requiring dependencies at import time
62
-
63
- ## See Also
64
-
65
- - [Code Style](code-style.md) - Code style guidelines
66
- - [Error Handling](error-handling.md) - Error handling guidelines
67
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/index.md DELETED
@@ -1,254 +0,0 @@
1
- # Contributing to The DETERMINATOR
2
-
3
- Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
4
-
5
- > **Note on Project Names**: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name.
6
-
7
- ## Git Workflow
8
-
9
- - `main`: Production-ready (GitHub)
10
- - `dev`: Development integration (GitHub)
11
- - Use feature branches: `yourname-dev`
12
- - **NEVER** push directly to `main` or `dev` on HuggingFace
13
- - GitHub is source of truth; HuggingFace is for deployment
14
-
15
- ## Repository Information
16
-
17
- - **GitHub Repository**: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo) (source of truth, PRs, code review)
18
- - **HuggingFace Space**: [`DataQuests/DeepCritical`](https://huggingface.co/spaces/DataQuests/DeepCritical) (deployment/demo)
19
- - **Package Name**: `determinator` (Python package name in `pyproject.toml`)
20
-
21
- ### Dual Repository Setup
22
-
23
- This project uses a dual repository setup:
24
-
25
- - **GitHub (`DeepCritical/GradioDemo`)**: Source of truth for code, PRs, and code review
26
- - **HuggingFace (`DataQuests/DeepCritical`)**: Deployment target for the Gradio demo
27
-
28
- #### Remote Configuration
29
-
30
- When cloning, set up remotes as follows:
31
-
32
- ```bash
33
- # Clone from GitHub
34
- git clone https://github.com/DeepCritical/GradioDemo.git
35
- cd GradioDemo
36
-
37
- # Add HuggingFace remote (optional, for deployment)
38
- git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
39
- ```
40
-
41
- **Important**: Never push directly to `main` or `dev` on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
42
-
43
- ## Package Manager
44
-
45
- This project uses [`uv`](https://github.com/astral-sh/uv) as the package manager. All commands should be prefixed with `uv run` to ensure they run in the correct environment.
46
-
47
- ### Installation
48
-
49
- ```bash
50
- # Install uv if you haven't already (recommended: standalone installer)
51
- # Unix/macOS/Linux:
52
- curl -LsSf https://astral.sh/uv/install.sh | sh
53
-
54
- # Windows (PowerShell):
55
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
56
-
57
- # Alternative: pipx install uv
58
- # Or: pip install uv
59
-
60
- # Sync all dependencies including dev extras
61
- uv sync --all-extras
62
-
63
- # Install pre-commit hooks
64
- uv run pre-commit install
65
- ```
66
-
67
- ## Development Commands
68
-
69
- ```bash
70
- # Installation
71
- uv sync --all-extras # Install all dependencies including dev
72
- uv run pre-commit install # Install pre-commit hooks
73
-
74
- # Code Quality Checks (run all before committing)
75
- uv run ruff check src tests # Lint with ruff
76
- uv run ruff format src tests # Format with ruff
77
- uv run mypy src # Type checking
78
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
79
-
80
- # Testing Commands
81
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
82
- uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
83
- uv run pytest tests/ -v -p no:logfire # Run all tests
84
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
85
- uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
86
-
87
- # Documentation Commands
88
- uv run mkdocs build # Build documentation
89
- uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
90
- ```
91
-
92
- ### Test Markers
93
-
94
- The project uses pytest markers to categorize tests. See [Testing Guidelines](testing.md) for details:
95
-
96
- - `unit`: Unit tests (mocked, fast)
97
- - `integration`: Integration tests (real APIs)
98
- - `slow`: Slow tests
99
- - `openai`: Tests requiring OpenAI API key
100
- - `huggingface`: Tests requiring HuggingFace API key
101
- - `embedding_provider`: Tests requiring API-based embedding providers
102
- - `local_embeddings`: Tests using local embeddings
103
-
104
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
105
-
106
- ## Getting Started
107
-
108
- 1. **Fork the repository** on GitHub: [`DeepCritical/GradioDemo`](https://github.com/DeepCritical/GradioDemo)
109
-
110
- 2. **Clone your fork**:
111
-
112
- ```bash
113
- git clone https://github.com/yourusername/GradioDemo.git
114
- cd GradioDemo
115
- ```
116
-
117
- 3. **Install dependencies**:
118
-
119
- ```bash
120
- uv sync --all-extras
121
- uv run pre-commit install
122
- ```
123
-
124
- 4. **Create a feature branch**:
125
-
126
- ```bash
127
- git checkout -b yourname-feature-name
128
- ```
129
-
130
- 5. **Make your changes** following the guidelines below
131
-
132
- 6. **Run checks**:
133
-
134
- ```bash
135
- uv run ruff check src tests
136
- uv run mypy src
137
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
138
- ```
139
-
140
- 7. **Commit and push**:
141
-
142
- ```bash
143
- git commit -m "Description of changes"
144
- git push origin yourname-feature-name
145
- ```
146
-
147
- 8. **Create a pull request** on GitHub
148
-
149
- ## Development Guidelines
150
-
151
- ### Code Style
152
-
153
- - Follow [Code Style Guidelines](code-style.md)
154
- - All code must pass `mypy --strict`
155
- - Use `ruff` for linting and formatting
156
- - Line length: 100 characters
157
-
158
- ### Error Handling
159
-
160
- - Follow [Error Handling Guidelines](error-handling.md)
161
- - Always chain exceptions: `raise SearchError(...) from e`
162
- - Use structured logging with `structlog`
163
- - Never silently swallow exceptions
164
-
165
- ### Testing
166
-
167
- - Follow [Testing Guidelines](testing.md)
168
- - Write tests before implementation (TDD)
169
- - Aim for >80% coverage on critical paths
170
- - Use markers: `unit`, `integration`, `slow`
171
-
172
- ### Implementation Patterns
173
-
174
- - Follow [Implementation Patterns](implementation-patterns.md)
175
- - Use factory functions for agent/tool creation
176
- - Implement protocols for extensibility
177
- - Use singleton pattern with `@lru_cache(maxsize=1)`
178
-
179
- ### Prompt Engineering
180
-
181
- - Follow [Prompt Engineering Guidelines](prompt-engineering.md)
182
- - Always validate citations
183
- - Use diverse evidence selection
184
- - Never trust LLM-generated citations without validation
185
-
186
- ### Code Quality
187
-
188
- - Follow [Code Quality Guidelines](code-quality.md)
189
- - Google-style docstrings for all public functions
190
- - Explain WHY, not WHAT in comments
191
- - Mark critical sections: `# CRITICAL: ...`
192
-
193
- ## MCP Integration
194
-
195
- ### MCP Tools
196
-
197
- - Functions in `src/mcp_tools.py` for Claude Desktop
198
- - Full type hints required
199
- - Google-style docstrings with Args/Returns sections
200
- - Formatted string returns (markdown)
201
-
202
- ### Gradio MCP Server
203
-
204
- - Enable with `mcp_server=True` in `demo.launch()`
205
- - Endpoint: `/gradio_api/mcp/`
206
- - Use `ssr_mode=False` to fix hydration issues in HF Spaces
207
-
208
- ## Common Pitfalls
209
-
210
- 1. **Blocking the event loop**: Never use sync I/O in async functions
211
- 2. **Missing type hints**: All functions must have complete type annotations
212
- 3. **Hallucinated citations**: Always validate references
213
- 4. **Global mutable state**: Use ContextVar or pass via parameters
214
- 5. **Import errors**: Lazy-load optional dependencies (magentic, modal, embeddings)
215
- 6. **Rate limiting**: Always implement for external APIs
216
- 7. **Error chaining**: Always use `from e` when raising exceptions
217
-
218
- ## Key Principles
219
-
220
- 1. **Type Safety First**: All code must pass `mypy --strict`
221
- 2. **Async Everything**: All I/O must be async
222
- 3. **Test-Driven**: Write tests before implementation
223
- 4. **No Hallucinations**: Validate all citations
224
- 5. **Graceful Degradation**: Support free tier (HF Inference) when no API keys
225
- 6. **Lazy Loading**: Don't require optional dependencies at import time
226
- 7. **Structured Logging**: Use structlog, never print()
227
- 8. **Error Chaining**: Always preserve exception context
228
-
229
- ## Pull Request Process
230
-
231
- 1. Ensure all checks pass: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
232
- 2. Update documentation if needed
233
- 3. Add tests for new features
234
- 4. Update CHANGELOG if applicable
235
- 5. Request review from maintainers
236
- 6. Address review feedback
237
- 7. Wait for approval before merging
238
-
239
- ## Project Structure
240
-
241
- - `src/`: Main source code
242
- - `tests/`: Test files (`unit/` and `integration/`)
243
- - `docs/`: Documentation source files (MkDocs)
244
- - `examples/`: Example usage scripts
245
- - `pyproject.toml`: Project configuration and dependencies
246
- - `.pre-commit-config.yaml`: Pre-commit hook configuration
247
-
248
- ## Questions?
249
-
250
- - Open an issue on [GitHub](https://github.com/DeepCritical/GradioDemo)
251
- - Check existing [documentation](https://deepcritical.github.io/GradioDemo/)
252
- - Review code examples in the codebase
253
-
254
- Thank you for contributing to The DETERMINATOR!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/prompt-engineering.md DELETED
@@ -1,55 +0,0 @@
1
- # Prompt Engineering & Citation Validation
2
-
3
- This document outlines prompt engineering guidelines and citation validation rules.
4
-
5
- ## Judge Prompts
6
-
7
- - System prompt in `src/prompts/judge.py`
8
- - Format evidence with truncation (1500 chars per item)
9
- - Handle empty evidence case separately
10
- - Always request structured JSON output
11
- - Use `format_user_prompt()` and `format_empty_evidence_prompt()` helpers
12
-
13
- ## Hypothesis Prompts
14
-
15
- - Use diverse evidence selection (MMR algorithm)
16
- - Sentence-aware truncation (`truncate_at_sentence()`)
17
- - Format: Drug → Target → Pathway → Effect
18
- - System prompt emphasizes mechanistic reasoning
19
- - Use `format_hypothesis_prompt()` with embeddings for diversity
20
-
21
- ## Report Prompts
22
-
23
- - Include full citation details for validation
24
- - Use diverse evidence selection (n=20)
25
- - **CRITICAL**: Emphasize citation validation rules
26
- - Format hypotheses with support/contradiction counts
27
- - System prompt includes explicit JSON structure requirements
28
-
29
- ## Citation Validation
30
-
31
- - **ALWAYS** validate references before returning reports
32
- - Use `validate_references()` from `src/utils/citation_validator.py`
33
- - Remove hallucinated citations (URLs not in evidence)
34
- - Log warnings for removed citations
35
- - Never trust LLM-generated citations without validation
36
-
37
- ## Citation Validation Rules
38
-
39
- 1. Every reference URL must EXACTLY match a provided evidence URL
40
- 2. Do NOT invent, fabricate, or hallucinate any references
41
- 3. Do NOT modify paper titles, authors, dates, or URLs
42
- 4. If unsure about a citation, OMIT it rather than guess
43
- 5. Copy URLs exactly as provided - do not create similar-looking URLs
44
-
45
- ## Evidence Selection
46
-
47
- - Use `select_diverse_evidence()` for MMR-based selection
48
- - Balance relevance vs diversity (lambda=0.7 default)
49
- - Sentence-aware truncation preserves meaning
50
- - Limit evidence per prompt to avoid context overflow
51
-
52
- ## See Also
53
-
54
- - [Code Quality](code-quality.md) - Code quality guidelines
55
- - [Error Handling](error-handling.md) - Error handling guidelines
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/contributing/testing.md DELETED
@@ -1,115 +0,0 @@
1
- # Testing Requirements
2
-
3
- This document outlines testing requirements and guidelines for The DETERMINATOR.
4
-
5
- ## Test Structure
6
-
7
- - Unit tests in `tests/unit/` (mocked, fast)
8
- - Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`)
9
- - Use markers: `unit`, `integration`, `slow`, `openai`, `huggingface`, `embedding_provider`, `local_embeddings`
10
-
11
- ## Test Markers
12
-
13
- The project uses pytest markers to categorize tests. These markers are defined in `pyproject.toml`:
14
-
15
- - `@pytest.mark.unit`: Unit tests (mocked, fast) - Run with `-m "unit"`
16
- - `@pytest.mark.integration`: Integration tests (real APIs) - Run with `-m "integration"`
17
- - `@pytest.mark.slow`: Slow tests - Run with `-m "slow"`
18
- - `@pytest.mark.openai`: Tests requiring OpenAI API key - Run with `-m "openai"` or exclude with `-m "not openai"`
19
- - `@pytest.mark.huggingface`: Tests requiring HuggingFace API key or using HuggingFace models - Run with `-m "huggingface"`
20
- - `@pytest.mark.embedding_provider`: Tests requiring API-based embedding providers (OpenAI, etc.) - Run with `-m "embedding_provider"`
21
- - `@pytest.mark.local_embeddings`: Tests using local embeddings (sentence-transformers, ChromaDB) - Run with `-m "local_embeddings"`
22
-
23
- ### Running Tests by Marker
24
-
25
- ```bash
26
- # Run only unit tests (excludes OpenAI tests by default)
27
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
28
-
29
- # Run HuggingFace tests
30
- uv run pytest tests/ -v -m "huggingface" -p no:logfire
31
-
32
- # Run all tests
33
- uv run pytest tests/ -v -p no:logfire
34
-
35
- # Run only local embedding tests
36
- uv run pytest tests/ -v -m "local_embeddings" -p no:logfire
37
-
38
- # Exclude slow tests
39
- uv run pytest tests/ -v -m "not slow" -p no:logfire
40
- ```
41
-
42
- **Note**: The `-p no:logfire` flag disables the logfire plugin to avoid conflicts during testing.
43
-
44
- ## Mocking
45
-
46
- - Use `respx` for httpx mocking
47
- - Use `pytest-mock` for general mocking
48
- - Mock LLM calls in unit tests (use `MockJudgeHandler`)
49
- - Fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`
50
-
51
- ## TDD Workflow
52
-
53
- 1. Write failing test in `tests/unit/`
54
- 2. Implement in `src/`
55
- 3. Ensure test passes
56
- 4. Run checks: `uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire`
57
-
58
- ### Test Command Examples
59
-
60
- ```bash
61
- # Run unit tests (default, excludes OpenAI tests)
62
- uv run pytest tests/unit/ -v -m "not openai" -p no:logfire
63
-
64
- # Run HuggingFace tests
65
- uv run pytest tests/ -v -m "huggingface" -p no:logfire
66
-
67
- # Run all tests
68
- uv run pytest tests/ -v -p no:logfire
69
- ```
70
-
71
- ## Test Examples
72
-
73
- ```python
74
- @pytest.mark.unit
75
- async def test_pubmed_search(mock_httpx_client):
76
- tool = PubMedTool()
77
- results = await tool.search("metformin", max_results=5)
78
- assert len(results) > 0
79
- assert all(isinstance(r, Evidence) for r in results)
80
-
81
- @pytest.mark.integration
82
- async def test_real_pubmed_search():
83
- tool = PubMedTool()
84
- results = await tool.search("metformin", max_results=3)
85
- assert len(results) <= 3
86
- ```
87
-
88
- ## Test Coverage
89
-
90
- ### Terminal Coverage Report
91
-
92
- ```bash
93
- uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
94
- ```
95
-
96
- This shows coverage with missing lines highlighted in the terminal output.
97
-
98
- ### HTML Coverage Report
99
-
100
- ```bash
101
- uv run pytest --cov=src --cov-report=html -p no:logfire
102
- ```
103
-
104
- This generates an HTML coverage report in `htmlcov/index.html`. Open this file in your browser to see detailed coverage information.
105
-
106
- ### Coverage Goals
107
-
108
- - Aim for >80% coverage on critical paths
109
- - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
110
- - Coverage configuration is in `pyproject.toml` under `[tool.coverage.*]`
111
-
112
- ## See Also
113
-
114
- - [Code Style](code-style.md) - Code style guidelines
115
- - [Implementation Patterns](implementation-patterns.md) - Common patterns
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/getting-started/examples.md DELETED
@@ -1,198 +0,0 @@
1
- # Examples
2
-
3
- This page provides examples of using The DETERMINATOR for various research tasks.
4
-
5
- ## Basic Research Query
6
-
7
- ### Example 1: Drug Information
8
-
9
- **Query**:
10
- ```
11
- What are the latest treatments for Alzheimer's disease?
12
- ```
13
-
14
- **What The DETERMINATOR Does**:
15
- 1. Searches PubMed for recent papers
16
- 2. Searches ClinicalTrials.gov for active trials
17
- 3. Evaluates evidence quality
18
- 4. Synthesizes findings into a comprehensive report
19
-
20
- ### Example 2: Clinical Trial Search
21
-
22
- **Query**:
23
- ```
24
- What clinical trials are investigating metformin for cancer prevention?
25
- ```
26
-
27
- **What The DETERMINATOR Does**:
28
-
29
- 1. Searches ClinicalTrials.gov for relevant trials
30
- 2. Searches PubMed for supporting literature
31
- 3. Provides trial details and status
32
- 4. Summarizes findings
33
-
34
- ## Advanced Research Queries
35
-
36
- ### Example 3: Comprehensive Review
37
-
38
- **Query**:
39
-
40
- ```
41
- Review the evidence for using metformin as an anti-aging intervention,
42
- including clinical trials, mechanisms of action, and safety profile.
43
- ```
44
-
45
- **What The DETERMINATOR Does**:
46
- 1. Uses deep research mode (multi-section)
47
- 2. Searches multiple sources in parallel
48
- 3. Generates sections on:
49
- - Clinical trials
50
- - Mechanisms of action
51
- - Safety profile
52
- 4. Synthesizes comprehensive report
53
-
54
- ### Example 4: Hypothesis Testing
55
-
56
- **Query**:
57
- ```
58
- Test the hypothesis that regular exercise reduces Alzheimer's disease risk.
59
- ```
60
-
61
- **What The DETERMINATOR Does**:
62
- 1. Generates testable hypotheses
63
- 2. Searches for supporting/contradicting evidence
64
- 3. Performs statistical analysis (if Modal configured)
65
- 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE
66
-
67
- ## MCP Tool Examples
68
-
69
- ### Using search_pubmed
70
-
71
- ```
72
- Search PubMed for "CRISPR gene editing cancer therapy"
73
- ```
74
-
75
- ### Using search_clinical_trials
76
-
77
- ```
78
- Find active clinical trials for "diabetes type 2 treatment"
79
- ```
80
-
81
- ### Using search_all
82
-
83
- ```
84
- Search all sources for "COVID-19 vaccine side effects"
85
- ```
86
-
87
- ### Using analyze_hypothesis
88
-
89
- ```
90
- Analyze whether vitamin D supplementation reduces COVID-19 severity
91
- ```
92
-
93
- ## Code Examples
94
-
95
- ### Python API Usage
96
-
97
- ```python
98
- from src.orchestrator_factory import create_orchestrator
99
- from src.tools.search_handler import SearchHandler
100
- from src.agent_factory.judges import create_judge_handler
101
-
102
- # Create orchestrator
103
- search_handler = SearchHandler()
104
- judge_handler = create_judge_handler()
105
- ```
106
-
107
- <!--codeinclude-->
108
- [Create Orchestrator](../src/orchestrator_factory.py) start_line:44 end_line:66
109
- <!--/codeinclude-->
110
-
111
- ```python
112
- # Run research query
113
- query = "What are the latest treatments for Alzheimer's disease?"
114
- async for event in orchestrator.run(query):
115
- print(f"Event: {event.type} - {event.data}")
116
- ```
117
-
118
- ### Gradio UI Integration
119
-
120
- ```python
121
- import gradio as gr
122
- from src.app import create_research_interface
123
-
124
- # Create interface
125
- interface = create_research_interface()
126
-
127
- # Launch
128
- interface.launch(server_name="0.0.0.0", server_port=7860)
129
- ```
130
-
131
- ## Research Patterns
132
-
133
- ### Iterative Research
134
-
135
- Single-loop research with search-judge-synthesize cycles:
136
-
137
- ```python
138
- from src.orchestrator.research_flow import IterativeResearchFlow
139
- ```
140
-
141
- <!--codeinclude-->
142
- [IterativeResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:56 end_line:77
143
- <!--/codeinclude-->
144
-
145
- ```python
146
- async for event in flow.run(query):
147
- # Handle events
148
- pass
149
- ```
150
-
151
- ### Deep Research
152
-
153
- Multi-section parallel research:
154
-
155
- ```python
156
- from src.orchestrator.research_flow import DeepResearchFlow
157
- ```
158
-
159
- <!--codeinclude-->
160
- [DeepResearchFlow Initialization](../src/orchestrator/research_flow.py) start_line:674 end_line:697
161
- <!--/codeinclude-->
162
-
163
- ```python
164
- async for event in flow.run(query):
165
- # Handle events
166
- pass
167
- ```
168
-
169
- ## Configuration Examples
170
-
171
- ### Basic Configuration
172
-
173
- ```bash
174
- # .env file
175
- LLM_PROVIDER=openai
176
- OPENAI_API_KEY=your_key_here
177
- MAX_ITERATIONS=10
178
- ```
179
-
180
- ### Advanced Configuration
181
-
182
- ```bash
183
- # .env file
184
- LLM_PROVIDER=anthropic
185
- ANTHROPIC_API_KEY=your_key_here
186
- EMBEDDING_PROVIDER=local
187
- WEB_SEARCH_PROVIDER=duckduckgo
188
- MAX_ITERATIONS=20
189
- DEFAULT_TOKEN_LIMIT=200000
190
- USE_GRAPH_EXECUTION=true
191
- ```
192
-
193
- ## Next Steps
194
-
195
- - Read the [Configuration Guide](../configuration/index.md) for all options
196
- - Explore the [Architecture Documentation](../architecture/graph_orchestration.md)
197
- - Check out the [API Reference](../api/agents.md) for programmatic usage
198
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/getting-started/installation.md DELETED
@@ -1,152 +0,0 @@
1
- # Installation
2
-
3
- This guide will help you install and set up DeepCritical on your system.
4
-
5
- ## Prerequisites
6
-
7
- - Python 3.11 or higher
8
- - `uv` package manager (recommended) or `pip`
9
- - At least one LLM API key (OpenAI, Anthropic, or HuggingFace)
10
-
11
- ## Installation Steps
12
-
13
- ### 1. Install uv (Recommended)
14
-
15
- `uv` is a fast Python package installer and resolver. Install it using the standalone installer (recommended):
16
-
17
- **Unix/macOS/Linux:**
18
- ```bash
19
- curl -LsSf https://astral.sh/uv/install.sh | sh
20
- ```
21
-
22
- **Windows (PowerShell):**
23
- ```powershell
24
- powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
25
- ```
26
-
27
- **Alternative methods:**
28
- ```bash
29
- # Using pipx (recommended if you have pipx installed)
30
- pipx install uv
31
-
32
- # Or using pip
33
- pip install uv
34
- ```
35
-
36
- After installation, restart your terminal or add `~/.cargo/bin` to your PATH.
37
-
38
- ### 2. Clone the Repository
39
-
40
- ```bash
41
- git clone https://github.com/DeepCritical/GradioDemo.git
42
- cd GradioDemo
43
- ```
44
-
45
- ### 3. Install Dependencies
46
-
47
- Using `uv` (recommended):
48
-
49
- ```bash
50
- uv sync
51
- ```
52
-
53
- Using `pip`:
54
-
55
- ```bash
56
- pip install -e .
57
- ```
58
-
59
- ### 4. Install Optional Dependencies
60
-
61
- For embeddings support (local sentence-transformers):
62
-
63
- ```bash
64
- uv sync --extra embeddings
65
- ```
66
-
67
- For Modal sandbox execution:
68
-
69
- ```bash
70
- uv sync --extra modal
71
- ```
72
-
73
- For Magentic orchestration:
74
-
75
- ```bash
76
- uv sync --extra magentic
77
- ```
78
-
79
- Install all extras:
80
-
81
- ```bash
82
- uv sync --all-extras
83
- ```
84
-
85
- ### 5. Configure Environment Variables
86
-
87
- Create a `.env` file in the project root:
88
-
89
- ```bash
90
- # Required: At least one LLM provider
91
- LLM_PROVIDER=openai # or "anthropic" or "huggingface"
92
- OPENAI_API_KEY=your_openai_api_key_here
93
-
94
- # Optional: Other services
95
- NCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits
96
- MODAL_TOKEN_ID=your_modal_token_id
97
- MODAL_TOKEN_SECRET=your_modal_token_secret
98
- ```
99
-
100
- See the [Configuration Guide](../configuration/index.md) for all available options.
101
-
102
- ### 6. Verify Installation
103
-
104
- Run the application:
105
-
106
- ```bash
107
- uv run gradio run src/app.py
108
- ```
109
-
110
- Open your browser to `http://localhost:7860` to verify the installation.
111
-
112
- ## Development Setup
113
-
114
- For development, install dev dependencies:
115
-
116
- ```bash
117
- uv sync --all-extras --dev
118
- ```
119
-
120
- Install pre-commit hooks:
121
-
122
- ```bash
123
- uv run pre-commit install
124
- ```
125
-
126
- ## Troubleshooting
127
-
128
- ### Common Issues
129
-
130
- **Import Errors**:
131
- - Ensure you've installed all required dependencies
132
- - Check that Python 3.11+ is being used
133
-
134
- **API Key Errors**:
135
- - Verify your `.env` file is in the project root
136
- - Check that API keys are correctly formatted
137
- - Ensure at least one LLM provider is configured
138
-
139
- **Module Not Found**:
140
- - Run `uv sync` or `pip install -e .` again
141
- - Check that you're in the correct virtual environment
142
-
143
- **Port Already in Use**:
144
- - Change the port in `src/app.py` or use environment variable
145
- - Kill the process using port 7860
146
-
147
- ## Next Steps
148
-
149
- - Read the [Quick Start Guide](quick-start.md)
150
- - Learn about [MCP Integration](mcp-integration.md)
151
- - Explore [Examples](examples.md)
152
-