AndrΓ© Oliveira commited on
Commit
59e6760
Β·
1 Parent(s): 499d53b

Initial MCP Space push

Browse files
Files changed (8) hide show
  1. .gitignore +76 -0
  2. LICENSE +19 -0
  3. README.md +70 -10
  4. api.py +347 -0
  5. app.py +50 -0
  6. models.py +133 -0
  7. requirements.txt +6 -0
  8. server.py +7 -0
.gitignore ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ---- System files ----
2
+ .DS_Store
3
+ .idea/
4
+ .vscode/
5
+ __pycache__/
6
+ *.pyc
7
+ *.pyo
8
+ *.pyd
9
+ *.so
10
+ *.egg
11
+ *.egg-info/
12
+ .Python
13
+ .env
14
+ .venv
15
+ env/
16
+ venv/
17
+ ENV/
18
+ .ipynb_checkpoints/
19
+
20
+ # ---- Build / packaging ----
21
+ build/
22
+ develop-eggs/
23
+ dist/
24
+ downloads/
25
+ eggs/
26
+ lib/
27
+ lib64/
28
+ parts/
29
+ sdist/
30
+ var/
31
+ *.manifest
32
+ *.spec
33
+
34
+ # ---- Logs and temp ----
35
+ *.log
36
+ pip-log.txt
37
+ pip-delete-this-directory.txt
38
+ coverage.xml
39
+ htmlcov/
40
+ .tox/
41
+ .nox/
42
+ .cache/
43
+ .pytest_cache/
44
+ .mypy_cache/
45
+ .dmypy.json
46
+ .pyre/
47
+
48
+ # ---- IDEs ----
49
+ # Already added:
50
+ # .idea/
51
+ # .vscode/
52
+
53
+ # ---- Configs ----
54
+ *.env.local
55
+ *.env.production
56
+ *.env.development
57
+
58
+ # ---- RAGMint specific ----
59
+ # Ignore raw datasets and local embeddings
60
+ data/raw/
61
+ data/interim/
62
+ data/tmp/
63
+ outputs/
64
+ models/
65
+ notebooks/
66
+ data/docs/
67
+ data/
68
+
69
+ # ---- OS ----
70
+ Thumbs.db
71
+
72
+ structure.txt
73
+ .pypirc
74
+ leaderboard.jsonl
75
+ archive
76
+ experiments
LICENSE ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ Copyright 2025 AndrΓ© Oliveira
8
+
9
+ Licensed under the Apache License, Version 2.0 (the "License");
10
+ you may not use this file except in compliance with the License.
11
+ You may obtain a copy of the License at
12
+
13
+ http://www.apache.org/licenses/LICENSE-2.0
14
+
15
+ Unless required by applicable law or agreed to in writing, software
16
+ distributed under the License is distributed on an "AS IS" BASIS,
17
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
18
+ See the License for the specific language governing permissions and
19
+ limitations under the License.
README.md CHANGED
@@ -1,13 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Ragmint Mcp Server
3
- emoji: 🐒
4
- colorFrom: gray
5
- colorTo: red
6
- sdk: gradio
7
- sdk_version: 5.49.1
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
 
11
  ---
 
 
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
1
+ # Ragmint MCP HF Space
2
+
3
+ This project is a **Ragmint MCP + Gradio Dashboard** designed for Hugging Face Spaces.
4
+ It allows users to:
5
+
6
+ - Optimize RAG pipelines
7
+ - Run autotune for RAG parameters
8
+ - Generate QA datasets
9
+ - Monitor corpus stats and leaderboard
10
+
11
+ The MCP backend handles all computations, and the Gradio frontend communicates with it via async HTTP requests.
12
+
13
+ ---
14
+
15
+ ## Features
16
+
17
+ 1. **Health Check** – Confirm the MCP backend is running.
18
+ 2. **Optimize RAG** – Run RAG optimization using user-defined parameters.
19
+ 3. **Autotune RAG** – Automatically tune chunk sizes, overlaps, and embedding models.
20
+ 4. **Generate QA** – Generate validation QA sets dynamically using an LLM.
21
+
22
+ ---
23
+
24
+ ## Usage
25
+
26
+ ### MCP Server (backend)
27
+
28
+ Install dependencies and start the MCP server:
29
+
30
+ ```bash
31
+ pip install -r requirements.txt
32
+ python ragmint_mcp.py
33
+ ```
34
+
35
+ The server runs on `http://127.0.0.1:8000`.
36
+
37
+
38
+ ### Gradio Dashboard (frontend)
39
+
40
+ Install dependencies (if not already):
41
+ ```
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+
46
+ ### Launch the Gradio frontend:
47
+
48
+ ```
49
+ python app.py
50
+ ```
51
+
52
+
53
+ The dashboard runs on `http://127.0.0.1:7860`.
54
+
55
  ---
56
+ ## File Structure
57
+ ```
58
+ .
59
+ β”œβ”€β”€ app.py # Gradio frontend
60
+ β”œβ”€β”€ ragmint_mcp.py # MCP server
61
+ β”œβ”€β”€ models.py # Pydantic models
62
+ β”œβ”€β”€ README.md
63
+ β”œβ”€β”€ requirements.txt
64
+ └── data/docs # Example documents and QA sets
65
+ ```
66
  ---
67
+ ## License
68
+
69
+ Apache 2.0
70
 
71
+ <p align="center">
72
+ <sub>Built with ❀️ by <a href="https://andyolivers.com">André Oliveira</a> | Apache 2.0 License</sub>
73
+ </p>
api.py ADDED
@@ -0,0 +1,347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import os
3
+ import json
4
+ import logging
5
+ import time
6
+
7
+ from models import OptimizeRequest, QARequest, AutotuneRequest
8
+ from fastapi import FastAPI, HTTPException
9
+ from fastapi.middleware.cors import CORSMiddleware
10
+ import uvicorn
11
+
12
+ try:
13
+ from ragmint.autotuner import AutoRAGTuner
14
+ from ragmint.qa_generator import generate_validation_qa
15
+ from ragmint.explainer import explain_results
16
+ from ragmint.leaderboard import Leaderboard
17
+ from ragmint.tuner import RAGMint
18
+ except Exception as e:
19
+ AutoRAGTuner = None
20
+ generate_validation_qa = None
21
+ explain_results = None
22
+ Leaderboard = None
23
+ RAGMint = None
24
+ _import_error = e
25
+ else:
26
+ _import_error = None
27
+
28
+ from dotenv import load_dotenv
29
+ load_dotenv()
30
+
31
+ # Logging
32
+ logging.basicConfig(level=logging.INFO)
33
+ logger = logging.getLogger("ragmint_mcp_server")
34
+
35
+ # FastAPI
36
+ app = FastAPI(title="Ragmint MCP Server", version="0.1.0")
37
+ app.add_middleware(
38
+ CORSMiddleware,
39
+ allow_origins=["*"],
40
+ allow_credentials=True,
41
+ allow_methods=["*"],
42
+ allow_headers=["*"],
43
+ )
44
+
45
+ DEFAULT_DATA_DIR = "../data/docs"
46
+ LEADERBOARD_STORAGE = "experiments/leaderboard.jsonl"
47
+ os.makedirs("../experiments", exist_ok=True)
48
+
49
+
50
+ @app.get("/health")
51
+ def health():
52
+ return {
53
+ "status": "ok",
54
+ "ragmint_imported": _import_error is None,
55
+ "import_error": str(_import_error) if _import_error else None,
56
+ }
57
+
58
+
59
+ @app.post("/optimize_rag")
60
+ def optimize_rag(req: OptimizeRequest):
61
+ logger.info("Received optimize_rag request: %s", req.json())
62
+
63
+ if RAGMint is None:
64
+ raise HTTPException(
65
+ status_code=500,
66
+ detail=f"Ragmint imports failed or RAGMint unavailable: {_import_error}"
67
+ )
68
+
69
+ docs_path = req.docs_path or DEFAULT_DATA_DIR
70
+ if not os.path.isdir(docs_path):
71
+ raise HTTPException(status_code=400, detail=f"docs_path does not exist: {docs_path}")
72
+
73
+ try:
74
+ # Build RAGMint exactly from request
75
+ rag = RAGMint(
76
+ docs_path=docs_path,
77
+ retrievers=req.retriever,
78
+ embeddings=req.embedding_model,
79
+ rerankers=(req.rerankers or ["mmr"]),
80
+ chunk_sizes=req.chunk_sizes,
81
+ overlaps=req.overlaps,
82
+ strategies=req.strategy,
83
+ )
84
+
85
+ # Validation selection
86
+ validation_set = None
87
+ validation_choice = (req.validation_choice or "").strip()
88
+ default_val_path = os.path.join(docs_path, "validation_qa.json")
89
+
90
+ # Auto
91
+ if not validation_choice:
92
+ if os.path.exists(default_val_path):
93
+ validation_set = default_val_path
94
+ logger.info("Using default validation set: %s", validation_set)
95
+ else:
96
+ logger.warning("No validation_choice provided and no default found.")
97
+ validation_set = None
98
+
99
+ # Remote HF dataset
100
+ elif "/" in validation_choice and not os.path.exists(validation_choice):
101
+ validation_set = validation_choice
102
+ logger.info("Using Hugging Face validation dataset: %s", validation_set)
103
+
104
+ # Local file
105
+ elif os.path.exists(validation_choice):
106
+ validation_set = validation_choice
107
+ logger.info("Using local validation dataset: %s", validation_set)
108
+
109
+ # Generate
110
+ elif validation_choice.lower() == "generate":
111
+ try:
112
+ gen_path = os.path.join(docs_path, "validation_qa.json")
113
+ generate_validation_qa(
114
+ docs_path=docs_path,
115
+ output_path=gen_path,
116
+ llm_model=req.llm_model if hasattr(req, "llm_model") else "gemini-2.5-flash-lite"
117
+ )
118
+ validation_set = gen_path
119
+ logger.info("Generated new validation QA set at: %s", validation_set)
120
+ except Exception as e:
121
+ logger.exception("Failed to generate validation QA dataset: %s", e)
122
+ raise HTTPException(status_code=500, detail=f"Failed to generate validation QA dataset: {e}")
123
+
124
+ # Optimize
125
+ start_time = time.time()
126
+ best, results = rag.optimize(
127
+ validation_set=validation_set,
128
+ metric=req.metric,
129
+ trials=req.trials,
130
+ search_type=req.search_type
131
+ )
132
+ elapsed = time.time() - start_time
133
+
134
+ run_id = f"opt_{int(time.time())}"
135
+
136
+ # Corpus stats
137
+ try:
138
+ corpus_stats = {
139
+ "num_docs": len(rag.documents),
140
+ "avg_len": sum(len(d.split()) for d in rag.documents) / max(1, len(rag.documents)),
141
+ "corpus_size": sum(len(d) for d in rag.documents),
142
+ }
143
+ except Exception:
144
+ corpus_stats = None
145
+
146
+ # Leaderboard
147
+ try:
148
+ if Leaderboard:
149
+ lb = Leaderboard()
150
+ lb.upload(
151
+ run_id=run_id,
152
+ best_config=best,
153
+ best_score=best.get("faithfulness", best.get("score", 0.0)),
154
+ all_results=results,
155
+ documents=os.listdir(docs_path),
156
+ model=best.get("embedding_model", req.embedding_model),
157
+ corpus_stats=corpus_stats,
158
+ )
159
+ except Exception:
160
+ logger.exception("Leaderboard persistence failed for optimize_rag")
161
+
162
+ return {
163
+ "status": "finished",
164
+ "run_id": run_id,
165
+ "elapsed_seconds": elapsed,
166
+ "best_config": best,
167
+ "results": results,
168
+ "corpus_stats": corpus_stats,
169
+ }
170
+
171
+ except Exception as exc:
172
+ logger.exception("optimize_rag failed")
173
+ raise HTTPException(status_code=500, detail=str(exc))
174
+
175
+
176
+ @app.post("/autotune_rag")
177
+ def autotune_rag(req: AutotuneRequest):
178
+ logger.info("Received autotune_rag request: %s", req.json())
179
+
180
+ if AutoRAGTuner is None or RAGMint is None:
181
+ raise HTTPException(
182
+ status_code=500,
183
+ detail=f"Ragmint autotuner/RAGMint imports failed: {_import_error}"
184
+ )
185
+
186
+ docs_path = req.docs_path or DEFAULT_DATA_DIR
187
+ if not os.path.isdir(docs_path):
188
+ raise HTTPException(status_code=400, detail=f"docs_path does not exist: {docs_path}")
189
+
190
+ try:
191
+ start_time = time.time()
192
+
193
+ tuner = AutoRAGTuner(docs_path=docs_path)
194
+ rec = tuner.recommend(
195
+ embedding_model=req.embedding_model,
196
+ num_chunk_pairs=req.num_chunk_pairs
197
+ )
198
+
199
+ chunk_candidates = tuner.suggest_chunk_sizes(
200
+ model_name=rec.get("embedding_model"),
201
+ num_pairs=int(req.num_chunk_pairs),
202
+ step=20
203
+ )
204
+
205
+ chunk_sizes = sorted({c for c, _ in chunk_candidates})
206
+ overlaps = sorted({o for _, o in chunk_candidates})
207
+
208
+ rag = RAGMint(
209
+ docs_path=docs_path,
210
+ retrievers=[rec["retriever"]],
211
+ embeddings=[rec["embedding_model"]],
212
+ rerankers=["mmr"],
213
+ chunk_sizes=chunk_sizes,
214
+ overlaps=overlaps,
215
+ strategies=[rec["strategy"]],
216
+ )
217
+
218
+ # Validation selection
219
+ validation_set = None
220
+ validation_choice = (req.validation_choice or "").strip()
221
+ default_val_path = os.path.join(docs_path, "validation_qa.jsonl")
222
+
223
+ if not validation_choice:
224
+ if os.path.exists(default_val_path):
225
+ validation_set = default_val_path
226
+ logger.info("Using default validation set: %s", validation_set)
227
+ else:
228
+ logger.warning("No validation_choice provided and no default found.")
229
+ validation_set = None
230
+
231
+ elif "/" in validation_choice and not os.path.exists(validation_choice):
232
+ validation_set = validation_choice
233
+
234
+ elif os.path.exists(validation_choice):
235
+ validation_set = validation_choice
236
+
237
+ elif validation_choice.lower() == "generate":
238
+ try:
239
+ gen_path = os.path.join(docs_path, "validation_qa.json")
240
+ generate_validation_qa(
241
+ docs_path=docs_path,
242
+ output_path=gen_path,
243
+ llm_model=req.llm_model if hasattr(req, "llm_model") else "gemini-2.5-flash-lite",
244
+ )
245
+ validation_set = gen_path
246
+ except Exception as e:
247
+ logger.exception("Failed to generate validation QA dataset: %s", e)
248
+ raise HTTPException(status_code=500, detail=f"Failed to generate validation QA dataset: {e}")
249
+
250
+ # Full optimize
251
+ best, results = rag.optimize(
252
+ validation_set=validation_set,
253
+ metric=req.metric,
254
+ search_type=req.search_type,
255
+ trials=req.trials,
256
+ )
257
+ elapsed = time.time() - start_time
258
+
259
+ run_id = f"autotune_{int(time.time())}"
260
+
261
+ # Corpus stats
262
+ try:
263
+ corpus_stats = {
264
+ "num_docs": len(rag.documents),
265
+ "avg_len": sum(len(d.split()) for d in rag.documents) / max(1, len(rag.documents)),
266
+ "corpus_size": sum(len(d) for d in rag.documents),
267
+ }
268
+ except Exception:
269
+ corpus_stats = None
270
+
271
+ # Leaderboard
272
+ try:
273
+ if Leaderboard:
274
+ lb = Leaderboard()
275
+ lb.upload(
276
+ run_id=run_id,
277
+ best_config=best,
278
+ best_score=best.get("faithfulness", best.get("score", 0.0)),
279
+ all_results=results,
280
+ documents=os.listdir(docs_path),
281
+ model=best.get("embedding_model", rec.get("embedding_model")),
282
+ corpus_stats=corpus_stats,
283
+ )
284
+ except Exception:
285
+ logger.exception("Leaderboard persistence failed for autotune_rag")
286
+
287
+ return {
288
+ "status": "finished",
289
+ "run_id": run_id,
290
+ "elapsed_seconds": elapsed,
291
+ "recommendation": rec,
292
+ "chunk_candidates": chunk_candidates,
293
+ "best_config": best,
294
+ "results": results,
295
+ "corpus_stats": corpus_stats,
296
+ }
297
+
298
+ except Exception as exc:
299
+ logger.exception("autotune_rag failed")
300
+ raise HTTPException(status_code=500, detail=str(exc))
301
+
302
+
303
+ @app.post("/generate_validation_qa")
304
+ def generate_qa(req: QARequest):
305
+ logger.info("Received generate_validation_qa request: %s", req.json())
306
+
307
+ if generate_validation_qa is None:
308
+ raise HTTPException(status_code=500, detail=f"Ragmint imports failed: {_import_error}")
309
+
310
+ try:
311
+ out_path = f"data/docs/validation_qa.json"
312
+ os.makedirs(os.path.dirname(out_path), exist_ok=True)
313
+
314
+ generate_validation_qa(
315
+ docs_path=req.docs_path,
316
+ output_path=out_path,
317
+ llm_model=req.llm_model,
318
+ batch_size=req.batch_size,
319
+ min_q=req.min_q,
320
+ max_q=req.max_q,
321
+ )
322
+
323
+ with open(out_path, "r", encoding="utf-8") as f:
324
+ data = json.load(f)
325
+
326
+ return {
327
+ "status": "finished",
328
+ "output_path": out_path,
329
+ "preview_count": len(data),
330
+ "sample": data[:5],
331
+ }
332
+
333
+ except Exception as exc:
334
+ logger.exception("generate_validation_qa failed")
335
+ raise HTTPException(status_code=500, detail=str(exc))
336
+
337
+
338
+ # -----------------------
339
+ # FastAPI launch
340
+ # -----------------------
341
+
342
+ def main():
343
+ uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")
344
+
345
+
346
+ if __name__ == "__main__":
347
+ main()
app.py ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import requests
3
+ import json
4
+ import server
5
+
6
+ API_URL = "http://localhost:8000"
7
+
8
+ def optimize_rag_tool(payload: str) -> str:
9
+ """Run RAGMint full optimization workflow.
10
+
11
+ Args:
12
+ payload: JSON string containing OptimizeRequest parameters.
13
+
14
+ Returns:
15
+ JSON result with best config and leaderboard stats.
16
+ """
17
+ r = requests.post(f"{API_URL}/optimize_rag", json=json.loads(payload))
18
+ return json.dumps(r.json(), indent=2)
19
+
20
+ def autotune_tool(payload: str) -> str:
21
+ """Run AutoRAG tuner to recommend best configs and optimize.
22
+
23
+ Args:
24
+ payload: JSON string for AutotuneRequest
25
+
26
+ Returns:
27
+ JSON result for tuning and full optimization.
28
+ """
29
+ r = requests.post(f"{API_URL}/autotune_rag", json=json.loads(payload))
30
+ return json.dumps(r.json(), indent=2)
31
+
32
+ def generate_qa_tool(payload: str) -> str:
33
+ """Generate validation QA set automatically with Gemini or Anthropic.
34
+
35
+ Args:
36
+ payload: JSON string for QARequest
37
+
38
+ Returns:
39
+ JSON preview of generated dataset
40
+ """
41
+ r = requests.post(f"{API_URL}/generate_validation_qa", json=json.loads(payload))
42
+ return json.dumps(r.json(), indent=2)
43
+
44
+ demo = gr.Interface(
45
+ fn=optimize_rag_tool,
46
+ inputs=gr.Textbox(lines=12, label="OptimizeRequest JSON"),
47
+ outputs=gr.Textbox(label="Response")
48
+ )
49
+
50
+ demo.launch(mcp_server=True)
models.py ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Optional, List, Dict, Any
2
+ from pydantic import BaseModel, Field
3
+
4
+
5
+
6
+ # Models
7
+ class OptimizeRequest(BaseModel):
8
+ """
9
+ πŸ”§ Explicit optimization request: user provides all pipeline configs manually.
10
+ """
11
+ docs_path: Optional[str] = Field(
12
+ default="data/docs",
13
+ description="πŸ“‚ Folder containing your documents for RAG optimization. Example: 'data/docs'"
14
+ )
15
+ retriever: Optional[List[str]] = Field(
16
+ description="πŸ” Retriever type to use. Example: 'bm25', 'faiss', 'chroma'",
17
+ default=['faiss']
18
+ )
19
+ embedding_model: Optional[List[str]] = Field(
20
+ description="🧠 Embedding model name or path. Example: 'sentence-transformers/all-MiniLM-L6-v2'",
21
+ default=['sentence-transformers/all-MiniLM-L6-v2']
22
+ )
23
+ strategy: Optional[List[str]] = Field(
24
+ description="🎯 RAG strategy name. Example: 'fixed', 'token', 'sentence'",
25
+ default=['fixed']
26
+ )
27
+ chunk_sizes: Optional[List[int]] = Field(
28
+ description="πŸ“ List of chunk sizes to evaluate. Example: [200, 400, 600]",
29
+ default=[200, 400, 600]
30
+ )
31
+ overlaps: Optional[List[int]] = Field(
32
+ description="πŸ” List of overlap values to test. Example: [50, 100, 200]",
33
+ default = [50, 100, 200]
34
+ )
35
+ rerankers: Optional[List[str]] = Field(
36
+ default=["mmr"],
37
+ description="βš–οΈ Rerankers to apply after retrieval. Default: ['mmr']"
38
+ )
39
+ search_type: Optional[str] = Field(
40
+ default="grid",
41
+ description="πŸ” Search method to explore parameter space. Options: 'grid', 'random', 'bayesian'"
42
+ )
43
+ trials: Optional[int] = Field(
44
+ default=5,
45
+ description="πŸ§ͺ Number of optimization trials to run."
46
+ )
47
+ metric: Optional[str] = Field(
48
+ default="faithfulness",
49
+ description="πŸ“ˆ Evaluation metric for optimization. Options: 'faithfulness'"
50
+ )
51
+ validation_choice: Optional[str] = Field(
52
+ default='generate',
53
+ description=(
54
+ "βœ… Validation data source. Options:\n"
55
+ " - Leave blank β†’ use default 'validation_qa.json' if available\n"
56
+ " - 'generate' β†’ auto-generate a validation QA file from your docs\n"
57
+ " - Path to a local JSON file (e.g. 'data/validation_qa.json')\n"
58
+ " - Hugging Face dataset ID (e.g. 'squad')"
59
+ )
60
+ )
61
+ llm_model: Optional[str] = Field(
62
+ default="gemini-2.5-flash-lite",
63
+ description="πŸ€– LLM used to generate QA dataset when validation_choice='generate'. Example: 'gemini-pro', 'gpt-4o-mini'"
64
+ )
65
+
66
+
67
+
68
+ class AutotuneRequest(BaseModel):
69
+ docs_path: Optional[str] = Field(
70
+ default="data/docs",
71
+ description="πŸ“‚ Folder containing your documents for RAG optimization. Example: 'data/docs'"
72
+ )
73
+ embedding_model: Optional[str] = Field(
74
+ default="sentence-transformers/all-MiniLM-L6-v2",
75
+ description="🧠 Embedding model name or path. Example: 'sentence-transformers/all-MiniLM-L6-v2'"
76
+ )
77
+ num_chunk_pairs: Optional[int] = Field(
78
+ default=5,
79
+ description="πŸ”’ Number of chunk pairs to analyze for tuning."
80
+ )
81
+ metric: Optional[str] = Field(
82
+ default="faithfulness",
83
+ description="πŸ“ˆ Evaluation metric for optimization. Options: 'faithfulness'"
84
+ )
85
+ search_type: Optional[str] = Field(
86
+ default="grid",
87
+ description="πŸ” Search method to explore parameter space. Options: 'grid', 'random', 'bayesian'"
88
+ )
89
+ trials: Optional[int] = Field(
90
+ default=5,
91
+ description="πŸ§ͺ Number of optimization trials to run."
92
+ )
93
+ validation_choice: Optional[str] = Field(
94
+ default='generate',
95
+ description=(
96
+ "βœ… Validation data source. Options:\n"
97
+ " - Leave blank β†’ use default 'validation_qa.jsonl' if available\n"
98
+ " - 'generate' β†’ auto-generate a validation QA file from your docs\n"
99
+ " - Path to a local JSON file (e.g. 'data/validation_qa.json')\n"
100
+ " - Hugging Face dataset ID (e.g. 'squad')"
101
+ )
102
+ )
103
+ llm_model: Optional[str] = Field(
104
+ default="gemini-2.5-flash-lite",
105
+ description="πŸ€– LLM used to generate QA dataset when validation_choice='generate'. Example: 'gemini-pro', 'gpt-4o-mini'"
106
+ )
107
+
108
+
109
+ class QARequest(BaseModel):
110
+ """
111
+ 🧩 Generates a validation QA dataset for RAG evaluation.
112
+ """
113
+ docs_path: str = Field(
114
+ description="πŸ“‚ Folder containing your documents to generate QA pairs from. Example: 'data/docs'",
115
+ default='data/docs'
116
+ )
117
+ llm_model: str = Field(
118
+ default="gemini-2.5-flash-lite",
119
+ description="πŸ€– LLM model used for question generation. Example: 'gemini-2.5-flash-lite', 'gpt-4o-mini'"
120
+ )
121
+ batch_size: int = Field(
122
+ default=5,
123
+ description="πŸ“¦ Number of documents processed per generation batch."
124
+ )
125
+ min_q: int = Field(
126
+ default=3,
127
+ description="οΏ½οΏ½ Minimum number of questions per document."
128
+ )
129
+ max_q: int = Field(
130
+ default=25,
131
+ description="❓ Maximum number of questions per document."
132
+ )
133
+
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ gradio[mcp]
2
+ fastapi
3
+ uvicorn
4
+ requests
5
+ ragmint
6
+ pydantic
server.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ import threading
2
+ from api import main
3
+
4
+ def start():
5
+ threading.Thread(target=main, daemon=True).start()
6
+
7
+ start()