André Oliveira commited on
Commit
30720a5
·
1 Parent(s): 62897a2

added clear cache tool

Browse files
Files changed (4) hide show
  1. README.md +95 -22
  2. api.py +36 -0
  3. app.py +32 -3
  4. data/docs/cisco.txt +0 -7
README.md CHANGED
@@ -22,7 +22,7 @@ Gradio-based MCP server for Ragmint, enabling **Retrieval-Augmented Generation (
22
 
23
  ## 🧩 Overview
24
 
25
- Ragmint MCP Server exposes the full power of **Ragmint**, a modular Python library for **evaluating, optimizing, and tuning RAG pipelines**, through a **Multimodal Control Plane (MCP)**. This allows external clients (like Claude Desktop or Cursor) to **run experiments, retrieve leaderboard results, and tune RAG parameters programmatically**.
26
 
27
  ## Ragmint
28
 
@@ -78,17 +78,8 @@ export GOOGLE_API_KEY="your_gemini_key"
78
 
79
  ## 🧠 MCP Usage
80
 
81
- Ragmint MCP Server provides Python-callable interfaces for programmatic control. Example usage with MCP:
82
 
83
- ```python
84
- from mcp_client import MCPClient
85
-
86
- client = MCPClient(server_url="http://localhost:7860")
87
-
88
- # Run Auto-RAG tuning
89
- config, results = client.autotune(docs_path="data/docs/", trials=5)
90
- print("Best config:", config)
91
- ```
92
 
93
  ---
94
 
@@ -142,6 +133,22 @@ ragmint_mcp_server/
142
  ├── models.py
143
  └── api.py
144
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  ---
146
 
147
  ## 📥 Inputs
@@ -149,21 +156,35 @@ ragmint_mcp_server/
149
  The Ragmint MCP Server exposes three main endpoints with the following inputs:
150
 
151
 
152
- ### 1. Upload Documents
153
 
154
- Input: `.txt` files to upload to the documents directory (`docs_path`).
155
 
156
  <details>
157
  <summary>View Input Model</summary>
158
 
159
  | Field | Type | Description | Example |
160
- |-------|------|-------------|---------|
161
- | file | File | Text file to be processed | sample.txt |
162
  | docs_path | str | Directory where files are stored | data/docs |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
163
 
164
  </details>
165
 
166
- ### 2. Optimize RAG
167
 
168
  Input: JSON object following the `OptimizeRequest` model.
169
 
@@ -187,7 +208,7 @@ Input: JSON object following the `OptimizeRequest` model.
187
 
188
  </details>
189
 
190
- ### 3. Autotune RAG
191
 
192
  Input: JSON object following the `AutotuneRequest` model.
193
 
@@ -207,7 +228,7 @@ Input: JSON object following the `AutotuneRequest` model.
207
 
208
  </details>
209
 
210
- ### 4. Generate QA
211
 
212
  Input: JSON object following the `QARequest` model.
213
  <details>
@@ -223,13 +244,26 @@ Input: JSON object following the `QARequest` model.
223
 
224
  </details>
225
 
 
 
 
 
 
 
 
 
 
 
 
 
 
226
  ---
227
 
228
  ## 📤 Outputs
229
 
230
  The Ragmint MCP Server exposes three main endpoints with the following example outputs:
231
 
232
- ### 1. Upload Documents Response
233
 
234
  <details>
235
  <summary>View Response Example</summary>
@@ -251,7 +285,28 @@ The Ragmint MCP Server exposes three main endpoints with the following example o
251
  ✅ Confirms your documents are ready for RAG operations.
252
 
253
 
254
- ### 2. Optimize RAG Response
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
255
 
256
  <details>
257
  <summary>View Response Example</summary>
@@ -312,7 +367,7 @@ The Ragmint MCP Server exposes three main endpoints with the following example o
312
  - **corpus_size** → Total size in characters or tokens.
313
 
314
 
315
- ### 3. Autotune RAG Response
316
 
317
  <details>
318
  <summary>View Response Example</summary>
@@ -383,7 +438,7 @@ The Ragmint MCP Server exposes three main endpoints with the following example o
383
  🧠 **Difference from Optimize**: Autotune automatically selects the best hyperparameters, rather than testing all user-specified combinations.
384
 
385
 
386
- ### 4. Generate QA Response
387
 
388
  <details>
389
  <summary>View Response Example</summary>
@@ -419,6 +474,24 @@ The Ragmint MCP Server exposes three main endpoints with the following example o
419
  - **expected_answer** → The reference answer corresponding to that question.
420
  - **status**: `"finished"` → QA generation completed successfully.
421
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
422
  ---
423
 
424
  ## 📘 License
 
22
 
23
  ## 🧩 Overview
24
 
25
+ Ragmint MCP Server exposes the full power of **Ragmint**, a modular Python library for **evaluating, optimizing, and tuning RAG pipelines**, through a **Multimodal Control Plane (MCP)**. This allows external clients (like Claude Desktop or Cursor) to **run experiments and tune RAG parameters programmatically**.
26
 
27
  ## Ragmint
28
 
 
78
 
79
  ## 🧠 MCP Usage
80
 
81
+ Ragmint MCP Server provides Python-callable interfaces for programmatic control. You can find an example of MCP usage in the [Ragmint MCP Server Space](https://huggingface.co/spaces/andyolivers/ragmint-mcp-server) on Hugging Face.
82
 
 
 
 
 
 
 
 
 
 
83
 
84
  ---
85
 
 
133
  ├── models.py
134
  └── api.py
135
  ```
136
+ ---
137
+ ## 🔧 MCP Tools (app.py)
138
+
139
+ The `app.py` file provides the Gradio UI and also registers the functions exposed as **MCP Tools**, enabling external MCP clients (Claude Desktop, Cursor, VS Code MCP extension, etc.) to call Ragmint programmatically.
140
+
141
+ `app.py` launches the FastAPI backend (`api.py`) in a background thread and exposes the following MCP tools:
142
+
143
+ | MCP Tool | Python Function | Description |
144
+ |-----------|------------------------|------------------------------------------------------------------------------------|
145
+ | upload_docs | upload_docs_tool() | Uploads `.txt` files or remote URLs into the configured `docs_path`. |
146
+ | upload_urls | upload_urls_tool() | Downloads remote files from external URLs and stores them inside `docs_path`. |
147
+ | optimize_rag | optimize_rag_tool() | Runs explicit hyperparameter optimization for a RAG pipeline. |
148
+ | autotune | autotune_tool() | Automatically recommends best chunking + embedding configuration. |
149
+ | generate_qa | generate_qa_tool() | Generates synthetic QA validation dataset for evaluation. |
150
+ | clear_cache | clear_cache_tool() | Deletes all docs inside `data/docs` to reset the workspace. |
151
+
152
  ---
153
 
154
  ## 📥 Inputs
 
156
  The Ragmint MCP Server exposes three main endpoints with the following inputs:
157
 
158
 
159
+ ### 1. Upload Documents (`upload_docs`)
160
 
161
+ Input: `.txt` files or file-like objects to upload to the documents directory (`docs_path`).
162
 
163
  <details>
164
  <summary>View Input Model</summary>
165
 
166
  | Field | Type | Description | Example |
167
+ |--------|-------|-------------|---------|
168
+ | files | File[] | Local `.txt` files selected or passed from MCP client | ["sample.txt"] |
169
  | docs_path | str | Directory where files are stored | data/docs |
170
+ </details>
171
+
172
+
173
+ ### 2. Upload URLs (`upload_urls`)
174
+
175
+ Input: List of URLs referencing `.txt` files to download and store in `docs_path`.
176
+
177
+ <details>
178
+ <summary>View Input Model</summary>
179
+
180
+ | Field | Type | Description | Example |
181
+ |--------|-------|-------------|---------|
182
+ | urls | List[str] | List of URLs pointing to remote documents | ["https://example.com/doc.txt"] |
183
+ | docs_path | str | Directory where downloaded files are saved | data/docs |
184
 
185
  </details>
186
 
187
+ ### 3. Optimize RAG (`optimize_rag`)
188
 
189
  Input: JSON object following the `OptimizeRequest` model.
190
 
 
208
 
209
  </details>
210
 
211
+ ### 4. Autotune RAG (`autotune`)
212
 
213
  Input: JSON object following the `AutotuneRequest` model.
214
 
 
228
 
229
  </details>
230
 
231
+ ### 5. Generate QA (`generate_qa`)
232
 
233
  Input: JSON object following the `QARequest` model.
234
  <details>
 
244
 
245
  </details>
246
 
247
+ ### 6. Clear Cache (`clear_cache`)
248
+
249
+ Deletes all stored documents from `data/docs`.
250
+
251
+ <details>
252
+ <summary>View Input Model</summary>
253
+
254
+ | Field | Type | Description | Example |
255
+ |--------|-------|-------------|---------|
256
+ | docs_path | str | Folder to wipe clean | data/docs |
257
+
258
+ </details>
259
+
260
  ---
261
 
262
  ## 📤 Outputs
263
 
264
  The Ragmint MCP Server exposes three main endpoints with the following example outputs:
265
 
266
+ ### 1. Upload Documents Response (`upload_docs`)
267
 
268
  <details>
269
  <summary>View Response Example</summary>
 
285
  ✅ Confirms your documents are ready for RAG operations.
286
 
287
 
288
+ ### 2. Upload URLs Response (`upload_urls`)
289
+
290
+ <details>
291
+ <summary>View Response Example</summary>
292
+
293
+ ```json
294
+ {
295
+ "status": "ok",
296
+ "uploaded_files": ["doc.txt"],
297
+ "docs_path": "data/docs"
298
+ }
299
+ ```
300
+ </details>
301
+
302
+ - **status**: `"ok"` → Indicates that the upload was successful.
303
+ - **uploaded_files**: List of file names that were successfully uploaded.
304
+ - **docs_path**: The directory where the uploaded documents are stored.
305
+
306
+ ✅ Confirms your documents are ready for RAG operations.
307
+
308
+
309
+ ### 3. Optimize RAG Response (`optimize_rag`)
310
 
311
  <details>
312
  <summary>View Response Example</summary>
 
367
  - **corpus_size** → Total size in characters or tokens.
368
 
369
 
370
+ ### 4. Autotune RAG Response (`autotune`)
371
 
372
  <details>
373
  <summary>View Response Example</summary>
 
438
  🧠 **Difference from Optimize**: Autotune automatically selects the best hyperparameters, rather than testing all user-specified combinations.
439
 
440
 
441
+ ### 5. Generate QA Response (`generate_qa`)
442
 
443
  <details>
444
  <summary>View Response Example</summary>
 
474
  - **expected_answer** → The reference answer corresponding to that question.
475
  - **status**: `"finished"` → QA generation completed successfully.
476
 
477
+
478
+ ### 6. Clear Cache Response (`clear_cache`)
479
+
480
+ <details>
481
+ <summary>View Response Example</summary>
482
+
483
+ ```json
484
+ {
485
+ "status": "ok",
486
+ "deleted_files": 7,
487
+ "docs_path": "data/docs"
488
+ }
489
+ ```
490
+ </details>
491
+
492
+ - **deleted_files**: Number of documents removed
493
+ - **status**: "ok" indicates successful workspace reset
494
+
495
  ---
496
 
497
  ## 📘 License
api.py CHANGED
@@ -282,6 +282,42 @@ def generate_validation_qa_endpoint(req: QARequest):
282
  raise HTTPException(status_code=500, detail=str(exc))
283
 
284
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
285
  def start_api():
286
  import uvicorn
287
  uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")
 
282
  raise HTTPException(status_code=500, detail=str(exc))
283
 
284
 
285
+ @app.post("/clear_cache")
286
+ async def clear_cache(docs_path: str = Form(DEFAULT_DATA_DIR)):
287
+ """
288
+ Delete all files inside docs_path but keep the directory.
289
+ Useful to reset uploaded documents for RAG runs.
290
+ """
291
+ if not os.path.exists(docs_path):
292
+ raise HTTPException(status_code=400, detail=f"docs_path does not exist: {docs_path}")
293
+
294
+ removed = []
295
+ for root, dirs, files in os.walk(docs_path, topdown=False):
296
+ for name in files:
297
+ file_path = os.path.join(root, name)
298
+ try:
299
+ os.remove(file_path)
300
+ removed.append(name)
301
+ except Exception as e:
302
+ logger.error(f"Failed to remove {file_path}: {e}")
303
+
304
+ for name in dirs:
305
+ dir_path = os.path.join(root, name)
306
+ try:
307
+ shutil.rmtree(dir_path)
308
+ removed.append(f"{name}/")
309
+ except Exception as e:
310
+ logger.error(f"Failed to remove {dir_path}: {e}")
311
+
312
+ return {
313
+ "status": "cleared",
314
+ "docs_path": docs_path,
315
+ "removed_items": removed,
316
+ "total_removed": len(removed),
317
+ }
318
+
319
+
320
+
321
  def start_api():
322
  import uvicorn
323
  uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")
app.py CHANGED
@@ -23,6 +23,25 @@ def call_api(endpoint: str, payload: dict) -> str:
23
  return str(e)
24
 
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  def upload_docs_tool(files, docs_path="data/docs"):
27
  """
28
  Upload documents to the server's docs folder via FastAPI /upload_docs.
@@ -136,11 +155,11 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
136
 
137
  # Upload MCP Documents (no file uploader)
138
  with gr.Column():
139
- gr.Markdown("## Upload Documents via MCP")
140
- gr.Markdown("📂 Upload files (local paths or URLs) to your `data/docs` folder on MCP.")
141
  upload_mcp_input = gr.Textbox(
142
  lines=5,
143
- placeholder='Enter list of files/URLs, e.g., ["file1.txt","file2.pdf"]',
144
  label="Files (JSON list)"
145
  )
146
  upload_mcp_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
@@ -198,6 +217,16 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
198
  qa_btn.click(generate_qa_tool, inputs=qa_input, outputs=qa_out)
199
  gr.Markdown("---")
200
 
 
 
 
 
 
 
 
 
 
 
201
  if __name__ == "__main__":
202
 
203
  demo.launch(
 
23
  return str(e)
24
 
25
 
26
+ def clear_cache_tool(docs_path="data/docs"):
27
+ """
28
+ 🧹 Clear Cache MCP Tool
29
+ Deletes all files and directories inside docs_path on the server.
30
+ Accepts:
31
+ - local paths (str), default='data/docs/'
32
+ """
33
+ try:
34
+ r = requests.post(
35
+ f"{BASE_INTERNAL}/clear_cache",
36
+ data={"docs_path": docs_path},
37
+ timeout=60
38
+ )
39
+ r.raise_for_status()
40
+ return r.json()
41
+ except Exception as e:
42
+ return {"error": str(e)}
43
+
44
+
45
  def upload_docs_tool(files, docs_path="data/docs"):
46
  """
47
  Upload documents to the server's docs folder via FastAPI /upload_docs.
 
155
 
156
  # Upload MCP Documents (no file uploader)
157
  with gr.Column():
158
+ gr.Markdown("## Upload Documents (URLs) via MCP")
159
+ gr.Markdown("📂 Upload files (URLs) to your `data/docs` folder on MCP.")
160
  upload_mcp_input = gr.Textbox(
161
  lines=5,
162
+ placeholder='Enter list of URLs (e.g., ["https://example.com/example.txt",...])',
163
  label="Files (JSON list)"
164
  )
165
  upload_mcp_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
 
217
  qa_btn.click(generate_qa_tool, inputs=qa_input, outputs=qa_out)
218
  gr.Markdown("---")
219
 
220
+ # Clear Cache
221
+ with gr.Column():
222
+ gr.Markdown("## Clear Cache")
223
+ gr.Markdown("🧹 Deletes all files and directories inside docs_path on the server.")
224
+ clear_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path to Clear")
225
+ clear_btn = gr.Button("Clear Cache", variant="primary")
226
+ clear_out = gr.JSON(label="Response")
227
+ clear_btn.click(clear_cache_tool, inputs=[clear_path], outputs=clear_out)
228
+ gr.Markdown("---")
229
+
230
  if __name__ == "__main__":
231
 
232
  demo.launch(
data/docs/cisco.txt DELETED
@@ -1,7 +0,0 @@
1
- Cisco Systems, Inc. is a global technology company headquartered in San Jose, California, specializing in networking hardware, software, and telecommunications equipment.
2
- Founded in 1984 by Leonard Bosack and Sandy Lerner, Cisco initially focused on connecting computers at Stanford University, pioneering modern network routing technology.
3
- Cisco is widely recognized for its enterprise routers, switches, and networking solutions, which form the backbone of many corporate and service provider networks worldwide.
4
- The company also offers cybersecurity solutions, including firewalls, VPNs, and intrusion prevention systems, to protect enterprises from evolving digital threats.
5
- Cisco has a strong presence in cloud computing, collaboration tools, and Internet of Things (IoT) technology, helping organizations modernize and automate their IT infrastructure.
6
- The company invests heavily in research and development, maintaining innovation in areas such as artificial intelligence (AI), software-defined networking (SDN), and network automation.
7
- Cisco’s stock is publicly traded on the NASDAQ under the ticker symbol CSCO, and it consistently ranks among the largest technology companies globally in revenue and market capitalization.