Spend-Analyzer-MCP / API_DOCUMENTATION.md
Balamurugan Thayalan
spend-analyzer-mcp-mbt v1.0.0
ed1f7cd
# Spend Analyzer MCP - API Documentation
This document provides comprehensive API documentation for the Spend Analyzer MCP system, including Modal functions, MCP protocol integration, and local usage.
## Table of Contents
1. [Modal Functions API](#modal-functions-api)
2. [MCP Protocol Integration](#mcp-protocol-integration)
3. [Local Python API](#local-python-api)
4. [Data Formats](#data-formats)
5. [Error Handling](#error-handling)
6. [Examples](#examples)
## Modal Functions API
### 1. `process_bank_statements`
Process bank statements from email attachments.
**Function Signature:**
```python
def process_bank_statements(
email_config: Dict,
days_back: int = 30,
passwords: Optional[Dict] = None
) -> Dict
```
**Parameters:**
- `email_config` (Dict): Email configuration
- `email` (str): Email address
- `password` (str): App-specific password
- `imap_server` (str): IMAP server address
- `days_back` (int): Number of days to look back (default: 30)
- `passwords` (Dict, optional): PDF passwords by filename
**Returns:**
```python
{
"processed_statements": [
{
"filename": str,
"bank": str,
"account": str,
"period": str,
"transaction_count": int,
"status": str # "success", "password_required", "error"
}
],
"total_transactions": int,
"analysis": Dict, # Financial analysis data
"timestamp": str # ISO format
}
```
**Example:**
```python
import modal
app = modal.App.lookup("spend-analyzer-mcp-bmt")
process_statements = app["process_bank_statements"]
email_config = {
"email": "user@gmail.com",
"password": "app_password",
"imap_server": "imap.gmail.com"
}
result = process_statements.remote(email_config, days_back=30)
print(f"Processed {result['total_transactions']} transactions")
```
### 2. `analyze_uploaded_statements`
Analyze directly uploaded PDF statements.
**Function Signature:**
```python
def analyze_uploaded_statements(
pdf_contents: Dict[str, bytes],
passwords: Optional[Dict] = None
) -> Dict
```
**Parameters:**
- `pdf_contents` (Dict[str, bytes]): Mapping of filename to PDF content
- `passwords` (Dict, optional): PDF passwords by filename
**Returns:**
```python
{
"processed_files": [
{
"filename": str,
"bank": str,
"account": str,
"transaction_count": int,
"status": str
}
],
"total_transactions": int,
"analysis": Dict
}
```
**Example:**
```python
# Read PDF files
pdf_contents = {}
with open("statement1.pdf", "rb") as f:
pdf_contents["statement1.pdf"] = f.read()
analyze_pdfs = app["analyze_uploaded_statements"]
result = analyze_pdfs.remote(pdf_contents)
```
### 3. `get_ai_analysis`
Get AI-powered financial analysis using Claude or SambaNova.
**Function Signature:**
```python
def get_ai_analysis(
analysis_data: Dict,
user_question: str = "",
provider: str = "claude"
) -> Dict
```
**Parameters:**
- `analysis_data` (Dict): Financial analysis data
- `user_question` (str): Specific question for the AI
- `provider` (str): "claude" or "sambanova"
**Returns:**
```python
{
"ai_analysis": str, # AI-generated analysis text
"provider": str, # AI provider used
"model": str, # Model name
"usage": {
"input_tokens": int,
"output_tokens": int,
"total_tokens": int
}
}
```
**Example:**
```python
get_analysis = app["get_ai_analysis"]
analysis_data = {
"spending_insights": [...],
"financial_summary": {...},
"recommendations": [...]
}
# Use Claude for detailed analysis
claude_result = get_analysis.remote(
analysis_data,
"What are my biggest spending risks?",
"claude"
)
# Use SambaNova for quick insights
sambanova_result = get_analysis.remote(
analysis_data,
"Quick spending summary",
"sambanova"
)
```
### 4. `save_user_data` / `load_user_data`
Persistent storage for user analysis data.
**Save Function:**
```python
def save_user_data(user_id: str, data: Dict) -> Dict
```
**Load Function:**
```python
def load_user_data(user_id: str) -> Dict
```
**Example:**
```python
save_data = app["save_user_data"]
load_data = app["load_user_data"]
# Save user analysis
save_result = save_data.remote("user123", analysis_data)
# Load user analysis
load_result = load_data.remote("user123")
if load_result["status"] == "found":
user_data = load_result["data"]
```
## MCP Protocol Integration
### Webhook Endpoint
The system provides an MCP webhook endpoint for external integrations:
**URL:** `https://your-modal-app.modal.run/mcp_webhook`
**Method:** POST
**Content-Type:** application/json
### MCP Tools
#### 1. `process_email_statements`
**Description:** Process bank statements from email
**Input Schema:**
```json
{
"type": "object",
"properties": {
"email_config": {
"type": "object",
"properties": {
"email": {"type": "string"},
"password": {"type": "string"},
"imap_server": {"type": "string"}
}
},
"days_back": {"type": "integer", "default": 30},
"passwords": {"type": "object"}
}
}
```
#### 2. `analyze_pdf_statements`
**Description:** Analyze uploaded PDF statements
**Input Schema:**
```json
{
"type": "object",
"properties": {
"pdf_contents": {"type": "object"},
"passwords": {"type": "object"}
}
}
```
#### 3. `get_ai_analysis`
**Description:** Get AI financial analysis
**Input Schema:**
```json
{
"type": "object",
"properties": {
"analysis_data": {"type": "object"},
"user_question": {"type": "string"},
"provider": {"type": "string", "enum": ["claude", "sambanova"]}
}
}
```
### MCP Message Examples
**Initialize:**
```json
{
"jsonrpc": "2.0",
"id": "1",
"method": "initialize",
"params": {}
}
```
**List Tools:**
```json
{
"jsonrpc": "2.0",
"id": "2",
"method": "tools/list"
}
```
**Call Tool:**
```json
{
"jsonrpc": "2.0",
"id": "3",
"method": "tools/call",
"params": {
"name": "get_ai_analysis",
"arguments": {
"analysis_data": {...},
"user_question": "How can I save money?",
"provider": "claude"
}
}
}
```
## Local Python API
### SpendAnalyzer Class
```python
from spend_analyzer import SpendAnalyzer
analyzer = SpendAnalyzer()
# Load transactions
analyzer.load_transactions(transactions_list)
# Set budgets
analyzer.set_budgets({
"Food & Dining": 500,
"Shopping": 300,
"Gas & Transport": 200
})
# Get insights
insights = analyzer.analyze_spending_by_category()
alerts = analyzer.check_budget_alerts()
summary = analyzer.generate_financial_summary()
recommendations = analyzer.get_spending_recommendations()
# Export all data
export_data = analyzer.export_analysis_data()
```
### EmailProcessor Class
```python
from email_processor import EmailProcessor
email_config = {
"email": "user@gmail.com",
"password": "app_password",
"imap_server": "imap.gmail.com"
}
processor = EmailProcessor(email_config)
# Fetch emails
emails = await processor.fetch_bank_emails(days_back=30)
# Extract attachments
for email in emails:
attachments = await processor.extract_attachments(email)
for filename, content, file_type in attachments:
if file_type == 'pdf':
# Process PDF
pass
```
### PDFProcessor Class
```python
from email_processor import PDFProcessor
processor = PDFProcessor()
# Process PDF
with open("statement.pdf", "rb") as f:
pdf_content = f.read()
statement_info = await processor.process_pdf(pdf_content, password="optional")
print(f"Bank: {statement_info.bank_name}")
print(f"Account: {statement_info.account_number}")
print(f"Transactions: {len(statement_info.transactions)}")
```
## Data Formats
### Transaction Format
```python
{
"date": "2024-01-15T00:00:00",
"description": "Amazon Purchase",
"amount": -45.67,
"category": "Shopping",
"account": "****1234",
"balance": 1500.33
}
```
### Financial Summary Format
```python
{
"total_income": 3000.0,
"total_expenses": 1500.0,
"net_cash_flow": 1500.0,
"largest_expense": {
"amount": 200.0,
"description": "Grocery Store",
"date": "2024-01-15",
"category": "Food & Dining"
},
"most_frequent_category": "Food & Dining",
"unusual_transactions": [...],
"monthly_trends": {...}
}
```
### Spending Insight Format
```python
{
"category": "Food & Dining",
"total_amount": 500.0,
"transaction_count": 15,
"average_transaction": 33.33,
"percentage_of_total": 33.3,
"trend": "increasing",
"top_merchants": ["Restaurant A", "Grocery Store", "Cafe B"]
}
```
### Budget Alert Format
```python
{
"category": "Food & Dining",
"budget_limit": 500.0,
"current_spending": 450.0,
"percentage_used": 90.0,
"alert_level": "warning",
"days_remaining": 10
}
```
## Error Handling
### Common Error Responses
**Authentication Error:**
```python
{
"error": "Invalid API key or authentication failed",
"code": "AUTH_ERROR"
}
```
**PDF Password Error:**
```python
{
"error": "PDF requires password",
"code": "PASSWORD_REQUIRED",
"filename": "statement.pdf"
}
```
**Processing Error:**
```python
{
"error": "Failed to parse PDF content",
"code": "PARSE_ERROR",
"details": "Unsupported PDF format"
}
```
**Rate Limit Error:**
```python
{
"error": "API rate limit exceeded",
"code": "RATE_LIMIT",
"retry_after": 60
}
```
### Error Handling Best Practices
1. **Always check for errors** in API responses
2. **Implement retry logic** for transient failures
3. **Handle password-protected PDFs** gracefully
4. **Monitor API usage** to avoid rate limits
5. **Log errors** for debugging
## Examples
### Complete Workflow Example
```python
import modal
import asyncio
async def analyze_finances():
# Connect to Modal app
app = modal.App.lookup("spend-analyzer-mcp-bmt")
# Process email statements
email_config = {
"email": "user@gmail.com",
"password": "app_password",
"imap_server": "imap.gmail.com"
}
process_statements = app["process_bank_statements"]
email_result = process_statements.remote(email_config, days_back=30)
# Upload additional PDFs
pdf_contents = {}
with open("additional_statement.pdf", "rb") as f:
pdf_contents["additional.pdf"] = f.read()
analyze_pdfs = app["analyze_uploaded_statements"]
pdf_result = analyze_pdfs.remote(pdf_contents)
# Combine analysis data
combined_analysis = {
**email_result["analysis"],
"additional_transactions": pdf_result["total_transactions"]
}
# Get AI analysis
get_analysis = app["get_ai_analysis"]
# Use Claude for detailed analysis
claude_analysis = get_analysis.remote(
combined_analysis,
"Provide a comprehensive financial health assessment",
"claude"
)
# Use SambaNova for quick insights
sambanova_analysis = get_analysis.remote(
combined_analysis,
"What are my top 3 spending categories?",
"sambanova"
)
print("Claude Analysis:", claude_analysis["ai_analysis"])
print("SambaNova Analysis:", sambanova_analysis["ai_analysis"])
# Run the analysis
asyncio.run(analyze_finances())
```
### Integration with External Systems
```python
import requests
import json
def call_mcp_webhook(data):
"""Call the MCP webhook endpoint"""
webhook_url = "https://your-modal-app.modal.run/mcp_webhook"
mcp_message = {
"jsonrpc": "2.0",
"id": "1",
"method": "tools/call",
"params": {
"name": "get_ai_analysis",
"arguments": data
}
}
response = requests.post(
webhook_url,
json=mcp_message,
headers={"Content-Type": "application/json"}
)
return response.json()
# Use the webhook
analysis_data = {"spending_insights": [...]}
result = call_mcp_webhook(analysis_data)
```
## Rate Limits and Quotas
### Claude API
- **Rate Limit:** 1000 requests/minute
- **Token Limit:** 100K tokens/minute
- **Best Practice:** Use for complex analysis
### SambaNova API
- **Rate Limit:** 5000 requests/minute
- **Token Limit:** 500K tokens/minute
- **Best Practice:** Use for quick insights and batch processing
### Modal Functions
- **Concurrent Executions:** Auto-scaled
- **Timeout:** Configurable per function
- **Memory:** 2GB default for PDF processing
## Support and Troubleshooting
### Common Issues
1. **PDF Processing Fails**
- Check PDF format compatibility
- Verify password if protected
- Ensure sufficient memory allocation
2. **Email Connection Issues**
- Use app-specific passwords
- Verify IMAP server settings
- Check firewall/network restrictions
3. **AI API Errors**
- Verify API keys are valid
- Check rate limits
- Monitor token usage
### Getting Help
1. Check the logs: `modal logs spend-analyzer-mcp-bmt`
2. Review error messages and codes
3. Consult the deployment guide
4. Open an issue with detailed error information
For more detailed information, see the [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md) file.