File size: 9,324 Bytes
5be6121
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ef2d83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
---
license: mit
tags:
  - vector-database
  - semantic-search
  - embeddings
  - llm
  - memory
  - hnsw
  - rust
  - python
library_name: arms-hat
pipeline_tag: feature-extraction
---

# HAT: Hierarchical Attention Tree

**A novel index structure for AI memory systems that achieves 100% recall at 70x faster build times than HNSW.**

**Also: A new database paradigm for any domain with known hierarchy + semantic similarity.**

[![PyPI](https://img.shields.io/pypi/v/arms-hat.svg)](https://pypi.org/project/arms-hat/)
[![crates.io](https://img.shields.io/crates/v/arms-hat.svg)](https://crates.io/crates/arms-hat)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Rust](https://img.shields.io/badge/Rust-1.70+-orange.svg)](https://www.rust-lang.org/)
[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/)

---

## Architecture

<p align="center">
  <img src="images/fig01_architecture.jpg" alt="HAT Architecture" width="800"/>
</p>

HAT exploits the **known hierarchy** in AI conversations: sessions contain documents, documents contain chunks. This structural prior enables O(log n) queries with 100% recall.

---

## Key Results

<p align="center">
  <img src="images/fig09_summary_results.jpg" alt="Summary Results" width="800"/>
</p>

| Metric | HAT | HNSW | Improvement |
|--------|-----|------|-------------|
| **Recall@10** | **100%** | 70% | +30% |
| **Build Time** | 30ms | 2.1s | **70x faster** |
| **Query Latency** | 3.1ms | - | Production-ready |

*Benchmarked on hierarchically-structured AI conversation data*

---

## Recall Comparison

<p align="center">
  <img src="images/fig02_recall_comparison.jpg" alt="HAT vs HNSW Recall" width="700"/>
</p>

HAT achieves **100% recall** where HNSW achieves only ~70% on hierarchically-structured data.

---

## Build Time

<p align="center">
  <img src="images/fig03_build_time.jpg" alt="Build Time Comparison" width="700"/>
</p>

HAT builds indexes **70x faster** than HNSW - critical for real-time applications.

---

## The Problem

Large language models have finite context windows. A 10K context model can only "see" the most recent 10K tokens, losing access to earlier conversation history.

**Current solutions fall short:**
- Longer context models: Expensive to train and run
- Summarization: Lossy compression that discards detail
- RAG retrieval: Re-embeds and recomputes attention every query

## The HAT Solution

<p align="center">
  <img src="images/fig06_hat_vs_rag.jpg" alt="HAT vs RAG" width="800"/>
</p>

HAT exploits **known structure** in AI workloads. Unlike general vector databases that treat data as unstructured point clouds, AI conversations have inherent hierarchy:

```
Session (conversation boundary)
  └── Document (topic or turn)
       └── Chunk (individual message)
```

### The Hippocampus Analogy

<p align="center">
  <img src="images/fig05_hippocampus.jpg" alt="Hippocampus Analogy" width="800"/>
</p>

HAT mirrors human memory architecture - functioning as an **artificial hippocampus** for AI systems.

---

## How It Works

### Beam Search Query

<p align="center">
  <img src="images/fig10_beam_search.jpg" alt="Beam Search" width="800"/>
</p>

HAT uses beam search through the hierarchy:

```
1. Start at root
2. At each level, score children by cosine similarity to query
3. Keep top-b candidates (beam width)
4. Return top-k from leaf level
```

**Complexity:** O(b Β· d Β· c) = O(log n) when balanced

### Consolidation Phases

<p align="center">
  <img src="images/fig08_consolidation.jpg" alt="Consolidation Phases" width="800"/>
</p>

Inspired by sleep-staged memory consolidation, HAT maintains index quality through incremental consolidation.

---

## Scale Performance

<p align="center">
  <img src="images/fig07_scale_performance.jpg" alt="Scale Performance" width="700"/>
</p>

HAT maintains **100% recall** across all tested scales while HNSW degrades significantly.

| Scale | HAT Build | HNSW Build | HAT R@10 | HNSW R@10 |
|-------|-----------|------------|----------|-----------|
| 500   | 16ms      | 1.0s       | **100%** | 55%       |
| 1000  | 25ms      | 2.0s       | **100%** | 44.5%     |
| 2000  | 50ms      | 4.3s       | **100%** | 67.5%     |
| 5000  | 127ms     | 11.9s      | **100%** | 55%       |

---

## End-to-End Pipeline

<p align="center">
  <img src="images/fig04_pipeline.jpg" alt="Integration Pipeline" width="800"/>
</p>

### Core Claim

> **A 10K context model with HAT achieves 100% recall on 60K+ tokens with 3.1ms latency.**

| Messages | Tokens | Context % | Recall | Latency | Memory |
|----------|--------|-----------|--------|---------|--------|
| 1000 | 30K | 33% | 100% | 1.7ms | 1.6MB |
| 2000 | 60K | 17% | 100% | 3.1ms | 3.3MB |

---

## Quick Start

### Python

```python
from arms_hat import HatIndex

# Create index (1536 dimensions for OpenAI embeddings)
index = HatIndex.cosine(1536)

# Add messages with automatic hierarchy
index.add(embedding)  # Returns ID

# Session/document management
index.new_session()   # Start new conversation
index.new_document()  # Start new topic

# Query
results = index.near(query_embedding, k=10)
for result in results:
    print(f"ID: {result.id}, Score: {result.score:.4f}")

# Persistence
index.save("memory.hat")
loaded = HatIndex.load("memory.hat")
```

### Rust

```rust
use hat::{HatIndex, HatConfig};

// Create index
let config = HatConfig::default();
let mut index = HatIndex::new(config, 1536);

// Add points
let id = index.add(&embedding);

// Query
let results = index.search(&query, 10);
```

---

## Installation

### Python

```bash
pip install arms-hat
```

### From Source (Rust)

```bash
git clone https://github.com/automate-capture/hat.git
cd hat
cargo build --release
```

### Python Development

```bash
cd python
pip install maturin
maturin develop
```

---

## Project Structure

```
hat/
β”œβ”€β”€ src/                  # Rust implementation
β”‚   β”œβ”€β”€ lib.rs           # Library entry point
β”‚   β”œβ”€β”€ index.rs         # HatIndex implementation
β”‚   β”œβ”€β”€ container.rs     # Tree node types
β”‚   β”œβ”€β”€ consolidation.rs # Background maintenance
β”‚   └── persistence.rs   # Save/load functionality
β”œβ”€β”€ python/              # Python bindings (PyO3)
β”‚   └── arms_hat/        # Python package
β”œβ”€β”€ benchmarks/          # Performance comparisons
β”œβ”€β”€ examples/            # Usage examples
β”œβ”€β”€ paper/               # Research paper (PDF)
β”œβ”€β”€ images/              # Figures and diagrams
└── tests/               # Test suite
```

---

## Reproducing Results

```bash
# Run HAT vs HNSW benchmark
cargo test --test phase31_hat_vs_hnsw -- --nocapture

# Run real embedding dimension tests
cargo test --test phase32_real_embeddings -- --nocapture

# Run persistence tests
cargo test --test phase33_persistence -- --nocapture

# Run end-to-end LLM demo
python examples/demo_hat_memory.py
```

---

## When to Use HAT

**HAT is ideal for:**
- AI conversation memory (chatbots, agents)
- Session-based retrieval systems
- Any hierarchically-structured vector data
- Systems requiring deterministic behavior
- Cold-start scenarios (no training needed)

**Use HNSW instead for:**
- Unstructured point clouds (random embeddings)
- Static knowledge bases (handbooks, catalogs)
- When approximate recall is acceptable

---

## Beyond AI Memory: A New Database Paradigm

HAT represents a fundamentally new approach to indexing: **exploiting known structure rather than learning it**.

| Database Type | Structure | Semantics |
|---------------|-----------|-----------|
| Relational | Explicit (foreign keys) | None |
| Document | Implicit (nesting) | None |
| Vector (HNSW) | Learned from data | Yes |
| **HAT** | **Explicit + exploited** | **Yes** |

Traditional vector databases treat embeddings as unstructured point clouds, spending compute to *discover* topology. HAT inverts this: **known hierarchy is free information - use it.**

### General Applications

Any domain with **hierarchical structure + semantic similarity** benefits from HAT:

- **Legal/Medical Documents:** Case β†’ Filing β†’ Paragraph β†’ Sentence
- **Code Search:** Repository β†’ Module β†’ Function β†’ Line
- **IoT/Sensor Networks:** Facility β†’ Zone β†’ Device β†’ Reading
- **E-commerce:** Catalog β†’ Category β†’ Product β†’ Variant
- **Research Corpora:** Journal β†’ Paper β†’ Section β†’ Citation

### The Core Insight

> *"Position IS relationship. No foreign keys needed - proximity defines connection."*

HAT combines the structural guarantees of document databases with the semantic power of vector search, without the computational overhead of learning topology from scratch.

---

## Citation

```bibtex
@article{hat2026,
  title={Hierarchical Attention Tree: Extending LLM Context Through Structural Memory},
  author={Young, Lucas and Automate Capture Research},
  year={2026},
  url={https://research.automate-capture.com/hat}
}
```

---

## Paper

πŸ“„ **[Read the Full Paper (PDF)](paper/HAT_Context_Extension_Young_2026.pdf)**

---

## License

MIT License - see [LICENSE](LICENSE) for details.

---

## Links

- **Research Site:** [research.automate-capture.com/hat](https://research.automate-capture.com/hat)
- **Main Site:** [automate-capture.com](https://automate-capture.com)