gtandon
/

CodeRankEmbed-compressed

tensor-compression

code-embeddings

Model card Files Files and versions

CodeRankEmbed-compressed / README.md

gtandon's picture

Upload tensor-compressed CodeRankEmbed model

5c1e941 verified 9 months ago

|

history blame contribute delete

2.52 kB

	---
	license: mit
	tags:
	- tensor-compression
	- code-embeddings
	- factorized
	- tltorch
	base_model: nomic-ai/CodeRankEmbed
	---

	# CodeRankEmbed-compressed

	This is a tensor-compressed version of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) using tensor factorization.

	## Compression Details

	- Compression method: Tensor factorization using TLTorch
	- Factorization types: cp
	- Ranks used: 4
	- Number of factorized layers: 60
	- Original model size: 136.73M parameters
	- Compressed model size: 23.62M parameters
	- Compression ratio: 5.79x (82.7% reduction)

	## Usage

	To use this compressed model, you'll need to install the required dependencies and use the custom loading script:

	```bash
	pip install torch tensorly tltorch sentence-transformers
	```

	### Loading the model

	```python
	import torch
	import json
	from sentence_transformers import SentenceTransformer
	import tensorly as tl
	from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding

	# Set TensorLy backend
	tl.set_backend("pytorch")

	# Load the model structure
	model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True)

	# Load factorization info
	with open("factorization_info.json", "r") as f:
	factorized_info = json.load(f)

	# Reconstruct factorized layers (see load_compressed_model.py for full implementation)
	# ... reconstruction code ...

	# Load compressed weights
	checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
	model.load_state_dict(checkpoint["state_dict"], strict=False)

	# Use the model
	embeddings = model.encode(["def hello_world():\n print('Hello, World!')"])
	```

	## Model Files

	- `pytorch_model.bin`: Compressed model weights
	- `factorization_info.json`: Metadata about factorized layers
	- `tokenizer.json`, `vocab.txt`: Tokenizer files
	- `modules.json`: SentenceTransformer modules configuration

	## Performance

	The compressed model maintains good quality while being significantly smaller:
	- Similar embedding quality (average cosine similarity > 0.9 with original)
	- 5.79x smaller model size
	- Faster loading and inference on CPU

	## Citation

	If you use this compressed model, please cite the original CodeRankEmbed model:

	```bibtex
	@misc{nomic2024coderankembed,
	title={CodeRankEmbed},
	author={Nomic AI},
	year={2024},
	url={https://huggingface.co/nomic-ai/CodeRankEmbed}
	}
	```

	## License

	This compressed model inherits the license from the original model. Please check the original model's license for usage terms.