| | --- |
| | license: mit |
| | tags: |
| | - tensor-compression |
| | - code-embeddings |
| | - factorized |
| | - tltorch |
| | base_model: nomic-ai/CodeRankEmbed |
| | --- |
| | |
| | # CodeRankEmbed-compressed |
| |
|
| | This is a tensor-compressed version of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) using tensor factorization. |
| |
|
| | ## Compression Details |
| |
|
| | - **Compression method**: Tensor factorization using TLTorch |
| | - **Factorization types**: cp |
| | - **Ranks used**: 4 |
| | - **Number of factorized layers**: 60 |
| | - **Original model size**: 136.73M parameters |
| | - **Compressed model size**: 23.62M parameters |
| | - **Compression ratio**: 5.79x (82.7% reduction) |
| |
|
| | ## Usage |
| |
|
| | To use this compressed model, you'll need to install the required dependencies and use the custom loading script: |
| |
|
| | ```bash |
| | pip install torch tensorly tltorch sentence-transformers |
| | ``` |
| |
|
| | ### Loading the model |
| |
|
| | ```python |
| | import torch |
| | import json |
| | from sentence_transformers import SentenceTransformer |
| | import tensorly as tl |
| | from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding |
| | |
| | # Set TensorLy backend |
| | tl.set_backend("pytorch") |
| | |
| | # Load the model structure |
| | model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True) |
| | |
| | # Load factorization info |
| | with open("factorization_info.json", "r") as f: |
| | factorized_info = json.load(f) |
| | |
| | # Reconstruct factorized layers (see load_compressed_model.py for full implementation) |
| | # ... reconstruction code ... |
| | |
| | # Load compressed weights |
| | checkpoint = torch.load("pytorch_model.bin", map_location="cpu") |
| | model.load_state_dict(checkpoint["state_dict"], strict=False) |
| | |
| | # Use the model |
| | embeddings = model.encode(["def hello_world():\n print('Hello, World!')"]) |
| | ``` |
| |
|
| | ## Model Files |
| |
|
| | - `pytorch_model.bin`: Compressed model weights |
| | - `factorization_info.json`: Metadata about factorized layers |
| | - `tokenizer.json`, `vocab.txt`: Tokenizer files |
| | - `modules.json`: SentenceTransformer modules configuration |
| |
|
| | ## Performance |
| |
|
| | The compressed model maintains good quality while being significantly smaller: |
| | - Similar embedding quality (average cosine similarity > 0.9 with original) |
| | - 5.79x smaller model size |
| | - Faster loading and inference on CPU |
| |
|
| | ## Citation |
| |
|
| | If you use this compressed model, please cite the original CodeRankEmbed model: |
| |
|
| | ```bibtex |
| | @misc{nomic2024coderankembed, |
| | title={CodeRankEmbed}, |
| | author={Nomic AI}, |
| | year={2024}, |
| | url={https://huggingface.co/nomic-ai/CodeRankEmbed} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This compressed model inherits the license from the original model. Please check the original model's license for usage terms. |
| |
|