metadata
license: apache-2.0
language:
- en
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- embeddings
- text-embeddings
library_name: sentence-transformers
base_model: sentence-transformers/all-MiniLM-L6-v2
Helion-V1-Embeddings
Helion-V1-Embeddings is a lightweight text embedding model designed for semantic similarity, search, and retrieval tasks. It converts text into dense vector representations optimized for the Helion ecosystem.
Model Description
- Developed by: DeepXR
- Model type: Sentence Transformer / Text Embedding Model
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Language: English
- License: Apache 2.0
- Embedding Dimension: 384
- Max Sequence Length: 256 tokens
Model Parameters
| Parameter | Value | Description |
|---|---|---|
| Architecture | BERT-based | 6-layer transformer encoder |
| Hidden Size | 384 | Dimension of hidden layers |
| Attention Heads | 12 | Number of attention heads |
| Intermediate Size | 1536 | Feed-forward layer size |
| Vocab Size | 30,522 | WordPiece vocabulary |
| Max Position Embeddings | 512 | Maximum sequence length |
| Pooling Strategy | Mean Pooling | Average of token embeddings |
| Output Dimension | 384 | Final embedding size |
| Total Parameters | ~22.7M | Trainable parameters |
| Model Size | ~80MB | Disk footprint |
Intended Use
Helion-V1-Embeddings is designed for:
- Semantic search and information retrieval
- Document similarity comparison
- Clustering and categorization
- Question-answering systems (retrieval component)
- Recommendation systems
- Duplicate detection
Primary Users
- Developers building search systems
- Data scientists working on NLP tasks
- Applications requiring text similarity
- RAG (Retrieval-Augmented Generation) pipelines
Key Features
- Fast Inference: Optimized for quick embedding generation
- Compact Size: Small model footprint (~80MB)
- Good Performance: Balanced accuracy and speed
- Easy Integration: Compatible with sentence-transformers library
- Batch Processing: Efficient for large datasets
Usage
Basic Usage
from sentence_transformers import SentenceTransformer
# Load model
model = SentenceTransformer('DeepXR/Helion-V1-embeddings')
# Encode sentences
sentences = [
"How do I reset my password?",
"What is the process for password recovery?",
"I forgot my login credentials"
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (3, 384)
Similarity Search
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('DeepXR/Helion-V1-embeddings')
# Encode query and documents
query = "How to train a machine learning model?"
documents = [
"Machine learning training requires data preprocessing",
"The best way to cook pasta is boiling water",
"Neural networks need proper hyperparameter tuning"
]
query_embedding = model.encode(query)
doc_embeddings = model.encode(documents)
# Calculate similarity
similarities = util.cos_sim(query_embedding, doc_embeddings)
print(similarities)
Integration with FAISS
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
model = SentenceTransformer('DeepXR/Helion-V1-embeddings')
# Create embeddings
documents = ["doc1", "doc2", "doc3"]
embeddings = model.encode(documents)
# Create FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings.astype('float32'))
# Search
query_embedding = model.encode(["search query"])
distances, indices = index.search(query_embedding.astype('float32'), k=3)
Performance
Benchmark Results
| Task | Score | Notes |
|---|---|---|
| STS Benchmark | ~0.78 | Semantic Textual Similarity |
| Retrieval (BEIR) | ~0.42 | Average across datasets |
| Speed (CPU) | ~2000 sentences/sec | Batch size 32 |
| Speed (GPU) | ~15000 sentences/sec | Batch size 128 |
Note: These are approximate values. Actual performance may vary.
Training Details
Training Data
The model was fine-tuned on:
- Question-answer pairs
- Semantic similarity datasets
- Document-query pairs
- Paraphrase detection examples
Training Procedure
- Base Model: sentence-transformers/all-MiniLM-L6-v2
- Training Method: Contrastive learning with cosine similarity
- Loss Function: MultipleNegativesRankingLoss
- Batch Size: 64
- Epochs: 3
- Pooling: Mean pooling
Technical Specifications
Model Architecture
- Type: Transformer-based encoder
- Layers: 6
- Hidden Size: 384
- Attention Heads: 12
- Parameters: ~22.7M
- Pooling Strategy: Mean pooling
Input Format
- Max Length: 256 tokens
- Tokenizer: WordPiece
- Normalization: Applied automatically
Output Format
- Embedding Dimension: 384
- Dtype: float32
- Normalization: L2 normalized (optional)
Limitations
- Sequence Length: Limited to 256 tokens (longer texts are truncated)
- Language: Primarily optimized for English
- Domain: General-purpose, may need fine-tuning for specialized domains
- Context: Does not maintain conversation context across multiple inputs
- Model Size: Smaller than state-of-the-art models, trading some accuracy for speed
Use Cases
β Good For:
- Semantic search in document collections
- Finding similar questions/answers
- Content recommendation
- Duplicate detection
- Clustering similar documents
- Quick similarity comparisons
β Not Suitable For:
- Long document encoding (>256 tokens)
- Real-time generation tasks
- Multilingual applications (without fine-tuning)
- Highly specialized domains without adaptation
- Tasks requiring deep reasoning
Comparison with Other Models
| Model | Dim | Speed | Accuracy | Size |
|---|---|---|---|---|
| Helion-V1-Embeddings | 384 | Fast | Good | 80MB |
| all-MiniLM-L6-v2 | 384 | Fast | Good | 80MB |
| all-mpnet-base-v2 | 768 | Medium | Better | 420MB |
| text-embedding-ada-002 | 1536 | API | Best | API |
Ethical Considerations
- Bias: May reflect biases present in training data
- Privacy: Do not embed sensitive personal information
- Fairness: Performance may vary across different text types
- Use Responsibly: Consider implications of similarity matching
Integration Examples
LangChain Integration
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
model_name="DeepXR/Helion-V1-embeddings"
)
text = "This is a sample document"
embedding = embeddings.embed_query(text)
LlamaIndex Integration
from llama_index.embeddings import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(
model_name="DeepXR/Helion-V1-embeddings"
)
embeddings = embed_model.get_text_embedding("Hello world")
Citation
@misc{helion-v1-embeddings,
author = {DeepXR},
title = {Helion-V1-Embeddings: Lightweight Text Embedding Model},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/DeepXR/Helion-V1-embeddings}
}
Model Card Authors
DeepXR Team