---
library_name: transformers
license: mit
language:
- hu
base_model:
- jhu-clsp/mmBERT-small
pipeline_tag: token-classification
tags:
- token classification
- hallucination detection
- transformers
- question answer
datasets:
- KRLabsOrg/ragtruth-hu-translated
---


# LettuceDetect: Hungarian Hallucination Detection Model

<p align="center">
  <img src="https://github.com/KRLabsOrg/LettuceDetect/blob/main/assets/lettuce_detective.png?raw=true" alt="LettuceDetect Logo" width="400"/>
</p>

**Model Name:** lettucedect-mmbert-small-hu-v1 
**Organization:** KRLabsOrg  
**Github:** https://github.com/KRLabsOrg/LettuceDetect

## Overview

LettuceDetect is a transformer-based model for hallucination detection on context and answer pairs, designed for Retrieval-Augmented Generation (RAG) applications. This model is built on **ModernBERT**, which has been specifically chosen and trained becasue of its extended context support (up to **8192 tokens**). This long-context capability is critical for tasks where detailed and extensive documents need to be processed to accurately determine if an answer is supported by the provided context.

**This is our Large model based on ModernBERT-large**

## Model Details

- **Architecture:** mmBERT-small with extended context support (up to 8192 tokens)
- **Task:** Token Classification / Hallucination Detection
- **Training Dataset:** RagTruth-HU
- **Language:** Hungarian

## How It Works

The model is trained to identify tokens in the answer text that are not supported by the given context. During inference, the model returns token-level predictions which are then aggregated into spans. This allows users to see exactly which parts of the answer are considered hallucinated.

## Usage

### Installation

Install the 'lettucedetect' repository

```bash
pip install lettucedetect
```

### Using the model

```python
from lettucedetect.models.inference import HallucinationDetector

detector = HallucinationDetector(
    method="transformer",
    model_path="KRLabsOrg/lettucedect-mmbert-small-hu-v1",
    lang="hu",
    trust_remote_code=True
)

contexts = [
    "Franciaország fővárosa Párizs. Franciaország népessége 67 millió fő. Franciaország területe 551 695 km²."
]
question = "Mennyi Franciaország népessége?"
answer = "Franciaország népessége 125 millió fő."

predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 0, 'end': 23, 'confidence': 0.9059492349624634, 'text': 'Franciaország népessége'}, {'start': 24, 'end': 34, 'confidence': 0.8549801707267761, 'text': '125 millió'}, {'start': 37, 'end': 38, 'confidence': 0.7141280174255371, 'text': '.'}]
```


## Performance

**Results on Translated RAGTruth-HU (Class 1: Hallucination)**

We evaluate our Hungarian models on the translated [RAGTruth](https://aclanthology.org/2024.acl-long.585/) dataset. As a prompt baseline we include **meta-llama/Llama-4-Maverick-17B-128E-Instruct**.

| Language | Model                                   | Precision (%) | Recall (%) | F1 (%) | Maverick F1 (%) | Δ F1 (%) |
|----------|-----------------------------------------|---------------|------------|--------|-----------------|----------|
| Hungarian | meta-llama/Llama-4-Maverick-17B-128E-Instruct | 38.70        | **96.82** | 55.30 | 55.30          | +0.00    |
| Hungarian | lettucedect-mmBERT-small (ours)                    | 70.20         | 72.51      | 71.33 | 55.30          | **+16.03** |
| Hungarian | lettucedect-mmBERT-base (ours)                     | **76.62**     | 69.21      | **72.73** | 55.30          | **+17.43** |

*Note:* Percentages are reported for the hallucination class (Class 1). Δ F1 is measured in percentage points vs. the Maverick baseline.

## Citing

If you use the model or the tool, please cite the following paper:

```bibtex
@misc{Kovacs:2025,
      title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, 
      author={Ádám Kovács and Gábor Recski},
      year={2025},
      eprint={2502.17125},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.17125}, 
}
```