YAML Metadata Warning: The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

BoolQ T5

This repository contains a T5-base model fine-tuned on the BoolQ dataset for generating true/false question-answer pairs. Leveraging T5’s text-to-text framework, the model can generate natural language questions and their corresponding yes/no answers directly from a given passage.

Model Overview

Built with PyTorch Lightning, this implementation streamlines training, validation, and hyperparameter tuning. By adapting the pre-trained T5-base model to the task of question generation and answer prediction, it effectively bridges comprehension and generation in a single framework.

Data Processing

Input Construction

Each input sample is formatted as follows:

truefalse: [answer] passage: [passage] </s>

Target Construction

Each target sample is formatted as:

question: [question] answer: [yes/no] </s>

The boolean answer is normalized to “yes” or “no” to ensure consistency during training.

Training Details

Framework: PyTorch Lightning
Optimizer: AdamW with linear learning rate scheduling and warmup
Batch Sizes:
- Training: 6
- Evaluation: 6
Maximum Sequence Length: 256 tokens
Number of Training Epochs: 4

Evaluation Metrics

The model’s performance was evaluated using BLEU scores for both the generated questions and answers. For question generation, the metrics are as follows:

Metric	Question
BLEU-1	0.5143
BLEU-2	0.3950
BLEU-3	0.3089
BLEU-4	0.2431

Note: These metrics offer a quantitative assessment of the model’s quality in generating coherent and relevant question-answer pairs.

How to Use

You can easily utilize this model for inference using the Hugging Face Transformers pipeline:

from transformers import pipeline

generator = pipeline("text2text-generation", model="Fares7elsadek/boolq-t5-base-question-generation")

# Example inference:
input_text = "truefalse: [answer] passage: [Your passage here] </s>"
result = generator(input_text)
print(result)

Downloads last month: 27

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for fares7elsadek/boolq-t5-base-question-generation

Base model

google-t5/t5-base

Finetuned

(715)

this model

fares7elsadek
/

boolq-t5-base-question-generation