Futyn-Maker's picture
Upload Russian Constructicon span predictor model
bfd2961 verified
metadata
tags:
  - transformers
  - question-answering
  - russian
  - constructicon
  - nlp
  - linguistics
base_model: DeepPavlov/xlm-roberta-large-en-ru
language:
  - ru
pipeline_tag: question-answering
widget:
  - text: Что вам здесь нужно?
    context: Что вам здесь нужно?
    example_title: Question construction span
  - text: Петр так и замер на месте.
    context: Петр так и замер на месте.
    example_title: Adverbial construction span

Russian Constructicon Span Predictor

A question-answering model for identifying construction spans in Russian text. Fine-tuned to locate specific Constructicon pattern implementations within text examples.

Model Details

  • Base model: DeepPavlov/xlm-roberta-large-en-ru
  • Task: Question Answering / Span Prediction
  • Language: Russian
  • Training: QA task where context=example, question=construction pattern, answer=construction span

Usage

Primary Usage (RusCxnPipe Library)

This model is designed for use with the RusCxnPipe library:

from ruscxnpipe import SpanPredictor

predictor = SpanPredictor(model_name="Futyn-Maker/ruscxn-span-predictor")

examples_with_patterns = [
    {
        "example": "Что вам здесь нужно?",
        "patterns": [{"id": "pattern1", "pattern": "что NP-Dat Cop нужно?"}]
    }
]

results = predictor.predict_spans(examples_with_patterns)
span_info = results[0]['patterns'][0]['span']
print(f"Span: '{span_info['span_string']}' at {span_info['span_start']}-{span_info['span_end']}")

Direct Usage (SimpleTransformers)

from simpletransformers.question_answering import QuestionAnsweringModel

model = QuestionAnsweringModel('xlmroberta', 'Futyn-Maker/ruscxn-span-predictor')

# Format: context = Russian text, question = construction pattern
to_predict = [
    {
        "context": "Что вам здесь нужно?",
        "qas": [
            {
                "question": "что NP-Dat Cop нужно?",
                "id": "0"
            }
        ]
    }
]

predictions, _ = model.predict(to_predict)
print(f"Predicted span: {predictions[0]['answer'][0]}")

Training Data

The model was trained on a question-answering dataset where:

  • Context: Russian text examples containing constructions
  • Question: Constructicon patterns
  • Answer: Text spans implementing the construction in the example

Output

Returns the exact text span where a construction pattern is realized in the input text, including start and end character positions.

Framework Versions

  • SimpleTransformers: 0.70.1
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126