--- tags: - transformers - question-answering - russian - constructicon - nlp - linguistics base_model: DeepPavlov/xlm-roberta-large-en-ru language: - ru pipeline_tag: question-answering widget: - text: "Что вам здесь нужно?" context: "Что вам здесь нужно?" example_title: "Question construction span" - text: "Петр так и замер на месте." context: "Петр так и замер на месте." example_title: "Adverbial construction span" --- # Russian Constructicon Span Predictor A question-answering model for identifying construction spans in Russian text. Fine-tuned to locate specific Constructicon pattern implementations within text examples. ## Model Details - **Base model:** [DeepPavlov/xlm-roberta-large-en-ru](https://huggingface.co/DeepPavlov/xlm-roberta-large-en-ru) - **Task:** Question Answering / Span Prediction - **Language:** Russian - **Training:** QA task where context=example, question=construction pattern, answer=construction span ## Usage ### Primary Usage (RusCxnPipe Library) This model is designed for use with the [RusCxnPipe](https://github.com/your-username/ruscxnpipe) library: ```python from ruscxnpipe import SpanPredictor predictor = SpanPredictor(model_name="Futyn-Maker/ruscxn-span-predictor") examples_with_patterns = [ { "example": "Что вам здесь нужно?", "patterns": [{"id": "pattern1", "pattern": "что NP-Dat Cop нужно?"}] } ] results = predictor.predict_spans(examples_with_patterns) span_info = results[0]['patterns'][0]['span'] print(f"Span: '{span_info['span_string']}' at {span_info['span_start']}-{span_info['span_end']}") ``` ### Direct Usage (SimpleTransformers) ```python from simpletransformers.question_answering import QuestionAnsweringModel model = QuestionAnsweringModel('xlmroberta', 'Futyn-Maker/ruscxn-span-predictor') # Format: context = Russian text, question = construction pattern to_predict = [ { "context": "Что вам здесь нужно?", "qas": [ { "question": "что NP-Dat Cop нужно?", "id": "0" } ] } ] predictions, _ = model.predict(to_predict) print(f"Predicted span: {predictions[0]['answer'][0]}") ``` ## Training Data The model was trained on a question-answering dataset where: - **Context:** Russian text examples containing constructions - **Question:** Constructicon patterns - **Answer:** Text spans implementing the construction in the example ## Output Returns the exact text span where a construction pattern is realized in the input text, including start and end character positions. ## Framework Versions - SimpleTransformers: 0.70.1 - Transformers: 4.51.3 - PyTorch: 2.7.0+cu126