Improve model card for mmBERT training checkpoints with metadata and usage

by nielsr HF Staff - opened Sep 13

←

nielsr

Sep 13

This PR significantly enhances the model card for the mmBERT raw training checkpoints by:

Updating the language tag to mul to accurately reflect its multilingual nature (over 1800 languages).
Adding pipeline_tag: feature-extraction to improve discoverability on the Hub and enable the automated widget. While this repository contains raw checkpoints, the associated models are commonly used for feature extraction.
Specifying library_name: transformers as the linked GitHub repository provides extensive usage examples with the transformers library.
Adding descriptive tags like bert, multilingual, and encoder-only for better searchability.
Updating the main title to reflect the paper's title and linking to the Hugging Face paper page.
Incorporating the paper's abstract.
Including comprehensive usage examples from the GitHub README's "Quick Start" and "Getting Started" sections, showcasing how to use the associated mmBERT models (e.g., mmbert-small, mmbert-base) for various tasks like feature extraction, masked language modeling, classification, and retrieval.
Clarifying the purpose of this repository as containing raw training checkpoints and directing users to the model collection for runnable models.
Integrating other valuable sections from the GitHub README such as "Model Family", "Training Details", "Evaluation", "FAQ", and "Limitations" to provide a complete overview of the project.

These changes make the model card more informative, discoverable, and user-friendly on the Hugging Face Hub.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment