Spaces:
Running
Running
| title: README | |
| emoji: π | |
| colorFrom: gray | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| # Marin | |
| Marin is an open lab for building foundation models---together. | |
| We're training powerful models from scratch, and sharing and programmatically documenting every step: | |
| the code, the data, the experiments, the mistakes...all in real-time. | |
| We invite anyone who shares our vision of open science and open-source to join and contribute, | |
| whether you want to try out a new architecture, training algorithm, dataset, | |
| evaluation...there is a lot to do! | |
| Find us here: | |
| - [GitHub](https://github.com/marin-community/marin): where the code for reproducing all experiments live | |
| - [Discord](https://discord.gg/J9CTk7pqcM): come hang out with us, ask questions, and coordinate on research directions | |
| - [Documentation](https://marin.readthedocs.io/en/latest/): where you can find tutorials and explanations | |
| - [DeepWiki](https://deepwiki.com/marin-community/marin): automatically generated Wiki with an integrated chatbot | |
| - [HuggingFace](https://huggingface.co/marin-community): where to find our datasets and models | |
| - [WandB](https://wandb.ai/marin-community): where to find our training runs | |
| - [Homepage](https://marin.community): See this content on another page | |
| Want to jump in? [Install](https://marin.readthedocs.io/en/latest/tutorials/installation/) the Marin code and | |
| [run your first experiment](https://marin.readthedocs.io/en/latest/tutorials/first-experiment/)! | |
| ## Experiments | |
| Building a foundation model requires countless experiments trying out endless variants of algorithms and datasets. | |
| All the experiments we're doing are captured as [GitHub issues](https://github.com/marin-community/marin/issues?q=is%3Aissue%20label%3Aexperiment). | |
| Here is a [summary](https://marin.readthedocs.io/en/latest/reports/) to get the lay of the land. | |
| ## Speedrun π | |
| Have a new architecture or training procedure that's more efficient? | |
| Participate in the [Marin speedrun](https://marin.community/speedrun) competition | |
| (inspired by the [nanogpt speedrun](https://github.com/KellerJordan/modded-nanogpt?tab=readme-ov-file#world-record-history)) | |
| and create the fastest method to train a model to a certain quality! | |
| Get started [here](https://marin.readthedocs.io/en/latest/tutorials/submitting-speedrun/). | |
| ## Datashop π οΈ | |
| Want to add new capabilities to the Marin models? | |
| Go visit our datashop, where you can upload a dataset or craft a prompt to curate a relevant dataset for your task. | |
| ## Models π | |
| We have trained a strong 8B parameter model. Try it out [here](https://huggingface.co/spaces/WillHeld/soft-racoon-test)! | |
| ## Acknowledgements | |
| Marin wouldn't be possible without the generous support of the [Google TPU Research Cloud program](https://sites.research.google/trc/about/). | |
| We also benefit immensely from the broader open ecosystem, who have released numerous tools and datasets, including | |
| [AI2](https://allenai.org), [Hugging Face](https://huggingface.co/), [NVIDIA](https://www.nvidia.com/en-us/research/), etc. | |