Spaces:

marin-community
/

README

Running

App Files Files Community

README / README.md

crfm-marin

Update README.md

8ebfc68 verified 6 months ago

preview code

raw

history blame contribute delete

3.02 kB

	---
	title: README
	emoji: 📈
	colorFrom: gray
	colorTo: blue
	sdk: static
	pinned: false
	---

	# Marin

	Marin is an open lab for building foundation models---together.
	We're training powerful models from scratch, and sharing and programmatically documenting every step:
	the code, the data, the experiments, the mistakes...all in real-time.
	We invite anyone who shares our vision of open science and open-source to join and contribute,
	whether you want to try out a new architecture, training algorithm, dataset,
	evaluation...there is a lot to do!

	Find us here:

	- [GitHub](https://github.com/marin-community/marin): where the code for reproducing all experiments live
	- [Discord](https://discord.gg/J9CTk7pqcM): come hang out with us, ask questions, and coordinate on research directions
	- [Documentation](https://marin.readthedocs.io/en/latest/): where you can find tutorials and explanations
	- [DeepWiki](https://deepwiki.com/marin-community/marin): automatically generated Wiki with an integrated chatbot
	- [HuggingFace](https://huggingface.co/marin-community): where to find our datasets and models
	- [WandB](https://wandb.ai/marin-community): where to find our training runs
	- [Homepage](https://marin.community): See this content on another page

	Want to jump in? [Install](https://marin.readthedocs.io/en/latest/tutorials/installation/) the Marin code and
	[run your first experiment](https://marin.readthedocs.io/en/latest/tutorials/first-experiment/)!

	## Experiments

	Building a foundation model requires countless experiments trying out endless variants of algorithms and datasets.
	All the experiments we're doing are captured as [GitHub issues](https://github.com/marin-community/marin/issues?q=is%3Aissue%20label%3Aexperiment).
	Here is a [summary](https://marin.readthedocs.io/en/latest/reports/) to get the lay of the land.

	## Speedrun 🏃

	Have a new architecture or training procedure that's more efficient?
	Participate in the [Marin speedrun](https://marin.community/speedrun) competition
	(inspired by the [nanogpt speedrun](https://github.com/KellerJordan/modded-nanogpt?tab=readme-ov-file#world-record-history))
	and create the fastest method to train a model to a certain quality!
	Get started [here](https://marin.readthedocs.io/en/latest/tutorials/submitting-speedrun/).

	## Datashop 🛠️

	Want to add new capabilities to the Marin models?
	Go visit our datashop, where you can upload a dataset or craft a prompt to curate a relevant dataset for your task.

	## Models 🌐

	We have trained a strong 8B parameter model. Try it out [here](https://huggingface.co/spaces/WillHeld/soft-racoon-test)!

	## Acknowledgements

	Marin wouldn't be possible without the generous support of the [Google TPU Research Cloud program](https://sites.research.google/trc/about/).
	We also benefit immensely from the broader open ecosystem, who have released numerous tools and datasets, including
	[AI2](https://allenai.org), [Hugging Face](https://huggingface.co/), [NVIDIA](https://www.nvidia.com/en-us/research/), etc.