May 6, 20232 min read

MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

MosaicML presents MPT-7B, the newest addition to our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1 trillion tokens of text and code, matching the quality of LLaMA-7B while being open-source and commercially usable. The model was trained on the MosaicML platform in just 9.5 days without human intervention, costing approximately $200,000. Today, you can train, fine-tune, and deploy your own private MPT models, starting from one of our checkpoints or from scratch. We're also releasing three fine-tuned models alongside MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+ with a context length of 65k tokens.

MPT Model Series

The MPT (MosaicML Pretrained Transformer) model series aims to address the limitations of existing open-source LLMs, such as LLaMA, Pythia, StableLM, and OpenLLaMA. Our MPT series is commercially usable, trained on 1 trillion tokens, capable of handling very long inputs, optimized for fast training and inference, and equipped with efficient open-source training code.

Model Evaluation

MPT-7B has been rigorously evaluated on a variety of benchmarks and consistently meets the high-quality bar set by LLaMA-7B.

New Models Released

We are releasing four models today:

MPT-7B Base: A decoder-style transformer with 6.7 billion parameters, trained on 1 trillion tokens of text and code.
MPT-7B-StoryWriter-65k+: A model designed to read and write stories with extremely long context lengths, fine-tuned on a filtered fiction subset of the books3 dataset.
MPT-7B-Instruct: A model for short-form instruction following, fine-tuned on a dataset derived from Databricks Dolly-15k and Anthropic’s Helpful and Harmless datasets.
MPT-7B-Chat: A chatbot-like model for dialogue generation, fine-tuned on various datasets including ShareGPT-Vicuna, HC3, Alpaca, Helpful and Harmless, and Evol-Instruct.
MosaicML LLM Foundry

In addition to the model checkpoints, we have open-sourced the entire codebase for pretraining, fine-tuning, and evaluating MPT via our new MosaicML LLM Foundry, emphasizing efficiency, ease-of-use, and rigorous attention to detail.

Training and Deploying Custom MPT

To start building and deploying your own custom MPT models on the MosaicML platform, sign up here.

MPT-7B: Matching LLaMA-7B Quality

MPT-7B matches the quality of LLaMA-7B and outperforms other open-source models in the 7B-20B range on standard academic tasks. We compiled 11 open-source benchmarks commonly used for in-context learning (ICL) and evaluated them in an industry-standard manner. Our evaluation suite is open for the community to use and contribute to, ensuring the most rigorous evaluation possible.

Conclusion

With the introduction of MPT-7B, MosaicML has set a new standard for open-source, commercially usable LLMs. We invite businesses and the open-source community to build on this effort, utilizing the efficient and powerful MosaicML LLM Foundry to develop custom models and applications.

留言

TOP AI TOOLS

snapy.ai

Snapy allows you to edit your videos with the power of ai. Save at least 30 minutes of editing time for a typical 5-10 minute long video.

- Trim silent parts of your videos
- Make your content more interesting for your audience
- Focus on making more quality content, we will take care of the editing

Landing AI

A platform to create and deploy custom computer vision projects.

SupaRes

An image enhancement platform.

MemeMorph

A tool for face-morphing and memes.

SuperAGI

SuperAGI is an open-source platform providing infrastructure to build autonomous AI agents.

FitForge

A tool to create personalized fitness plans.

FGenEds

A tool to summarize lectures and educational materials.

Shortwave

A platform for emails productivity.

Publer

An all-in-one social media management tool.

Typeface

A tool to generate personalized content.

Addy AI

A Google Chrome Exntesion as an email assistant.

Notability

A telegrambot to organize notes in Notion.

latest stuff in ai, directly in your inbox. 🤗