The RedPajama project is dedicated to developing a suite of leading open-source AI models, while also rigorously understanding the factors that contribute to their exceptional performance. Building on the release of the RedPajama base dataset, which has already inspired numerous open-source models like MPT, OpenLLaMA, and OpenAlpaca, today marks the release of the RedPajama-INCITE family of models, including base, instruction-tuned, and chat versions.
The RedPajama-INCITE models, available in both 3B and 7B parameter configurations, have been trained on the RedPajama base dataset, seeking to replicate the LLaMA recipe as closely as possible. Key takeaways from these releases include:
The 3B model, which offers the best performance in its class, is fast, accessible, and even compatible with hardware like the RTX 2070.
The instruction-tuned models demonstrate strong performance on the HELM benchmarks, making them suitable for various applications such as few-shot learning, entity extraction, classification, and summarization.
The 7B model, which is still undergoing training, already outperforms the Pythia 7B model, highlighting the importance of a larger dataset and the value of the RedPajama base dataset.
In the coming weeks, the RedPajama team plans to release an improved version of the RedPajama dataset, doubling its size and further enhancing the quality of AI models based on it. The project showcases the potential of open collaborations, which are likely to drive the development of future AI systems.
The RedPajama-INCITE family of models, released under the Apache 2.0 license, includes:
RedPajama-INCITE-Base-3B-v1: A base model that outperforms other open models of similar sizes in benchmarks.
RedPajama-INCITE-Chat-3B-v1: A chat model fine-tuned using data from Dolly 2.0 and Open Assistant over the RedPajama-INCITE-Base-3B-v1 base model.
RedPajama-INCITE-Instruct-3B-v1: A model designed for few-shot prompts, fine-tuned using the same formula as GPT-JT over the RedPajama-INCITE-Base-3B-v1 base model.
RedPajama-INCITE-Base-7B-v0.1: An early preview of the RedPajama 7B model partway through training.
RedPajama-INCITE-Chat-7B-v0.1: An early preview of the chat model trained on RedPajama-INCITE-Base-7B-preview.
RedPajama-INCITE-Instruct-7B-v0.1: An early preview of the model designed for few-shot prompts trained on RedPajama-INCITE-Base-7B-preview.
The open-source community's support, suggestions, and feedback have been invaluable for the RedPajama project. The team is already working on the next version of the RedPajama base dataset, which will be nearly twice the size of the original v1 dataset. As these models continue to evolve, the RedPajama project is poised to make a significant impact on the AI landscape.
コメント