OpenBMB - Big Model Warehouse

ModelCenter

Big Model Warehouse.ModelCenter implements pre-trained language models (PLMs) based on BMTrain backend. It supports Efficient, Low-Resource, Extendable model usage and distributed training.

GitHub

Doc

Features

Easy To Use

Compared to Deepspeed and Megatron, ModelCenter have better and more flexible code-packaging and easy to configure Python environments, and the training code is uniform with PyTorch style.

More Efficient Memory Utilization

Our implementation reduces the memory footprint by several times, allowing more efficient use of the GPU's computational power with a larger batch size.

Efficient Distributed Training With Low Resources

With the support of BMTrain, ModelCenter can easily extend ZeRO3's optimization to any PLMs, and we optimize communication and time scheduling for faster distributed training.

Powerful Performance

Thanks to BMTrain, ModelCenter performs amazingly compared to other popular frameworks.

Easy Usage

In line with the usage habits of Huggingface transformers, the threshold for getting started is lower, and the training speedup can be achieved with simple replacement.

Original Code Code after Replacement

Supported Models

Encoder

bert-base-cased bert-base-uncased bert-large-cased bert-large-uncased bert-base-chinese bert-base-multilingual-cased

Decoder

CPM-1(large) GPT-2(base) GPT-2(medium) GPT-2(large) GPT-2(XL) GPT-J(6B)

Encoder-Decoder

CPM-2(large) T5-small T5-base T5-large T5(3B) T5(11B)

Toolkits

BMTrain

BMCook

BMInf

OpenPrompt

OpenDelta

ModelCenter

Resources

General Model License

Community

Blogs

BM Course

GitHub

About OpenBMB

About

Paper