OpenBMB - The toolkit for big model “slimming”

BMCook

The toolkit for big model “slimming”. BMCook performs efficient compression for big models to improve operating efficiency.
Through the combination of algorithms such as quantization, pruning, distillation, and MoEfication, 90%+ effects of the original model can be maintained, and model inference can be accelerated by 10 times.

GitHub

Doc

Share

Features

Model Quantization

4 times faster operation speed, using 1/4 storage space.

Model Pruning

Pruning 50% of the parameters can speed up 1 times

Model MoEfication

Reduce 80% linear layer parameters, can speed up 1 times

Model Distillation

Provide better supervision models for the above modules

Supported Methods

Compared to existing compression toolkits, BMCook supports all mainstream acceleration methods for pre-trained language models.

Combination in Any Way

Decoupled implementations, the compression methods can be combined in any way towards extreme acceleration.

Toolkits

BMTrain

BMCook

BMInf

OpenPrompt

OpenDelta

ModelCenter

Resources

General Model License

Community

Blogs

BM Course

GitHub

About OpenBMB

About

Paper

© 2023 OpenBMB. All Rights Reserved

京公网安备 11010802039419号

京ICP备2023004350号-2