open-bmb

CPM2

CPM2 is a general bilingual pretrained language model with 11 billiion parameters.
CPM2 is based on the encoder-decoder architecture and has 7 general language capabilities. An updated version CPM2.1 was published on November, 2021. CPM2.1 introduces a generative pre-training task and is trained via the continual learning paradigm, the generation ability has been enhanced.

GitHub

License

Features

Knowledge Inheritance

Exploite existing model knowledge to accelerate training

Prompt Tuning

Use prompts to tune the model, which reduces the tuning parameters and better stimulates model performance

Mixture of Experts

The model parameters can be expanded by MoE to support the training of the hundred-billion scale model

Performance

CPM2 has strong general language intelligence

Performance of mT5 and CPM-2 with fine-tuning. We use the first 6 datasets, which makes up the lite version of CUGE, to compute the overall CUGE scores (%). The numbers in brackets are the CUEG scores (%) for each dataset.