CPM2 is a general bilingual pretrained language model with 11 billiion parameters.
CPM2 is based on the encoder-decoder architecture and has 7 general language capabilities. An updated version CPM2.1 was published on November, 2021. CPM2.1 introduces a generative pre-training task and is trained via the continual learning paradigm, the generation ability has been enhanced.
Knowledge Inheritance
Exploite existing model knowledge to accelerate training
Prompt Tuning
Use prompts to tune the model, which reduces the tuning parameters and better stimulates model performance
Mixture of Experts
The model parameters can be expanded by MoE to support the training of the hundred-billion scale model
CPM2 has strong general language intelligence
Performance of mT5 and CPM-2 with fine-tuning. We use the first 6 datasets, which makes up the lite version of CUGE, to compute the overall CUGE scores (%). The numbers in brackets are the CUEG scores (%) for each dataset.
Reading Comprehension
Text Summarization
Text Generation
Text Classification
In development
Text Completion
__ Enter “__” where text should be filled