论文发表

  • Ning Ding, Shengding Hu, Weilin Zhao, Yulin Chen, Zhiyuan Liu, Haitao Zheng, and Maosong Sun. OpenPrompt: An Open-source Framework for Prompt-learning. ACL 2022 Demo.

    OpenBMB核心模块,用于提示学习微调,2021年10月发布至2023年1月已获2300星标,获得ACL 2022最佳演示论文奖。

  • Zhengyan Zhang, Baitao Gong, Yingfa Chen, Xu Han, Guoyang Zeng, Weilin Zhao, Yanxu Chen, Zhiyuan Liu and Maosong Sun. BMCook: A Task-agnostic Compression Toolkit for Big Models. EMNLP 2022 Demo.

    OpenBMB核心模块,用于大模型压缩,可实现将模型压缩10倍,效果达到原模型90%。

  • Zeng, Zheni, Yuan Yao, Zhiyuan Liu, and Maosong Sun. A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. Nature Communications 2022.

    生物医学富知识预训练模型创新技术,曾入选 Nature Communications 编辑亮点文章。

  • Xu Han, Guoyang Zeng, Weilin Zhao, Zhiyuan Liu, Zhengyan Zhang, Jie Zhou, Jun Zhang, Jia Chao, and Maosong Sun. BMInf: An Efficient Toolkit for Big Model Inference and Tuning. ACL 2022 Demo.

    OpenBMB核心模块,用于高效推理,可于Nvidia GTX 1060 6G显卡运行百亿大模型。

  • Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun. CPM-2: Large-scale Cost-effective Pre-trained Language Models. AI Open, 2022.

    CPM大模型第2版,2021年6月发布。110亿参数规模,基于MoE架构可达到1980亿。

  • Xu Han, Zhengyan Zhang, Zhiyuan Liu. Knowledgeable machine learning for natural language processing. Communications of the ACM 2021.

    知识指导机器学习观点文章,发表于国际计算机学会旗舰期刊CACM。

  • Chaojun Xiao, Xueyu Hu, Zhiyuan Liu, Cunchao Tu, Maosong Sun. Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents. AI Open, 2021.

    我国首个面向长文本的法律智能大模型,曾获SMP 2021最佳论文奖。

  • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. CPM: A Large-scale Generative Chinese Pre-trained Language Model. AI Open, 2021.

    中国首个中文大模型,2020年11月发布,26亿参数规模。

  • Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu. ERNIE: Enhanced Language Representation with Informative Entities. ACL 2019.

    世界首个知识指导预训练模型,截至2023年1月引用 900+次,ACL 2019引用排名第6。