2 minute read

Power of Chinese AI Models

Power of Chinese AI ModelsPermalink

IntroductionPermalink

After Deepseek R1 bloodbath in the market now people started paying more attention towards China and west started looking towards east, including people in east are looking towards North.

I was tracking these models for sometime so thought to summarize them at one place for my readers.

Opensource: 🚀

Partially or fully close source: 🔒

List of Chinese ModelsPermalink

Developer Model SeriesModels IncludedFeatures
Tsinghua & Fudan University OpenChineseGPT OpenChineseGPT 🚀 Dialogue, instruction-following
Tsinghua & Fudan University OpenBuddy OpenBuddy 🚀 Dialogue, instruction-following
Tsinghua & Fudan University OpenChineseLLaMA OpenChineseLLaMA 🚀 Dialogue, instruction-following
Shanghai AI Lab Fengshenbang Series Fengshenbang-13B 🚀, Fengshenbang-7B 🚀 General-purpose, multilingual
IDEA Research Ziya Series Ziya-LLaMA 🚀, Ziya-13B 🚀 Dialogue, instruction-following
Tsinghua University CPM Series CPM-1 🚀, CPM-2 🚀, CPM-3 🚀 Early Chinese LLMs
Huawei PanGu PanGu 🔒 Large-scale, multilingual
Tsinghua & Fudan University Chinese LLaMA & Alpaca Chinese LLaMA 🚀, Chinese Alpaca 🚀 Dialogue, instruction-following
Fudan University MOSS MOSS 🚀 Dialogue, general-purpose
Zhipu AI ChatGLM Series ChatGLM3 🚀, ChatGLM2 🚀, ChatGLM 🚀, GLM-4 🚀 Chinese dialogue, multi-turn, long-context
Alibaba Cloud Qwen Series Qwen-1.8B 🚀, Qwen-7B 🚀, Qwen-14B 🚀, Qwen-72B 🚀, Qwen-2.5-1M 🚀 Multimodal, multilingual, 32K tokens, strong performance on benchmarks
Baichuan Intelligent Tech Baichuan Series Baichuan-7B 🚀, Baichuan-13B 🚀, Baichuan2 🚀 High performance, quantized versions
Shanghai AI Lab InternLM Series InternLM 🚀, InternLM-Chat 🚀 General-purpose, long-context
01.AI Yi Series Yi-1.0 🚀, Yi-6B 🚀, Yi-34B 🚀 Multilingual, long-context
DeepSeek AI DeepSeek Series DeepSeek-V2 🚀, DeepSeek-LLM-67B 🚀, DeepSeek-R1 🚀 High performance, Chinese & English, advanced reasoning for math and coding
Shenzhen Yuanxiang AI XVERTE Series XVERTE-7B 🚀, XVERTE-13B 🚀, XVERTE-65B 🚀 Multilingual, 256K tokens
Peking University YuLan Series YuLan-Base-126B 🚀, YuLan-Chat-3-126B 🚀 Multilingual, large-pretraining
Sichuan AI University gLAW LAW 🚀, LAWMiner 🚀, LLAMA 🚀, Fuzz 🚀, Mingcha 🚀 Specialized for legal tasks
Baidu ERNIE ERNIE 3.0 Titan 🔒 Knowledge enhanced with 260 billion parameters, supports multiple industries
ByteDance Doubao Doubao 1.5 Pro 🔒 Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs
Tencent Hunyuan Hunyuan 🔒 Supports image and text generation, logical reasoning, aimed at enterprise use
Moonshot AI Kimi Kimi k1.5 🔒 Matches or outperforms OpenAI o1, focused on solving complex problems
SenseTime SenseNova SenseNova 🔒 Includes models for natural language processing, content generation, data annotation
MiniMax MiniMax-Text MiniMax-Text-01 🔒 Large parameter size (456 billion), outperforms on some benchmarks, large context window
Kuaishou Kling Kling 🔒 Text-to-video model, free to public, simulates real-world motion and physics
iFlytek iFlytek Spark iFlytek Spark V4.0 🔒 Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo

Updated: