NO |
1 |
ChatGPT |
2022/11/30 |
Text |
OpenAI |
Multilingual |
https://openai.com/blog/chatgpt/ |
~100B |
- |
- |
https://chat.openai.com/ |
NO |
2 |
Galactica |
2022/11/16 |
Text |
Meta |
English |
https://arxiv.org/abs/2211.09085 |
1.3B , 6.7B , 30B , 120B |
- |
- |
- |
NO |
3 |
ERNIE-ViLG 2.0 |
2022/10/27 |
TextVision |
Baidu |
Chinese |
https://arxiv.org/abs/2210.15257 |
24B |
- |
- |
- |
NO |
4 |
WeLM |
2022/10/12 |
Text |
Tencent |
Chinese |
https://arxiv.org/abs/2209.10372 |
1.3B , 2.7B , 10B |
- |
- |
https://welm.weixin.qq.com/docs/api/ |
NO |
5 |
Magneto |
2022/10/12 |
Text |
Microsoft |
English |
https://arxiv.org/abs/2210.06423 |
1B |
- |
- |
- |
NO |
6 |
Imagen Video |
2022/10/5 |
TextVision |
Google |
English |
https://arxiv.org/abs/2210.02303 |
11.6B |
- |
- |
- |
NO |
7 |
Whisper |
2022/9/21 |
Audio |
OpenAI |
Multilingual |
https://cdn.openai.com/papers/whisper.pdf |
1.55B |
https://github.com/openai/whisper |
https://github.com/openai/whisper |
- |
NO |
8 |
Sparrow |
2022/9/20 |
Text |
DeepMind |
English |
https://arxiv.org/abs/2209.14375 |
70B |
- |
- |
- |
NO |
9 |
CodeGeeX |
2022/9/19 |
Code |
Tsinghua UniversityPeng Cheng LaboratoryZhipu.AI |
- |
http://keg.cs.tsinghua.edu.cn/codegeex/ |
13B |
- |
https://github.com/THUDM/CodeGeeX - |
|
NO |
10 |
CPM-Ant |
2022/9/16 |
Text |
OpenBMB |
Chinese |
https://www.openbmb.org/en/community/blogs/blogpage?id=98afef2ce45f4fe9a4bc15a66d7ccb92 |
1B , 3B , 7B , 10B |
https://github.com/OpenBMB/CPM-Live/tree/master/cpm-live#model-checkpoints https://github.com/OpenBMB/CPM-Live/tree/master/cpm-live |
- |
|
NO |
11 |
PaLI |
2022/9/14 |
TextVision |
Google |
Multilingual |
https://arxiv.org/abs/2209.06794 |
3B , 15B , 17B |
- |
- |
- |
NO |
12 |
BEiT-3 |
2022/8/22 |
TextVision |
Microsoft |
English |
https://arxiv.org/abs/2208.10442 |
1.9B |
https://huggingface.co/docs/transformers/main/model_doc/beit |
https://github.com/microsoft/unilm/tree/master/beit - |
|
NO |
13 |
Atlas |
2022/8/8 |
Text |
MetaENS, PSL UniversityUniversity College LondonInria |
English |
https://arxiv.org/abs/2208.03299 |
3B , 11B |
- |
- |
- |
NO |
14 |
GLM-130B |
2022/8/5 |
Text |
Tsinghua UniversityZhipu.AI |
EnglishChinese |
http://keg.cs.tsinghua.edu.cn/glm-130b/posts/glm-130b/ |
130B |
https://docs.google.com/forms/d/e/1FAIpQLSehr5Dh_i3TwACmFFi8QEgIVNYGmSPwV0GueIcsUev0NEfUug/viewform |
https://github.com/THUDM/GLM-130B - |
|
NO |
15 |
AlexaTM 20B |
2022/8/2 |
Text |
Amazon |
Multilingual |
https://arxiv.org/abs/2208.01448 |
20B |
https://github.com/amazon-research/alexa-teacher-models |
- |
- |
NO |
16 |
FIM |
2022/7/28 |
CodeText |
OpenAI |
English |
https://arxiv.org/abs/2207.14255 |
1.4B , 2.8B , 6.9B |
- |
- |
- |
NO |
17 |
PanGu-Coder |
2022/7/22 |
Code |
Huawei |
- |
https://arxiv.org/abs/2207.11280 |
2.6B |
- |
- |
- |
NO |
18 |
ESM-2 |
2022/7/21 |
Protein |
MetaNew York UniversityStanford UniversityMIT |
- |
https://www.biorxiv.org/content/10.1101/2022.07.20.500902v1 |
3B , 15B - |
- |
- |
|
NO |
19 |
BLOOM |
2022/7/12 |
Text |
BigScience |
Multilingual |
https://bigscience.huggingface.co/blog/bloom |
1.3B , 2.5B , 6.3B , 175B |
https://huggingface.co/bigscience/bloom/tree/main |
https://github.com/huggingface/transformers/tree/main/src/transformers/models/bloom |
- |
NO |
20 |
NLLB |
2022/7/11 |
Text |
MetaUC BerkeleyJohns Hopkins University |
Multilingual |
https://arxiv.org/abs/2207.04672 |
dense: 1.3B , 3.3B |
https://github.com/facebookresearch/fairseq/blob/nllb/README.md#multilingual-translation-models |
https://github.com/facebookresearch/fairseq/blob/nllb/examples/nllb/modeling/README.md |
- |
NO |
21 |
Minerva |
2022/6/29 |
Text |
Google |
English |
https://arxiv.org/abs/2206.14858 |
8B , 62B , 540B |
- |
- |
- |
NO |
22 |
ProGen2 |
2022/6/27 |
Protein |
SalesforceJohns Hopkins UniversityColumbia University |
- |
https://arxiv.org/pdf/2206.13517.pdf |
2.7B , 6.4B |
https://storage.googleapis.com/sfr-progen-research/checkpoints/progen2-xlarge.tar.gz |
https://github.com/salesforce/progen - |
|
NO |
23 |
LIMoE |
2022/6/26 |
TextVision |
Google |
English |
https://arxiv.org/abs/2206.02770 |
5.6B |
- |
- |
- |
NO |
24 |
YaLM |
2022/6/23 |
Text |
Yandex |
EnglishRussian |
https://medium.com/yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6 |
100B |
https://github.com/yandex/YaLM-100B#downloading-checkpoint https://github.com/yandex/YaLM-100B |
- |
|
NO |
25 |
Parti |
2022/6/22 |
TextVision |
Google |
English |
https://arxiv.org/abs/2206.10789 |
3B , 20B |
- |
- |
- |
NO |
26 |
GODEL |
2022/6/22 |
Text |
MicrosoftColumbia University |
English |
https://arxiv.org/abs/2206.11309 |
2.7B |
https://github.com/Microsoft/GODEL#models |
https://github.com/Microsoft/GODEL |
- |
NO |
27 |
Unified-IO |
2022/6/17 |
TextVision |
AI2University of Washington |
English |
https://arxiv.org/abs/2206.08916 |
2.8B |
- |
- |
- |
NO |
28 |
AlexaTM |
2022/6/15 |
Text |
Amazon |
Multilingual |
https://arxiv.org/abs/2206.07808 |
2.68B , 9.9B |
- |
- |
- |
NO |
29 |
SwinV2-MoE |
2022/6/7 |
Vision |
Microsoft |
- |
https://arxiv.org/abs/2206.03382 |
1B , 2B |
- |
https://github.com/microsoft/Swin-Transformer |
- |
NO |
30 |
OBERT |
2022/6/1 |
Text |
OPPO |
Chinese |
https://blog.51cto.com/u_15273780/5440502 |
~1B |
- |
- |
- |
NO |
31 |
CogVideo |
2022/5/29 |
TextVision |
Tsinghua UniversityBAAI |
Chinese |
https://arxiv.org/abs/2205.15868 |
9.4B |
https://github.com/THUDM/CogVideo#download |
https://github.com/THUDM/CogVideo |
- |
NO |
32 |
Imagen |
2022/5/23 |
TextVision |
Google |
English |
https://arxiv.org/abs/2205.11487 |
7.6B |
- |
- |
- |
YES |
33 |
ERNIE 3.0 Zeus |
2022/5/20 |
Text |
Baidu |
Chinese |
https://baijiahao.baidu.com/s?id=1733603775259242015&wfr=spider&for=pc |
~100B |
- |
- |
https://wenxin.baidu.com/younger/apiDetail?id=20006 |
NO |
34 |
RankGen |
2022/5/19 |
Text |
University of Massachusetts AmherstGoogle |
English |
https://arxiv.org/abs/2205.09726 |
1.2B |
https://huggingface.co/kalpeshk2011 |
https://github.com/martiansideofthemoon/rankgen - |
|
NO |
35 |
Gato |
2022/5/12 |
TextVision |
DeepMind |
English |
https://arxiv.org/abs/2205.06175 |
1.2B |
- |
- |
- |
NO |
36 |
HunYuan(混元) |
2022/5/11 |
Text |
Tencent |
Chinese |
http://ex.chinadaily.com.cn/exchange/partners/82/rss/channel/cn/columns/snl9a7/stories/WS628df605a3101c3ee7ad730e.html |
~10B |
- |
- |
- |
NO |
37 |
UL2 |
2022/5/10 |
Text |
Google |
English |
https://arxiv.org/abs/2205.05131 |
20B |
https://github.com/google-research/google-research/tree/master/ul2#checkpoints |
https://github.com/google-research/t5x |
- |
NO |
38 |
CoCa |
2022/5/4 |
TextVision |
Google |
English |
https://arxiv.org/abs/2205.01917 |
2.1B |
- |
- |
- |
NO |
39 |
OPT |
2022/5/2 |
Text |
Meta |
English |
https://arxiv.org/abs/2205.01068 |
1.3B , 2.7B , 6.7B , 13B , 30B , 66B , 175B |
https://github.com/facebookresearch/metaseq/tree/main/projects/OPT#pretrained-model-weights |
https://github.com/facebookresearch/metaseq - |
|
NO |
40 |
Flamingo |
2022/4/28 |
TextVision |
DeepMind |
English |
https://arxiv.org/abs/2204.14198 |
80B |
- |
- |
- |
NO |
41 |
CogView2 |
2022/4/28 |
TextVision |
Tsinghua UniversityBAAI |
EnglishChinese |
https://arxiv.org/abs/2204.14217 |
6B |
https://model.baai.ac.cn/model-detail/100041 |
https://github.com/THUDM/CogView2 - |
|
NO |
42 |
mGPT |
2022/4/15 |
Text |
SberDevicesHSE UniversityAI Research Institute |
Multilingual |
https://arxiv.org/abs/2204.07580 |
1.3B , 13B |
https://huggingface.co/sberbank-ai/mGPT |
https://github.com/ai-forever/mgpt - |
|
NO |
43 |
GPT-NeoX |
2022/4/14 |
Text |
EleutherAI |
English |
https://arxiv.org/abs/2204.06745 |
20B |
https://github.com/EleutherAI/gpt-neox#download-links https://github.com/EleutherAI/gpt-neox |
- |
|
NO |
44 |
NOOR |
2022/4/13 |
Text |
Technology Innovation Institute |
Arabic |
https://www.tii.ae/news/technology-innovation-institute-announces-launch-noor-worlds-largest-arabic-nlp-model |
10B |
- |
- |
- |
NO |
45 |
METRO-LM |
2022/4/13 |
Text |
Microsoft |
English |
https://arxiv.org/abs/2204.06644 |
5.4B |
- |
- |
- |
NO |
46 |
DALL-E 2 |
2022/4/13 |
TextVision |
OpenAI |
English |
https://arxiv.org/abs/2204.06125 |
6.5B |
- |
- |
https://labs.openai.com/waitlist |
NO |
47 |
InCoder |
2022/4/12 |
Code |
FacebookUniversity of WashingtonUC BerkeleyTTICCMU |
- |
https://arxiv.org/abs/2204.05999 |
1.3B , 6.7B |
https://sites.google.com/view/incoder-code-models |
https://github.com/dpfried/incoder/blob/main/README.md - |
|
NO |
48 |
PaLM |
2022/4/5 |
Text |
Google |
English |
https://arxiv.org/abs/2204.02311 |
8B , 62B , 540B |
- |
- |
- |
NO |
49 |
Chinchilla |
2022/3/29 |
Text |
DeepMind |
English |
https://arxiv.org/abs/2203.15556 |
70B |
- |
- |
- |
NO |
50 |
Benetnasch(瑶光) |
2022/3/28 |
Text |
Singularity AI |
Chinese |
https://vr.sina.com.cn/2022-03-28/doc-imcwiwss8619202.shtml |
~10B |
- |
- |
https://openapi.singularity-ai.com/index.html#/ |
NO |
51 |
CodeGen |
2022/3/25 |
Code |
Salesforce |
- |
https://arxiv.org/abs/2203.13474 |
2.7B , 6.1B , 16.1B |
https://github.com/salesforce/CodeGen#setup https://github.com/salesforce/CodeGen |
- |
|
NO |
52 |
EVA-2 |
2022/3/17 |
Text |
Tsinghua UniversityBAAIYork University |
Chinese |
https://arxiv.org/abs/2203.09313 |
2.8B |
https://wudaoai.cn/model/detail/EVA#download |
https://github.com/thu-coai/EVA/ |
- |
NO |
53 |
AlphaCode |
2022/3/16 |
Code |
DeepMind |
- |
https://arxiv.org/abs/2203.07814 |
1.1B , 2.8B , 8.7B , 41.1B |
- |
- |
- |
NO |
54 |
InstructGPT |
2022/3/4 |
Text |
OpenAI |
English |
https://arxiv.org/abs/2203.02155 |
1.3B , 6B , 175B |
- |
- |
- |
NO |
55 |
DeepNet |
2022/3/1 |
Text |
Microsoft |
English |
https://arxiv.org/abs/2203.00555 |
3.2B |
- |
- |
- |
NO |
56 |
PolyCoder |
2022/2/26 |
Code |
CMU |
- |
https://arxiv.org/abs/2202.13169 |
2.7B |
https://github.com/VHellendoorn/Code-LMs#available-models |
https://github.com/VHellendoorn/Code-LMs |
- |
NO |
57 |
SEER |
2022/2/16 |
Vision |
MetaInria |
- |
https://arxiv.org/abs/2202.08360 |
1.5B , 10B |
https://github.com/facebookresearch/vissl/tree/main/projects/SEER#pretrained-models-weights |
https://github.com/facebookresearch/vissl/tree/main/projects/SEER - |
|
NO |
58 |
Cedille |
2022/2/7 |
Text |
Cedille AI |
French |
https://arxiv.org/abs/2202.03371 |
6B |
https://github.com/coteries/cedille-ai#mesh-transformer |
https://github.com/coteries/cedille-ai#why-is-this-repository-empty |
https://app.cedille.ai/ |
NO |
59 |
Megatron-Turing NLG |
2022/1/28 |
Text |
MicrosoftNVIDIA |
English |
https://arxiv.org/abs/2201.11990 |
530B |
- |
- |
- |
NO |
60 |
LaMDA |
2022/1/20 |
Text |
Google |
English |
https://arxiv.org/abs/2201.08239 |
2B , 8B , 137B |
- |
- |
- |
NO |
61 |
CM3 |
2022/1/19 |
TextVision |
Facebook |
English |
https://arxiv.org/abs/2201.07520 |
2.7B , 13B |
- |
- |
- |
NO |
62 |
ERNIE-ViLG |
2021/12/31 |
TextVision |
Baidu |
Chinese |
https://arxiv.org/pdf/2112.15283.pdf |
10B |
- |
- |
https://wenxin.baidu.com/younger/apiDetail?id=20008 |
NO |
63 |
ERNIE 3.0 Titan |
2021/12/23 |
Text |
Peng Cheng Laboratory Baidu |
Chinese |
https://arxiv.org/abs/2112.12731 |
260B |
- |
- |
- |
NO |
64 |
XGLM |
2021/12/20 |
Text |
Meta |
Multilingual |
https://arxiv.org/abs/2112.10668 |
1.7B , 2.9B , 7.5B |
https://github.com/facebookresearch/fairseq/tree/main/examples/xglm#pre-trained-models |
https://github.com/facebookresearch/fairseq/tree/main/examples/xglm |
- |
NO |
65 |
MOE LM |
2021/12/20 |
Text |
Meta |
English |
https://arxiv.org/abs/2112.10684 |
dense: 1.3B , 2.7B , 6.7B , 13B |
https://github.com/facebookresearch/fairseq/tree/main/examples/moe_lm#pre-trained-models |
https://github.com/facebookresearch/fairseq/tree/main/examples/moe_lm |
- |
NO |
66 |
GLIDE |
2021/12/20 |
TextVision |
OpenAI |
English |
https://arxiv.org/abs/2112.10741 |
5B |
- |
https://github.com/openai/glide-text2im |
- |
NO |
67 |
WebGPT |
2021/12/17 |
Text |
OpenAI |
English |
https://arxiv.org/abs/2112.09332 |
13B , 175B |
- |
- |
- |
NO |
68 |
LongT5 |
2021/12/15 |
Text |
Google |
English |
https://arxiv.org/abs/2112.07916 |
3B |
https://github.com/google-research/longt5#released-model-checkpoints |
https://github.com/google-research/longt5 - |
|
NO |
69 |
GLaM |
2021/12/13 |
Text |
Google |
English |
https://arxiv.org/abs/2112.06905 |
dense: 1.7B , 8.7B , 137B |
- |
- |
- |
NO |
70 |
Retro |
2021/12/8 |
Text |
DeepMind |
English |
https://arxiv.org/abs/2112.04426 |
1.5B , 7.5B |
- |
- |
- |
NO |
71 |
Gopher |
2021/12/8 |
Text |
DeepMind |
English |
https://storage.googleapis.com/deepmind-media/research/language-research/Training%20Gopher.pdf |
1.4B , 7.1B , 280B |
- |
- |
- |
NO |
72 |
CodeParrot |
2021/12/8 |
Code |
Huggingface |
- |
https://huggingface.co/blog/codeparrot |
1.5B |
https://huggingface.co/codeparrot/codeparrot |
https://github.com/huggingface/transformers/tree/main/examples/research_projects/codeparrot |
- |
NO |
73 |
GPT-JT |
2021/11/29 |
Text |
Together |
English |
https://www.together.xyz/blog/releasing-v1-of-gpt-jt-powered-by-open-source-ai |
6B |
https://huggingface.co/togethercomputer/GPT-JT-6B-v1 |
https://huggingface.co/togethercomputer/GPT-JT-6B-v1 |
- |
NO |
74 |
Zhouwenwang(周文王) |
2021/11/22 |
Text |
IDEA |
Chinese |
https://idea.edu.cn/fengshenbang-lm.html |
1.3B |
https://huggingface.co/IDEA-CCNL/Zhouwenwang-Unified-1.3B |
https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/models/roformer |
- |
NO |
75 |
Yuyuan(余元) |
2021/11/22 |
Text |
IDEA |
Chinese |
https://idea.edu.cn/fengshenbang-lm.html |
3.5B |
https://huggingface.co/IDEA-CCNL/YuyuanQA-GPT2-3.5B |
https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/wenzhong_qa - |
|
NO |
76 |
Wenzhong(闻仲) |
2021/11/22 |
Text |
IDEA |
Chinese |
https://idea.edu.cn/fengshenbang-lm.html |
3.5B |
https://huggingface.co/IDEA-CCNL/Wenzhong-GPT2-3.5B |
https://fengshenbang-doc.readthedocs.io/zh/latest/docs/%E9%97%BB%E4%BB%B2%E7%B3%BB%E5%88%97/Wenzhong-GPT2-3.5B.html |
- |
NO |
77 |
ExT5 |
2021/11/22 |
Text |
Google |
English |
https://arxiv.org/abs/2111.10952 |
3B |
, |
11B |
- |
NO |
78 |
Erlangshen(二郎神) |
2021/11/22 |
Text |
IDEA |
Chinese |
https://idea.edu.cn/fengshenbang-lm.html |
1.3B |
https://huggingface.co/IDEA-CCNL/Erlangshen-MegatronBert-1.3B |
https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/pretrain_erlangshen |
- |
NO |
79 |
Bigan(比干) |
2021/11/22 |
Text |
IDEA |
Chinese |
https://idea.edu.cn/fengshenbang-lm.html |
1.1B |
https://huggingface.co/IDEA-CCNL/Bigan-Transformer-XL-denoise-1.1B |
https://fengshenbang-doc.readthedocs.io/zh/latest/docs/%E6%AF%94%E5%B9%B2%E7%B3%BB%E5%88%97/Bigan-Transformer-XL-denoise-1.1B.html |
- |
NO |
80 |
BASIC |
2021/11/19 |
Vision |
GoogleMIT |
- |
https://arxiv.org/abs/2111.10050 |
3B |
- |
- |
- |
YES |
81 |
Swin Transformer V2 |
2021/11/18 |
Vision |
Microsoft |
- |
https://arxiv.org/abs/2111.09883 |
3B |
- |
https://github.com/microsoft/Swin-Transformer |
- |
NO |
82 |
PERKS |
2021/11/4 |
Text |
Kuaishou |
Chinese |
https://github.com/KuaiSearchPERKS/PERKS/ |
~1B |
- |
- |
- |
NO |
83 |
M-CTC-T |
2021/10/30 |
Audio |
FacebookMcGill UniversityMila |
Multilingual |
https://arxiv.org/abs/2111.00161 |
1.06B |
https://github.com/flashlight/wav2letter/tree/main/recipes/mling_pl#inference |
https://github.com/flashlight/wav2letter/tree/main/recipes/mling_pl |
- |
NO |
84 |
TI-NLP |
2021/10/19 |
Text |
Tencent |
Chinese |
https://www.iheima.com/article-325531.html |
~10B |
- |
- |
- |
NO |
85 |
T0 |
2021/10/15 |
Text |
BigScience |
English |
https://arxiv.org/abs/2110.08207 |
3B |
, |
11B |
https://github.com/bigscience-workshop/t-zero#released-checkpoints |
NO |
86 |
MixQG |
2021/10/15 |
Text |
Salesforce |
English |
https://arxiv.org/abs/2110.08175 |
3B |
https://github.com/salesforce/QGen/tree/main/MixQG#released-model-checkpoints |
https://github.com/salesforce/QGen/tree/main/MixQG |
- |
NO |
87 |
ShenNonG(神农) |
2021/10/13 |
Text |
Tencent |
Chinese |
https://mp.weixin.qq.com/s/CavGiy1Rz0MJVtcxXdSn0A |
~1B |
- |
- |
- |
NO |
88 |
Mengzi(孟子) |
2021/10/13 |
Text |
Shanghai JiaoTong University Beijing Institute of Technology Beijing Jiaotong University Peking University Langboat Technology |
Chinese |
https://arxiv.org/abs/2110.06696 |
1B |
- |
https://github.com/Langboat/Mengzi |
- |
NO |
89 |
Yuan(源) 1.0 |
2021/10/10 |
Text |
Inspur |
Chinese |
https://arxiv.org/abs/2110.04725 |
13B |
, |
245B |
- |
NO |
90 |
M6-10T |
2021/10/8 |
TextVision |
Alibaba |
Chinese |
https://arxiv.org/abs/2110.03888 |
dense: |
1.4B |
- |
- |
NO |
91 |
Zidong.Taichu(紫东太初) |
2021/9/27 |
AudioTextVision |
Institute of Automation |
Chinese |
http://www.ia.cas.cn/xwzx/kydt/202109/t20210927_6215538.html |
~1B |
, |
~10B |
, |
NO |
92 |
Z-code M3 |
2021/9/22 |
Text |
Microsoft |
Multilingual |
https://arxiv.org/abs/2109.10465 |
1.8B |
, |
3B |
, |
NO |
93 |
T5-Efficient |
2021/9/22 |
Text |
GoogleDeepMind |
English |
https://arxiv.org/abs/2109.10686 |
3B |
, |
11B |
, |
NO |
94 |
PLATO-XL |
2021/9/20 |
Text |
Baidu |
English |
https://arxiv.org/abs/2109.09519 |
11B |
https://github.com/PaddlePaddle/Knover/blob/develop/projects/PLATO-XL/README.md#pre-trained-dialogue-generation-model |
https://github.com/PaddlePaddle/Knover/ |
- |
NO |
95 |
ShenZhou(神舟) |
2021/9/19 |
Text |
Tencent |
Chinese |
https://mp.weixin.qq.com/s/PODShmOo0tg9cmchNhzvtw |
~10B |
- |
- |
- |
NO |
96 |
CoAtNet |
2021/9/15 |
Vision |
Google |
English |
https://arxiv.org/abs/2106.04803 |
1.47B |
, |
2.44B |
- |
NO |
97 |
HyperCLOVA |
2021/9/10 |
Text |
NAVERSearch Solutions |
Korean |
https://arxiv.org/abs/2109.04650 |
1.3B |
, |
6.9B |
, |
NO |
98 |
Macaw |
2021/9/6 |
Text |
AI2 |
English |
https://arxiv.org/abs/2109.02593 |
3B |
, |
11B |
https://github.com/allenai/macaw#available-models |
NO |
99 |
FLAN |
2021/9/3 |
Text |
Google |
English |
https://arxiv.org/abs/2109.01652 |
137B |
- |
https://github.com/google-research/flan |
- |
NO |
100 |
ProteinLM |
2021/8/17 |
Protein |
Tsinghua University BAAI Tencent |
- |
https://arxiv.org/abs/2108.07435 |
3B |
https://github.com/THUDM/ProteinLM#download-proteinlm |
https://github.com/THUDM/ProteinLM |
- |
NO |
101 |
Jurassic-1 |
2021/8/11 |
Text |
AI21 Labs |
English |
https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf |
7.5B |
, |
17B |
, |
NO |
102 |
baseline-1.5B |
2021/8/4 |
Text |
Cohere |
English |
https://arxiv.org/abs/2108.07790 |
1.5B |
- |
- |
- |
NO |
103 |
EVA |
2021/8/3 |
Text |
Tsinghua University BAAI |
Chinese |
https://arxiv.org/abs/2108.01547 |
2.8B |
https://wudaoai.cn/model/detail/EVA#download |
https://github.com/thu-coai/EVA/ |
- |
NO |
103 |
EVA |
2021/8/3 |
Text |
Tsinghua University BAAI |
Chinese |
https://arxiv.org/abs/2108.01547 |
2.8B |
https://wudaoai.cn/model/detail/EVA#download |
https://github.com/thu-coai/EVA/ |
- |
NO |
104 |
BlenderBot 2 |
2021/7/16 |
Text |
Facebook |
English |
https://ai.facebook.com/blog/blender-bot-2-an-open-source-chatbot-that-builds-long-term-memory-and-searches-the-internet/ |
2.7B |
https://parl.ai/projects/blenderbot2/ |
https://parl.ai/projects/blenderbot2/ |
- |
NO |
105 |
ProtTrans |
2021/7/7 |
Protein |
Technical University of MunichMed AI Technology GoogleNVIDIASeoul National UniversityOak Ridge National Laboratory |
- |
https://doi.org/10.1109/TPAMI.2021.3095381 |
3B |
, |
11B |
https://github.com/agemagician/ProtTrans#%EF%B8%8F-models-availability |
NO |
106 |
Codex |
2021/7/7 |
Code |
OpenAI |
- |
https://arxiv.org/abs/2107.03374 |
2.5B |
, |
12B |
- |
NO |
107 |
ERNIE 3.0 |
2021/7/5 |
Text |
Baidu |
Chinese |
https://arxiv.org/abs/2107.02137 |
10B |
- |
- |
- |
NO |
108 |
CPM-2 |
2021/6/20 |
Text |
Tsinghua University BAAI |
Chinese |
https://arxiv.org/abs/2106.10715 |
dense: |
11B |
https://github.com/OpenBMB/ModelCenter#supported-models |
https://github.com/OpenBMB/ModelCenter |
NO |
109 |
Motian(摩天) |
2021/6/15 |
Text |
Tencent |
Chinese |
https://mp.weixin.qq.com/s/HQL0Hk49UR6kVNtrvcXEGA |
~1B |
- |
- |
- |
NO |
110 |
V-MOE |
2021/6/10 |
Vision |
Google |
- |
https://arxiv.org/abs/2106.05974 |
14.7B |
- |
- |
- |
NO |
111 |
GPT-J |
2021/6/9 |
Text |
EleutherAI |
English |
https://www.infoq.com/news/2021/07/eleutherai-gpt-j/ |
6B |
https://github.com/kingoflolz/mesh-transformer-jax/#links |
https://github.com/kingoflolz/mesh-transformer-jax/ |
- |
NO |
112 |
ViT |
2021/6/8 |
Vision |
Google |
- |
https://arxiv.org/abs/2106.04560 |
1B |
, |
1.8B |
- |
NO |
113 |
Wudao(悟道) 2.0 |
2021/6/1 |
TextVision |
BAAI |
EnglishChinese |
https://www.sohu.com/na/469857971_473283 |
1750B |
- |
- |
- |
NO |
114 |
ByT5 |
2021/5/28 |
Text |
Google |
Multilingual |
https://arxiv.org/abs/2105.13626 |
1.2B |
, |
3.7B |
, |
NO |
115 |
CogView |
2021/5/26 |
TextVision |
Tsinghua University Alibaba BAAI |
Chinese |
https://arxiv.org/abs/2204.14217 |
4B |
https://github.com/THUDM/CogView#download |
https://github.com/THUDM/CogView |
- |
NO |
116 |
QAConv |
2021/5/14 |
Text |
SalesforceHKUST |
English |
https://arxiv.org/abs/2105.06912 |
3B |
https://github.com/salesforce/QAConv#trained-models |
https://github.com/salesforce/QAConv |
- |
NO |
117 |
XLM-R |
2021/5/2 |
Text |
Facebook |
Multilingual |
https://arxiv.org/abs/2105.00572 |
3.5B |
, |
10.7B |
https://github.com/facebookresearch/fairseq/tree/main/examples/xlmr#pre-trained-models |
NO |
118 |
PanGu(盘古)-α |
2021/4/26 |
Text |
Peng Cheng Laboratory Huawei |
Chinese |
https://arxiv.org/abs/2104.12369 |
2.6B |
, |
13B |
, |
NO |
119 |
PLUG |
2021/4/19 |
Text |
Alibaba |
Chinese |
https://www.infoq.cn/article/efiho75sqsvqlvftruke |
27B |
https://www.alice-mind.com/portal#/ |
https://github.com/alibaba/AliceMind/tree/main/PLUG |
- |
NO |
120 |
GPT-Neo |
2021/3/21 |
Text |
EleutherAI |
English |
https://venturebeat.com/2021/05/15/gpt-3s-free-alternative-gpt-neo-is-something-to-be-excited-about/ |
1.3B |
, |
2.7B |
https://github.com/EleutherAI/gpt-neo/#pretrained-models |
NO |
121 |
GLM |
2021/3/18 |
Text |
Tsinghua University BAAI MIT Shanghai QiZhi Institute |
EnglishChinese |
https://arxiv.org/abs/2103.10360 |
10B |
https://github.com/THUDM/GLM#pretrained-models |
https://github.com/THUDM/GLM |
- |
NO |
122 |
Chinese-Transformer-XL |
2021/3/17 |
Text |
Tsinghua University |
Chinese |
https://wudaoai.cn/model/detail/Transformer-XL |
2.9B |
http://dorc-model-team.ks3-cn-beijing.ksyun.com/ren-zhi/my-model/mp_rank_00_model_states.pt |
https://github.com/THUDM/Chinese-Transformer-XL |
- |
NO |
123 |
BriVL |
2021/3/11 |
TextVision |
Renmin University of ChinaInstitute of Computing Technology |
Chinese |
https://arxiv.org/abs/2103.06561 |
1B |
https://github.com/BAAI-WuDao/BriVL#%E4%B8%8B%E8%BD%BD%E4%B8%93%E5%8C%BA |
https://github.com/BAAI-WuDao/BriVL |
- |
NO |
124 |
M6 |
2021/3/1 |
TextVision |
Alibaba Tsinghua University |
Chinese |
https://arxiv.org/abs/2103.00823 |
10B |
, |
100B |
- |
NO |
125 |
DALL-E |
2021/2/24 |
TextVision |
OpenAI |
English |
https://arxiv.org/abs/2102.12092 |
12B |
- |
https://github.com/openai/dall-e |
- |
NO |
126 |
Switch Transformers |
2021/1/11 |
Text |
Google |
English |
https://arxiv.org/abs/2101.03961 |
7B |
, |
26B |
, |
NO |
127 |
CPM-1 |
2020/12/1 |
Text |
Tsinghua University BAAI |
Chinese |
https://arxiv.org/abs/2012.00413 |
2.6B |
https://github.com/OpenBMB/ModelCenter#supported-models |
https://github.com/OpenBMB/ModelCenter |
- |
NO |
128 |
mT5 |
2020/10/22 |
Text |
Google |
Multilingual |
https://arxiv.org/abs/2010.11934 |
1.2B |
, |
3.7B |
, |
NO |
129 |
M2M-100 |
2020/10/21 |
Text |
Facebook |
Multilingual |
https://arxiv.org/abs/2010.11125 |
1.2B |
, |
12B |
https://github.com/facebookresearch/fairseq/tree/main/examples/m2m_100#trained-models |
NO |
130 |
BlenderBot 3 |
2020/8/5 |
Text |
MetaMilaMcGill University |
English |
https://arxiv.org/abs/2208.03188 |
3B |
, |
30B |
, |
NO |
131 |
PLATO-2 |
2020/6/30 |
Text |
Baidu |
English |
https://arxiv.org/abs/2006.16779 |
1.6B |
https://github.com/PaddlePaddle/Knover/blob/develop/projects/PLATO-2/README.md#pre-trained-dialogue-generation-model |
https://github.com/PaddlePaddle/Knover/ |
- |
NO |
132 |
GShard |
2020/6/30 |
Text |
Google |
Multilingual |
https://arxiv.org/abs/2006.16668 |
12.5B |
, |
37B |
, |
NO |
133 |
iGPT |
2020/6/17 |
Vision |
OpenAI |
- |
https://proceedings.mlr.press/v119/chen20s.html |
1.3B |
, |
6.8B |
- |
YES |
134 |
DeBERTa v2 |
2020/6/5 |
Text |
Microsoft |
English |
https://arxiv.org/abs/2006.03654 |
1.5B |
https://huggingface.co/microsoft/deberta-v2-xxlarge |
https://github.com/microsoft/DeBERTa |
- |
NO |
135 |
GPT-3 |
2020/5/28 |
Text |
OpenAI |
English |
https://arxiv.org/abs/2005.14165 |
1.3B |
, |
2.7B |
, |
NO |
136 |
BlenderBot 1 |
2020/4/28 |
Text |
Facebook |
English |
https://arxiv.org/abs/2004.13637 |
2.7B |
, |
9.4B |
https://parl.ai/projects/recipes/ |
NO |
137 |
ProGen |
2020/3/8 |
Protein |
SalesforceStanford University |
- |
https://arxiv.org/abs/2004.03497 |
1.2B |
- |
- |
- |
NO |
138 |
Turing NLG |
2020/2/13 |
Text |
Microsoft |
English |
https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/ |
17B |
- |
- |
- |
NO |
139 |
Meena |
2020/1/27 |
Text |
Google |
English |
https://arxiv.org/abs/2001.09977 |
2.6B |
- |
- |
- |
NO |
140 |
T5 |
2019/10/23 |
Text |
Google |
English |
https://arxiv.org/abs/1910.10683 |
3B |
, |
11B |
https://github.com/google-research/text-to-text-transfer-transformer#released-model-checkpoints |
NO |
141 |
Megatron-LM |
2019/9/17 |
Text |
NVIDIA |
English |
https://arxiv.org/abs/1909.08053 |
1.2B |
, |
2.5B |
, |
NO |
142 |
CTRL |
2019/9/11 |
Text |
Salesforce |
English |
https://arxiv.org/abs/1909.05858 |
1.63B |
https://console.cloud.google.com/storage/browser/sf-ctrl;tab=objects?prefix=&forceOnObjectsSortingFiltering=false |
https://github.com/salesforce/ctrl |
- |
NO |
143 |
GPT-2 |
2019/2/11 |
Text |
OpenAI |
English |
https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf |
1.5B |
https://github.com/openai/gpt-2 |
https://github.com/openai/gpt-2 |
- |
NO |
144 |
Sparsely-Gate MOE |
2017/1/23 |
Text |
Jagiellonian UniversityGoogle |
Multilingual |
https://arxiv.org/abs/1701.06538 |
8.7B |
, |
137B |
- |
NO |
145 |
LLaMA |
2023.02 |
NLP |
facebook |
en/ |
https://arxiv.org/abs/2302.13971 |
7B/13B/33B/ |
https://github.com/facebookresearch/llama/blob/main/download.sh |
https://github.com/facebookresearch/llama/tree/main/llama |
|
YES |
146 |
ChatGLM |
2023.03 |
NLP |
THUDM |
zh/en |
https://chatglm.cn/blog |
6B |
https://huggingface.co/THUDM/chatglm-6b |
https://huggingface.co/THUDM/chatglm-6b |
|
NO |
147 |
Open Flamingo |
2023.03 |
CV/NLP |
laion.ai |
en |
https://laion.ai/blog/open-flamingo/ |
9B |
https://huggingface.co/openflamingo/OpenFlamingo-9B |
https://github.com/mlfoundations/open_flamingo |
https://7164d2142d11.ngrok.app/ |
NO |
148 |
Alpaca-7B |
2023.03 |
NLP |
Stanford |
en/multiligual |
https://crfm.stanford.edu/2023/03/13/alpaca.html |
7B/13B(Not |
Open) |
https://huggingface.co/tatsu-lab/alpaca-7b-wdiff/tree/main |
https://github.com/tatsu-lab/stanford_alpaca/blob/main/train.py |
YES |
149 |
Vicuna |
2023.03 |
NLP |
The Vicuna |
en/ |
https://lmsys.org/blog/2023-03-30-vicuna/ |
7B/13B |
https://github.com/lm-sys/FastChat#vicuna-weights |
https://github.com/lm-sys/FastChat/blob/main/scripts/train_vicuna_7b.sh |
https://chat.lmsys.org/ |
YES |
150 |
FastChat-T5 |
2023.04 |
NLP |
The Vicuna |
en |
https://github.com/lm-sys/FastChat#FastChat-T5 |
3B |
https://huggingface.co/lmsys/fastchat-t5-3b-v1.0 |
https://huggingface.co/lmsys/fastchat-t5-3b-v1.0 |
|
NO |
151 |
Pythia |
2023.04 |
NLP |
EleutherAI |
en |
https://arxiv.org/pdf/2304.01373.pdf |
70M/160M/410M/1B/1.4B/2.8B/6.9B/12B |
https://github.com/EleutherAI/pythia/tree/main |
https://github.com/EleutherAI/pythia/tree/main/models |
|
YES |
152 |
StableLM-Tuned-Alpha |
2023.04 |
NLP |
Stability AI |
en |
https://github.com/stability-AI/stableLM |
3B/7B |
https://huggingface.co/stabilityai |
https://huggingface.co/stabilityai |
|
NO |
153 |
Stable-vicuna-13b-delta |
2023.04 |
NLP |
CarperAI |
en/ |
https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot |
13B |
https://huggingface.co/CarperAI/stable-vicuna-13b-delta/tree/main |
https://huggingface.co/CarperAI/stable-vicuna-13b-delta |
https://huggingface.co/CarperAI/stable-vicuna-13b-delta |
NO |
154 |
Dolly |
2023.04 |
NLP |
Databricks |
en |
https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm |
3B/6B/7B/12B |
https://huggingface.co/databricks |
https://huggingface.co/databricks |
|
NO |
155 |
MPT |
2023.05 |
NLP |
Mosaic ML |
en |
https://the-decoder.com/mpt-7b-the-best-open-source-llm-available-for-commercial-use/ |
7B |
https://huggingface.co/mosaicml |
https://huggingface.co/mosaicml |
|
NO |
156 |
Dromedary |
2023.05 |
NLP |
MIT-IBM |
en/ |
https://arxiv.org/pdf/2305.03047.pdf |
65B |
https://huggingface.co/zhiqings/dromedary-65b-lora-delta-v0 |
https://github.com/IBM/Dromedary |
|
YES |
157 |
Visualglm-6b |
2023.05.17 |
CV/NLP |
THUDM |
zh/en |
https://github.com/THUDM/VisualGLM-6B |
6B |
https://huggingface.co/THUDM/visualglm-6b |
https://github.com/THUDM/VisualGLM-6B |
https://huggingface.co/spaces/THUDM/visualglm-6b |