You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
xiao9905 212215c54f Update Wechat Group 9 months ago
configs Update quantization results 1 year ago
cuda Add 4-bit quantization and CUDA kernels 1 year ago
docs Update inference-with-fastertransformer.md 1 year ago
evaluation Fix device 1 year ago
generation Fix first beam 1 year ago
kernels Add 4-bit quantization and CUDA kernels 1 year ago
logs Update links 1 year ago
quantization Update quantization docs and scripts 1 year ago
resources Update Wechat Group 9 months ago
scripts Set nucleus sampling as the default strategy 1 year ago
tasks Ethnic evaluation (#28) 1 year ago
tools Fix conversion script 1 year ago
.gitignore Initial commit 1 year ago
LICENSE Initial commit 1 year ago
MODEL_LICENSE Initial commit 1 year ago
README.md Update Wechat Group 9 months ago
README_zh.md add Google Group and Wechat Group 1 year ago
benchmark.py Update benchmark 1 year ago
evaluate.py Initial commit 1 year ago
generate.py Add sMASK for generation 1 year ago
initialize.py Add sequential initialization 1 year ago
requirements.txt Update requirements.txt 1 year ago

GLM-130B 是一个开源开放的双语(中文和英文)双向稠密模型,基于 GLM 架构,拥有 1300 亿参数。它旨在支持在一台 A100(40G * 8) 或 V100(32G * 8)服务器上对千亿规模参数的模型进行推理。

Python Markdown Shell Cuda Text other