You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Sengxian 96eac9f33b Add 4-bit quantization and CUDA kernels 1 year ago
..
Makefile Add 4-bit quantization and CUDA kernels 1 year ago
quantization.cu Add 4-bit quantization and CUDA kernels 1 year ago

GLM-130B 是一个开源开放的双语(中文和英文)双向稠密模型,基于 GLM 架构,拥有 1300 亿参数。它旨在支持在一台 A100(40G * 8) 或 V100(32G * 8)服务器上对千亿规模参数的模型进行推理。

Python Markdown Shell Cuda Text other