You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
zhanghangit a2e70f02e6 update code and readme 2 years ago
..
__pycache__ update code and readme 2 years ago
audio update code and readme 2 years ago
encoders update code and readme 2 years ago
legacy update code and readme 2 years ago
multilingual update code and readme 2 years ago
__init__.py update code and readme 2 years ago
add_target_dataset.py update code and readme 2 years ago
append_token_dataset.py update code and readme 2 years ago
backtranslation_dataset.py update code and readme 2 years ago
base_wrapper_dataset.py update code and readme 2 years ago
bucket_pad_length_dataset.py update code and readme 2 years ago
colorize_dataset.py update code and readme 2 years ago
concat_dataset.py update code and readme 2 years ago
concat_sentences_dataset.py update code and readme 2 years ago
data_utils.py update code and readme 2 years ago
data_utils_fast.cpp update code and readme 2 years ago
data_utils_fast.cpython-36m-x86_64-linux-gnu.so update code and readme 2 years ago
data_utils_fast.cpython-38-x86_64-linux-gnu.so update code and readme 2 years ago
data_utils_fast.pyx update code and readme 2 years ago
denoising_dataset.py update code and readme 2 years ago
dictionary.py update code and readme 2 years ago
fairseq_dataset.py update code and readme 2 years ago
fasta_dataset.py update code and readme 2 years ago
id_dataset.py update code and readme 2 years ago
indexed_dataset.py update code and readme 2 years ago
iterators.py update code and readme 2 years ago
language_pair_dataset.py update code and readme 2 years ago
list_dataset.py update code and readme 2 years ago
lm_context_window_dataset.py update code and readme 2 years ago
lru_cache_dataset.py update code and readme 2 years ago
mask_tokens_dataset.py update code and readme 2 years ago
monolingual_dataset.py update code and readme 2 years ago
multi_corpus_dataset.py update code and readme 2 years ago
multi_corpus_sampled_dataset.py update code and readme 2 years ago
nested_dictionary_dataset.py update code and readme 2 years ago
noising.py update code and readme 2 years ago
num_samples_dataset.py update code and readme 2 years ago
numel_dataset.py update code and readme 2 years ago
offset_tokens_dataset.py update code and readme 2 years ago
pad_dataset.py update code and readme 2 years ago
plasma_utils.py update code and readme 2 years ago
prepend_dataset.py update code and readme 2 years ago
prepend_token_dataset.py update code and readme 2 years ago
raw_label_dataset.py update code and readme 2 years ago
replace_dataset.py update code and readme 2 years ago
resampling_dataset.py update code and readme 2 years ago
roll_dataset.py update code and readme 2 years ago
round_robin_zip_datasets.py update code and readme 2 years ago
shorten_dataset.py update code and readme 2 years ago
sort_dataset.py update code and readme 2 years ago
strip_token_dataset.py update code and readme 2 years ago
subsample_dataset.py update code and readme 2 years ago
token_block_dataset.py update code and readme 2 years ago
token_block_utils_fast.cpp update code and readme 2 years ago
token_block_utils_fast.cpython-36m-x86_64-linux-gnu.so update code and readme 2 years ago
token_block_utils_fast.cpython-38-x86_64-linux-gnu.so update code and readme 2 years ago
token_block_utils_fast.pyx update code and readme 2 years ago
transform_eos_dataset.py update code and readme 2 years ago
transform_eos_lang_pair_dataset.py update code and readme 2 years ago

鹏程-通言模型 通言模型是在M2M-100模型结构上进行改进的多语种机器翻译模型,通过参数复用和增量式训练,将模型参数从1.2B提升至13.2B,在一带一路多个小语种的翻译上大幅提升。

Text Python C++ Cuda other