Xin Yan thomas-yanxin
Loading Heatmap…

thomas-yanxin synced commits to main at thomas-yanxin/VLMEvalKit from mirror

1 hour ago

thomas-yanxin synced commits to master at thomas-yanxin/Langchain-Chatchat from mirror

  • dab9ed95e7 真-修复:nltk_data 路径未正确设置导致 nltk download 错误, 以及 txt 等文档入库失败 (#4581)

2 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/LLaMA-Factory from mirror

2 hours ago

thomas-yanxin synced commits to master at thomas-yanxin/llama.cpp from mirror

4 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/Bunny from mirror

5 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/lmdeploy from mirror

  • 0188537758 update daily testcase new (#2035) * update * update * update * update * update * update * update * Update daily_ete_test.yml * Update daily_ete_test.yml * update * update * update * update * update * updsate * update * update * update * Update benchmark.yml * update * update * update * update * updat * update * update * update * update * update * update * update * Update action_tools.py * Update action_tools.py * Update config.yaml * update

5 hours ago

thomas-yanxin synced commits to tpu-n at thomas-yanxin/vllm from mirror

9 hours ago

thomas-yanxin synced commits to nemotron-support at thomas-yanxin/vllm from mirror

9 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/vllm from mirror

  • 14f91fe67c [Spec Decode] Disable Log Prob serialization to CPU for spec decoding for both draft and target models. (#6485)
  • d7f4178dd9 [Frontend] Move chat utils (#6602) Co-authored-by: Roger Wang <ywang@roblox.com>
  • Compare 2 commits »

9 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/LLaMA-Factory from mirror

9 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/opencompass from mirror

  • 96f644de69 [Fix] Update path and folder (#1344) * Update path and folder * Update path --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>

12 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/VLMEvalKit from mirror

12 hours ago

thomas-yanxin synced commits to nightly at thomas-yanxin/unsloth from mirror

13 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/unsloth from mirror

13 hours ago

thomas-yanxin synced commits to new-quant-method at thomas-yanxin/transformers from mirror

14 hours ago

thomas-yanxin synced commits to master at thomas-yanxin/llama.cpp from mirror

  • 22f281aa16 examples : Rewrite pydantic_models_to_grammar_examples.py (#8493) Changes: - Move each example into its own function. This makes the code much easier to read and understand. - Make the program easy to only run one test by commenting out function calls in main(). - Make the output easy to parse by indenting the output for each example. - Add shebang and +x bit to make it clear it's an executable. - Make the host configurable via --host with a default 127.0.0.1:8080. - Make the code look in the tools list to call the registered tool, instead of hardcoding the returned values. This makes the code more copy-pastable. - Add error checking, so that the program exits 1 if the LLM didn't returned expected values. It's super useful to check for correctness. Testing: - Tested with Mistral-7B-Instruct-v0.3 in F16 and Q5_K_M and Meta-Llama-3-8B-Instruct in F16 and Q5_K_M. - I did not observe a failure even once in Mistral-7B-Instruct-v0.3. - Llama-3 failed about a third of the time in example_concurrent: it only returned one call instead of 3. Even for F16. Potential follow ups: - Do not fix the prompt encoding yet. Surprisingly it mostly works even if the prompt encoding is not model optimized. - Add chained answer and response. Test only change.
  • 328884f421 gguf-py : fix some metadata name extraction edge cases (#8591) * gguf-py : fix some metadata name extraction edge cases * convert_lora : use the lora dir for the model card path * gguf-py : more metadata edge cases fixes Multiple finetune versions are now joined together, and the removal of the basename annotation on trailing versions is more robust. * gguf-py : add more name metadata extraction tests * convert_lora : fix default filename The default filename was previously hardcoded. * convert_hf : Model.fname_out can no longer be None * gguf-py : do not use title case for naming convention Some models use acronyms in lowercase, which can't be title-cased like other words, so it's best to simply use the same case as in the original model name. Note that the size label still has an uppercased suffix to make it distinguishable from the context size of a finetune.
  • c69c63039c convert_hf : fix Gemma v1 conversion (#8597) * convert_hf : fix Gemma v1 conversion * convert_hf : allow renaming tokens, but with a warning * convert_hf : fix Gemma v1 not setting BOS and EOS tokens
  • 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) * CUDA: MMQ code deduplication + iquant support * 1 less parallel job for CI build
  • Compare 4 commits »

14 hours ago

thomas-yanxin synced commits to compilade/fix-metadata-name-extraction at thomas-yanxin/llama.cpp from mirror

  • 1932a1b871 gguf-py : do not use title case for naming convention Some models use acronyms in lowercase, which can't be title-cased like other words, so it's best to simply use the same case as in the original model name. Note that the size label still has an uppercased suffix to make it distinguishable from the context size of a finetune.
  • bf8e71b0c0 convert_lora : fix default filename The default filename was previously hardcoded. * convert_hf : Model.fname_out can no longer be None
  • a3d154b260 gguf-py : add more name metadata extraction tests
  • Compare 3 commits »

14 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/lit-parrot from mirror

  • 2104b5b5b6 Dynamically set kv-cache size in serve (#1602)

17 hours ago

thomas-yanxin synced commits to w8a8-input-scale-none at thomas-yanxin/vllm from mirror

  • 22e9855b4e Merge branch 'main' into w8a8-input-scale-none
  • 9364f74eee [ Kernel ] Enable `fp8-marlin` for `fbgemm-fp8` models (#6606)
  • 06d6c5fe9f [Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543)
  • 683e3cb9c4 [ Misc ] `fbgemm` checkpoints (#6559)
  • 9042d68362 [Misc] Consolidate and optimize logic for building padded tensors (#6541)
  • Compare 10 commits »

19 hours ago

thomas-yanxin synced commits to main at thomas-yanxin/vllm from mirror

  • 082ecd80d5 [ Bugfix ] Fix AutoFP8 fp8 marlin (#6609)
  • f952bbc8ff [Misc] Fix input_scale typing in w8a8_utils.py (#6579)
  • 9364f74eee [ Kernel ] Enable `fp8-marlin` for `fbgemm-fp8` models (#6606)
  • 06d6c5fe9f [Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543)
  • 683e3cb9c4 [ Misc ] `fbgemm` checkpoints (#6559)
  • Compare 5 commits »

19 hours ago