PCL-张晗 Hanlard

codes of ICLR 2024 paper "CPPO: Continual Learning for Reinforcement Learning with Human Feedback"

Updated 1 week ago Python

复现了offline对齐算法的一系列工作,欢迎大家交流。 包括DPO, PRO, RRHF和SPIN。还有团队发表在ICLR2024的CPPO,以及最新的研究工作COPR。

Updated 2 months ago Python

Code and Datasets of " CLLE: A Benchmark for Continual Language Learning Evaluation in Multilingual Machine Translation"

Updated 1 year ago

Code for KIP-Frame: "Prompt-Based Prototypical Framework for Continual Relation Extraction".

Updated 1 year ago Python