Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
wilfChen 1925488311 | 1 year ago | |
---|---|---|
.. | ||
images | 1 year ago | |
README.en.md | 1 year ago | |
README.md | 1 year ago |
强化学习(RL)是一种基于对期望行为和不期望行为分别进行奖励和惩罚的机器学习方法。智能体能够感知和解释其环境、采取行动并通过反复试验进行学习。在 AlphaGo 获得成功之后,强化学习受到了更加广泛的关注,并已经在许多领域中开展了应用。
目前使用传统AI框架进行RL算法的模拟和应用,由于其接口的灵活性,往往需要人为重复开发一些基础框架的代码。MindSpore Reinforcement在支持可扩展的分布式多智能体异构硬件训练的同时,还提供了直观的算法编程抽象。欢迎用户积极参与MindSpore强化学习社区并提供您的反馈,社区的反馈对于MindSpore Reinforcement 未来的发展具有建设性的重要意义。
强化学习 SIG将关注强化学习的计算框架开发、研究进展以及在实际场景中的各类应用,特别是专注于优化基于MindSpore的高性能可扩展的强化学习分布式计算框架,提供持续丰富的经典算法库,供强化学习的研究人员更加方便地开展感兴趣的研究工作。小组的重点工作包括以下几个方面:
为了解决现有强化学习框架的局限性,欢迎在以下方面提出建议和贡献:
强化学习算法开发和多智能体分布式训练
MindSpore Reinforcement为编写强化学习算法提供了干净整洁的API抽象,它将算法与部署和执行解耦,包括硬件加速器的使用、并行度和跨进程计算的分布。MindSpore Reinforcement将强化学习的算法转换为一系列编译后的计算图,然后由MindSpore框架在Ascend、GPU或者CPU上高效运行。
强化学习套件库代码仓链接:https://gitee.com/mindspore/reinforcement
MindSpore community
Markdown Python Jupyter Notebook Text Diff other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》