帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

基于偏向信息学习的双层强化学习算法
Dual Reinforcement Learning Based on Bias Learning

作  者: ; ; ; ;

机构地区: 中国科学院计算技术研究所智能信息处理重点实验室

出  处: 《计算机研究与发展》 2008年第9期1455-1462,共8页

摘  要: 传统的强化学习存在收敛速度慢等问题,结合先验知识预置某些偏向可以加快学习速度.但是当先验知识不正确时又可能导致学习过程不收敛.对此,提出基于偏向信息学习的双层强化学习模型.该模型将强化学习过程和偏向信息学习过程结合起来:偏向信息指导强化学习的行为选择策略,同时强化学习指导偏向信息学习过程.该方法在有效利用先验知识的同时能够消除不正确先验知识的影响.针对迷宫问题的实验表明,该方法能够稳定收敛到最优策略;并且能够有效利用先验知识提高学习效率,加快学习过程的收敛. Reinforcement learning has received much attention in the past decade. Its incremental nature and adaptive capabilities make it suitable for use in various domains, such as automatic control, mobile robotics and multi-agent system. A critical problem in conventional reinforcement learning is the slow convergence of the learning process. To accelerate the learning speed, bias information is incorporated to boost learning process with priori knowledge. Current methods use bias information for the action selection strategies in reinforcement learning. They may suffer from the nonconvergence problem when priori knowledge is incorrect. A dual reinforcement learning model based on bias learning is proposed, which integrates reinforcement learning process and bias learning process. Bias information is used for action selection strategies in reinforcement learning and reinforcement learning is used to guide bias learning process. Thus the dual reinforcement learning model could make effective use of priori knowledge, and eliminate the negative effects of incorrect priori knowledge. Finally, the proposed dual model is validated by experiment on maze problem including simple environment and complex environment. The experimental results demonstrate that the model could converge to the optimal strategy steadily. Moreover, the model could improve the learning performance and speed up the convergence of the learning process.

关 键 词: 强化学习 学习算法 偏向信息 偏向信息学习 先验知识

领  域: [自动化与计算机技术] [自动化与计算机技术]

相关作者

相关机构对象

机构 中山大学
机构 暨南大学管理学院

相关领域作者

作者 李文姬
作者 邵慧君
作者 杜松华
作者 周国林
作者 邢弘昊