基于情景记忆的量子深度强化学习

朱献超; 侯晓凯; 吴绍君; 祝峰

doi:10.12178/1001-0548.2022043

基于情景记忆的量子深度强化学习

Quantum Deep Reinforcement Learning Based on Episodic Memory

摘要

摘要: 作为量子机器学习的一个新兴子领域，量子深度强化学习旨在利用量子神经网络构建一个量子智能体，使其通过与环境进行不断交互习得一个最优策略，以达到期望累积回报最大化。然而，现有量子深度强化学习方法在训练过程中需要与经典环境进行大量交互，从而导致大量多次调用量子线路。为此，该文提出了一种基于情景记忆的量子深度强化学习模型，称为量子情景记忆深度Q网络，该模型利用情景记忆来加速量子智能体的训练过程。具体来说，该模型将历史上出现的拥有高奖励值的经验记录到情景记忆中，使得在当前环境的状态与情景记忆中的某状态相似时，量子智能体可以根据该历史状态快速地获得想要的动作，从而减少了算法优化的迭代次数。在5个经典的雅达利游戏上的数值模拟表明，该文提出的方法可以显著地减少训练量子智能体的迭代次数，进而可以获得比其他量子深度强化学习方法更高的分数。

Abstract: As an emerging subfield of quantum machine learning, quantum deep reinforcement learning (QDRL) utilizes quantum neural networks (QNNs) to construct a quantum agent and trains QNNs through multiple interactions with an environment to maximize the expected cumulative return. However, existing QDRL methods require the quantum agent to interact with a classical environment many times, requiring a huge number of executions of the QNN circuit. To address this problem, this work proposes a QDRL model, a quantum episodic memory deep Q-network, which utilizes episodic memory to accelerate the training process. Specifically, the proposed model stores experiences with high rewards in history into the episodic memory, which then helps the quantum agent to obtain the desired action with significantly fewer iterations when the environment state is similar to one of those stored in the episodic memory. Numerical simulations on five typical Atari games show that the proposed method can significantly reduce the number of training iterations and can achieve a higher score compared to other conventional QDRL methods.

HTML全文

参考文献(44)

施引文献

资源附件(0)

基于情景记忆的量子深度强化学习

Quantum Deep Reinforcement Learning Based on Episodic Memory

期刊在线

编辑办公

友情链接