Abstract:
In this paper, in the distributed Reconfigurable Intelligence Surface (RIS) assisted multi-user millimeter wave (mmWave) system, the deep reinforcement learning (DRL) theory is used to learn and adjust transmit beamforming matrix at the base station and phase shift matrix at the RIS, and jointly optimize the transmit beamforming matrix and phase shift matrix to maximize the weighted sum-rate. Specifically, in the discrete action space, we first design the power codebook and the phase codebook, and propose the Deep Q Network(DQN) algorithm to optimize the beamforming matrix and phase shift matrix; then, in the continuous action space, the Twin Delayed Deep Deterministic (TD3) policy gradient algorithm is used to optimize the beamforming matrix and phase shift matrix. The weighted sum-rates of the system in discrete action space and continuous action space with different number of codebook bits are compare through simulation. In addition, compared with the traditional convex optimization algorithm and the zero-forcing precoding with a random PBF algorithm, the sum-rate performance of DRL algorithm is significantly improved, and the sum-rate of the continuous TD3 algorithm exceeds the convex optimization algorithm by 23.89%, and the performance of the discrete DQN algorithm exceeds the traditional convex optimization algorithm when the number of codebook bits is 4.