-
Driven by the astounding development of smart phones, mobile applications and the Internet of Things (IoT), traffic demand grows exponentially in mobile networks. Heterogeneous ultra-dense networks (H-UDNs) [such as femtocell/picocell, cloud radio access network (Cloud-RAN), and fog radio access network (Fog-RAN)] have emerged as a promising solution to sustain the enormous mobile traffic demand. The coverage and capacity of wireless networks are improved through densified access points (APs) and communication links, which greatly enhances the spatial reuse of limited frequency resources.
H-UDNs, however, bring opportunities as well as formidable challenges. An H-UDN is expected to support an abundance of new applications with various new service requirements. For instance, massive machine-type communications require high connection density; auto pilot cars require low latency and ultra-high reliability; augmented reality requires both high throughput and low latency. To effectively and efficiently sustain these manifold requirements in a complicated system like an H-UDN is undoubtedly a mission worth pursuing. In addition, the existing advanced techniques, such as multi-cell coordination and massive multiple-input multiple-output (MIMO), cannot be easily extended to a H-UDN. This is because many assumptions used in the existing techniques are no longer valid or accurate in H-UDNs. For example, the law of large numbers and the random matrix theory have been widely used to simplify the analysis of massive MIMO. However, due to random scattering of APs and users, the channels between APs and users follow heavy-tailed distributions that are not analyzable for algorithm design using the existing random matrix theory[1]. Moreover, the high density of devices leads to prohibitively high complexity if the existing algorithms are directly applied. Consider, for example, a system supporting thousands or even tens of thousands of machine-type devices in a small area. The complexity of jointly detecting the signals sent by the devices using, say, a simple linear minimum mean square error (LMMSE) detector would cause unaffordable computational complexity, since the complexity of LMMSE grows cubically in the number of terminals.
Machine learning is a family of promising techniques to address the above-mentioned challenges. Unlike the traditional model-based approaches that are optimized based on mathematically convenient models, machine-learning based approaches are driven by real-world data, and thus are less sensitive to model imperfections. Previously, machine learning has been extensively used to solve a wide variety of problems in image/audio processing, social behavior analysis, project management, etc. The applications of machine learning in wireless networks start to attract research interests in recent years. As discussed in Ref. [2-3] and the references therein, machine learning approaches have potential applications in cognitive radios, massive MIMOs, device-to-device communications, etc. The goal of this paper is to complement their contributions by investigating the use of machine learning in H-UDNs, especially for solving collaborative signal processing and resource allocation problems. Specifically, we discuss how to utilize graphical representations of H-UDNs to design efficient machine learning algorithms. We first introduce several recently proposed signal processing algorithms (namely, randomized message passing[1], bilinear generalized approximate message passing (BIG-AMP)[4], and deep learning[5]) based on the coverage graph of APs. Then, we extend the discussion to resource management problems, such as radio resource allocation, power allocation, and cache placement. Reinforcement learning, deep learning, and semi-supervised learning are introduced as potential solutions, where the graphical models based on certain features of H-UDNs can help greatly improve the efficiency of the solutions.
-
In a UDN, interference is significantly magnified due to the high density of APs. To fully exploit the benefits of network densification, efficient interference management schemes become essential. It is commonly believed that network cooperation is an effective technique to mitigate interference and enhance network performance. However, cooperative signal processing in UDNs is usually very costly due to the following reasons. First, the amount of data in an H-UDN is extremely large. Jointly processing these data may involve prohibitively high computational complexity. Secondly, full-scale cooperation requires the estimation of a sufficiently large channel matrix that consists of channel coefficients from all mobile users to all APs. This causes significant channel estimation overhead, and thereby fundamentally limits the cooperation gain. Thirdly, with limited capacity of the fronthual network, excessive information sharing among APs is impossible. In this section, we will discuss potential solutions to the above issues based on the graphical representations of the H-UDN and the corresponding machine learning techniques.
-
We first illustrate the use of factor graphs and the associated message passing (MP) algorithms[6] for efficient signal processing in H-UDNs. A factor graph is a bipartite graph representing the factorization of a function, especially a probabilistic function. Assume, for example, that
$X$ ,$Y$ , and$Z$ are random variables that form a Markov chain. Then, their joint probability mass function${P_{XYZ}}(x,y,z)$ can be factored as a product of${P_X}(x)$ ,${P_{Y|X}}(y|x)$ , and${P_{Z|Y}}(z|y)$ . The factorization can be expressed by a factor graph shown in Fig. 2a. Specifically, the factor graph consists of variable nodes$\{ x,y,z\} $ (denoted by circles), factor nodes$\{ {P_X},{P_{Y|X}},{P_{Z|Y}}\} $ (denoted by squares), and edges. A variable node and a factor node are connected by an edge only when the variable is an input of the corresponding factor function. That is, edges specify the dependence relations between two types of nodes.A typical statistical inference problem is to infer unobserved variables based on observed variables and a given probability model. With the help of the factor-graph representation of the probability model, the problem translates to estimating the values of unobserved nodes conditioned on the values of observed ones. Efficient MP algorithms associated with graphical representations can be designed to solve the inference problem. In an MP algorithm, there are two types of messages over an edge: 1) Messages from a factor node to a variable node. 2) Messages from a variable node to a factor node. Each message over an edge is designed to capture the probabilistic information from nearby nodes. Each message is iteratively updated only based on the previous messages sent by nearby nodes. In this way, MP provides a distributed solution for the inference problem. Moreover, when the factor graph is sparse, the complexity can be further reduced. Notice that many signal processing problems in communications are inference problems, since the receiver in a communication system wants to estimate the unobserved transmit signals by processing its observed receive signals. Hence, factor-graph-based MP has a great potential to provide an efficient solution for signal processing in H-UDNs. However, depending on specific problems, MP algorithms need to be carefully adjusted to achieve satisfactory performance. In the following, we illustrate how to utilize the special features of the underlying systems to design efficient message-passing algorithms for signal processing. In particular, we will discuss the applications of factor graph in signal detection problems with and without perfect channel state information (CSI).
1) Signal processing with perfect CSI. In this example, we consider joint uplink signal detection in a Cloud-RAN. Suppose that the APs (i.e., RRHs) perfectly know the channel matrix
${{H}}$ , and the signals transmitted by the users follow an independent distribution. Then, the widely-used maximum a posteriori (MAP) estimator can be expressed as${{\hat x}} = \arg \max \displaystyle\prod\limits_{n = 1}^N {P({y_n}|{{x}},{{H}})} \displaystyle\prod\limits_{k = 1}^K {P({x_k})} $ , where${y_n}$ is the signal received by AP n, and${x_k}$ is the signal transmitted by user$k$ . The corresponding factor graph is given in Fig. 2b. By treating the dashed rectangles as APs and users, the factor graph can be viewed as a graph induced from the coverage graph. More importantly, the factor graph shares certain common properties with the coverage graph. Recall that the coverage graph is a sparse and locally dense graph since each AP can only receive reasonably strong signals from a small number of nearby users. Hence, the factor graph is also globally sparse and locally dense. On one hand, the high sparsity leads to a significant reduction of computational complexity of messages in the message-passing algorithm. On the other hand, the local denseness compromises the convergence performance of the message-passing algorithm. MP converges when the factor graph is a tree or only contains random long loops. Unfortunately, locally dense factor graphs contain a large number of short loops. To improve the convergence of MP, several efficient methods have been proposed, such as damping and asynchronous updating. With damping, an updated message is a weighted average of the message in the previous iteration and the message calculated by the original message updating rule. With asynchronous updating, the messages are updated sequentially instead of in parallel, with the hope to break the effect of short loops if the updating schedule is appropriately chosen. However, so far there is no efficient and systematic methods to find an updating schedule that ensures convergence of MP. Recently, Ref. [1] proposed a random asynchronous updating strategy to improve the convergence performance, where the messages are updated sequentially in a random order. Fig. 3a and Fig. 3b compare the convergence performance of MP with other widely-used algorithms. As shown in Fig. 3a, the MP algorithms converge much faster than other algorithms. Through Fig. 3b, we observe that the number of iterations needed by MP does not increase with the network size, while that of the other algorithms grows roughly linear with the network size. This implies that MP with appropriate adjustment is an effective solution for scalable signal processing in large-scale H-UDNs.2) Blind signal detection. In the previous example, we have assumed perfect CSI at the receiver side. In practical systems, CSI acquisition consumes a substantial amount of system resource when the network size is large, and thus fundamentally limits the system capacity. To avoid the channel estimation overhead, another line of research works on blind-detection approaches, where both the channel matrix
${{H}}$ and the transmitted signals${{x}}$ are estimated simultaneously from the received signals${{y}}$ . However, it has been proved that blind approaches can, at best, achieve the same degrees of freedom as training-based approaches due to the fact that an unknown channel makes signal detection in a blind approach more difficult. Recently, it was shown that channel sparsity in certain transformed domain can be exploited to improve the performance of blind detection. For example, in massive MIMO systems, the channel matrix${{H}}$ is sparse in the angular domain. The MAP estimation can be written as,$$({{\hat x}},{{\hat H}}) = \arg {\max\nolimits_{{{x}},{{H}}}}P({{y}}|{{x}},{{H}})P({{x}})P({{H}})$$ where the sparsity of
${{H}}$ is incorporated in its prior distribution$P({{H}})$ . Similar to the previous example, the above MAP estimation can also be represented by a factor graph. With careful design, MP algorithms, such as projected bilinear generalized approximate message passing (P-BiG-AMP)[4], can be potentially applied to solve the blind detection problem. Fig. 3c shows that the P-BiG-AMP algorithm significantly outperforms other existing detection schemes, and performs closely to the ideal case with perfect channel knowledge. -
In this subsection, we discuss the design of efficient deep learning algorithms for signal processing in H-UDNs. In deep learning, a multi-layer neural network is trained to imitate the relationship between the input and output data based on a large training set. Once the neural network is well trained, it can be used to complete the task it was trained for with very low complexity. Thanks to its outstanding performance, e.g. in image classification and speech recognition, deep learning has been considered as a promising tool for future wireless network design. However, how to adapt existing deep learning techniques to wireless systems is still an open problem. Most of the previous work in image/audio processing has used very little prior information during the training of neural networks. Nonetheless, in wireless systems, there exists a large amount of prior information, including user locations, APs' coverage area, distributions of channel gains, and so on. Ignoring such prior information would inevitably cause a performance loss. Recently, noticeable efforts have been made to utilize prior information to improve the performance of deep learning[5, 7-8]. In Ref. [5], a novel neural-network architecture has been proposed to mimic the updating rule of a well-known signal processing algorithm, approximate message passing (AMP). As shown in Fig. 4a, the message updating function in each layer of the neural network is the same as that of AMP, but the updating matrices
$\{ {{{B}}_1},{{{B}}_2}, \cdots , {{{B}}_M}\} $ are learned from the training data (e.g., a set of received and transmitted signals$\{ ({{{y}}^{(d)}},{{{x}}^{(d)}})\} _{d = 1}^D$ ). The proposed neural network significantly outperforms the original AMP algorithm in both computational time and accuracy. In addition, Ref. [5] showed that fixing the structure of updating matrices according to the channel matrix${{A}}$ , i.e., letting${{{B}}_m} = {{{C}}_m}{{A}}$ , renders a more efficient learning procedure. This is because the channel matrix, as prior information, represents the relationship between the channel input and the channel output. When applying to large-scale UDNs, however, the approach in Ref. [5] might be impractical due to the high computational complexity and signaling overhead to estimate the channel matrix. On the other hand, as discussed in Subsection 1.2, the coverage graph of an H-UDN is often sparse and easy to obtain. Thus, it is natural to restrict the updating matrices in the neural networks to sparse matrices according to the coverage information. For example, Ref. [8] proposed a convolutional neural network (CNN)-based signal detector for large-scale banded systems, in which the receivers can only receive signals from nearby transmitters. As shown in Fig. 5, by utilizing the banded structure of the channel matrix${{H}}$ , a CNN is used to extract and share messages between consecutive signals. As all neurons in a convolutional layer share the same set of tunable parameters, the number of tunable parameters is significantly reduced, and the complexity of neural network training is drastically lowered. -
To date, most resource allocation problems in wireless networks have been formulated and solved as optimization problems, thanks to the availability of efficient algorithms when the formulation is in a nice (e.g., convex) form. A limitation of optimization methods is that they heavily rely on rigidly defined and mathematically convenient models. However, in H-UDNs, complicated interactions between various network entities render accurate modeling difficult. Even if an accurate model can be constructed, it is likely too tedious to be mathematically tractable. In contrast, driven by real-world data instead of pre-assumed models, machine learning algorithms hold significant potential. In this section, we discuss the design of efficient machine learning algorithms based on graphical models for resource management in H-UDNs; see the summary in Table 1.
Table 1. Machine Learning Techniques for Resource Management
Learning techniques Key characteristic Examples Usages of graphical models Reinforcement learning Govern agents in an environment to maximize cumulative rewards Opportunistic access[9-10]
Joint power control and relay selection[11]Information sharing over edges
Environment modeling based on graphs
Performance analysis based on graph theoryDeep learning Use a cascade of layers of nonlinear processing units for feature extraction and transformation Algorithm approximation[12]
Environment/ policy modeling[13-14]Neural network sparsification based on graphs to simplify the learning procedure Semi-supervised learning Infer information of unlabeled data from a small amount of labeled data Vertex labeling[15] Propagating node information over edges -
Reinforcement learning is concerned with how agents ought to take proper actions in a dynamic environment so as to maximize certain cumulative rewards. Regarding network entities as agents and the network as the environment, reinforcement learning is an ideal tool for resource management in H-UDNs. Here, we take a model-free reinforcement learning technique, say Q-learning, as an example. As shown in Fig. 6, Q-learning is used to find an optimal action-selection policy by maximizing the accumulated reward, which is approximated by a Q-function. The Q-function is iteratively updated through observing the actual reward after the agent carries out an action in a given environment. For instance, Ref. [9] proposed a Q-learning algorithm for femtocells to design transmission schemes based on their own actions and received reward signals in order to maximize the reward functions. In Ref. [9], femtocells acted as a secondary network and perform opportunistic channel access in a distributed manner, which reduced the implementation complexity. One key challenge of applying existing Q-learning algorithms to H-UDNs is that when the devices aim at maximizing their own reward functions, their behaviors often reduce each other’s rewards. This is because an H-UDN consists of various kinds of devices that may have conflicting interests. To address the challenge, it is necessary to introduce a certain level of information exchange among different devices. For example, Ref. [11] proposed to replace individual reward functions by a global reward function to enhance the performance of joint power control and relay selection. However, obtaining the global reward requires collecting all the reward signals throughout the whole network, which is impractical due to the large network size. Hence, an approximation approach was proposed in Ref. [11] based on the coverage graph. That is, each device approximates the global reward by averaging the rewards received from the neighboring devices in the coverage graph. Another way to avoid collecting the reward signals from the entire network is to design the scheme in a way that a good system-level performance is achieved when individual reward functions are optimized. Ref. [10] considered an opportunistic spectrum access (OSA) problem in cognitive radio networks, where individual objective functions are derived for each user by decoupling the global reward function. Specifically, the OSA problem is transferred to a coloring problem over a conflict graph of the network with users being the vertices. For each user, the individual objective function consists of its own rewards and the rewards of users conflicting with it. It is shown that the proposed algorithm can achieve a near-optimal performance when the conflict graph is a complete graph.
-
Deep learning is potentially a useful tool for resource management in H-UDNs due to its capability of modeling complex non-linear relationships. Deep neural networks can be used to replace the existing high-complexity resource allocation algorithms. Recently, Ref. [12] approximated weighted MMSE (WMMSE) based power allocation using deep learning by treating the input and output of WMMSE as an unknown non-linear mapping. Unlike Ref. [5], which approximates the AMP algorithm by mimicing each iteration by a layer of the neural network, Ref. [12] used a fully-connected neural network to approximate the WMMSE algorithm (see Fig. 4b). Through simulations, the authors showed that the fully-connected deep neural network can achieve similar performance as that of WMMSE with much lower complexity. Ref. [16] proposed a general deep learning framework for the optimization of downlink beamforming. Specifically, the solution is obtained based on neural networks and exploitation of expert knowledge, such as the uplink-downlink duality and the known structure of optimal solutions. Through simulations, the authors showed that the proposed framework can achieve near-optimal solutions while enjoy significantly reduced computational complexity.
Neural networks can naturally be applied to learn the network environments, such as prior information, reward functions, states of agents, and interactions between agents, which are useful in reinforcement learning schemes. This leads to deep reinforcement learning algorithms. In Ref. [13], the authors proposed to use a deep convolutional neural network to approximate the optimal action-value function of Q-learning. In this way, successful policies are directly learned from high-dimensional inputs in the end-to-end system with very little model information. In Ref. [14], a deep reinforcement learning algorithm was proposed for job scheduling with policy represented by deep neural networks. Through simulations, it is shown that the proposed algorithm performs comparably or even better than standard heuristic algorithms. Moreover, it does not require any prior information of the system environment. Recently, Ref. [17] proposed to replace traditional DNNs in deep reinforcement learning algorithms with a graph neural network, which is constructed based on the graphical representation of actions. The proposed algorithm not only achieves comparable performance to state-of-the-art reinforcement learning methods, but also shows outstanding capability of generalization to different scenarios. These encouraging observations indicate the great potential of developing deep reinforcement learning techniques for resource management in H-UDNs. Exploiting the graphical representation of underlying networks to improve the efficiency is undoubtedly a subject worth pursuing.
-
Semi-supervised learning is a powerful tool to train prediction systems with little supervision (i.e., only a small percentage of the training data is labeled). The learning procedure is usually performed over a graph, in which a small fraction of vertices are labeled. The vertices in the partially-labeled graph are connected by an edge if they share some similarities, with the edge weight quantifying the similarities. Then, the learning algorithm gradually labels the unlabeled vertices by propagating information across the graph in a distributed manner, leading to efficient implementation on large graphs[15]. Many resource allocation problems in H-UDNs can be interpreted as vertex labeling (a.k.a., graph coloring) problems. For example, frequency channels can be allocated by labeling the devices with different labels (say, different subchannels). In practice, the subchannels occupied by devices at the network boundaries are predetermined to avoid collisions with nearby networks, which means a small number of vertices corresponding to boundary devices are pre-labeled. The radio resource allocation problem thereby can be transferred to a semi-supervised learning problem by carefully defining the edge weights (i.e., similarities). For example, we can set the weights between devices to be inversely proportional to the distances between them to avoid interference among nearby devices. Another example is content placement in network caching. Vertices corresponding to caches are connected in a cooperation graph, only when they are capable of jointly serving a user with the same contents. Constraints in terms of storage capacity, fronthual capacity, transmission delay, etc., can be treated as pre-labeled data in semi-supervised learning algorithms.
Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations
-
摘要: 异构超密集网络(H-UDN)被认为是一种通过网络密集化来维持爆炸性的移动业务需求的解决方案。通过将接入点、处理器和存储单元放置得尽可能靠近移动用户,H-UDN带来了许多优势,包括较高的频谱利用率、较高的能量利用率和低延迟。尽管如此,H-UDNs中网络实体的高密度和多样性给协同信号的处理和资源管理带来了巨大的设计挑战。该文阐述了机器学习技术在解决这些挑战方面的巨大潜力。特别地,展示了如何利用H-UDN的图形表示来设计有效的机器学习算法。Abstract: Heterogeneous ultra-dense network (H-UDN) is envisioned as a promising solution to sustain the explosive mobile traffic demand through network densification. By placing access points, processors, and storage units as close as possible to mobile users, H-UDNs bring forth a number of advantages, including high spectral efficiency, high energy efficiency, and low latency. Nonetheless, the high density and diversity of network entities in H-UDNs introduce formidable design challenges in collaborative signal processing and resource management. This article illustrates the great potential of machine learning techniques in solving these challenges. In particular, we show how to utilize graphical representations of H-UDNs to design efficient machine learning algorithms.
-
Table 1. Machine Learning Techniques for Resource Management
Learning techniques Key characteristic Examples Usages of graphical models Reinforcement learning Govern agents in an environment to maximize cumulative rewards Opportunistic access[9-10]
Joint power control and relay selection[11]Information sharing over edges
Environment modeling based on graphs
Performance analysis based on graph theoryDeep learning Use a cascade of layers of nonlinear processing units for feature extraction and transformation Algorithm approximation[12]
Environment/ policy modeling[13-14]Neural network sparsification based on graphs to simplify the learning procedure Semi-supervised learning Infer information of unlabeled data from a small amount of labeled data Vertex labeling[15] Propagating node information over edges -
[1] FAN Cong-min, YUAN Xiao-jun, ZHANG Ying-jun. Scalable uplink signal detection in c-rans via randomized Gaussian message passing[J]. IEEE Transactions on Wireless Communications, 2017, 16(8): 5187-5200. [2] JIANG Chun-xiao, ZHANG Hai-jun, REN Yong, et al. Machine learning paradigms for next-generation wireless networks[J]. IEEE Wireless Communications, 2017, 24(2): 98-105. [3] WANG Wen-bo, KWASINSKI A, NIYATO D, et al. A survey on applications of model-free strategy learning in cognitive wireless networks[J]. IEEE Communications Surveys & Tutorials, 2016, 18(3): 1717-1757. [4] ZHANG Jian-wen, YUAN Xiao-jun, ZHANG Ying-jun. Blind signal detection in massive MIMO: Exploiting the channel sparsity[J]. IEEE Transactions on Communications, 2017, 66(2): 700-712. [5] BORGERDING M, SCHNITER P. Onsager-corrected deep learning for sparse linear inverse problems[C]//2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP). Washington DC: IEEE, 2016: 227-231. [6] LOELIGER H A, DAUWELS J, HU Jun-li, et al. The factor graph approach to model-based signal processing[J]. Proceedings of the IEEE, 2007, 95(6): 1295-1322. [7] NACHMANI E, BE’ERY Y, BURSHTEIN D. Learning to decode linear codes using deep learning[C]//2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton). Monticello: IEEE, 2016: 341-346. [8] FAN Cong-min, YUAN Xiao-jun, ZHANG Ying-jun. CNN-based signal detection for banded linear systems[J]. IEEE Transactions on Wireless Communications, 2019, 18(9): 4394-4407. [9] ALNWAIMI G, VAHID S, MOESSNER K. Dynamic heterogeneous learning games for opportunistic access in LTE-based macro/femtocell deployments[J]. IEEE Transactions on Wireless Communications, 2015, 14(4): 2294-2308. [10] NOROOZOLIAEE M, HAMDAOUI B, TUMER K. Efficient objective functions for coordinated learning in large-scale distributed OSA systems[J]. IEEE Transactions on Mobile Computing, 2013, 12(5): 931-944. [11] NADDAFZADEH-SHIRAZI G, KONG P Y, THAM C K. Distributed reinforcement learning frameworks for cooperative retransmission in wireless networks[J]. IEEE Transactions on Vehicular Technology, 2010, 59(8): 4157-4162. [12] SUN Hao-ran, CHEN Xiang-yi, SHI Qing-jiang, et al. Learning to optimize: Training deep neural networks for wireless resource management[C]//2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Sapporo: IEEE, 2017: 1-6. [13] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529. [14] MAO Hong-zi, ALIZADEH M, MENACHE I, et al. Resource management with deep reinforcement learning[C]//Proceedings of the 15th ACM Workshop on Hot Topics in Networks. Atlanta GA: ACM, 2016: 50-56. [15] RAVI S, DIAO Qi-ming. Large scale distributed semi-supervised learning using streaming approximation[C]//Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS). Cadiz: DBLP, 2016: 519-528. [16] XIA Wen-chao, ZHENG Gan, ZHU Yong-xu, et al. A deep learning framework for optimization of MISO downlink beamforming[J]. IEEE Transactions on Communications, 2020, 68(3): 1866-1880. [17] WANG Ting-wu, LIAO Ren-jie, BA J, et al. Nervenet: Learning structured policy with graph neural networks[EB/OL]. [2020-09-18]. http://www.cs.toronto.edu/~tingwuwang/nervenet.html.