留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations

FAN Cong-min ZHANG Ying-jun YUAN Xiao-jun LI Si-xian

樊聪敏, 张颖珺, 袁晓军, 李思贤. 基于图形表示的异构超密集网络的机器学习技术研究[J]. 电子科技大学学报, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
引用本文: 樊聪敏, 张颖珺, 袁晓军, 李思贤. 基于图形表示的异构超密集网络的机器学习技术研究[J]. 电子科技大学学报, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
FAN Cong-min, ZHANG Ying-jun, YUAN Xiao-jun, LI Si-xian. Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
Citation: FAN Cong-min, ZHANG Ying-jun, YUAN Xiao-jun, LI Si-xian. Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356

基于图形表示的异构超密集网络的机器学习技术研究

doi: 10.12178/1001-0548.2020356
详细信息
  • 中图分类号: TN92

Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations

More Information
图(7) / 表(1)
计量
  • 文章访问数:  5499
  • HTML全文浏览量:  1314
  • PDF下载量:  65
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-09-15
  • 修回日期:  2020-09-30
  • 网络出版日期:  2020-11-25
  • 刊出日期:  2020-11-23

Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations

doi: 10.12178/1001-0548.2020356
    通讯作者: 袁晓军,E-mail:xjyuan@uestc.edu.cn
  • 中图分类号: TN92

摘要: 异构超密集网络(H-UDN)被认为是一种通过网络密集化来维持爆炸性的移动业务需求的解决方案。通过将接入点、处理器和存储单元放置得尽可能靠近移动用户,H-UDN带来了许多优势,包括较高的频谱利用率、较高的能量利用率和低延迟。尽管如此,H-UDNs中网络实体的高密度和多样性给协同信号的处理和资源管理带来了巨大的设计挑战。该文阐述了机器学习技术在解决这些挑战方面的巨大潜力。特别地,展示了如何利用H-UDN的图形表示来设计有效的机器学习算法。

English Abstract

樊聪敏, 张颖珺, 袁晓军, 李思贤. 基于图形表示的异构超密集网络的机器学习技术研究[J]. 电子科技大学学报, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
引用本文: 樊聪敏, 张颖珺, 袁晓军, 李思贤. 基于图形表示的异构超密集网络的机器学习技术研究[J]. 电子科技大学学报, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
FAN Cong-min, ZHANG Ying-jun, YUAN Xiao-jun, LI Si-xian. Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
Citation: FAN Cong-min, ZHANG Ying-jun, YUAN Xiao-jun, LI Si-xian. Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
  • Driven by the astounding development of smart phones, mobile applications and the Internet of Things (IoT), traffic demand grows exponentially in mobile networks. Heterogeneous ultra-dense networks (H-UDNs) [such as femtocell/picocell, cloud radio access network (Cloud-RAN), and fog radio access network (Fog-RAN)] have emerged as a promising solution to sustain the enormous mobile traffic demand. The coverage and capacity of wireless networks are improved through densified access points (APs) and communication links, which greatly enhances the spatial reuse of limited frequency resources.

    H-UDNs, however, bring opportunities as well as formidable challenges. An H-UDN is expected to support an abundance of new applications with various new service requirements. For instance, massive machine-type communications require high connection density; auto pilot cars require low latency and ultra-high reliability; augmented reality requires both high throughput and low latency. To effectively and efficiently sustain these manifold requirements in a complicated system like an H-UDN is undoubtedly a mission worth pursuing. In addition, the existing advanced techniques, such as multi-cell coordination and massive multiple-input multiple-output (MIMO), cannot be easily extended to a H-UDN. This is because many assumptions used in the existing techniques are no longer valid or accurate in H-UDNs. For example, the law of large numbers and the random matrix theory have been widely used to simplify the analysis of massive MIMO. However, due to random scattering of APs and users, the channels between APs and users follow heavy-tailed distributions that are not analyzable for algorithm design using the existing random matrix theory[1]. Moreover, the high density of devices leads to prohibitively high complexity if the existing algorithms are directly applied. Consider, for example, a system supporting thousands or even tens of thousands of machine-type devices in a small area. The complexity of jointly detecting the signals sent by the devices using, say, a simple linear minimum mean square error (LMMSE) detector would cause unaffordable computational complexity, since the complexity of LMMSE grows cubically in the number of terminals.

    Machine learning is a family of promising techniques to address the above-mentioned challenges. Unlike the traditional model-based approaches that are optimized based on mathematically convenient models, machine-learning based approaches are driven by real-world data, and thus are less sensitive to model imperfections. Previously, machine learning has been extensively used to solve a wide variety of problems in image/audio processing, social behavior analysis, project management, etc. The applications of machine learning in wireless networks start to attract research interests in recent years. As discussed in Ref. [2-3] and the references therein, machine learning approaches have potential applications in cognitive radios, massive MIMOs, device-to-device communications, etc. The goal of this paper is to complement their contributions by investigating the use of machine learning in H-UDNs, especially for solving collaborative signal processing and resource allocation problems. Specifically, we discuss how to utilize graphical representations of H-UDNs to design efficient machine learning algorithms. We first introduce several recently proposed signal processing algorithms (namely, randomized message passing[1], bilinear generalized approximate message passing (BIG-AMP)[4], and deep learning[5]) based on the coverage graph of APs. Then, we extend the discussion to resource management problems, such as radio resource allocation, power allocation, and cache placement. Reinforcement learning, deep learning, and semi-supervised learning are introduced as potential solutions, where the graphical models based on certain features of H-UDNs can help greatly improve the efficiency of the solutions.

    • In this section, we first introduce the main entities of an H-UDN. Then, H-UDNs are modelled as graphs that depict various interactions between network entities. Based on these graphs, machine learning algorithms will be discussed in later sections for efficient network operations.

    • An H-UDN consists of various types of network architectures as illustrated in Fig. 1. In the following part, we introduce typical network architectures as well as their unique features.

      1) Macrocell and Picocell/Femtocell: Macrocells are cells that provide radio coverage served by high power base stations (BSs) in a cellular network. Picocells and femtocells are served by small and low-power BSs to provide uninterrupted coverage for end users even in areas difficult or expensive to be covered by macrocells. In particular, femtocells are deployed, powered, and connected by end users or small businesses with less active control from network operators. Hence, the operation of femtocells is more autonomous than picocells and macrocells. As such, multi-cell coordination in an H-UDN must consider different levels of processing power, backhaul limitation, information availability, controllability, and willingness of participation of different cells.

      Figure 1.  An H-UDN consists of a femtocell, a picocell, a macrocell, a Cloud-RAN, and a Fog-RAN, where the Fog-RAN incorporates the macrocell and part of the Cloud-RAN. A coverage graph and a conflict graph are induced from the Cloud-RAN.

      2) Cloud-RAN: A Cloud-RAN consists of three key components: ① the distributed remote radio heads (RRHs), ② a pool of baseband processing units (BBUs) in a data center cloud and ③ a high-bandwidth and low-latency optical transport network connecting BBUs and RRHs. Compared with traditional BSs, the RRHs are lightweight, allowing them to be deployed in a high density with low cost. Meanwhile, the centralized BBU pool enables seamless coordination of RRHs for collaborative signal processing, radio resource allocation, network virtualization, etc. Thus, a natural challenge is to design scalable RRH coordination algorithms to avoid high overhead and computational cost caused by the high density of RRHs.

      3) Fog-RAN: The key idea of Fog-RAN is to take full advantages of local computing, communication, and storage capabilities at edge devices (such as RRHs, smartphones, and laptops.), so as to avoid heavy communication overhead and large latency caused by backhaul/fronthaul transmission and centralized processing. To fully utilize the capabilities at edge devices, tasks should be efficiently decomposed and assigned to different devices. As such, decentralized control is critical in Fog-RAN. As a task offloading scheme, Fog-RAN is usually overlaid on other network architectures. For example, the Fog-RAN in Fig. 1 partially merges with a macrocell and a Cloud-RAN.

      The above-mentioned network architectures may coexist in a single H-UDN. For example, coordination among devices may rely on centralized processing in the BBU pool of a Cloud-RAN. Meanwhile, the computational tasks may be offloaded to network edges of a Fog-RAN for delay-sensitive applications. In a nutshell, a H-UDN is an inseparable continuum of different types of networks. Signal processing and resource management in such a complicated system become challenging due to the close interaction between different network entities. As a result, interference across different networks must be carefully managed. Likewise, different types of resource allocation, including allocation of radio, computation, and storage, are to be coordinated at different levels of network hierarchies and different types of network entities. To fully utilize the resources of network entities without causing significantly implementational and computational costs, it is of utmost importance to design efficient resource management schemes for H-UDNs.

    • In this subsection, we describe the interactions between different entities of an H-UDN using graphical models.

      1) Coverage: The coverage of APs in a H-UDN can be modelled as a bipartite graph as shown in Fig. 1, where the APs (denoted by solid squares) and users (denoted by hollow squares) are treated as vertices. An edge connects an AP and a user if the AP serves the user. Similarly, we can extend this graphical model to the coverage of caches, computing units, etc.

      2) Contention: Contentions exist in an H-UDN due to limited resources. For example, nearby links interfere with each other, and thus cannot transmit on the same frequency channel at the same time. Likewise, multiple BSs compete for limited fronthaul and backhaul capacity. Different tasks compete for storage and computing power. By connecting the conflicting entities with edges, a conflict graph is constructed.

      Besides the discussions above, H-UDNs can be represented by graphs in many other aspects, such as cooperation among network entities, dependency between content popularity and caching placement, time-varing connection between users and APs, and interaction between different network levels. Most of these graphs share common properties. First, most of them are sparse, i.e., a limited number of edges exist between vertices. Second, most of the graphs are locally dense. That is, the graphs contain many short loops. These two properties are due to the scattered physical locations of network entities. Nearby entities are more likely to interfere with each other. It is also easier for them to share information and cooperate. Meanwhile, interactions between faraway entities are weak and usually negligible. In what follows, we will discuss how to exploit these two properties to design efficient machine learning algorithms for signal processing and resource management in H-UDNs.

    • In a UDN, interference is significantly magnified due to the high density of APs. To fully exploit the benefits of network densification, efficient interference management schemes become essential. It is commonly believed that network cooperation is an effective technique to mitigate interference and enhance network performance. However, cooperative signal processing in UDNs is usually very costly due to the following reasons. First, the amount of data in an H-UDN is extremely large. Jointly processing these data may involve prohibitively high computational complexity. Secondly, full-scale cooperation requires the estimation of a sufficiently large channel matrix that consists of channel coefficients from all mobile users to all APs. This causes significant channel estimation overhead, and thereby fundamentally limits the cooperation gain. Thirdly, with limited capacity of the fronthual network, excessive information sharing among APs is impossible. In this section, we will discuss potential solutions to the above issues based on the graphical representations of the H-UDN and the corresponding machine learning techniques.

    • We first illustrate the use of factor graphs and the associated message passing (MP) algorithms[6] for efficient signal processing in H-UDNs. A factor graph is a bipartite graph representing the factorization of a function, especially a probabilistic function. Assume, for example, that $X$,$Y$, and $Z$ are random variables that form a Markov chain. Then, their joint probability mass function ${P_{XYZ}}(x,y,z)$ can be factored as a product of ${P_X}(x)$, ${P_{Y|X}}(y|x)$, and ${P_{Z|Y}}(z|y)$. The factorization can be expressed by a factor graph shown in Fig. 2a. Specifically, the factor graph consists of variable nodes $\{ x,y,z\} $(denoted by circles), factor nodes $\{ {P_X},{P_{Y|X}},{P_{Z|Y}}\} $(denoted by squares), and edges. A variable node and a factor node are connected by an edge only when the variable is an input of the corresponding factor function. That is, edges specify the dependence relations between two types of nodes.

      A typical statistical inference problem is to infer unobserved variables based on observed variables and a given probability model. With the help of the factor-graph representation of the probability model, the problem translates to estimating the values of unobserved nodes conditioned on the values of observed ones. Efficient MP algorithms associated with graphical representations can be designed to solve the inference problem. In an MP algorithm, there are two types of messages over an edge: 1) Messages from a factor node to a variable node. 2) Messages from a variable node to a factor node. Each message over an edge is designed to capture the probabilistic information from nearby nodes. Each message is iteratively updated only based on the previous messages sent by nearby nodes. In this way, MP provides a distributed solution for the inference problem. Moreover, when the factor graph is sparse, the complexity can be further reduced. Notice that many signal processing problems in communications are inference problems, since the receiver in a communication system wants to estimate the unobserved transmit signals by processing its observed receive signals. Hence, factor-graph-based MP has a great potential to provide an efficient solution for signal processing in H-UDNs. However, depending on specific problems, MP algorithms need to be carefully adjusted to achieve satisfactory performance. In the following, we illustrate how to utilize the special features of the underlying systems to design efficient message-passing algorithms for signal processing. In particular, we will discuss the applications of factor graph in signal detection problems with and without perfect channel state information (CSI).

      Figure 2.  Two examples of factor graph.

      1) Signal processing with perfect CSI. In this example, we consider joint uplink signal detection in a Cloud-RAN. Suppose that the APs (i.e., RRHs) perfectly know the channel matrix ${{H}}$, and the signals transmitted by the users follow an independent distribution. Then, the widely-used maximum a posteriori (MAP) estimator can be expressed as ${{\hat x}} = \arg \max \displaystyle\prod\limits_{n = 1}^N {P({y_n}|{{x}},{{H}})} \displaystyle\prod\limits_{k = 1}^K {P({x_k})} $, where ${y_n}$ is the signal received by AP n, and ${x_k}$ is the signal transmitted by user $k$. The corresponding factor graph is given in Fig. 2b. By treating the dashed rectangles as APs and users, the factor graph can be viewed as a graph induced from the coverage graph. More importantly, the factor graph shares certain common properties with the coverage graph. Recall that the coverage graph is a sparse and locally dense graph since each AP can only receive reasonably strong signals from a small number of nearby users. Hence, the factor graph is also globally sparse and locally dense. On one hand, the high sparsity leads to a significant reduction of computational complexity of messages in the message-passing algorithm. On the other hand, the local denseness compromises the convergence performance of the message-passing algorithm. MP converges when the factor graph is a tree or only contains random long loops. Unfortunately, locally dense factor graphs contain a large number of short loops. To improve the convergence of MP, several efficient methods have been proposed, such as damping and asynchronous updating. With damping, an updated message is a weighted average of the message in the previous iteration and the message calculated by the original message updating rule. With asynchronous updating, the messages are updated sequentially instead of in parallel, with the hope to break the effect of short loops if the updating schedule is appropriately chosen. However, so far there is no efficient and systematic methods to find an updating schedule that ensures convergence of MP. Recently, Ref. [1] proposed a random asynchronous updating strategy to improve the convergence performance, where the messages are updated sequentially in a random order. Fig. 3a and Fig. 3b compare the convergence performance of MP with other widely-used algorithms. As shown in Fig. 3a, the MP algorithms converge much faster than other algorithms. Through Fig. 3b, we observe that the number of iterations needed by MP does not increase with the network size, while that of the other algorithms grows roughly linear with the network size. This implies that MP with appropriate adjustment is an effective solution for scalable signal processing in large-scale H-UDNs.

      2) Blind signal detection. In the previous example, we have assumed perfect CSI at the receiver side. In practical systems, CSI acquisition consumes a substantial amount of system resource when the network size is large, and thus fundamentally limits the system capacity. To avoid the channel estimation overhead, another line of research works on blind-detection approaches, where both the channel matrix ${{H}}$ and the transmitted signals ${{x}}$ are estimated simultaneously from the received signals ${{y}}$. However, it has been proved that blind approaches can, at best, achieve the same degrees of freedom as training-based approaches due to the fact that an unknown channel makes signal detection in a blind approach more difficult. Recently, it was shown that channel sparsity in certain transformed domain can be exploited to improve the performance of blind detection. For example, in massive MIMO systems, the channel matrix ${{H}}$ is sparse in the angular domain. The MAP estimation can be written as,

      $$({{\hat x}},{{\hat H}}) = \arg {\max\nolimits_{{{x}},{{H}}}}P({{y}}|{{x}},{{H}})P({{x}})P({{H}})$$

      where the sparsity of ${{H}}$ is incorporated in its prior distribution $P({{H}})$. Similar to the previous example, the above MAP estimation can also be represented by a factor graph. With careful design, MP algorithms, such as projected bilinear generalized approximate message passing (P-BiG-AMP)[4], can be potentially applied to solve the blind detection problem. Fig. 3c shows that the P-BiG-AMP algorithm significantly outperforms other existing detection schemes, and performs closely to the ideal case with perfect channel knowledge.

      Figure 3.  Convergence speed and performance of message passing (MP).

    • In this subsection, we discuss the design of efficient deep learning algorithms for signal processing in H-UDNs. In deep learning, a multi-layer neural network is trained to imitate the relationship between the input and output data based on a large training set. Once the neural network is well trained, it can be used to complete the task it was trained for with very low complexity. Thanks to its outstanding performance, e.g. in image classification and speech recognition, deep learning has been considered as a promising tool for future wireless network design. However, how to adapt existing deep learning techniques to wireless systems is still an open problem. Most of the previous work in image/audio processing has used very little prior information during the training of neural networks. Nonetheless, in wireless systems, there exists a large amount of prior information, including user locations, APs' coverage area, distributions of channel gains, and so on. Ignoring such prior information would inevitably cause a performance loss. Recently, noticeable efforts have been made to utilize prior information to improve the performance of deep learning[5, 7-8]. In Ref. [5], a novel neural-network architecture has been proposed to mimic the updating rule of a well-known signal processing algorithm, approximate message passing (AMP). As shown in Fig. 4a, the message updating function in each layer of the neural network is the same as that of AMP, but the updating matrices $\{ {{{B}}_1},{{{B}}_2}, \cdots , {{{B}}_M}\} $ are learned from the training data (e.g., a set of received and transmitted signals $\{ ({{{y}}^{(d)}},{{{x}}^{(d)}})\} _{d = 1}^D$). The proposed neural network significantly outperforms the original AMP algorithm in both computational time and accuracy. In addition, Ref. [5] showed that fixing the structure of updating matrices according to the channel matrix ${{A}}$, i.e., letting ${{{B}}_m} = {{{C}}_m}{{A}}$, renders a more efficient learning procedure. This is because the channel matrix, as prior information, represents the relationship between the channel input and the channel output. When applying to large-scale UDNs, however, the approach in Ref. [5] might be impractical due to the high computational complexity and signaling overhead to estimate the channel matrix. On the other hand, as discussed in Subsection 1.2, the coverage graph of an H-UDN is often sparse and easy to obtain. Thus, it is natural to restrict the updating matrices in the neural networks to sparse matrices according to the coverage information. For example, Ref. [8] proposed a convolutional neural network (CNN)-based signal detector for large-scale banded systems, in which the receivers can only receive signals from nearby transmitters. As shown in Fig. 5, by utilizing the banded structure of the channel matrix ${{H}}$, a CNN is used to extract and share messages between consecutive signals. As all neurons in a convolutional layer share the same set of tunable parameters, the number of tunable parameters is significantly reduced, and the complexity of neural network training is drastically lowered.

      Figure 4.  Neural networks to approximate existing signal processing/resource management algorithms.

      Figure 5.  Architecture of the CNN-based detector.

    • To date, most resource allocation problems in wireless networks have been formulated and solved as optimization problems, thanks to the availability of efficient algorithms when the formulation is in a nice (e.g., convex) form. A limitation of optimization methods is that they heavily rely on rigidly defined and mathematically convenient models. However, in H-UDNs, complicated interactions between various network entities render accurate modeling difficult. Even if an accurate model can be constructed, it is likely too tedious to be mathematically tractable. In contrast, driven by real-world data instead of pre-assumed models, machine learning algorithms hold significant potential. In this section, we discuss the design of efficient machine learning algorithms based on graphical models for resource management in H-UDNs; see the summary in Table 1.

      Table 1.  Machine Learning Techniques for Resource Management

      Learning techniquesKey characteristicExamplesUsages of graphical models
      Reinforcement learningGovern agents in an environment to maximize cumulative rewardsOpportunistic access[9-10]
      Joint power control and relay selection[11]
      Information sharing over edges
      Environment modeling based on graphs
      Performance analysis based on graph theory
      Deep learningUse a cascade of layers of nonlinear processing units for feature extraction and transformationAlgorithm approximation[12]
      Environment/ policy modeling[13-14]
      Neural network sparsification based on graphs to simplify the learning procedure
      Semi-supervised learningInfer information of unlabeled data from a small amount of labeled dataVertex labeling[15]Propagating node information over edges
    • Reinforcement learning is concerned with how agents ought to take proper actions in a dynamic environment so as to maximize certain cumulative rewards. Regarding network entities as agents and the network as the environment, reinforcement learning is an ideal tool for resource management in H-UDNs. Here, we take a model-free reinforcement learning technique, say Q-learning, as an example. As shown in Fig. 6, Q-learning is used to find an optimal action-selection policy by maximizing the accumulated reward, which is approximated by a Q-function. The Q-function is iteratively updated through observing the actual reward after the agent carries out an action in a given environment. For instance, Ref. [9] proposed a Q-learning algorithm for femtocells to design transmission schemes based on their own actions and received reward signals in order to maximize the reward functions. In Ref. [9], femtocells acted as a secondary network and perform opportunistic channel access in a distributed manner, which reduced the implementation complexity. One key challenge of applying existing Q-learning algorithms to H-UDNs is that when the devices aim at maximizing their own reward functions, their behaviors often reduce each other’s rewards. This is because an H-UDN consists of various kinds of devices that may have conflicting interests. To address the challenge, it is necessary to introduce a certain level of information exchange among different devices. For example, Ref. [11] proposed to replace individual reward functions by a global reward function to enhance the performance of joint power control and relay selection. However, obtaining the global reward requires collecting all the reward signals throughout the whole network, which is impractical due to the large network size. Hence, an approximation approach was proposed in Ref. [11] based on the coverage graph. That is, each device approximates the global reward by averaging the rewards received from the neighboring devices in the coverage graph. Another way to avoid collecting the reward signals from the entire network is to design the scheme in a way that a good system-level performance is achieved when individual reward functions are optimized. Ref. [10] considered an opportunistic spectrum access (OSA) problem in cognitive radio networks, where individual objective functions are derived for each user by decoupling the global reward function. Specifically, the OSA problem is transferred to a coloring problem over a conflict graph of the network with users being the vertices. For each user, the individual objective function consists of its own rewards and the rewards of users conflicting with it. It is shown that the proposed algorithm can achieve a near-optimal performance when the conflict graph is a complete graph.

      Figure 6.  In Q-learning, the Q-function is iteratively updated through observing the actual reward after the agent carries out an action in a given environment.

    • Deep learning is potentially a useful tool for resource management in H-UDNs due to its capability of modeling complex non-linear relationships. Deep neural networks can be used to replace the existing high-complexity resource allocation algorithms. Recently, Ref. [12] approximated weighted MMSE (WMMSE) based power allocation using deep learning by treating the input and output of WMMSE as an unknown non-linear mapping. Unlike Ref. [5], which approximates the AMP algorithm by mimicing each iteration by a layer of the neural network, Ref. [12] used a fully-connected neural network to approximate the WMMSE algorithm (see Fig. 4b). Through simulations, the authors showed that the fully-connected deep neural network can achieve similar performance as that of WMMSE with much lower complexity. Ref. [16] proposed a general deep learning framework for the optimization of downlink beamforming. Specifically, the solution is obtained based on neural networks and exploitation of expert knowledge, such as the uplink-downlink duality and the known structure of optimal solutions. Through simulations, the authors showed that the proposed framework can achieve near-optimal solutions while enjoy significantly reduced computational complexity.

      Neural networks can naturally be applied to learn the network environments, such as prior information, reward functions, states of agents, and interactions between agents, which are useful in reinforcement learning schemes. This leads to deep reinforcement learning algorithms. In Ref. [13], the authors proposed to use a deep convolutional neural network to approximate the optimal action-value function of Q-learning. In this way, successful policies are directly learned from high-dimensional inputs in the end-to-end system with very little model information. In Ref. [14], a deep reinforcement learning algorithm was proposed for job scheduling with policy represented by deep neural networks. Through simulations, it is shown that the proposed algorithm performs comparably or even better than standard heuristic algorithms. Moreover, it does not require any prior information of the system environment. Recently, Ref. [17] proposed to replace traditional DNNs in deep reinforcement learning algorithms with a graph neural network, which is constructed based on the graphical representation of actions. The proposed algorithm not only achieves comparable performance to state-of-the-art reinforcement learning methods, but also shows outstanding capability of generalization to different scenarios. These encouraging observations indicate the great potential of developing deep reinforcement learning techniques for resource management in H-UDNs. Exploiting the graphical representation of underlying networks to improve the efficiency is undoubtedly a subject worth pursuing.

    • Semi-supervised learning is a powerful tool to train prediction systems with little supervision (i.e., only a small percentage of the training data is labeled). The learning procedure is usually performed over a graph, in which a small fraction of vertices are labeled. The vertices in the partially-labeled graph are connected by an edge if they share some similarities, with the edge weight quantifying the similarities. Then, the learning algorithm gradually labels the unlabeled vertices by propagating information across the graph in a distributed manner, leading to efficient implementation on large graphs[15]. Many resource allocation problems in H-UDNs can be interpreted as vertex labeling (a.k.a., graph coloring) problems. For example, frequency channels can be allocated by labeling the devices with different labels (say, different subchannels). In practice, the subchannels occupied by devices at the network boundaries are predetermined to avoid collisions with nearby networks, which means a small number of vertices corresponding to boundary devices are pre-labeled. The radio resource allocation problem thereby can be transferred to a semi-supervised learning problem by carefully defining the edge weights (i.e., similarities). For example, we can set the weights between devices to be inversely proportional to the distances between them to avoid interference among nearby devices. Another example is content placement in network caching. Vertices corresponding to caches are connected in a cooperation graph, only when they are capable of jointly serving a user with the same contents. Constraints in terms of storage capacity, fronthual capacity, transmission delay, etc., can be treated as pre-labeled data in semi-supervised learning algorithms.

    • This article discussed machine learning as a promising solution for collaborative signal detection and resource management in H-UDNs with graphical representations. Incorporating the network structure by means of various graph representations, machine learning algorithms exhibit great potential in dealing with large-scale systems without relying on mathematically convenient models. The few examples discussed in this paper serve as an inspiration that calls for future research efforts in this field. There is no doubt that machine learning algorithms associated with graphical representations, once successfully designed, will significantly improve the performance of H-UDNs and consequently bring enormous economic benefits to mobile networks.

      Some potential challenges of applying machine learning in H-UDNs are listed below. First, as discussed before, how to incorporate various prior information in learning algorithms is an interesting future research topic. Meanwhile, the cost of acquiring prior information justifies a thorough investigation on the tradeoff between algorithm performance and the amount and type of prior information acquired (e.g., the exact distribution of a wireless channel vs. the family of distributions of a wireless channel). Secondly, the scalability in terms of processing cost, memory requirements, computation time, etc., is a key factor that determines the feasibility of machine learning algorithms in large-scale H-UDNs. It is therefore worthy to investigate the design of scalable machine learning algorithms by exploiting the underlying structure of physical networks. Thirdly, the network environment of H-UDNs varies dynamically due to the dynamics of mobile users, wireless channels, and interaction between network entities. A good machine learning algorithm must be able to adapt to environment dynamics in a timely and stable manner.

参考文献 (17)

目录

    /

    返回文章
    返回