Workflow Task Scheduling in Cloud Computing Based on Hybrid Improved CS Algorithm and Decision Tree

CHEN Chao

doi:10.3969/j.issn.1001-0548.2016.06.017

The existing workflow task scheduling schemes in cloud computing environment are analyzed, For the issues of the long operation time and low resource utilization, a workflow task scheduling scheme base on hybrid improved cuckoo search and decision tree in cloud computing is proposed. First, the deadline is assigned according to the work-flow task attribute; then, the improved cuckoo search algorithm is used to split the workflow into several sub workflow, minimizing data dependent; then, the decision tree is used to choose the resources which meet the QoS constraints of tasks; finally, the deadline constraints to be satisfied is judged according to satisfy according to the sum of task computing time, queuing time and communication delay, so as to configure the appropriate resources. Experimental results show that the proposed scheme has shorter total running time and higher task completion rate.

HTML

云计算^[1]是一种新的计算技术，用户可以利用云计算租借软件、硬件、基础设施和计算资源作为每个用户的基础资源，将工作提交给云计算处理或存储。不同的用户有不同的QoS需求，在云计算中，云调度程序必须能够以最大化方式对工作流进行调度。科学工作流^[2]是指将一系列在科学研究中遇到的数据管理、计算、分析、展现等工作变成独立的服务，再把这些服务通过数据链接组合在一起，满足研究人员科学实验和数据处理中的需要。传统的计算环境已很难满足科学工作流的需要，云计算以高性能的计算资源为科学工作流应用提供了一种全新的部署和执行方式。

目前，云计算环境下工作流任务调度方案作为云计算工作流技术的重要组成部分，已经成为该领域内的研究热点。工作流任务调度的主要目标是减少任务执行的总时间和资源的空闲时间，提高资源利用率^[3]。为此，本文提出一种云计算环境下的工作流任务调度方案。该方案根据工作流任务属性分配截止期限，利用布谷鸟搜索(cuckoo search, CS)算法将工作流分割成多个子工作流，利用决策树选择满足任务QoS约束的资源，最后根据截止时间配置相应资源。实验结果表明，本文方案具有较短的总运行时间和较高的任务完成率。

1. 相关研究

云计算环境中，何时租借云虚拟资源以及如何租借做出有效决策是一个难题。现有的一些调度策略主要以作业等待时间作为决策依据，缺乏对资源动态服务能力的有效评估。目前，也有学者以截止时间为约束提出一些任务调度方案，如，文献[4]描述了一种异态最早结束时间(HEFT)列表调度方案，该方案首先为工作流图中的节点和边赋权值，生成一个有序的任务列表，然后根据任务列表分配资源，但在后续调度过程中没有优化调度顺序的机制。文献[5]融入了回退机制对HEFT进行了改进(SHEFT)，使其能够动态地预分配和释放资源。然而，这些方案都以分配的截止时间为标准，没有估算任务的具体执行时间。文献[6]提出了一种混合计算环境下的任务调度策略(Aneka)，考虑了工作流截止时间的要求，其策略主要根据任务大小来预测作业完成时间，从而判断是否超出时间限制，该策略没有考虑任务的队列等待时间。

另外，现有大多数云计算工作流调度方案都是利用特定算法，直接将工作流进行调度，对工作流分割成子工作流进行调度的研究不多。文献[7]运用图形分割算法，尽量减少工作流执行期间的数据移动，最小化中间节点通信。但是，并没有研究划分后的工作流资源调度问题，不适合多资源的云计算环境。文献[8]提出一种支配队列任务调度算法，根据工作流中任务节点的计算负载和传输负载确定关键路径，将关键路径上的任务进行聚集，并安排在同一个资源上执行。该方法一定程度上减少了执行任务时资源之间的通信开销，但其从大规模图结构角度出发，延长了任务的调度时间。在云计算应用中，工作流中的任务会分配到不同的资源上，由于这些任务之间存在依赖性，所以资源之间需要数据通信。如果能将工作流中相对独立的较小任务分离出来，再将这些子工作流尽量分配到集中的资源上，将会大大降低资源之间的通信量，提高任务执行效率。为此，本文利用一种启发式智能算法将工作流进行划分，最小化子工作流间的数据依赖性。

本文的主要创新在于：

1)采用CS搜索算法，将工作流划分成子工作流，最小化数据依赖性，以此提高后续任务调度的效率。

2)对CS搜索算法进行改进，提出了一种适用于工作流划分的变异和交叉机制，避免算法陷入局部最优，提高划分能力。

3)采用决策树来分类和选择候选资源，根据任务截止期限约束和排队时间来分配相应的资源。

4. 结束语

本文提出一种结合CS算法和决策树的工作流任务调度方案。为工作流中每个任务分配截止期限，利用CS算法将工作流分割成多个子工作流，利用决策树选择合适资源，根据截止期限约束配置相应的资源。在数据和计算密集型工作流上进行实验，与SHEFT和Aneka方案相比，本文方案具有较短的总运行时间和较高的任务完成率。

未来研究中，将考虑以亚马逊现货实例为对象进行实际实验。

Reference (13)

[1]	张鹏, 王桂玲, 徐学辉. 云计算环境下适于工作流的数据布局方法[J]. 计算机研究与发展, 2013, 50(3): 636-647.	ZHANG Peng, WANG Gui-ling, XU Xue-hui. A data placement approach for workflow in cloud[J]. Journal of Computer Research and Development, 2013, 50(3): 636-647.
[2]	刘少伟, 孔令梅, 任开军. 云环境下优化科学工作流执行性能的两阶段数据放置与任务调度策略[J]. 计算机学报, 2011, 34(11): 2121-2130. doi: 10.3724/SP.J.1016.2011.02121	LIU Shao-Wei, KONG Ling-Mei, REN Kai-jun. A two-step data placement and task scheduling strategy for optimizing scientific workflow performance on cloud computing platform[J]. Chinese Journal of Computers, 2011, 34(11): 2121-2130. doi: 10.3724/SP.J.1016.2011.02121
[3]	LI W, WU J, ZHANG Q. Trust-driven and QoS demand clustering analysis based cloud workflow scheduling strategies[J]. Cluster Computing, 2014, 17(3): 1-18.
[4]	CANON L C, JEANNOT E. Evaluation and optimization of the robustness of DAG schedules in heterogeneous environments[J]. IEEE Transactions on Parallel & Distributed Systems, 2010, 21(4): 532-546.
[5]	MAO Y, ZHU L, CHEN X, et al. Associate task scheduling algorithm based on delay-bound constraint in cloud computing[C]//201213th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES).[S.l.]:IEEE, 2012:92-96.
[6]	CALHEIROS R N, VECCHIOLA C, KARUNAMOORTHY D. The aneka platform and QoS-driven resource provisioning for elastic applications on hybrid Clouds[J]. Future Generation Computer Systems, 2012, 28(6): 861-870. doi: 10.1016/j.future.2011.07.005
[7]	AHMAD S G, LIEW C S, RAFIQUE M M, et al. Data-intensive workflow optimization based on application task graph partitioning in heterogeneous computing systems[C]//2014 IEEE Fourth International Conference on Big Data and Cloud Computing (BdCloud).[S.l.]:IEEE, 2014:129-136.
[8]	MALAWSKI M, JUVE G, DEELMAN E, et al. Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds[C]//Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.[S.l.]:IEEE Computer Society, 2012:1-11.
[9]	田国忠, 肖创柏, 谢军奇. 有期限约束的多DAG共享资源的调度及公平费用优化方法[J]. 计算机学报, 2014, 37(7): 1607-1619.	TIAN Guo-zhong, XIAO Chuang-bai, XIE Jun-qi. Scheduling and fair cost-optimizing methods for concurrent multiple DAGs with deadline sharing resources[J]. Chinese Journal of Computers, 2014, 37(7): 1607-1619.
[10]	CHEN W, DEELMAN E. Integration of workflow partitioning and resource provisioning[C]//IEEE/ACM International Symposium on Cluster, Cloud & Grid Computing.[S.l.]:IEEE Computer Society, 2012:764-768.
[11]	LI Xiang-tao, YIN Ming-hao. A hybrid cuckoo search via Lévy flights for the permutation flow shop scheduling problem[J]. International Journal of Production Research, 2013, 51(16): 4732-4754. doi: 10.1080/00207543.2013.767988
[12]	LARUMBE F, SANSO B. A Cuckoo search algorithm for the location of data centers and software components in green cloud computing networks[J]. Transactions on Cloud Computing IEEE, 2013, 1(1): 22-35. doi: 10.1109/TCC.2013.2
[13]	GHAFARIAN T, DELDARI H, JAVADI B. Cycloid grid:a proximity-aware P2P-based resource discovery architecture in volunteer computing systems[J]. Future Generation Computer Systems, 2013, 29(6): 1583-1595. doi: 10.1016/j.future.2012.08.010

Workflow Task Scheduling in Cloud Computing Based on Hybrid Improved CS Algorithm and Decision Tree

doi: 10.3969/j.issn.1001-0548.2016.06.017

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views