Abstract:
Scientific workflow provides scientific computing with workflow specification, workflow process management, task parallelism, etc. High performance computing provides mechanisms and development interfaces such as cluster management, task management, task scheduling, etc. to scientific computing. While we are entering into a "big data" era, it is necessary to integrate scientific workflow with high performance computing to implement the large scale parallel computing on high performance computing platform. The integration middleware interact with upper workflow systems and underlying HPC platform provides the support for task submission and status monitoring. The integration architecture will be a reference solution to the construction of computing platforms in distributed cluster environment. Taking Swift scientific workflow system and Windows HPC platform integration solution as references, a case study by using a NASA MODIS image processing workflow is presented to analyze and demonstrate the capability of the integrated system.