A data placement approach for scientific workflow execution in hybrid clouds
thesisposted on 28.03.2022, 23:44 by Amirmohammad Pasdar
Cloud computing has been widely adopted by industry practitioners and researchers. Recently, applications in science and engineering such as scientific workflows have also been increasingly deployed in clouds. As these applications are becoming resource intensive in both data and computing, private clouds struggle to cope with their resource requirements. Public clouds claim to overcome many shortcomings of private clouds. However, the complete offloading of workflow execution to public clouds may introduce excessive data transfer and privacy/governance concerns. In this thesis, we propose a hybrid cloud solution for workflow scheduling explicitly considering data placement. To this end, we present Hybrid Scheduling for Hybrid Clouds (HSHC), which schedules scientific workflows across private and public clouds incorporating a novel dynamic data placement policy. HSHC consists of two phases: static and dynamic. The former uses an extended genetic algorithm to solve the problem of workflow scheduling with static information of workflows and cloud resources. The latter adjusts scheduling and data placement decisions reflecting changing conditions of workflow execution in the hybrid cloud. We evaluate HSHC with both real-world scientific applications and random workflows in performance and cost. Experimental results demonstrate HSHC’s two-phase approach effectively deals with the dynamic nature of the hybrid cloud.