Publication
SoCC 2022
Short paper

Cloud-native Workflow Scheduling using a Hybrid Priority Rule and Dynamic Task Parallelism

View publication

Abstract

Demand for efficient cloud-native workflow scheduling is growing as many data science workloads are composed of several tasks with dependencies. As container technology becomes more prevalent in cloud communities, containerized workflow orchestration tools are introduced and become standard for scheduling workflows. However, current schedulers use simple heuristics and rely on the user’s choice on priority and parallelism level of tasks without accounting for workflow-specific information. We introduce a workflow-aware scheduling algorithm that uses workflow information for scheduling tasks, without user input, with an objective of improving resource utilization and minimizing weighted workflow completion time, duration multiplied by user specific workflow priority. Our scheduler comprises of two strategies, a hybrid priority rule inspired by production planning ideas, and a task splitting rule based on a convex task processing time curve for the parallelism level. Using simulation, we demonstrate that our algorithm (1) produces an efficient balance of weighted workflow completion time and resource utilization and (2) outperforms deterministic parallelism.

Date

Publication

SoCC 2022