Ng, T. S. Eugene2019-05-172019-05-172018-052018-06-11May 2018Wu, Dingming. "Oblivious yet High Performance Task Scheduling for Large Shared Clusters." (2018) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/105585">https://hdl.handle.net/1911/105585</a>.https://hdl.handle.net/1911/105585Data analytics in large scale clusters are gradually shifting from monolithic and centralized scheduling frameworks to distributed or hybrid scheduling frameworks. In these distributed or hybrid frameworks, task queues on workers have widely been adopted to reconcile the conflict of task placements by different cluster schedulers. While a lot of task scheduling policies are available for each worker, the impact of each policy on the task performance and the ultimate job performance is not well understood. Consequently, the choice of scheduling policy for task is usually quite \textit{ad hoc}, especially when the task runtime information is not available beforehand. This thesis explores the task queuing effect by examining and comparing different scheduling policies for workers. We present the design and implementation of a worker-level task scheduler, Runway, that is oblivious to the individual task runtime information while still provides high performance and fairness. We demonstrate Runway's effectiveness in reducing average task completion time while guaranteeing starvation-freedom through extensive evaluations. Results show that Runway can provide 5$\times$ task performance improvement, and 42\% job performance improvement under high load compared to the state-of-art solution.application/pdfengCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.Cluster SchedulingTask SchedulingBig data frameworksNon-clairvoyant SchedulingOblivious yet High Performance Task Scheduling for Large Shared ClustersThesis2019-05-17