Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters

dc.contributor.advisorSarkar, Vivek
dc.contributor.committeeMemberCox, Alan
dc.contributor.committeeMemberMellor-Crummey, John
dc.creatorZhang, Yunming
dc.date.accessioned2016-01-07T22:00:11Z
dc.date.available2016-01-07T22:00:11Z
dc.date.created2014-12
dc.date.issued2014-05-27
dc.date.submittedDecember 2014
dc.date.updated2016-01-07T22:00:11Z
dc.description.abstractThis research proposes a novel runtime system, Habanero Hadoop, to tackle the inefficient utilization of multi-core machines' memory in the existing Hadoop MapReduce runtime system. Insufficient memory for each map task leads to the inability to tackle large-scale problems such as genome sequencing and data clustering. The Habanero Hadoop system integrates a shared memory model into the fully distributed memory model of the Hadoop MapReduce system. The improvements eliminate duplication of in-memory data structures used in the map phase, making more memory available to each map task. Previous works optimizing multi-core performance for MapReduce runtime focused on maximizing CPU utilization rather than memory efficiency. My work provided multiple approaches to significantly improve the memory efficiency of the Hadoop MapReduce runtime. The optimized Habanero Hadoop runtime can increase the throughput and maximum input size for certain widely used data analytics applications such as Kmeans and Hash Join by 2x.
dc.format.mimetypeapplication/pdf
dc.identifier.citationZhang, Yunming. "Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters." (2014) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/87782">https://hdl.handle.net/1911/87782</a>.
dc.identifier.urihttps://hdl.handle.net/1911/87782
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectMapReduce
dc.subjectMemory
dc.titleOptimized Runtime Systems for MapReduce Applications in Multi-core Clusters
dc.typeThesis
dc.type.materialText
thesis.degree.departmentComputer Science
thesis.degree.disciplineEngineering
thesis.degree.grantorRice University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHANG-DOCUMENT-2014.pdf
Size:
4.9 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: