Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters

dc.contributor.advisorSarkar, Viveken_US
dc.contributor.committeeMemberCox, Alanen_US
dc.contributor.committeeMemberMellor-Crummey, Johnen_US
dc.creatorZhang, Yunmingen_US
dc.date.accessioned2016-01-07T22:00:11Zen_US
dc.date.available2016-01-07T22:00:11Zen_US
dc.date.created2014-12en_US
dc.date.issued2014-05-27en_US
dc.date.submittedDecember 2014en_US
dc.date.updated2016-01-07T22:00:11Zen_US
dc.description.abstractThis research proposes a novel runtime system, Habanero Hadoop, to tackle the inefficient utilization of multi-core machines' memory in the existing Hadoop MapReduce runtime system. Insufficient memory for each map task leads to the inability to tackle large-scale problems such as genome sequencing and data clustering. The Habanero Hadoop system integrates a shared memory model into the fully distributed memory model of the Hadoop MapReduce system. The improvements eliminate duplication of in-memory data structures used in the map phase, making more memory available to each map task. Previous works optimizing multi-core performance for MapReduce runtime focused on maximizing CPU utilization rather than memory efficiency. My work provided multiple approaches to significantly improve the memory efficiency of the Hadoop MapReduce runtime. The optimized Habanero Hadoop runtime can increase the throughput and maximum input size for certain widely used data analytics applications such as Kmeans and Hash Join by 2x.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationZhang, Yunming. "Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters." (2014) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/87782">https://hdl.handle.net/1911/87782</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/87782en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectMapReduceen_US
dc.subjectMemoryen_US
dc.titleOptimized Runtime Systems for MapReduce Applications in Multi-core Clustersen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Scienceen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHANG-DOCUMENT-2014.pdf
Size:
4.9 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: