High-Performance Data Multicast in Hybrid Data Center Networks

dc.contributor.advisorNg, T. S. Eugeneen_US
dc.creatorSun, Xiaoye Stevenen_US
dc.date.accessioned2019-05-17T16:44:36Zen_US
dc.date.available2019-05-17T16:44:36Zen_US
dc.date.created2018-12en_US
dc.date.issued2018-11-30en_US
dc.date.submittedDecember 2018en_US
dc.date.updated2019-05-17T16:44:36Zen_US
dc.description.abstractNowadays, a significant number of big data processing applications, such as machine learning algorithms and database queries are implemented based on various distributed big data processing frameworks. The distributed computation logic in these applications greatly relies on data multicast, a data transfer pattern with which a piece of data is delivered to multiple destination servers. However, in these distributed frameworks, the state-of-the-art data multicast mechanisms are all based on application-layer multicast, in which data is delivered through unicast flows on top of an overlay network. This thesis proposes high-performance system components that solve the data multicast issue by leveraging hybrid data center networks. In a hybrid data center network, the racks are connected via a circuit switch (or a circuit-switched network) in addition to the traditional packet-switched network. Circuit switches fundamentally change the multicast communication capability among the servers since they can be extended to support physical layer multicast. This thesis achieves the goal of high-performance from two critical aspects, i.e., multicast data transfer and multicast data scheduling. In the first part, the thesis presents Republic, a complete platform providing high-performance ``data multicast service'' for applications running in hybrid data centers. Republic consists of Republic agent daemon running on each of the servers and a centralized Republic manager. The Republic agent (1) exposes a unified Republic API for the applications using the data multicast service, (2) talks with the Republic manager to request and return network resources for data multicast, and (3) achieves multicast data transfer efficiently and reliably. The Republic manager, takes the multicast data scheduling algorithm as a plug-in module. Republic is implemented and deployed in a hybrid data center testbed. The testbed evaluation shows that Republic can improve data multicast in Apache Spark machine learning applications by as much as 4.0 times. In the second part, the thesis tackles the problem of scheduling multicast data transfer in a high-bandwidth circuit switch. The scheduling algorithm adopts the approaches of multi-hopping and segmented transfer. It aims at minimizing the average demand completion time to deliver the most benefit to the applications. The algorithm exhibits up to 13.4 times improvement comparing with the state-of-the-art solution.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationSun, Xiaoye Steven. "High-Performance Data Multicast in Hybrid Data Center Networks." (2018) Diss., Rice University. <a href="https://hdl.handle.net/1911/105887">https://hdl.handle.net/1911/105887</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105887en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectData center networken_US
dc.subjectdata multicasten_US
dc.subjectcircuit switchen_US
dc.subjectnetwork traffic schedulingen_US
dc.subjecten_US
dc.titleHigh-Performance Data Multicast in Hybrid Data Center Networksen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentElectrical and Computer Engineeringen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SUN-DOCUMENT-2018.pdf
Size:
2.73 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.85 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: