An Experimental Comparison of Complex Objects Implementations in Big Data Systems

dc.contributor.advisorJermaine, Christopheren_US
dc.creatorSikdar, Souraven_US
dc.date.accessioned2017-08-01T16:34:43Zen_US
dc.date.available2017-08-01T16:34:43Zen_US
dc.date.created2016-12en_US
dc.date.issued2017-06-07en_US
dc.date.submittedDecember 2016en_US
dc.date.updated2017-08-01T16:34:43Zen_US
dc.description.abstractMany data management and analytics systems support complex objects. Dataflow platforms such as Spark and Flink allow programmers to manipulate sets consisting of objects from a host programming language, often Java. Document databases such as MongoDB make use of hierarchical interchange formats--most popularly JSON--which embody a data model where individual records can themselves contain sets of records. Systems such as Dremel and AsterixDB allow complex nesting of data structures. The desire to support such complex objects forces a system designer to ask: how should complex objects be implemented in a modern data management system? In this thesis, over a suite of representative data management tasks, I experimentally evaluate the performance implications of a wide variety of complex object implementations. The choice of object implementation can have a profound effect on performance. For example, the same external sort to perform a duplicate removal can take anywhere between a half hour to fourteen and a half hours depending upon the complex object implementation. A corollary is that a bad object implementation can doom system performance. In addition, we reaffirm the value of the classical database way of storing complex objects - where there is no distinction between the in-memory and over-the-wire data representation, within a modern big data system.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationSikdar, Sourav. "An Experimental Comparison of Complex Objects Implementations in Big Data Systems." (2017) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/96019">https://hdl.handle.net/1911/96019</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/96019en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectComplex Objects Implementationsen_US
dc.subjectExperimental Evaluationen_US
dc.subjectBig Data Systemsen_US
dc.titleAn Experimental Comparison of Complex Objects Implementations in Big Data Systemsen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Scienceen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SIKDAR-DOCUMENT-2016.pdf
Size:
515.24 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: