Browsing by Author "Luo, Shangyu"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Adding Vector and Matrix Support to SimSQL(2016-04-22) Luo, Shangyu; Jermaine, ChrisIn this thesis, I consider the problem of making linear algebra simple to use and efficient to run in a relational database management system. Relational database systems are widely used, and much of the data in the world is stored within them. Having linear algebra integrated into a relational database would provide great support for tasks such as in-database analytics and in-database machine learning. Currently, when it is necessary to perform such analyses, one must either extract the data from a database, and use an external tool such as MATLAB, or else use awkward, existing within-the-database linear algebra facilities. In this thesis, I will focus on my four main contributions: (1) I add vector and matrix types to SQL, the most commonly-used database programming language; (2) I design a few simple SQL language extensions to accommodate vectors and matrices; (3) I consider the problem of making vector and matrix operations efficient via integration with the database query optimizer; and (4) I conduct some experiments to show the efficacy of my language extensions.Item Automatic Matrix Format Exploration for Large Scale Linear Algebra(2020-10-30) Luo, Shangyu; Jermaine, ChristopherThe input of a linear algebra (LA) operation, such as matrices and vectors, could be stored in multiple ways: rows/columns, strips, blocks, etc. Usually, it is very difficult for a programmer to figure out the proper format to use to make a LA computation run fast. Predicting and optimizing the runtime behavior of a LA computation is not an easy task, even when one has expert knowledge of the underlying execution engine. The situation is particularly difficult if the computation consists of thousands of operations, and those operations must be run in a distributed manner. In this paper, we argue that we can render a parallel relational database to automatically explore the formats of LA computations. More specifically, our system would take in the existing code and analyze the operations in the code, explore different formats for those operations and select the most efficient formats, and finally automatically generate the new code to run those operations in their selected formats. We show that our implementation is able to find the formats that have a better performance than the formats that are manually picked up by an expert user of the system.