Deep Graph Representation Learning: Scalability and Applications

dc.contributor.advisorHu, Xia (Ben)
dc.creatorZhou, Kaixiong
dc.date.accessioned2023-09-01T19:57:12Z
dc.date.available2023-09-01T19:57:12Z
dc.date.created2023-08
dc.date.issued2023-08-10
dc.date.submittedAugust 2023
dc.date.updated2023-09-01T19:57:13Z
dc.description.abstractThe ubiquity of graphs in science and industry has motivated the developments of graph representation learning algorithms, where graph neural networks (GNNs) have emerged as one of the predominant computational tools. In general, GNNs apply a recursive message passing mechanism to learn the representation of each node by incorporating representations of itself and its neighbors. Despite the promising results of GNNs achieved in many fields, their scalability and application are still too limited to learn the complex and large-scale graph data. The scalability of GNNs is defined from two perspectives: model depth scalability and data processing scalability. First, the model depth of GNNs is often less than three layers, which prevents one from effectively modeling the high-order neighborhood dependencies. Second, GNNs are notorious to suffer from bottlenecks of memory space and computation time on the large graphs, which are characterized by the large amount of nodes and edges. Third, although many GNN prototypes have been proposed in the benchmark datasets, it is not straightforward to apply GNNs to a new application on hand with the specific domain knowledge. To address the above challenges, I have devoted to exploring a series of work to advances the optimization of deep GNNs, the efficient training on large graphs, and their well-performing applications. Part I aims at scaling up the model depth at graph neural architecture to learn the complex neighborhood structure. At the fundamental theory level, we analyze the over-smoothing issue within deep model, where the node representation vectors over the graph converge to similar embeddings. At the algorithm level, we develop a set of novel tricks including normalization, skip connection, and weight regularization to tackle the over-smoothing. At the benchmark level, we develop the first platform to comprehensively incorporate the existing tricks, fairly evaluate them, and propose a new model of deep GNNs with superior generalization performance across tens of benchmark datasets. At Part II, we present algorithms to enhance GNNs’ scalability in learning the large-scale graph datasets. A novel training paradigm of graph isolated training is proposed to decouple the large graph into many small clusters and train expert GNNs for each of them. By cutting down the inter-cluster communication between clusters, our solution significantly accelerates the training process while maintaining the node classification accuracy. We also analyze label bias issue at the small batch, which might lead to overfitting of GNNs. An adaptive label smoothing is then designed to address the label bias and improve the model’s generalization performance. At Part III, we further explore the wide applications of GNNs. Based on the transfer learning paradigm of “pre-train, prompt, fine-tune”, we design the first graph prompting function. The graph prompt reformulates the downstream task looking the same as the pretext one and transfers the pre-trained model easily to the downstream problem. At the area of bioinformatics, we extend GNNs to hierarchically learn the different abstract graph structures of graph molecules. At the area of tabular data mining, we use GNNs to explicitly learn the feature interactions between columns and make recommendation for each sample. Finally, I discuss the future work of graph machine learning.
dc.format.mimetypeapplication/pdf
dc.identifier.citationZhou, Kaixiong. "Deep Graph Representation Learning: Scalability and Applications." (2023) Diss., Rice University. https://hdl.handle.net/1911/115249.
dc.identifier.urihttps://hdl.handle.net/1911/115249
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectDeep graph neural networks
dc.subjectlarge-scale graph machine learning
dc.subjectgraph batch bias
dc.subjectgraph prompt
dc.subjectmolecular graph representation learning.
dc.titleDeep Graph Representation Learning: Scalability and Applications
dc.typeThesis
dc.type.materialText
thesis.degree.departmentComputer Science
thesis.degree.disciplineEngineering
thesis.degree.grantorRice University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHOU-DOCUMENT-2023.pdf
Size:
3.92 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: