Deep Graph Representation Learning: Scalability and Applications

Date
2023-08-10
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

The ubiquity of graphs in science and industry has motivated the developments of graph representation learning algorithms, where graph neural networks (GNNs) have emerged as one of the predominant computational tools. In general, GNNs apply a recursive message passing mechanism to learn the representation of each node by incorporating representations of itself and its neighbors. Despite the promising results of GNNs achieved in many fields, their scalability and application are still too limited to learn the complex and large-scale graph data. The scalability of GNNs is defined from two perspectives: model depth scalability and data processing scalability. First, the model depth of GNNs is often less than three layers, which prevents one from effectively modeling the high-order neighborhood dependencies. Second, GNNs are notorious to suffer from bottlenecks of memory space and computation time on the large graphs, which are characterized by the large amount of nodes and edges. Third, although many GNN prototypes have been proposed in the benchmark datasets, it is not straightforward to apply GNNs to a new application on hand with the specific domain knowledge.

To address the above challenges, I have devoted to exploring a series of work to advances the optimization of deep GNNs, the efficient training on large graphs, and their well-performing applications. Part I aims at scaling up the model depth at graph neural architecture to learn the complex neighborhood structure. At the fundamental theory level, we analyze the over-smoothing issue within deep model, where the node representation vectors over the graph converge to similar embeddings. At the algorithm level, we develop a set of novel tricks including normalization, skip connection, and weight regularization to tackle the over-smoothing. At the benchmark level, we develop the first platform to comprehensively incorporate the existing tricks, fairly evaluate them, and propose a new model of deep GNNs with superior generalization performance across tens of benchmark datasets.

At Part II, we present algorithms to enhance GNNs’ scalability in learning the large-scale graph datasets. A novel training paradigm of graph isolated training is proposed to decouple the large graph into many small clusters and train expert GNNs for each of them. By cutting down the inter-cluster communication between clusters, our solution significantly accelerates the training process while maintaining the node classification accuracy. We also analyze label bias issue at the small batch, which might lead to overfitting of GNNs. An adaptive label smoothing is then designed to address the label bias and improve the model’s generalization performance.

At Part III, we further explore the wide applications of GNNs. Based on the transfer learning paradigm of “pre-train, prompt, fine-tune”, we design the first graph prompting function. The graph prompt reformulates the downstream task looking the same as the pretext one and transfers the pre-trained model easily to the downstream problem. At the area of bioinformatics, we extend GNNs to hierarchically learn the different abstract graph structures of graph molecules. At the area of tabular data mining, we use GNNs to explicitly learn the feature interactions between columns and make recommendation for each sample. Finally, I discuss the future work of graph machine learning.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Deep graph neural networks, large-scale graph machine learning, graph batch bias, graph prompt, molecular graph representation learning.
Citation

Zhou, Kaixiong. "Deep Graph Representation Learning: Scalability and Applications." (2023) Diss., Rice University. https://hdl.handle.net/1911/115249.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page