Residual Correlation in Graph Neural Network Regression

Published on Aug 23, 2020 in KDD (Knowledge Discovery and Data Mining)
· DOI :10.1145/3394486.3403101
Junteng Jia9
Estimated H-index: 9
(Cornell University),
Austin R. Benson21
Estimated H-index: 21
(Cornell University)
A graph neural network transforms features in each vertex's neighborhood into a vector representation of the vertex. Afterward, each vertex's representation is used independently for predicting its label. This standard pipeline implicitly assumes that vertex labels are conditionally independent given their neighborhood features. However, this is a strong assumption, and we show that it is far from true on many real-world graph datasets. Focusing on regression tasks, we find that this conditional independence assumption severely limits predictive power. This should not be that surprising, given that traditional graph-based semi-supervised learning methods such as label propagation work in the opposite fashion by explicitly modeling the correlation in predicted outcomes. Here, we address this problem with an interpretable and efficient framework that can improve any graph neural network architecture simply by exploiting correlation structure in the regression residuals. In particular, we model the joint distribution of residuals on vertices with a parameterized multivariate Gaussian, and estimate the parameters by maximizing the marginal likelihood of the observed labels. Our framework achieves substantially higher accuracy than competing baselines, and the learned parameters can be interpreted as the strength of correlation among connected vertices. Furthermore, we develop linear time algorithms for low-variance, unbiased model parameter estimates, allowing us to scale to large networks. We also provide a basic version of our method that makes stronger assumptions on correlation structure but is painless to implement, often leading to great practical performance with minimal overhead.
#1Benedek Rozemberczki (Edin.: University of Edinburgh)H-Index: 9
#2Carl AllenH-Index: 9
Last. Rik SarkarH-Index: 20
view all 3 authors...
We present network embedding algorithms that capture information about a node from the local distribution over node attributes around it, as observed over random walks following an approach similar to Skip-gram. Observations from neighborhoods of different sizes are either pooled (AE) or encoded distinctly in a multi-scale approach (MUSAE). Capturing attribute-neighborhood relationships over multiple scales is useful for a diverse range of applications, including latent feature identification ac...
85 Citations
Jul 25, 2019 in KDD (Knowledge Discovery and Data Mining)
#1Kun Dong (Cornell University)H-Index: 5
#2Austin R. Benson (Cornell University)H-Index: 21
Last. David Bindel (Cornell University)H-Index: 25
view all 3 authors...
Spectral analysis connects graph structure to the eigenvalues and eigenvectors of associated matrices. Much of spectral graph theory descends directly from spectral geometry, the study of differentiable manifolds through the spectra of associated differential operators. But the translation from spectral geometry to spectral graph theory has largely focused on results involving only a few extreme eigenvalues and their associated eigenvalues. Unlike in geometry, the study of graphs through the ove...
15 CitationsSource
Jul 25, 2019 in KDD (Knowledge Discovery and Data Mining)
#1Hongchang Gao (University of Pittsburgh)H-Index: 10
#2Jian Pei (SFU: Simon Fraser University)H-Index: 91
Last. Heng Huang (University of Pittsburgh)H-Index: 62
view all 3 authors...
attention in recent years. Unlike the standard convolutional neural network, graph convolutional neural networks perform the convolutional operation on the graph data. Compared with the generic data, the graph data possess the similarity information between different nodes. Thus, it is important to preserve this kind of similarity information in the hidden layers of graph convolutional neural networks. However, existing works fail to do that. On the other hand, it is challenging to enforce the h...
18 CitationsSource
Jul 25, 2019 in KDD (Knowledge Discovery and Data Mining)
#1Junteng Jia (Cornell University)H-Index: 9
#2Michael T. Schaub (MIT: Massachusetts Institute of Technology)H-Index: 19
Last. Austin R. Benson (Cornell University)H-Index: 21
view all 4 authors...
We present a graph-based semi-supervised learning (SSL) method for learning edge flows defined on a graph. Specifically, given flow measurements on a subset of edges, we want to predict the flows on the remaining edges. To this end, we develop a computational framework that imposes certain constraints on the overall flows, such as (approximate) flow conservation. These constraints render our approach different from classical graph-based SSL for vertex labels, which posits that tightly connected ...
29 CitationsSource
May 24, 2019 in ICML (International Conference on Machine Learning)
#1Jiaxuan You (Stanford University)H-Index: 16
#2Rex Ying (Stanford University)H-Index: 17
Last. Jure Leskovec (Stanford University)H-Index: 117
view all 3 authors...
Learning node embeddings that capture a node's position within the broader graph structure is crucial for many prediction tasks on graphs. However, existing Graph Neural Network (GNN) architectures have limited power in capturing the position/location of a given node with respect to all other nodes of the graph. Here we propose Position-aware Graph Neural Networks (P-GNNs), a new class of GNNs for computing position-aware node embeddings. P-GNN first samples sets of anchor nodes, computes the di...
79 Citations
May 13, 2019 in WWW (The Web Conference)
#1Rania Ibrahim (Purdue University)H-Index: 6
#2David F. Gleich (Purdue University)H-Index: 31
Diffusions, such as the heat kernel diffusion and the PageRank vector, and their relatives are widely used graph mining primitives that have been successful in a variety of contexts including community detection and semi-supervised learning. The majority of existing methods and methodology involves linear diffusions, which then yield simple algorithms involving repeated matrix-vector operations. Recent work, however, has shown that sophisticated and complicated techniques based on network embedd...
13 CitationsSource
Oct 1, 2018 in ICLR (International Conference on Learning Representations)
#1Keyulu Xu (MIT: Massachusetts Institute of Technology)H-Index: 11
#2Weihua Hu (Stanford University)H-Index: 17
Last. Stefanie Jegelka (MIT: Massachusetts Institute of Technology)H-Index: 33
view all 4 authors...
Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs. GNNs follow a neighborhood aggregation scheme, where the representation vector of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited ...
1,326 Citations
Jan 1, 2018 in NeurIPS (Neural Information Processing Systems)
#3Kilian Q. Weinberger (Cornell University)H-Index: 73
#4David Bindel (Cornell University)H-Index: 25
Despite advances in scalable models, the inference tools used for Gaussian processes (GPs) have yet to fully capitalize on developments in computing hardware. We present an efficient and general approach to GP inference based on Blackbox Matrix-Matrix multiplication (BBMM). BBMM inference uses a modified batched version of the conjugate gradients algorithm to derive all terms for training and inference in a single call. BBMM reduces the asymptotic complexity of exact GP inference from O(n^3) to ...
186 Citations
The problem of estimating the trace of matrix functions appears in applications ranging from machine learning and scientific computing, to computational biology. This paper presents an inexpensive method to estimate the trace of f(A)for cases where fis analytic inside a closed interval and Ais a symmetric positive definite matrix. The method combines three key ingredients, namely, the stochastic trace estimator, Gaussian quadrature, and the Lanczos algorithm. As examples, we consider th...
112 CitationsSource
#1William L. HamiltonH-Index: 28
#2Rex YingH-Index: 17
Last. Jure LeskovecH-Index: 117
view all 3 authors...
Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions...
801 Citations
Cited By17
#1Konstantin KlemmerH-Index: 6
#2Tianlin XuH-Index: 2
Last. Daniel B. NeillH-Index: 26
view all 4 authors...
From ecology to atmospheric sciences, many academic disciplines deal with data characterized by intricate spatio-temporal complexities, the modeling of which often requires specialized approaches. Generative models of these data are of particular interest, as they enable a range of impactful downstream applications like simulation or creating synthetic training data. Recent work has highlighted the potential of generative adversarial nets (GANs) for generating spatio-temporal data. A new GAN alg...
#1Austin ClydeH-Index: 4
#2Ashka ShahH-Index: 1
Last. Rick StevensH-Index: 55
view all 5 authors...
Scaffold based drug discovery (SBDD) is a technique for drug discovery which pins chemical scaffolds as the framework of design. Scaffolds, or molecular frameworks, organize the design of compounds into local neighborhoods. We formalize scaffold based drug discovery into a network design. Utilizing docking data from SARS-CoV-2 virtual screening studies and JAK2 kinase assay data, we showcase how a scaffold based conception of chemical space is intuitive for design. Lastly, we highlight the utili...
#1Songtao Liu (PSU: Pennsylvania State University)
#2Hanze Dong (PSU: Pennsylvania State University)H-Index: 2
Last. Dinghao Wu (UTA: University of Texas at Arlington)
view all 8 authors...
Data augmentation has been widely used in image data and linguistic data but remains under-explored on graph-structured data. Existing methods focus on augmenting the graph data from a global perspective and largely fall into two genres: structural manipulation and adversarial training with feature noise injection. However, the structural manipulation approach suffers information loss issues while the adversarial training approach may downgrade the feature quality by injecting noise. In this wor...
1 Citations
#1Kaixiong Zhou (Rice University)H-Index: 7
#2Ninghao Liu (UGA: University of Georgia)H-Index: 13
Last. Xia Hu (Rice University)H-Index: 43
view all 8 authors...
Graph neural networks (GNNs), which learn the node representations by recursively aggregating information from its neighbors, have become a predominant computational tool in many domains. To handle large-scale graphs, most of the existing methods partition the input graph into multiple sub-graphs (e.g., through node clustering) and apply batch training to save memory cost. However, such batch training will lead to label bias within each batch, and then result in over-confidence in model predicti...
1 Citations
#1Eli ChienH-Index: 4
#2Chao PanH-Index: 3
Last. Olgica MilenkovicH-Index: 34
view all 4 authors...
Hypergraphs are used to model higher-order interactions amongst agents and there exist many practically relevant instances of hypergraph datasets. To enable efficient processing of hypergraph-structured data, several hypergraph neural network platforms have been proposed for learning hypergraph properties and structure, with a special focus on node classification. However, almost all existing methods use heuristic propagation rules and offer suboptimal performance on many datasets. We propose Al...
#1Jiong Zhu (UM: University of Michigan)H-Index: 5
#2Junchen Jin (UM: University of Michigan)H-Index: 1
Last. Danai Koutra (UM: University of Michigan)H-Index: 25
view all 4 authors...
Recent studies have exposed that many graph neural networks (GNNs) are sensitive to adversarial attacks, and can suffer from performance loss if the graph structure is intentionally perturbed. A different line of research has shown that many GNN architectures implicitly assume that the underlying graph displays homophily, i.e., connected nodes are more likely to have similar features and class labels, and perform poorly if this assumption is not fulfilled. In this work, we formalize the relation...
#1Junteng JiaH-Index: 9
#2Cenk BaykalH-Index: 9
Last. Austin R. BensonH-Index: 21
view all 4 authors...
With the wide-spread availability of complex relational data, semi-supervised node classification in graphs has become a central machine learning problem. Graph neural networks are a recent class of easy-to-train and accurate methods for this problem that map the features in the neighborhood of a node to its label, but they ignore label correlation during inference and their predictions are difficult to interpret. On the other hand, collective classification is a traditional approach based on in...
May 3, 2021 in ICLR (International Conference on Learning Representations)
#1Jiaqi Ma (UM: University of Michigan)H-Index: 7
#2Bo Chang (Google)H-Index: 12
Last. Qiaozhu Mei (UM: University of Michigan)H-Index: 52
view all 4 authors...
Graph-structured data are ubiquitous. However, graphs encode diverse types of information and thus play different roles in data representation. In this paper, we distinguish the \textit{representational} and the \textit{correlational} roles played by the graphs in node-level prediction tasks, and we investigate how Graph Neural Network (GNN) models can effectively leverage both types of information. Conceptually, the representational information provides guidance for the model to construct bette...
2 Citations
#1Derek Lim (Cornell University)H-Index: 4
#2Xiuyu Li (Cornell University)
Last. Ser-Nam LimH-Index: 21
view all 4 authors...
Much data with graph structures satisfy the principle of homophily, meaning that connected nodes tend to be similar with respect to a specific attribute. As such, ubiquitous datasets for graph machine learning tasks have generally been highly homophilous, rewarding methods that leverage homophily as an inductive bias. Recent work has pointed out this particular focus, as new non-homophilous datasets have been introduced and graph representation learning models better suited for low-homophily set...
4 Citations
#1Yongyi YangH-Index: 1
#2Tang LiuH-Index: 1
Last. David WipfH-Index: 33
view all 9 authors...
Despite the recent success of graph neural networks (GNN), common architectures often exhibit significant limitations, including sensitivity to oversmoothing, long-range dependencies, and spurious edges, e.g., as can occur as a result of graph heterophily or adversarial attacks. To at least partially address these issues within a simple transparent framework, we consider a new family of GNN layers designed to mimic and integrate the update rules of two classical iterative algorithms, namely, pro...
2 Citations