Nonlinear Diffusion for Community Detection and Semi-Supervised Learning

Published on May 13, 2019 in WWW (The Web Conference)
· DOI :10.1145/3308558.3313483
Rania Ibrahim6
Estimated H-index: 6
(Purdue University),
David F. Gleich31
Estimated H-index: 31
(Purdue University)
Diffusions, such as the heat kernel diffusion and the PageRank vector, and their relatives are widely used graph mining primitives that have been successful in a variety of contexts including community detection and semi-supervised learning. The majority of existing methods and methodology involves linear diffusions, which then yield simple algorithms involving repeated matrix-vector operations. Recent work, however, has shown that sophisticated and complicated techniques based on network embeddings and neural networks can give empirical results superior to those based on linear diffusions. In this paper, we illustrate a class of nonlinear graph diffusions that are competitive with state of the art embedding techniques and outperform classic diffusions. Our new methods enjoy much of the simplicity underlying classic diffusion methods as well. Formally, they are based on nonlinear dynamical systems that can be realized with an implementation akin to applying a nonlinear function after each matrix-vector product in a classic diffusion. This framework also enables us to easily integrate results from multiple data representations in a principled fashion. Furthermore, we have some theoretical relationships that suggest choices of the nonlinear term. We demonstrate the benefits of these techniques on a variety of synthetic and real-world data.
📖 Papers frequently viewed together
4 Citations
3 Authors (Kai-Wei Chang, ..., Sathiya Keerthi)
10 Citations
1 Citations
#1Han XiaoH-Index: 10
#2Kashif RasulH-Index: 5
Last. Roland VollgrafH-Index: 1
view all 3 authors...
We present Fashion-MNIST, a new dataset comprising of 28x28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits. The dataset is freely available a...
2,685 Citations
#1Juan Luis Vázquez (UAM: Autonomous University of Madrid)H-Index: 68
We describe the mathematical theory of diffusion and heat transport with a view to including some of the main directions of recent research. The linear heat equation is the basic mathematical model that has been thoroughly studied in the last two centuries. It was followed by the theory of parabolic equations of different types. In a parallel development, the theory of stochastic partial differential equations gives a foundation to the probabilistic study of diffusion.
58 CitationsSource
#1Dane Taylor (Statistical and Applied Mathematical Sciences Institute)H-Index: 17
#2Sean A. Myers (UNC: University of North Carolina at Chapel Hill)H-Index: 3
Last. Peter J. Mucha (UNC: University of North Carolina at Chapel Hill)H-Index: 51
view all 5 authors...
Numerous centrality measures have been developed to quantify the importances of nodes in time-independent networks, and many of them can be expressed as the leading eigenvector of some matrix. With the increasing availability of network data that changes in time, it is important to extend such eigenvector-based centrality measures to time-dependent networks. In this paper, we introduce a principled generalization of network centrality measures that is valid for any eigenvector-based centrality. ...
89 CitationsSource
Aug 13, 2016 in KDD (Knowledge Discovery and Data Mining)
#1Aditya Grover (Stanford University)H-Index: 17
#2Jure Leskovec (Stanford University)H-Index: 117
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuou...
4,651 CitationsSource
#1Austin R. Benson (Stanford University)H-Index: 21
#2David F. Gleich (Purdue University)H-Index: 31
Last. Jure Leskovec (Stanford University)H-Index: 117
view all 3 authors...
Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks—at the level of small network subgraphs—remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order...
543 CitationsSource
Jun 19, 2016 in ICML (International Conference on Machine Learning)
#1Zhilin Yang (CMU: Carnegie Mellon University)H-Index: 30
#2William W. Cohen (CMU: Carnegie Mellon University)H-Index: 86
Last. Ruslan Salakhutdinov (CMU: Carnegie Mellon University)H-Index: 96
view all 3 authors...
We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph. We develop both transductive and inductive variants of our method. In the transductive variant of our method, the class labels are determined by both the learned embeddings and input feature vectors, while in the inductive variant, the embeddings are defined as a parametric...
553 Citations
#1Xiaoran YanH-Index: 11
#2Shang-Hua Teng (SC: University of Southern California)H-Index: 60
Last. Rumi Ghosh (Bosch)H-Index: 18
view all 4 authors...
Abstract : We study the interplay between a dynamical process and the structure of the network on which it unfolds using the parameterized Laplacian framework. This framework allows for defining and characterizing an ensemble of dynamical processes on a network beyond what the traditional Laplacian is capable of modeling. This, in turn, allows for studying the impact of the interaction between dynamics and network topology on the quality-measure of network clusters and centrality, in order to ef...
7 CitationsSource
#1David F. GleichH-Index: 31
#2Michael W. MahoneyH-Index: 60
7 CitationsSource
Feb 12, 2016 in AAAI (National Conference on Artificial Intelligence)
#1James Voss (OSU: Ohio State University)H-Index: 9
#2Mikhail Belkin (OSU: Ohio State University)H-Index: 45
Last. Luis Rademacher (OSU: Ohio State University)H-Index: 12
view all 3 authors...
In recent years, spectral clustering has become a standard method for data analysis used in a broad range of applications. In this paper we propose a new class of algorithms for multiway spectral clustering based on optimization of a certain "contrast function" over the unit sphere. These algorithms, partly inspired by certain Indepenent Component Analysis techniques, are simple, easy to implement and efficient. Geometrically, the proposed algorithms can be interpreted as hidden basis recovery b...
8 Citations
Aug 10, 2015 in KDD (Knowledge Discovery and Data Mining)
#1David F. Gleich (Purdue University)H-Index: 31
#2Michael W. Mahoney (University of California, Berkeley)H-Index: 60
Graph-based learning methods have a variety of names including semi-supervised and transductive learning. They typically use a diffusion to propagate labels from a small set of nodes with known class labels to the remaining nodes of the graph. While popular, these algorithms, when implemented in a straightforward fashion, are extremely sensitive to the details of the graph construction. Here, we provide four procedures to help make them more robust: recognizing implicit regularization in the dif...
39 CitationsSource
Cited By11
#1Li Chen (Georgia Institute of Technology)H-Index: 2
#2Richard Peng (Georgia Institute of Technology)H-Index: 27
Last. Di Wang (Google)H-Index: 12
view all 3 authors...
Diffusion is a fundamental graph procedure and has been a basic building block in a wide range of theoretical and empirical applications such as graph partitioning and semi-supervised learning on graphs. In this paper, we study computationally efficient diffusion primitives beyond random walk. We design an \widetilde{O}(m)time randomized algorithm for the \ell_2norm flow diffusion problem, a recently proposed diffusion model based on network flow with demonstrated graph clustering related ...
#1Sergio Picart-Armada (UPC: Polytechnic University of Catalonia)H-Index: 5
#2Wesley K. ThompsonH-Index: 68
Last. Alexandre Perera-Lluna (UPC: Polytechnic University of Catalonia)H-Index: 12
view all 4 authors...
MOTIVATION Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applicati...
1 CitationsSource
#1Georgios Drakopoulos (Ionian University)H-Index: 10
#2Eleana Kafeza (ZU: Zayed University)H-Index: 1
Last. Haseena Al Katheeri (ZU: Zayed University)H-Index: 3
view all 4 authors...
Startups arguably contribute to the current business landscape by developing innovative products and services. The discovery of business partners and employees with a specific background which can be verified stands out repeatedly as a prime obstacle. LinkedIn is a popular platform where professional milestones, endorsements, recommendations, and skills are posted. A graph search algorithm with a BFS and a DFS strategy for seeking trusted candidates in LinkedIn is proposed. Both strategies rely ...
2 CitationsSource
#1Qian HuangH-Index: 5
#2Horace HeH-Index: 5
Last. Austin R. BensonH-Index: 21
view all 5 authors...
Graph Neural Networks (GNNs) are the predominant technique for learning over graphs. However, there is relatively little understanding of why GNNs are successful in practice and whether they are necessary for good performance. Here, we show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in t...
16 Citations
Aug 23, 2020 in KDD (Knowledge Discovery and Data Mining)
#1Junteng Jia (Cornell University)H-Index: 9
#2Austin R. Benson (Cornell University)H-Index: 21
A graph neural network transforms features in each vertex's neighborhood into a vector representation of the vertex. Afterward, each vertex's representation is used independently for predicting its label. This standard pipeline implicitly assumes that vertex labels are conditionally independent given their neighborhood features. However, this is a strong assumption, and we show that it is far from true on many real-world graph datasets. Focusing on regression tasks, we find that this conditional...
13 CitationsSource
#1Francesco TudiscoH-Index: 12
#2Austin R. Benson (Cornell University)H-Index: 21
Last. Konstantin ProkopchikH-Index: 1
view all 3 authors...
Label spreading is a general technique for semi-supervised learning with point cloud or network data, which can be interpreted as a diffusion of labels on a graph. While there are many variants of label spreading, nearly all of them are linear models, where the incoming information to a node is a weighted sum of information from neighboring nodes. Here, we add nonlinearity to label spreading through nonlinear functions of higher-order structure in the graph, namely triangles in the graph. For a ...
4 Citations
#1Junteng JiaH-Index: 9
#2Austin R. BensonH-Index: 21
Graph neural networks aggregate features in vertex neighborhoods to learn vector representations of all vertices, using supervision from some labeled vertices during training. The predictor is then a function of the vector representation, and predictions are made independently on unlabeled nodes. This widely-adopted approach implicitly assumes that vertex labels are independent after conditioning on their neighborhoods. We show that this strong assumption is far from true on many real-world grap...
5 Citations