Network Density of States

Published on Jul 25, 2019 in KDD (Knowledge Discovery and Data Mining)
· DOI :10.1145/3292500.3330891
Kun Dong5
Estimated H-index: 5
(Cornell University),
Austin R. Benson23
Estimated H-index: 23
(Cornell University),
David Bindel28
Estimated H-index: 28
(Cornell University)
Sources
Abstract
Spectral analysis connects graph structure to the eigenvalues and eigenvectors of associated matrices. Much of spectral graph theory descends directly from spectral geometry, the study of differentiable manifolds through the spectra of associated differential operators. But the translation from spectral geometry to spectral graph theory has largely focused on results involving only a few extreme eigenvalues and their associated eigenvalues. Unlike in geometry, the study of graphs through the overall distribution of eigenvalues --- the \em spectral density --- is largely limited to simple random graph models. The interior of the spectrum of real-world graphs remains largely unexplored, difficult to compute and to interpret. In this paper, we delve into the heart of spectral densities of real-world graphs. We borrow tools developed in condensed matter physics, and add novel adaptations to handle the spectral signatures of common graph motifs. The resulting methods are highly efficient, as we illustrate by computing spectral densities for graphs with over a billion edges on a single compute node. Beyond providing visually compelling fingerprints of graphs, we show how the estimation of spectral densities facilitates the computation of many common centrality measures, and use spectral densities to estimate meaningful information about graph structure that cannot be inferred from the extremal eigenpairs alone.
References46
Newest
Jul 19, 2018 in KDD (Knowledge Discovery and Data Mining)
#1David Cohen-Steiner (IRIA: French Institute for Research in Computer Science and Automation)H-Index: 30
#2Weihao Kong (Stanford University)H-Index: 10
Last. Gregory Valiant (Stanford University)H-Index: 31
view all 4 authors...
The spectrum of a network or graph G=(V,E)with adjacency matrix A , consists of the eigenvalues of the normalized Laplacian L= I - D^-1/2 A D^-1/2 This set of eigenvalues encapsulates many aspects of the structure of the graph, including the extent to which the graph posses community structures at multiple scales. We study the problem of approximating the spectrum, lambda = (lambda_1,\dots,lambda_|V| ) of G in the regime where the graph is too large to explicitly calculate the spectrum...
Source
Aug 4, 2017 in KDD (Knowledge Discovery and Data Mining)
#1Nicole Eikmeier (Purdue University)H-Index: 5
#2David F. Gleich (Purdue University)H-Index: 33
By studying a large number of real world graphs, we find empirical evidence that most real world graphs have a statistically significant power-law distribution with a cutoff in the singular values of the adjacency matrix and eigenvalues of the Laplacian matrix in addition to the commonly conjectured power-law in the degrees. Among these results, power-laws in the singular values appear more consistently than in the degree distribution. The exponents of the power-law distributions are much larger...
Source
#1Jeff CheegerH-Index: 46
Source
#1Jure Leskovec (Stanford University)H-Index: 121
#2Andrej Krevl (Stanford University)H-Index: 3
A collection of more than 50 large network datasets from tens of thousands of nodes and edges to tens of millions of nodes and edges. In includes social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks.
#1Lloyd N. TrefethenH-Index: 70
We describe a new approximation algorithm for Max Cut. Our algorithm runs in \tilde O(n^2)time, where nis the number of vertices, and achieves an approximation ratio of .531 In instances in which an optimal solution cuts a 1-\varepsilonfraction of edges, our algorithm finds a solution that cuts a 1-4\sqrt{\varepsilon} + 8\varepsilon-o(1)fraction of edges. Our main result is a variant of spectral partitioning, which can be implemented in nearly linear time. Given a graph in which t...
Source
#1C. Seshadhri (SNL: Sandia National Laboratories)H-Index: 32
#2Tamara G. Kolda (SNL: Sandia National Laboratories)H-Index: 56
Last. Ali Pinar (SNL: Sandia National Laboratories)H-Index: 35
view all 3 authors...
Community structure plays a significant role in the analysis of social networks and similar graphs, yet this structure is little understood and not well captured by most models. We formally define a community to be a subgraph that is internally highly connected and has no deeper substructure. We use tools of combinatorics to show that any such community must contain a dense Erd˝ os-R´ enyi (ER) subgraph. Based on mathematical arguments, we hypothesize that any graph with a heavy-tailed degree di...
Source
#1Haim Avron (TAU: Tel Aviv University)H-Index: 21
#2Sivan Toledo (TAU: Tel Aviv University)H-Index: 36
We analyze the convergence of randomized trace estimators. Starting at 1989, several algorithms have been proposed for estimating the trace of a matrix by 1/M∑ie1M ziT Azi, where the zi are random vectors; different estimators use different distributions for the zis, all of which lead to E(1/M∑ie1M ziT Azi) e trace(A). These algorithms are useful in applications in which there is no explicit representation of A but rather an efficient method compute zTAz given z. Existing results only analyze th...
Source
#1Dragoš CvetkovićH-Index: 33
#2Peter Rowlinson (University of Stirling)H-Index: 19
Last. Slobodan K. SimićH-Index: 23
view all 3 authors...
Preface 1. Introduction 2. Graph operations and modifications 3. Spectrum and structure 4. Characterizations by spectra 5. Structure and one eigenvalue 6. Spectral techniques 7. Laplacians 8. Additional topics 9. Applications Appendix Bibliography Index of symbols Index.
#1Nicholas J. HighamH-Index: 1
The only book devoted exclusively to matrix functions, this research monograph gives a thorough treatment of the theory of matrix functions and numerical methods for computing them. The author s elegant presentation focuses on the equivalent definitions of f(A) via the Jordan canonical form, polynomial interpolation, and the Cauchy integral formula, and features an emphasis on results of practical interest and an extensive collection of problems and solutions. Functions of Matrices: Theory and C...
Cited By15
Newest
#1Saurabh SawlaniH-Index: 7
#2Lingxiao Zhao (CMU: Carnegie Mellon University)H-Index: 7
Last. Leman Akoglu (CMU: Carnegie Mellon University)H-Index: 37
view all 3 authors...
Given a node-attributed graph, how can we efficiently represent it with few numerical features that expressively reflect its topology and attribute information? We propose A-DOGE, for Attributed DOS-based Graph Embedding, based on density of states (DOS, a.k.a. spectral density) to tackle this problem. A-DOGE is designed to fulfill a long desiderata of desirable characteristics. Most notably, it capitalizes on efficient approximation algorithms for DOS, that we extend to blend in node labels and...
#1Rajarshi Bhattacharjee (UMass: University of Massachusetts Amherst)H-Index: 2
#2Cameron Musco (UMass: University of Massachusetts Amherst)H-Index: 18
Last. Archan Ray (UMass: University of Massachusetts Amherst)H-Index: 1
view all 3 authors...
We study the problem of approximating the eigenspectrum of a symmetric matrix A \in \mathbb{R}^{n \times n}with bounded entries (i.e., \|A\|_{\infty} \leq 1. We present a simple sublinear time algorithm that approximates all eigenvalues of Aup to additive error \pm \epsilon nusing those of a randomly sampled \tilde{O}(\frac{1}{\epsilon^4}) \times \tilde O(\frac{1}{\epsilon^4})principal submatrix. Our result can be viewed as a concentration bound on the full eigenspectrum of a rand...
#1Tyler Chen (UW: University of Washington)H-Index: 4
#2Thomas Trogdon (UW: University of Washington)H-Index: 13
Last. Shashanka Ubaru (IBM)H-Index: 10
view all 3 authors...
The cumulative empirical spectral measure (CESM) \Phi[\mathbf{A}] : \mathbb{R} \to [0,1]of a n\times nsymmetric matrix \mathbf{A}is defined as the fraction of eigenvalues of \mathbf{A}less than a given threshold, i.e., \Phi[\mathbf{A}](x) := \sum_{i=1}^{n} \frac{1}{n} {\large\unicode{x1D7D9}}[ \lambda_i[\mathbf{A}]\leq x] Spectral sums \operatorname{tr}(f[\mathbf{A}])can be computed as the Riemann-Stieltjes integral of fagainst \Phi[\mathbf{A}] so the task of estimating C...
#1Vladimir Braverman (Johns Hopkins University)H-Index: 21
#2Aditya Krishnan (Johns Hopkins University)H-Index: 2
Last. Christopher Musco (NYU: New York University)H-Index: 17
view all 3 authors...
We analyze the popular kernel polynomial method (KPM) for approximating the spectral density (eigenvalue distribution) of a real symmetric (or Hermitian) matrix A \in \mathbb{R}^{n\times n} We prove that a simple and practical variant of the KPM algorithm can approximate the spectral density to \epsilonaccuracy in the Wasserstein-1 distance with roughly O({1}/{\epsilon})matrix-vector multiplications with A This yields a provable linear time result for the problem. The KPM variant we ...
#1Junteng Jia (Cornell University)H-Index: 10
#2Austin R. Benson (Cornell University)H-Index: 23
Semi-supervised learning on graphs is a widely applicable problem in network science and machine learning. Two standard algorithms -- label propagation and graph neural networks -- both operate by repeatedly passing information along edges, the former by passing labels and the latter by passing node features, modulated by neural networks. These two types of algorithms have largely developed separately, and there is little understanding about the structure of network data that would make one of t...
Apr 20, 2020 in WWW (The Web Conference)
#1Anton Tsitsulin (University of Bonn)H-Index: 7
#2Marina Munkhoeva (Skolkovo Institute of Science and Technology)H-Index: 4
Last. Bryan Perozzi (Google)H-Index: 27
view all 3 authors...
Graph comparison is a fundamental operation in data mining and information retrieval. Due to the combinatorial nature of graphs, it is hard to balance the expressiveness of the similarity measure and its scalability. Spectral analysis provides quintessential tools for studying the multi-scale structure of graphs and is a well-suited foundation for reasoning about differences between graphs. However, computing full spectrum of large graphs is computationally prohibitive; thus, spectral graph comp...
Source
Deep learning image classifiers usually rely on huge training sets and their training process can be described as learning the similarities and differences among training images. But, images in large training sets are not usually studied from this perspective and fine-level similarities and differences among images is usually overlooked. This is due to lack of fast and efficient computational methods to analyze the contents of these datasets. Some studies aim to identify the influential and redu...
Deep learning image classifiers usually rely on huge training sets and their training process can be described as learning the similarities and differences among training images. But, images in large training sets are not usually studied from this perspective and fine-level similarities and differences among images is usually overlooked. Some studies aim to identify the influential and redundant training images, but such methods require a model that is already trained on the entire training set....
#1Junteng JiaH-Index: 10
#2Austin R. BensonH-Index: 23
Graph neural networks aggregate features in vertex neighborhoods to learn vector representations of all vertices, using supervision from some labeled vertices during training. The predictor is then a function of the vector representation, and predictions are made independently on unlabeled nodes. This widely-adopted approach implicitly assumes that vertex labels are independent after conditioning on their neighborhoods. We show that this strong assumption is far from true on many real-world grap...
Pleasingly parallel computations are those that involve completely independent work. We investigate these in the context of a problem we call AllPageRank. The AllPageRank problem involves computing a subset of accurate PageRank entries for each possible seeded PageRank vector. AllPageRank is representative of a wider class of possible computational procedures that will run a large number of experiments on a single graph structure. Our study involves computing the AllPageRank vectors for a multi-...
Source
This website uses cookies.
We use cookies to improve your online experience. By continuing to use our website we assume you agree to the placement of these cookies.
To learn more, you can find in our Privacy Policy.