arXiv: Machine Learning

Papers

Papers 8216

1 page of 822 pages (8,216 results)

#1Yifei Wang (PKU: Peking University)

#2Yisen Wang (PKU: Peking University)H-Index: 18

Last. Zhouchen Lin (PKU: Peking University)H-Index: 61

view all 0 authors...

Recently, sampling methods have been successfully applied to enhance the sample quality of Generative Adversarial Networks (GANs). However, in practice, they typically have poor sample efficiency because of the independent proposal sampling from the generator. In this work, we propose REP-GAN, a novel sampling method that allows general dependent proposals by REParameterizing the Markov chains into the latent space of the generator. Theoretically, we show that our reparameterized proposal admits...

#1Hanna Tseran (MPG: Max Planck Society)

#2Guido Montúfar (UCLA: University of California, Los Angeles)H-Index: 17

Learning with neural networks relies on the complexity of the representable functions, but more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, recent works have shown that the practical complexity of deep ReLU networks is often far from the theoretical maximum. In this work we show that this phenomenon also occurs in networks with maxout (multi-argument) activation functions and w...

#1Nicolas Dewolf (UGent: Ghent University)

#2Bernard De Baets (UGent: Ghent University)H-Index: 64

Last. Willem Waegeman (UGent: Ghent University)H-Index: 28

view all 0 authors...

Over the last few decades, various methods have been proposed for estimating prediction intervals in regression settings, including Bayesian methods, ensemble methods, direct interval estimation methods and conformal prediction methods. An important issue is the calibration of these methods: the generated prediction intervals should have a predefined coverage level, without being overly conservative. In this work, we review the above four classes of methods from a conceptual and experimental poi...

Generalized correlation analysis (GCA) is concerned with uncovering linear relationships across multiple datasets. It generalizes canonical correlation analysis that is designed for two datasets. We study sparse GCA when there are potentially multiple generalized correlation tuples in data and the loading matrix has a small number of nonzero rows. It includes sparse CCA and sparse PCA of correlation matrices as special cases. We first formulate sparse GCA as generalized eigenvalue problems at bo...

#1Meimei Liu (VT: Virginia Tech)

#2Zhengwu Zhang (UNC: University of North Carolina at Chapel Hill)H-Index: 10

Last. David B. Dunson (Duke University)H-Index: 89

view all 0 authors...

There has been huge interest in studying human brain connectomes inferred from different imaging modalities and exploring their relationship with human traits, such as cognition. Brain connectomes are usually represented as networks, with nodes corresponding to different regions of interest (ROIs) and edges to connection strengths between ROIs. Due to the high-dimensionality and non-Euclidean nature of networks, it is challenging to depict their population distribution and relate them to human t...

#1Achille Thin (École Polytechnique)H-Index: 2

#2Nikita Kotelevskii (Skolkovo Institute of Science and Technology)H-Index: 1

Last. Maxim Panov (Skolkovo Institute of Science and Technology)H-Index: 7

view all 0 authors...

Variational auto-encoders (VAE) are popular deep latent variable models which are trained by maximizing an Evidence Lower Bound (ELBO). To obtain tighter ELBO and hence better variational approximations, it has been proposed to use importance sampling to get a lower variance estimate of the evidence. However, importance sampling is known to perform poorly in high dimensions. While it has been suggested many times in the literature to use more sophisticated algorithms such as Annealed Importance ...

#1Ghassen JerfelH-Index: 9

#2Serena WangH-Index: 8

Last. Michael I. JordanH-Index: 172

view all 0 authors...

Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior leading to miscalibration and potential degeneracy. Importance sampling (IS), on the other hand, is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. The quality of IS crucially depends on the choice of t...

Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization

#1Arthur Jacot (EPFL: École Polytechnique Fédérale de Lausanne)H-Index: 12

#2François Gaston Ged (EPFL: École Polytechnique Fédérale de Lausanne)H-Index: 2

Last. Clément Hongler (EPFL: École Polytechnique Fédérale de Lausanne)H-Index: 16

view all 0 authors...

For deep linear networks (DLN), various hyperparameters alter the dynamics of training dramatically. We investigate how the rank of the linear map found by gradient descent is affected by (1) the initialization norm and (2) the addition of L_{2}regularization on the parameters. For (1), we study two regimes: (1a) the linear/lazy regime, for large norm initialization; (1b) a \textquotedbl saddle-to-saddle\textquotedbl{} regime for small initialization norm. In the (1a) setting, the dynamics of...

#1Tomasz PiotrowskiH-Index: 17

#2Renato L. G. CavalcanteH-Index: 15

We derive conditions for the existence of fixed points of neural networks, an important research objective to understand their behavior in modern applications involving autoencoders and loop unrolling techniques, among others. In particular, we focus on networks with nonnegative inputs and nonnegative network parameters, as often considered in the literature. We show that such networks can be recognized as monotonic and (weakly) scalable functions within the framework of nonlinear Perron-Frobeni...

#2Geoffrey RoederH-Index: 7

Last. Tai-Danae Bradley

view all 0 authors...

We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probabilistic TN models such as Born machines (BMs) employ complex-valued hidden states to produce novel f...

12345678910