The observer-assisted method for adjusting hyper-parameters in deep learning algorithms

Published on Nov 30, 2016in arXiv: Learning
Maciej Wielgosz9
Estimated H-index: 9
Sources
Abstract
This paper presents a concept of a novel method for adjusting hyper-parameters in Deep Learning (DL) algorithms. An external agent-observer monitors a performance of a selected Deep Learning algorithm. The observer learns to model the DL algorithm using a series of random experiments. Consequently, it may be used for predicting a response of the DL algorithm in terms of a selected quality measurement to a set of hyper-parameters. This allows to construct an ensemble composed of a series of evaluators which constitute an observer-assisted architecture. The architecture may be used to gradually iterate towards to the best achievable quality score in tiny steps governed by a unit of progress. The algorithm is stopped when the maximum number of steps is reached or no further progress is made.
📖 Papers frequently viewed together
6 Citations
2014ICC: International Conference on Communications
4 Authors (Kun Wang, ..., Yanfei Sun)
2 Citations
References8
Newest
We propose a formal mathematical model for sparse representations and active dendrites in neocortex. Our model is inspired by recent experimental findings on active dendritic processing and NMDA spikes in pyramidal neurons. These experimental and modeling studies suggest that the basic unit of pattern memory in the neocortex is instantiated by small clusters of synapses operated on by localized non-linear dendritic processes. We derive a number of scaling laws that characterize the accuracy of s...
45 Citations
Dec 10, 2015 in ICIP (International Conference on Image Processing)
#1Sachin S. Talathi (Qualcomm)H-Index: 18
Recently sequential model based optimization (SMBO) has emerged as a promising hyper-parameter optimization strategy in machine learning. In this work, we investigate SMBO to identify architecture hyper-parameters of deep convolution networks (DCNs) object recognition. We propose a simple SMBO strategy that starts from a set of random initial DCN architectures to generate new architectures, which on training perform well on a given dataset. Using the proposed SMBO strategy we are able to identif...
10 CitationsSource
Nov 15, 2015 in HiPC (IEEE International Conference on High Performance Computing, Data, and Analytics)
#1Steven R. Young (ORNL: Oak Ridge National Laboratory)H-Index: 10
#2Derek C. Rose (ORNL: Oak Ridge National Laboratory)H-Index: 12
Last. Robert M. Patton (ORNL: Oak Ridge National Laboratory)H-Index: 12
view all 5 authors...
There has been a recent surge of success in utilizing Deep Learning (DL) in imaging and speech applications for its relatively automatic feature generation and, in particular for convolutional neural networks (CNNs), high accuracy classification abilities. While these models learn their parameters through data-driven methods, model selection (as architecture construction) through hyper-parameter choices remains a tedious and highly intuition driven task. To address this, Multi-node Evolutionary ...
174 CitationsSource
Empirical evidence demonstrates that every region of the neocortex represents information using sparse activity patterns. This paper examines Sparse Distributed Representations (SDRs), the primary information representation strategy in Hierarchical Temporal Memory (HTM) systems and the neocortex. We derive a number of properties that are core to scaling, robustness, and generalization. We use the theory to provide practical guidelines and illustrate the power of SDRs as the basis of HTM. Our goa...
96 Citations
Sep 6, 2014 in ECCV (European Conference on Computer Vision)
#1Matthew D. Zeiler (NYU: New York University)H-Index: 14
#2Rob Fergus (NYU: New York University)H-Index: 79
Large Convolutional Network models have recently demonstrated impressive classification performance on the ImageNet benchmark Krizhevsky et al. [18]. However there is no clear understanding of why they perform so well, or how they might be improved. In this paper we explore both issues. We introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier. Used in a diagnostic role, these visualizations allow us to fi...
8,366 CitationsSource
#1James Bergstra (UdeM: Université de Montréal)H-Index: 25
#2Yoshua Bengio (UdeM: Université de Montréal)H-Index: 192
Grid search and manual search are the most widely used strategies for hyper-parameter optimization. This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid. Empirical evidence comes from a comparison with a large previous study that used grid search and manual search to configure neural networks and deep belief networks. Compared with neural networks configured by a pure grid search, we find that random ...
3,982 Citations
Cited By0
Newest