Search

Scholarly Works (4 results)

Sort By:

Thesis
Peer Reviewed

Learning real-time object detectors : probabilistic generative approaches

Fasel, Ian Robert

UC San Diego Electronic Theses and Dissertations (2006)

This dissertation is a computational investigation of the task of locating and recognizing objects in unconstrained images in real-time, and learning to do so with minimal supervision. We take a probabilistic generative modeling approach, which involves formulating analytical models of several real-world vision problems, studying how optimal inference would proceed under such models, developing techniques for learning parameters under these models, and evaluating the performance of the optimal inference algorithms in realistic data. We begin by developing a novel generative model of images under which an image is a collection of sets of pixels which are generated by different object categories. This provides a novel definition of ̀òbject'' as a set of pixels that are co- dependent, but conditionally independent of the other sets of pixels in the image. We then develop an algorithm for optimal inference (i.e., detection of objects) and maximum likelihood learning when the segmentation of training images is known. We point out a computational tradeoff between robustness of object detection and precision of localization, and propose context dependent detectors as a way to solve the problem. These techniques are used to develop a state-of-the-art, real-time head, eye, and blink detector. We predict that similar context-dependent detectors may be found in the brain. We develop an algorithm for optimal inference and maximum likelihood learning when the segmentation of training images is unknown. We test this on image datasets labeled with the identity but not the location of objects, and achieve state-of-the-art performance in discovery of object categories. We then test the algorithm in a fully unsupervised context, in which a real-time person detector is learned from just a few minutes of visual information self-labeled through multi-modal contingency detection. This suggests that early face (and other) preferences in humans infants may be evidence for rapid statistical learning rather than innate biases. We develop software for learning robust, real-time object detectors from both labeled and unlabeled examples, including a real-time head, eye, and blink detector available to the public

Cover page: Learning real-time object detectors : probabilistic generative approaches

Article
Peer Reviewed

The emergence of shared attention: Using robots to test developmental theories

UC San Diego Previously Published Works (2001)

The capacity for shared attention is a cornerstone of human social intelligence. Recent accounts attribute the emergence of shared attention to multiple cognitive mechanisms. Current behavioral data support an alternative dynamic systems model, but many questions remain. To answer these questions and test alternative theories, robotic models will play a critical role. Robotic models reduce the scope of the modeling task, permit comparison of empirically supported theories, and encourage parsimonious models of complex behaviors. Current efforts to model the emergence of shared attention are described.

Cover page: The emergence of shared attention: Using robots to test developmental theories

Article
Peer Reviewed

Deep active object recognition by joint label and action prediction

UC San Diego Previously Published Works (2017)

An active object recognition system has the advantage of acting in the environment to capture images that are more suited for training and lead to better performance at test time. In this paper, we utilize deep convolutional neural networks for active object recognition by simultaneously predicting the object label and the next action to be performed on the object with the aim of improving recognition performance. We treat active object recognition as a reinforcement learning problem and derive the cost function to train the network for joint prediction of the object label and the action. A generative model of object similarities based on the Dirichlet distribution is proposed and embedded in the network for encoding the state of the system. The training is carried out by simultaneously minimizing the label and action prediction errors using gradient descent. We empirically show that the proposed network is able to predict both the object label and the actions on GERMS, a dataset for active object recognition. We compare the test label prediction accuracy of the proposed model with Dirichlet and Naive Bayes state encoding. The results of experiments suggest that the proposed model equipped with Dirichlet state encoding is superior in performance, and selects images that lead to better training and higher accuracy of label prediction at test time.

Cover page: Deep active object recognition by joint label and action prediction

Article
Peer Reviewed

Coordinating Touch and Vision to Learn What Objects Look Like

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 33 (2011)