Event News

Talk by Tomas Pajdla and Ming-Hsuan Yang


November 28 (Monday), 2016
National Institute of Informatics
20F, Room 2005
15:30 - 16:30
Recent Results on Image Editing and Learning Filters
Ming-Hsuan Yang (UC Merced, USA)
In the first part of this talk, I will present recent results onsematic-aware image editing. Skies are common backgrounds in photos but areoften less interesting due to the time of photographing. Professionalphotographers correct this by using sophisticated tools with painstakingefforts that are beyond the command of ordinary users. In this work, wepropose an automatic background replacement algorithm that can generaterealistic, artifact-free images with diverse styles of skies. The key ideaof our algorithm is to utilize visual semantics to guide the entire processincluding sky segmentation, search and replacement. First we train a deepconvolutional neural network for semantic scene parsing, which is used asvisual prior to segment sky regions in a coarse-to-fine manner. Second, inorder to find proper skies for replacement, we propose a data-driven skysearch scheme based on semantic layout of the input image. Finally, tore-compose the stylized sky with the original foreground naturally, anappearance transfer method is developed to match statistics locally andsemantically. We show that the proposed algorithm can automaticallygenerate a set of visually pleasing results. In addition, we demonstratethe effectiveness of the proposed algorithm with extensive user studies.

In the second part, I will present recent results on learning image filtersfor low-level vision. We formulate numerous low-level vision problems(e.g., edge preserving filtering and denoising) as recursive imagefiltering via a hybrid neural network. The network contains severalspatially variant recurrent neural networks (RNN) as equivalents of a groupof distinct recursive filters for each pixel, and a deep convolutionalneural network (CNN) that learns the weights of the RNNs. The deep CNN canlearn regulations of recurrent propagation for various tasks andeffectively guides recurrent propagation over an entire image. The proposedmodel does not need large number of convolutional channels nor big kernelsto learn features for low-level vision filters. It is much smaller andfaster compared to a deep CNN based image filter. Experimental results showthat many low-level vision tasks can be effectively learned and carried outin real-time by the proposed algorithm.

Bio: Ming-Hsuan Yang is an associate professor in Electrical Engineeringand Computer Science at University of California, Merced. He received thePhD degree in Computer Science from the University of Illinois atUrbana-Champaign in 2000. He serves as an area chair for severalconferences including IEEE Conference on Computer Vision and PatternRecognition, IEEE International Conference on Computer Vision, EuropeanConference on Computer Vision, Asian Conference on Computer Vision, AAAI NationalConference on Artificial Intelligence, and IEEE International Conference onAutomatic Face and Gesture Recognition. He serves as a program co-chair forIEEE International Conference on Computer Vision in 2019 as well as AsianConference on Computer Vision in 2014, and general co-chair for AsianConference on Computer Vision in 2016. He serves as an associate editor ofthe IEEE Transactions on Pattern Analysis and Machine Intelligence (2007 to2011), International Journal of Computer Vision, Computer Vision and ImageUnderstanding, Image and Vision Computing, and Journal of ArtificialIntelligence Research. Yang received the Google faculty award in 2009, andthe Distinguished Early Career Research award from the UC Merced senate in2011, the Faculty Early Career Development (CAREER) award from the NationalScience Foundation in 2012, and the Distinguished Research Award from UCMerced Senate in 2015.

Degereneracies in Rolling Shutter SfM
Tomas Pajdla (Czech Technical University in Prague)
We present the problem of degeneracies in Structure from Motion (SfM) with rolling shutter cameras. We first show that many common camera configurations, e.g. cameras with parallel readout directions, become critical and allow for a large class of ambiguities in multi-view reconstruction. Then, we provide mathematical analysis of some multi-view cases and related synthetic experiments and show that bundle adjustment with rolling shutter cameras, which are close to critical configurations, may still produce drastically deformed reconstructions. Finally, we provide practical recipes how to photograph with rolling shutter cameras to avoid scene deformations in SfM.