About this Event
1300 University Blvd, Birmingham, AL 35233
Title: Person Re-identification by Deep Structured Prediction:
A Generative Approach
Abstract:
Visual appearance based person re-identification (re-ID) is the task of assigning the same identifier to all instances of a particular individual captured in a series of images or videos, even after the occurrence of significant gaps over time or space. Major applications include person association for long-term tracking in large video surveillance networks, re-identification of persons-of-interest with unmanned autonomous vehicles (UAV) from different on-board camera views, and person-of-interest retrieval in multimedia forensics databases. Due to changes in camera view, illumination, body pose, and occlusion, among others, it is extremely challenging to separate the false positives from the real person-of-interest while accommodating appearance variations of the same person-of-interest. The current state-of-the-art methods for person re-ID can be categorized into two main approaches: Given a set of gallery images with known IDs, the task is to infer either the ID label of a probe image individually (person re-ID via image retrieval) or the collective ID labeling of all probe images simultaneously in a probe set (person re-ID via a highly crafted re-ID structure where each probe image is a node and similar probe images form connections).
This dissertation is primarily focused on exploring this question: without highly crafting a predefined re-ID structure as a priori, is it possible to learn this re-ID structure among probe images automatically thereby further inferring their ID labels collectively? Initial attempts on learning this re-ID structure in feature space were coupled with another open issue, i.e., learning the similarity metric when creating the topology underlying re-ID structure. To decouple these open issues, this dissertation formulates person re-ID, for the first time, as an energy-based structured prediction problem, which still manipulates the feature embedding of all the nodes but constructs the re-ID structure in the output label space. Contrary to typical structured prediction problems assuming a predefined structure (e.g., a 2-D grid structure in image segmentation or a 1-D chain structure in sequence labeling), this dissertation takes a generative approach to approximating the unknown re-ID structure by generating ‘snapshot’ structure samples. The baseline formulation is as follows: To infer unknown IDs of all probe images collectively, the allowable uncertainty is introduced in feature embeddings and the associated intermediate labelings of all probe images; Such pairs consisting of manipulated ‘snapshot’ structures and their intermediate labelings are structure samples which are then fed into a structured prediction model to reason about the commonality among these structure samples, thereby approximating the unknown TRUE re-ID structure that better captures the labels’ interactions among all probe images.
With this baseline formulation, this dissertation instantiates two families of structure sampling and learning paradigms. One is generating structure samples by activating randomized dropout layer in testing mode, while structured prediction takes a general-graph based Conditional Random Field (CRF) without predefining the underlying structure. The other is generating structure samples by neural-transferring bias of known gallery images, while structured prediction models possible higher-arity interactions among probe images utilizing fully parameterized energy networks (SPENs). The current results of the later approach swept all the competitions on benchmark datasets (i.e., Market 1501 and DukeMTMC-reID) by the end of 2018. However, this work has its limitations. Future work includes exploring re-ID structure in feature space instead of output label space, innovating approaches to handle dynamically changed output label space, customizing the generative approach with memory-efficient implementation in a multi-GPU environment, and generalizing to other application domains that align well with structure generative approach.
Event Details
See Who Is Interested
0 people are interested in this event