Person re-identification (in short, re-id) consists in recognizing a same individual in diverse locations and time over several non-overlapping camera views. The re-identification task is fundamental for a set of surveillance applications specially when dealing with large and structured environments such as museums, shopping malls, airports, etc. The task is challenging because the recognition must be robust to changes in the perspective view, human pose, lighting variability, and occlusions.
This topic mainly focuses on exploring new descriptors and methods to deal with the re-identification task. As cameras do not often provide sufficient resolution to work with facial or iris recognition, the classical solutions normally rely on appearance information, i.e., clothing and accessories. These appearance-based methods lie primarily on designing and building the person signature by extracting features from the whole region or specific parts of the human body. Further, learning techniques can be also utilised to increase re-id accuracy.
We address this task in the large, by developing purely appearance-based methods (e.g., like SDALF) as well as metric and transfer learning approaches, for modelling the entire re-id process or estimating the brightness transfer function (BTF) among the cameras, respectively.
Moreover, thanks to novel camera technologies such as RGB-D cameras (Microsoft Kinect® or Asus Xtion Pro®), able to acquire depth together with RGB data, re-id can also be approached using 3D soft-biometrics and other geometrical information that can be extracted from such type of data.
Our main works in this respect are listed in the following.
Unsupervised Adaptive Re-identification in Open World Dynamic Camera NetworksMost re-id approaches have neglected the dynamic and open world nature of the re-identification problem, where a new camera may be temporarily inserted into an existing system to get additional information. To address such a novel and very practical problem, we propose an unsupervised adaptation scheme for re-identification models in a dynamic camera network. First, we formulate a domain perceptive re-identification method based on geodesic flow kernel that can effectively find the best source camera (already installed) to adapt with a newly introduced target camera, without requiring a very expensive training phase. Second, we introduce a transitive inference algorithm for re-identification that can exploit the information from best source camera to improve the accuracy across other camera pairs in a network of multiple cameras. Extensive experiments on four benchmark datasets demonstrate that the proposed model significantly outperforms the state-of-the-art unsupervised learning based alternatives whilst being extremely efficient to compute.
Reference:
|
Distance Penalization for Person Re-identificationWe take advantage of the several feature descriptors available in the literature to devise a fusion approach which, taking as input the several descriptors composed by the distances of the probe from all the gallery images, re-rank them on the basis of their confidence to build a dictionary, followed by a sparse coding approach to get the final ranking. More specifically, the processing pipeline is composed of two stages. First, a metric learning paradigm is applied on a bunch of distinct feature extractors to produce an ensemble of estimated distance measures, which are afterwards penalized according to their confidence in estimating the correct matches and averaged to draw a final decision. Second, the closest persons from the gallery are selected based on the previously fused distance measures, and utilized to span a dictionary to reconstruct the probe image queried to the system. Evaluated on benchmark datasets, the proposed framework advances the state-of-the-art by interesting margins. In particular, Rank-1 gains amounting to about 12%, 1%, 6%, and 12%, were scored on VIPeR, CAVIAR4REID, iLIDS, and 3DPeS, respectively.
Reference:
|
Person re-identification using sparse representation with manifold constraintsNowadays, surveillance cameras with high frame rate are capable of capturing several consecutive frames from each person. Images, in multi-shot scenarios, provide richer information of the target person compared to single-shot conditions. They, however, produce a high cost as per information redundancy, which may degrade the performance of re-id systems. In this paper, we propose a novel framework that combines sparse coding and manifold constraints to extract discriminative information from multi-shot images of one pedestrian for person re-id across a set of non-overlapped surveillance cameras. The evaluation over two standard multi-shot datasets shows very competitive accuracy of our framework against the state-of-the-art.
Reference:
|
Exploiting multiple detections to learn robust brightness transfer functions in re-identification systemsOne of the most relevant problems in re-identification systems is that the appearance of same individual varies across cameras due to illumination and viewpoint changes. This paper proposes the use of Cumulative Weighted Brightness Transfer Functions to model these appearance variations. It is multiple frame based learning approach, which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images. We tested our approach on standard multi-camera surveillance datasets showing consistent and significant improvements over existing methods on three different datasets without any other additional cost. Our approach is general and can be applied to any subsequent appearance-based method.
Reference:
|
Person re-identification by discriminatively selecting parts and featuresThis paper presents a novel appearance-based method for person re-identification. The core idea is to rank and select different body parts on the basis of the discriminating power of their characteristic features. In our approach, we first segment the pedestrian images into meaningful parts, then we extract features from such parts as well as from the whole body and finally, we perform a salience analysis based on regression coefficients. Given a set of individuals, our method is able to estimate the different importance (or salience) of each body part automatically. To prove the effectiveness of our approach, we considered two standard datasets and we demonstrated through an exhaustive experimental phase how our method improves significantly upon existing approaches, especially in multi-shot scenarios.
Reference:
|
Semi-supervised Multi-feature Learning for Person Re-identificationLearning approaches for re-id are usually based on simple features, and are trained on camera pairs to discriminate between individuals. In this paper, we present a method that joins these two ideas: given an arbitrary state-of-the-art set of features, no matter their number, dimensionality or descriptor, the proposed multi-class learning approach learns how to fuse them, ensuring that the features agree on the classification result. The approach consists of a semi-supervised multi-feature learning strategy that requires at least a single image per person as training data. To validate our approach, we present results on different datasets, using several heterogeneous features, setting a higher level of performance in the person re-identification problem, even in very poor settings.
Reference:
|
Person Re-identification with a PTZ Camera: an introductory studyWe present an introductory study that paves the way for a new kind of person re-identification, by exploiting a single Pan-Tilt-Zoom (PTZ) camera. PTZ devices allow to zoom in on body regions, acquiring discriminative visual patterns that enrich the appearance description of an individual. This intuition has been translated into a statistical direct re-identification scheme, which collects two images for each probe subject: the first image captures the probe individual, focusing on the whole body; the second can be a zoomed body part (head, torso or legs) or another whole body image, and is the outcome of an action-selection mechanism, driven by feature selection principles. The validation of this technique is also explored: in order to allow repeatability, two novel multi-resolution benchmarks have been created. On these data, we demonstrate that our approach selects effective actions, by focusing on the body part that better discriminate each subject. Moreover, we show that the proposed compound of two images overwhelms standard multi-shot descriptions, composed by many more pictures.
Reference:
|
Re-identification with RGB-D SensorsPerson re-identification is mostly addressed considering by primarily exploiting appearance cues coming from 2D images, hypothesizing that the individuals cannot change their clothes. In this paper, we relax this constraint by presenting and exploiting a set of 3D soft-biometric cues invariant to appearance variations, and gathered using RGB-D technology. The joint use of these characteristics provides encouraging performances on a benchmark of 79 people that have been captured in different days and with different clothing. This promotes a novel research direction in re-identification, supported also by the fact that a new brand of affordable RGB-D cameras have recently invaded the worldwide market.
Reference:
|
Custom Pictorial Structures for Re-identificationWe propose a novel methodology for re-identification, based on Pictorial Structures (PS). Whenever face or other biometric information is missing, humans recognize an individual by selectively focusing on the body parts, looking for part-to-part correspondences. We want to take inspiration from this strategy in a re-identification context, using PS to achieve this objective. For single image re-identification, we adopt PS to localize the parts, extract and match their descriptors. When multiple images of a single individual are available, we propose a new algorithm to customize the fit of PS on that specific person, leading to what we call a Custom Pictorial Structure (CPS). CPS learns the appearance of an individual, improving the localization of its parts, thus obtaining more reliable visual characteristics for re-identification. It is based on the statistical learning of pixel attributes collected through spatio-temporal reasoning. The use of PS and CPS leads to state-of-the-art results on all the available public benchmarks, and opens a fresh new direction for research on re-identification.
Reference:
|
Person Re-Identification by Symmetry-Driven Accumulation of Local FeaturesWe present an appearance-based method for person re-identification. It consists in the extraction of features that model three complementary aspects of the human appearance: the overall chromatic content, the spatial arrangement of colors into stable regions, and the presence of recurrent local motifs with high entropy. All this information is derived from different body parts, and weighted opportunely by exploiting symmetry and asymmetry perceptual principles. In this way, robustness against very low resolution, occlusions and pose, viewpoint and illumination changes is achieved. The approach applies to situations where the number of candidates varies continuously, considering single images or bunch of frames for each individual. It has been tested on several public benchmark datasets (ViPER, iLIDS, ETHZ), gaining new state-of-the-art performances.
Reference:
|
ADDITIONAL REFERENCES:
- B. Mirmahboob, M. L. Mekhalfi, V. Murino.
"Distance penalization and fusion for person re-identification"
IEEE Winter Conference on Applications of Computer Vision (WACV), 2017 - A. Bhuiyan, B. Mirmahboub, A. Perina, V. Murino
"Person re-identification using robust brightness transfer functions based on multiple detections"
18th International Conference on Image Analysis and Processing, Genova, Italy, 7-11 September 2015 [PDF] - L. Bazzani, M. Cristani, A. Perina, V. Murino
"Multiple-shot person re-identification by chromatic and epitomic analyses"
Pattern Recognition Letters, 33(7):898-903, May 2012