Adaptive Skew-Sensitive Ensembles for Video-to-Video Face Recognition

Adaptive Skew-Sensitive Ensembles for Video-to-Video Face Recognition


Video-based face recognition (FR) is employed more and more to assist operators of intelligent video surveillance (VS) systems in industry and public sectors, due in large part to the low cost camera technologies and the advances in the areas of biometrics, pattern recognition and computer vision. Decision support systems are employed in crowded scenes (airports, shopping centers, stadiums, etc.), where an human operator monitors live or archived videos to analyze a scene (Hampapur et al., 2005). VS systems perform a growing number of functions, ranging from real time recognition and video footage analysis to fusion of video data from different sources (Gouaillier, 2009). FR in VS (FRiVS) can be employed in a range of still-to-video (as found in, e.g., watchlist screening) and video-to-video (as found in, e.g., face re-identification) applications. In still-to-video FR, a gallery of still images is employed in the construction of facial models, whereas in video-to-video FR facial models are designed from video streams.
Of special interest in this Thesis is the automatic detection of a target individual of interest enrolled to a video-to-video FR system. In this human-centeric scenario, live or archived videos are analyzed, and the operator receives an alarm if it detects the presence of a target individual enrolled to the system. Due to the high amount of non-target individuals appearing in crowded scenes, avoiding false alarms while maintaining a high detection rate is challenging for such a system. The design of a FR system for real world applications raises many challenges.

Objective and contributions

In this Thesis, a new framework for adaptive MCSs is proposed for partially-supervised learning of facial models over time based on facial trajectories. This framework is designed to implement systems for video-to-video FR, as needed for face re-identification applications, where gradual or abrupt environmental changes occur over time. In Bayesian decision theory, these changes correspond to changes in the probability density function of the faces (e.g. appearance of the face), or the prior probabilities (class proportions). The main contribution of this Thesis includes the proposal of an adaptive MCS for video-to-video FR for video surveillance, capable of spatio-temporal recognition and self-updating based on highly confident facial trajectories captured in scene. The system is also capable of adapting the fusion function of individual-specific classifiers to the operational imbalance in video-to-video FR. This contribution is divided into three parts.


Intelligent video surveillance systems that employ face recognition (FR) for decision support are important in many private, but mostly public sector applications. The extensive use of FR systems is due in part to the universality of the human face as a biometric trait that can be covertly captured, the availability of low cost cameras, and to advances in biometrics, pattern recognition and image/video processing. These systems are being considered for video surveillance in crowded scenes (airports, shopping centers, stadiums, etc.) In these scenes, an operator observes the scene through surveillance cameras and monitor who or what is in scene (Hampapur et al., 2005). Although many decision support systems exist, there are still many functions to be developed or improved. These areas of opportunity for researchers range from the real time recognition to fusion of video data from different sources, passing through the design of compact biometric models and the preservation of performance over time (Gouaillier, 2009; Ahmad et al., 2008). Of special interest in this Thesis is the automatic detection of individuals of interest enrolled to a system, based on the appearance of their face, and the preservation of system’s performance regardless of variations over time of a target individual’s appearance.

 Challenges of FRiVS

Many challenges have been found in FRiVS that remain as a research area. As stated by Zhao et al. in (Zhao et al., 2003), FR from outdoor images of dense scenes, under unconstrained conditions, is still a research problem. This problem has been addressed by considering time information in video-based approaches (Matta and Dugelay, 2009). However noisy sensed data from the complex, changing environment may lead biometric model that does not correspond to the true biometric samples, which affects directly the accuracy of the matching algorithm.
Overlapping class distributions due to inter-class similarity also increases the number of false alarms produced by the system. Facial models designed with a limited set of training data from the complex data distribution of faces in feature space are scarcely representative. Even if the facial models are representative, most FR systems assume that face samples in operation are acquired by the same sensor as the used to acquire training data, which is not necessarily true and affect accuracy. Also factors like an inappropriate interaction of the biometric system with the sensor, and inherent scene properties like environmental or temporal changes of the true distribution of faces in feature space, may degrade the accuracy of the system (Rattani, 2010; Poh et al., 2009). The quality of facial models is then a critical issue in the overall biometric application performance. The recognition problem becomes more challenging if we consider that faces do not remain static over time, and present either gradual (e.g. aging) or abrupt (e.g.
pose, illumination) changes along the system’s operation.

 Adaptive Face Recognition
 Semi-Supervisd Learning

Many researchers have recently focused on the interesting area of updating biometric models over time employing new acquired data. These adaptive biometric systems can be categorized according to the way class labels are obtained. Unsupervised approaches do not require class labels to update biometric models, and a simultaneous recognition and update is performed.
On the other hand, Supervised approaches use only labeled data previously acquired in an offline update. Approaches in which biometric models are built supervised, and unsupervised adaptation is performed online, are also called partially-supervised or semi-supervised. Table 1.2 shows different approaches to adapt facial models as new data becomes available, either from daily operations or security reports.

 A Self-Updating System for Spatio-Temporal Face Recognition

The structure of the adaptive MCS for video-to-video FR is shown in Fig. 3.2. It is composed of 7 subsystems: 5 used in normal operation and 2 used in the design/self-update phase. The segmentation module is used for face detection, the feature extraction/selection module and the matcher with one EoD per enrolled individual produces classification predictions. The IVT face tracker follows faces in scene allowing the spatio-temporal fusion system to regroup and accumulate target predictions over a fixed size window for enhanced spatio-temporal FR. Detection (γd k ) and update (γu k ) thresholds for spatio-temporal fusion are estimated using validation trajectories, and the design/update module avoids knowledge corruption by using a learn-andcombine strategy. Individual-specific EoDs are designed by the design/update module, by training a pool of PFAM 2-class classifiers using a DPSO training strategy, and estimating the fusion function with BC. The sample selection system allows to reduce the negative bias of the training and validation sets using the OSS and random selection strategies.

 Performance Analysis

The analysis of simulation results has been divided into three levels. First, transaction-based analysis shows the performance of the system based on classification decisions on each ROI.
Then, a subject-based analysis allows a focus on specific individuals, which in turn allows for levels of performance depending on particular characteristics. Finally, a trajectory based analysis shows the overall performance of the system after the decision fusion accumulates predictions for complete input trajectories (shown in Fig. 3.5).


Systems for face recognition (FR) in video surveillance are applied in a range of scenarios like watchlist screening, face re-identification and search and retrieval. Several challenges are present in these applications, including the common assumption that the facial appearance of target individuals do not change over time, and that the proportions of faces captured for target and non-target individuals are balanced, known a priori and remain fixed. However, faces captured during operations vary due to capture conditions, the proportions of target and nontarget individuals continuously change during operations, and facial models used matching are commonly not representative since they are designed a priori, with a limited amount of reference samples that are collected and labeled at a high cost.
In this Thesis, a framework for adaptive systems for video-to-video face recognition (FR) in video surveillance is proposed, contributing with new techniques to adapt the facial models for enrolled individuals of interest. This framework allows the systems for trajectory-based selfupdating to automatically update facial models, considering gradual and abrupt changes in the classification environment. Besides, with the use of a modification to SSBC, the systems are capable to adapt the individual-specific ensembles to the operational imbalance.

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

1.1 Face Recognition in Video-Surveillance
1.1.1 Specialized Architectures for FRiVS
1.1.2 Challenges of FRiVS
1.2 Adaptive Face Recognition
1.2.1 Semi-Supervisd Learning
1.2.2 Adaptive Biometrics
1.2.3 Challenges of Adaptive FR Systems
1.3 Incremental and On-Line Learning of Classifiers
1.3.1 Fuzzy ARTMAP
1.3.2 PFAM Neural Classifier
1.4 Adaptive Ensembles
1.4.1 Generation of Pools
1.4.2 Selection and Fusion Iterative Boolean Combination
1.4.3 Ensembles for Class Imbalance Passive Approaches Active Approaches Skew-Sensitive Boolean Combination
1.4.4 Challenges on Adaptive Ensembles for Class Imbalance
1.5 Measuring Classification Performance
1.6 Summary of Overall Challenges
2.1 Introduction
2.2 Video-to-video Face Recognition
2.2.1 Face Tracking
2.2.2 Specialized Classification Architectures
2.2.3 Decision Fusion
2.2.4 Challenges of Facial Modeling
2.3 Adaptive Biometric Systems
2.3.1 Selection of Representative Samples
2.3.2 Update of Biometric Systems
2.3.3 Adaptive Face Recognition
2.4 A Self-Updating System for Face Recognition in Video Surveillance
2.4.1 Modular Classification System
2.4.2 Tracking System
2.4.3 Decision Fusion System
2.4.4 Design/Update System
2.4.5 Sample Selection
2.5 Experimental Methodology
2.5.1 Video Surveillance Database
2.5.2 Implementation of the Proposed MCS
2.5.3 Experimental Protocol
2.5.4 Performance Analysis
2.6 Results
2.6.1 Transaction-Based Analysis
2.6.2 Subject-Based Analysis
2.6.3 Trajectory-Based Analysis
2.7 Conclusion
3.1 Introduction
3.2 Video-to-Video Face Recognition in Person Re-identification
3.2.1 Face Tracking
3.2.2 Face Matching
3.2.3 Spatio-Temporal Fusion
3.2.4 Key Challenges in Person Re-Identification
3.3 Update of Facial Models
3.3.1 Adaptive Biometrics
3.3.2 Adaptive Face Recognition Systems
3.4 A Self-Updating System for Spatio-Temporal Face Recognition
3.4.1 Modular Classification System
3.4.2 Tracking System
3.4.3 Spatio-Temporal Fusion System
3.4.4 Design/Update System
3.4.5 Sample Selection
3.5 Experimental Methodology
3.5.1 Database for Face Re-Identification
3.5.2 Experimental Protocol
3.5.3 Performance Analysis
3.6 Results
3.6.1 Subject-Based Analysis
3.6.2 LTM management
3.6.3 Trajectory-Based Analysis
3.7 Conclusions
4.1 Introduction
4.2 Ensemble Methods for Class Imbalance
4.2.1 Passive Approaches
4.2.2 Active Approaches
4.2.3 Estimation of Class Imbalance
4.2.4 Challenges
4.3 Adaptive Skew-Sensitive Ensembles for Video-to-Video Face Recognition
4.3.1 Approximation of Operational Imbalance
4.3.2 Design and Adaptation of Ensembles
4.4 Synthetic Experiments
4.4.1 Experimental Protocol
4.4.2 Results Classification on Imbalanced Problems Ensemble Generation Using Several Classifiers per Imbalance Approximation of Imbalance Through Quantification
4.4.3 Discussion
4.5 Experiments on Video Data
4.5.1 Experimental Protocol
4.5.2 Video Surveillance Data
4.5.3 Experimental Protocol
4.5.4 Results Transaction-Based Analysis Individual-Specific Analysis Approximation of Operational Imbalance
4.5.5 Trajectory-Level Analysis
4.6 Conclusion

Rapport PFE, mémoire et thèse PDFTélécharger le rapport complet

Télécharger aussi :

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *