Anomaly intrusion detection systems

ANOMALY INTRUSION DETECTION SYSTEMS

Overview of Intrusion Detection Systems

IDSs allow for the detection of successful or unsuccessful attempts to compromise systems security. An IDS is an important component of any security infrastructure that complements other security mechanisms. an IDS consists of four essential components: sensors, analysis engines, data repository, and management and reporting modules.
An IDS monitors the activity of a target system through a data source, such as system call traces, audit trails, or network packets. Relevant information from these data sources are captured by IDSs sensors, synthesized as events, and forwarded to the analysis engine for on-line analysis or to a repository for off-line analysis.
The analysis engine contains decision-making mechanisms to discriminate malicious events from normal events. It may include anomaly, misuse, or hybrid detection approaches (described next). Outputs from analysis engines include specific information regarding manifestation of suspicious events. These information are stored in a repository for foren-sics analysis. A management and reporting module receives events that could indicate an attack from the analysis engine, raises an alarm to notify human operators, and reports the relevant information and the level of threat. The management module controls operations of IDS components, such as tuning decision thresholds of the analysis engine and updating the data repository.
The configuration of analysis engines, update of data repository, and response to alerts are among the responsibilities of the IDS administrator. When alerts are raised, the IDS administrator should prioritize and investigate incidents to refute or confirm that an attack has actually occurred. If an intrusion attempt is confirmed, a response team should react to limit the damage, and a forensic analysis team should investigate the cause of the successful attack. An IDS may include a response module that undertake further actions either to prevent an ongoing attack or to collect additional supporting information – it is often referred to as intrusion prevention system (IPS) or intrusion detection and prevention system (IDPS), (Ghorbani et al., 2010; Rash et al., 2005; Scarfone and Mell, 2007; Stakhanova et al., 2007).

Host-based Anomaly Detection

Host-based anomaly detection systems typically monitor the behavior of system processes to determine whether a process is behaving normally or has been subverted by an attack. In particular, abnormal behavior of privileged processes is most dangerous. Attacks exploiting vulnerabilities in privileged processes can lead to full system compromise.
In general, ADSs monitor events from variety of data sources, including log files created by a process and audit trails generated by the operating system. While these sources provide important security information, it is fairly easy to deploy attacks which do not leave any traces in traditional logs or audit trails, and hence evade detection. Sequences of system calls issued by a process to request kernel services, have been shown effective in describing normal process behavior (Forrest et al., 1996). A substantial amount of research have investigated various techniques for detecting anomalies in system call sequences (Forrest et al., 2008; Warrender et al., 1999).
The remaining of this section provides a brief background on privileged processes and system calls and then reviews system call-based approaches to process anomaly detection.

Evaluation of Intrusion Detection Systems

The evaluation of intrusion detection systems is an open research topic, which face several challenges, including the lack of representative data and unified methodologies, and the employment of inadequate metrics for evaluation. (Abouzakhar and Manson, 2004;
Gadelrab, 2008; Tucker et al., 2007). This thesis evaluates the proposed techniques for system call anomaly detection based on their efficiency, adaptability, and accuracy.
The efficiency considers the costs involved during the design and operation phase of an anomaly detection system. It considers the time and memory complexity required for designing an ADS, including detectors training, validation, selection or combination, as well as the space requirement for storing training data (e.g., batch technique) and for storing selected or combined models. Impact on efficiency of training set sizes with various alphabet sizes and complexities of monitored processes is also assessed. During operation, the efficiency considers the time and memory complexity required to operate the ADS, with one or multiple detectors, evaluate the likelihood of the input sub-sequences of observations, and make a decision.
A fully adaptive ADS must have mechanisms to detect legitimate changes in normal behaviors, collect data that reflect the changes, ensure the relevant data contain no pattern of attacks, and update its internal detectors and decision thresholds to adapt for the changes. The scope of this thesis is limited to adaptation at the detector and decision level. The remaining tasks are still under the administrator’s scope of responsibility.
Therefore, the adaptability of an ADS is evaluated for its effectiveness in maintaining or improving the overall system accuracy in response to new data. The accuracy of an ADS is determined based on the receiver operating characteristic analysis, and should not be confused with the accuracy measure (or inversely error rate), as described next.

A SURVEY OF TECHNIQUES FOR INCREMENTAL LEARNING OF HMM PARAMETERS

The Hidden Markov Model (HMM) is a stochastic model for sequential data. It is a stochastic process determined by the two interrelated mechanisms – a latent Markov chain having a finite number of states, and a set of observation probability distributions, each one associated with a state. At each discrete time instant, the process is assumed to be in a state, and an observation is generated by the probability distribution corresponding to the current state. The HMM is termed discrete if the output alphabet is finite, and continuous if the output alphabet is not necessarily finite, e.g., each state is governed by a parametric density function (Elliott, 1994; Ephraim and Merhav, 2002; Rabiner, 1989).
Theoretical and empirical results have shown that, given an adequate number of states and a sufficiently rich set of data, HMMs are capable of representing probability distributions corresponding to complex real-world phenomena in terms of simple and compact models (Bengio, 1999; Bilmes, 2002). This is supported by the success of HMMs in various practical applications, where it has become a predominant methodology for design of automatic speech recognition systems (Bahl et al., 1982; Huang and Hon, 2001; Rabiner, 1989). It has also been successfully applied to various other fields, such as communication and control (Elliott, 1994; Hovland and McCarragher, 1998), bioinformatics (Eddy, 1998; Krogh et al., 2001), computer vision (Brand and Kettnaker, 2000; Rittscher et al., 2000), and computer and network security (Cho and Han, 2003; Lane and Brodley, 2003; Warrender et al., 1999). For instance, in the area of computer and network security, a growing number of HMM applications are found in intrusion detection systems (IDSs). HMMs have been applied either to anomaly detection, to model normal patterns of behavior, or in misuse detection, to model a predefined set of attacks. HMM applications in anomaly and misuse detection have emerged in both main categories of IDS – host-based IDS (Cho and Han, 2003; Lane and Brodley, 2003; Warrender et al., 1999; Yeung and Ding, 2003) and network-based IDS (Gao et al., 2003; Tosun, 2005). Moreover, HMMs have recently begun to emerge in wireless IDS applications (Cardenas et al., 2003; Konorski, 2005).

On-Line Learning of HMM Parameters

Several on-line learning techniques from the literature may be applied to incremental learning of HMM parameters from new training sequences. Figure 2.4 presents a taxonomy of techniques for on-line learning of HMM parameters, according to objective function, optimization technique, and target application. As shown in the figure, they fall in the categories of standard numerical optimization, expectation-maximization and recursive estimation, with the objective of either maximizing the likelihood estimation (MLE) criterion, minimizing the model divergence (MMD) of parameters penalized with the MLE, or minimizing the output or state prediction error (MPE).
The target application implies a scenario for data organization and learning. Some techniques have been designed for block-wise estimation of HMM parameters, while others for symbol-wise estimation of parameters. Block-wise techniques are designed for scenarios in which training symbols are organized into a block of sub-sequences and the HMM re-estimates its parameters after observing each sub-sequence. In contrast, symbol-wise techniques, also known as as recursive or sequential techniques, are designed for scenarios in which training symbols are observed one at a time, from a stream of symbols, and the HMM parameters are re-estimated upon observing each new symbol

ITERATIVE BOOLEAN COMBINATION OF CLASSIFIERS IN THE ROC SPACE: AN APPLICATION TO ANOMALY DETECTION WITH HMMS

Intrusion Detection Systems (IDS) is used to identify, assess, and report unauthorized computer or network activities. Host-Based IDSs (HIDS) are designed to monitor the activities of a host system and state, while network-based IDSs (NIDS) monitor the network traffic for multiple hosts. HIDSs and NIDSs have been designed to perform misuse detection and anomaly detection. Anomaly-based intrusion detection allows to detect novel attacks for which the signatures have not yet been extracted (Chandola et al., 2009a). In practice, anomaly detectors will typically generate false alarms due in large part to the limited data used for training, and to the complexity of underlying data distributions that may change dynamically over time. Since it is very difficult to collect and label representative data to design and validate an Anomaly Detection Systems (ADS), its internal model of normal behavior will tend to diverge from the underlying data distribution.
In HIDSs applied to anomaly detection, operating system events are usually monitored. Since system calls are the gateway between user and kernel mode, traditional host-based anomaly detection systems monitor deviation in system call sequences. Forrest et al. (1996) confirmed that short sequences of system calls are consistent with normal operation, and unusual burst will occur during an attack. Their anomaly detection system, called Sequence Time-Delay Embedding (STIDE), is based on look-up tables of memorized normal sequences. During operations, STIDE must compare each input sequence to all “normal” training sequences. The number of comparisons increases exponentially with the detector window size. Moreover, STIDE is often used for design and validation of other state-of-the-art detectors. Various neural and statistical detectors have been
applied to learn the normal process behavior through system call sequences (Warrender et al., 1999). Among these, techniques based on discrete Hidden Markov Models (HMMs) (Rabiner, 1989) have been shown to produce a very high level of performance (Warrender et al., 1999). A well trained HMM is able to capture the underlying structure of the monitored application and detect deviations from “normal” system call sequences.
Once trained, an HMM provides a fast and compact detector, with tolerance to noise and uncertainty.

Anomaly Detection with HMMs

A discrete-time finite-state HMM is a stochastic process determined by the two interrelated mechanisms – a latent Markov chain having a finite number of states, and a set of observation probability distributions, each one associated with a state. At each discrete time instant, the process is assumed to be in a state, and an observation is generated by the probability distribution corresponding to the current state. HMM parameters are usually trained using the Baum-Welch (BW) algorithm (Baum et al., 1970) – a specialized expectation maximization technique to estimate the parameters of the model from the training data. Theoretical and empirical results have shown that, given an adequate number of states and a sufficiently rich set of observations, HMMs are capable of representing probability distributions corresponding to complex real-world phenomena in
terms of simple and compact models, with tolerance to noise and uncertainty. For further details regarding HMM the reader is referred to the extensive literature (Ephraim and Merhav, 2002; Rabiner, 1989).

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela rapport-gratuit.com propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

INTRODUCTION
CHAPTER 1 ANOMALY INTRUSION DETECTION SYSTEMS
1.1 Overview of Intrusion Detection Systems
1.1.1 Network-based IDS
1.1.2 Host-based IDS
1.1.3 Misuse Detection
1.1.4 Anomaly Detection
1.2 Host-based Anomaly Detection
1.2.1 Privileged Processes
1.2.2 System Calls
1.2.3 Anomaly Detection using System Calls
1.3 Anomaly Detection with HMMs
1.3.1 HMM-based Anomaly Detection using System Calls
1.4 Data Sets
1.4.1 University of New Mexico (UNM) Data Sets
1.4.2 Synthetic Generator
1.5 Evaluation of Intrusion Detection Systems
1.5.1 Receiver Operating Characteristic (ROC) Analysis
1.6 Anomaly Detection Challenges
1.6.1 Representative Data Assumption
1.6.2 Unrepresentative Data
CHAPTER 2 A SURVEY OF TECHNIQUES FOR INCREMENTAL LEARNING OF HMM PARAMETERS
2.1 Introduction
2.2 Batch Learning of HMM Parameters
2.2.1 Objective Functions
2.2.2 Optimization Techniques
2.2.2.1 Expectation-Maximization
2.2.2.2 Standard Numerical Optimization
2.2.2.3 Expectation-Maximization versus Gradient-based Techniques
2.3 On-Line Learning of HMM Parameters
2.3.1 Minimum Model Divergence (MMD)
2.3.2 Maximum Likelihood Estimation (MLE)
2.3.2.1 On-line Expectation-Maximization
2.3.2.2 Numerical Optimization Methods
2.3.2.3 Recursive Maximum Likelihood Estimation (RMLE)
2.3.3 Minimum Prediction Error (MPE)
2.4 An Analysis of the On-line Learning Algorithms
2.4.1 Convergence Properties
2.4.2 Time and Memory Complexity
2.5 Guidelines for Incremental Learning of HMM Parameters
2.5.1 Abundant Data Scenario
2.5.2 Limited Data Scenario
2.6 Conclusion
CHAPTER 3 ITERATIVE BOOLEAN COMBINATION OF CLASSIFIERS IN THE ROC SPACE: AN APPLICATION TO ANOMALY DETECTION WITH HMMS
3.1 Introduction
3.2 Anomaly Detection with HMMs
3.3 Fusion of Detectors in the Receiver Operating Characteristic (ROC) Space
3.3.1 Maximum Realizable ROC (MRROC)
3.3.2 Repairing Concavities
3.3.3 Conjunction and Disjunction Rules for Crisp Detectors
3.3.4 Conjunction and Disjunction Rules for Combining Soft Detectors
3.4 A Boolean Combination (BCALL) Algorithm for Fusion of Detectors
3.4.1 Boolean Combination of Two ROC Curves
3.4.2 Boolean Combination of Multiple ROC Curves
3.4.3 Time and Memory Complexity
3.4.4 Related Work on Classifiers Combinations
3.5 Experimental Methodology
3.5.1 University of New Mexico (UNM) Data
3.5.2 Synthetic Data
3.5.3 Experimental Protocol
3.6 Simulation Results and Discussion
3.6.1 An Illustrative Example with Synthetic Data
3.6.2 Results with Synthetic and Real Data
3.7 Conclusion
CHAPTER 4 ADAPTIVE ROC-BASED ENSEMBLES OF HMMS APPLIED TO ANOMALY DETECTION
4.1 Introduction
4.2 Adaptive Anomaly Detection Systems
4.2.1 Anomaly Detection Using HMMs
4.2.2 Adaptation in Anomaly Detection
4.2.3 Techniques for Incremental Learning of HMM Parameters
4.2.4 Incremental Learning with Ensembles of Classifiers
4.3 Learn-and-Combine Approach Using Incremental Boolean Combination
4.3.1 Incremental Boolean Combination in the ROC Space
4.3.2 Model Management
4.3.2.1 Model Selection
4.3.2.2 Model Pruning
4.4 Experimental Methodology
4.4.1 Data Sets
4.4.2 Experimental Protocol
4.5 Simulation Results
4.5.1 Evaluation of the Learn-and-Combine Approach
4.5.2 Evaluation of Model Management Strategies
4.6 Conclusion
CONCLUSIONS