You are on page 1of 16

ABSTRACT

Distributed denial of service (DDoS) attack is a continuous critical threat to the Internet. Derived from the low layers, new application-layer-based DDoS attacks utilizing legitimate HTTP requests to overwhelm victim resources are more undetectable. The case may be more serious when such attacks mimic or occur during the flash crowd event of a popular Website. Focusing on the detection for such new DDoS attacks, a scheme based on document popularity is introduced. An Access Matrix is defined to capture the spatial-temporal patterns of a normal flash crowd. Principal component analysis and independent component analysis are applied to abstract the multidimensional Access Matrix. A novel anomaly detector based on hidden semi-Markov model is proposed to describe the dynamics of Access Matrix and to detect the attacks.

Statement About The Problem:


Few existing papers focus on the detection of App-DDoS attacks during the flash crowd event. Net-DDoS attacks versus stable background traffic, Net-DDoS attacks versus flash crowd (i.e., burst background traffic) are dealt the existing system. some simple App-DDoS attacks (e.g., Flood) still can be monitored by improving existing methods designed for Net-DDoS attacks, e.g., we can apply the HTTP request rate, HTTP session rate, and duration of users access for detecting.
Most existing methods used on document popularity for modeling user behavior merely

focus on the average characteristics (e.g., mean and variance), we use a stochastic process to model the variety of the document popularity, in which a random vector is used to represent the spatial distribution of document popularity and is assumed to be changing with time. Existing algorithms of HsMM will be very complex when the observation is a highdimension vector with dependent elements in the spatial-temporal matrix of AM. In the practical implementation, the model is first trained by the stable and low-volume Web workload whose normality can be ensured by most existing anomaly detection systems, and then it is used to monitor the following Web workload for a period of 10 min. Stochastic pulses are very difficult to be detected by the existing methods that are based on traffic volume analysis, because the average rate of the attacks is not remarkably higher than that of a normal user.

Why Is The Particular Topic Chosen?

In the past decades, various security-enhanced measures have been proposed to improve the security of data transmission. Distributed denial of service (DDoS) attack is a continuous critical threat to the Internet. Derived from the low layers, new application-layer-based DDoS attacks utilizing legitimate HTTP requests to overwhelm victim resources are more undetectable. Since many studies have noticed this type of attack and have proposed different schemes (e.g., network measure or anomaly detection) to protect the network and equipment from bandwidth attacks, it is not as easy as in the past for attackers to launch the DDoS attacks based on network layer.

Objective Of The Project:

Distributed denial of service (DDoS) attack is a continuous critical threat to the Internet. Derived from the low layers, new application-layer-based DDoS attacks utilizing legitimate HTTP requests to overwhelm victim resources are more undetectable. The case may be more serious when such attacks mimic or occur during the flash crowd event of a popular Website. Focusing on the detection for such new DDoS attacks, a scheme based on document popularity is introduced. An Access Matrix is defined to capture the spatial-temporal patterns of a normal flash crowd. Principal component analysis and independent component analysis are applied to abstract the multidimensional Access Matrix. A novel anomaly detector based on hidden semi-Markov model is proposed to describe the dynamics of Access Matrix and to detect the attacks. The entropy of document popularity fitting to the model is used to detect the potential application-layer DDoS attacks. Numerical results based on real Web traffic data are presented to demonstrate the effectiveness of the proposed method.

Scope of the Project:

Distributed denial of service (DDoS) attack has caused severe damage to servers and will cause even greater intimidation to the development of new Internet services. Traditionally, DDoS attacks are carried out at the network layer, such as ICMP flooding, SYN flooding, and UDP flooding, which are called Net-DDoS attacks in this paper. The intent of these attacks is to consume the network bandwidth and deny service to legitimate users of the victim systems. Since many studies have noticed this type of attack and have proposed different schemes (e.g., network measure or anomaly detection) to protect the network and equipment from bandwidth attacks, it is not as easy as in the past for attackers to launch the DDoS attacks based on network layer.

Methodology

Software engineering is the practice of using selected process techniques to improve the quality of a software development effort. This is based on the assumption, subject to endless debate and supported by patient experience, that a methodical approach to software development results in fewer defects and, therefore, ultimately provides shorter delivery times and better value. The documented collection of policies, processes and procedures used by a development team or organization to practice software engineering is called its software development methodology (SDM) or system development life cycle (SDLC). The steps for Spiral Model can be generalized as follows: The new system requirements are defined in as much details as possible. This usually involves interviewing a number of users representing all the external or internal users and other aspects of the existing system. A preliminary design is created for the new system. A first prototype of the new system is constructed from the preliminary design. This is usually a scaled-down system, and represents an approximation of the characteristics of the final product. A second prototype is evolved by a fourfold procedure: Evaluating the first prototype in terms of its strengths, weakness, and risks. Defining the requirements of the second prototype. Planning an designing the second prototype. Constructing and testing the second prototype.

At the customer option, the entire project can be aborted if the risk is deemed too great. Risk factors might involve development cost overruns, operating-cost miscalculation, or any other factor that could, in the customers judgment, result in a less-than-satisfactory final product.

The existing prototype is evaluated in the same manner as was the previous prototype, and if necessary, another prototype is developed from it according to the fourfold procedure outlined above. The preceding steps are iterated until the customer is satisfied that the refined prototype represents the final product desired. The final system is constructed, based on the refined prototype. The final system is thoroughly evaluated and tested. Routine maintenance is carried on a continuing basis to prevent large scale failures and to minimize down time. The following diagram shows how a spiral model acts like:

Spiral Model

Advantages:
Estimates(i.e. budget, schedule etc .) become more relistic as work progresses, because important issues discoved earlier. It is more able to cope with the changes that are software development generally entails. Software engineers can get their hands in and start woring on the core of a project earlier.

Hardware Requirements:
Processor Ram Hard Disk Compact Disk Input device Output device : Any Processor above 500 MHz. : 128Mb. : 10 Gb. : 650 Mb. : Standard Keyboard and Mouse. : VGA and High Resolution Monitor.

Software Requirements:
Operating System Language Front End : Windows Family. : JDK 1.5 : Java Swing

What Contribution Would The Project Make?


Can be able to detect the App-DDos attacks during the normal flash crowd event. Can be able to deal with multidimensional data by using PCA and ICA. Can be able to get real web traffic data by using HsMM.

Can be able to find constant rate attacks, increasing rate attacks, and stochastic pulsing

attack.

Topic Of The Project


Distributed denial of service (DDoS) attack has caused severe damage to servers and will cause even greater intimidation to the development of new Internet services. Traditionally, DDoS attacks are carried out at the network layer, such as ICMP flooding, SYN flooding, and UDP flooding, which are called Net-DDoS attacks in this. The intent of these attacks is to consume the network bandwidth and deny service to legitimate users of the victim systems. When the simple Net-DDoS attacks fail, attackers shift their offensive strategies to application-layer attacks and establish a more sophisticated type of DDoS attacks. To circumvent detection, they attack the victim Web servers by HTTP GET requests (e.g., HTTP Flooding) and pulling large image files from the victim server in overwhelming numbers.

Process Description : DFDS:


A data flow diagram is graphical tool used to describe and analyze movement of data through a system. These are the central tool and the basis from which the other components are developed. The transformation of data from input to output, through processed, may be described logically and independently of physical components associated with the system. These are known as the logical data flow diagrams. The physical data flow diagrams show the actual implements and movement of data between people, departments and workstations. A full description of a system actually consists of a set of data flow diagrams. Using two familiar notations Yourdon, Gane and Sarson notation develops the data flow diagrams. Each component in a DFD is labeled with a descriptive name. Process is further identified with a number that will be used for identification purpose. The development of DFDS is done in several levels. Each process in lower level diagrams can be broken down into a more detailed DFD in the next level. The loplevel diagram is often called context diagram. It consists a single process bit, which plays vital role in studying the current system. The process in the context level diagram is exploded into other process at the first level DFD. The idea behind the explosion of a process into more process is that understanding at one level of detail is exploded into greater detail at the next level. This is done until further explosion is necessary and an adequate amount of detail is described for analyst to understand the process.

Larry Constantine first developed the DFD as a way of expressing system requirements in a graphical from, this lead to the modular design. A DFD is also known as a bubble Chart has the purpose of clarifying system requirements and identifying major transformations that will become programs in system design. So it is the starting point of the design to the lowest level of detail. A DFD consists of a series of bubbles joined by data flows in the system.

DFD SYMBOLS:
In the DFD, there are four symbols 1. A square defines a source(originator) or destination of system data 2. An arrow identifies data flow. It is the pipeline through which the information flows 3. A circle or a bubble represents a process that transforms incoming data flow into outgoing data flows. 4. An open rectangle is a data store, data at rest or a temporary repository of data

Process that transforms data flow.

Source or Destination of data

Data flow

Data Store

CONSTRUCTING A DFD:
Several rules of thumb are used in drawing DFDS: 1. Process should be named and numbered for an easy reference. representative of the process. 2. The direction of flow is from top to bottom and from left to right. Data traditionally flow from source to the destination although they may flow back to the source. One way to indicate this is to draw long flow line back to a source. An alternative way is to repeat the source symbol as a destination. Since it is used more than once in the DFD it is marked with a short diagonal. 3. When a process is exploded into lower level details, they are numbered. 4. The names of data stores and destinations are written in capital letters. Process and dataflow names have the first letter of each work capitalized. A DFD typically shows the minimum contents of data store. Each data store should contain all the data elements that flow in and out. Questionnaires should contain all the data elements that flow in and out. interfaces redundancies and like is then accounted for often through interviews. Missing Each name should be

SAILENT FEATURES OF DFDS


1. The DFD shows flow of data, not of control loops and decision are controlled considerations do not appear on a DFD. 2. The DFD does not indicate the time factor involved in any process whether the dataflow take place daily, weekly, monthly or yearly. 3. The sequence of events is not brought out on the DFD.

TYPES OF DATA FLOW DIAGRAMS


1. Current Physical 2. Current Logical 3. New Logical 4. New Physical

CURRENT PHYSICAL:

In Current Physical DFD process label include the name of people or their positions or the names of computer systems that might provide some of the overall system-processing label includes an identification of the technology used to process the data. Similarly data flows and data stores are often labels with the names of the actual physical media on which data are stored such as file folders, computer files, business forms or computer tapes.

CURRENT LOGICAL:
The physical aspects at the system are removed as much as possible so that the current system is reduced to its essence to the data and the processors that transforms them regardless of actual physical form.

NEW LOGICAL:
This is exactly like a current logical model if the user were completely happy with the user were completely happy with the functionality of the current system but had problems with how it was implemented typically through the new logical model will differ from current logical model while having additional functions, absolute function removal and inefficient flows recognized.

NEW PHYSICAL:
The new physical represents only the physical implementation of the new system.

RULES GOVERNING THE DFDS PROCESS


1) 2) 3) No process can have only outputs. No process can have only inputs. If an object has only inputs than it must be a sink. A process has a verb phrase label.

DATA STORE

1) 2) 3)

Data cannot move directly from one data store to another data store, a process must Data cannot move directly from an outside source to a data store, a process, which A data store has a noun phrase label.

move data. receives, must move data from the source and place the data into data store

SOURCE OR SINK
The origin and /or destination of data. 1) 2) Data cannot move direly from a source to sink it must be moved by a process A source and /or sink has a noun phrase land

DATA FLOW
1) A Data Flow has only one direction of flow between symbols. It may flow in both directions between a process and a data store to show a read before an update. The later is usually indicated however by two separate arrows since these happen at different type. 2) 3) A join in DFD means that exactly the same data comes from any of two or more A data flow cannot go directly back to the same process it leads. There must be at least different processes data store or sink to a common location. one other process that handles the data flow produce some other data flow returns the original data into the beginning process. 4) 5) A Data flow to a data store means update (delete or change). A data Flow from a data store means retrieve or use.

A data flow has a noun phrase label more than one data flow noun phrase can appear on a single arrow as long as all of the flows on the same arrow move together as one package.

Context level DFD:

user

Monitoring the Application-Layer DDoS Attacks for Popular Websites

Received Data

Level-1:

user

server

router

flooding select

destination

Architecture:

We will propose a dynamic routing algorithm that could randomize delivery paths for data transmission. The algorithm is easy to implement and compatible with popular routing protocols, such as the Routing Information Protocol in wired networks and Destination-Sequenced Distance Vector protocol in wireless networks, without introducing extra control messages.

Multidimensional Data Processing HsMM


Self-Adaptive SchemeRouting Detection Architecture

Multidimensional Data Processing:

Client-server computing or networking is a distributed application architecture that partitions tasks or workloads between service providers (servers) and service requesters, called clients. A server machine is a high-performance host that is running one or more server programs which share its resources with clients. A client also shares any of its resources; Clients therefore initiate communication sessions with servers which await (listen to) incoming requests.

HsMM:

To propose a distance-vector based algorithm for dynamic routing to improve the security of data transmission. We propose to rely on existing distance information exchanged among neighboring nodes (referred to as routers as well in this paper) for the seeking of routing paths. In many distance-vector-based implementations, e.g., those based on RIP, each node maintains a routing table in which each entry is associated with a tuple, and Next hop denote some unique destination node, an estimated minimal cost to send a packet to t.

Self-Adaptive SchemeRouting :
In order to minimize the probability that packets are eavesdropped over a specific link, a randomization process for packet deliveries, in this process, the previous next-hop for the source node s is identified in the first step of the process. Then, the process randomly picks up a neighboring node as the next hop for the current packet transmission. The exclusion for the next hop selection avoids transmitting two consecutive packets in the same link, and the randomized pickup prevents attackers from easily predicting routing paths for the coming transmitted packets.

Detection Architecture :

In the network be given a routing table and a link table. We assume that the link table of each node is constructed by an existing link discovery protocol, such as the Hello protocol in. On the other hand, the construction and maintenance of routing tables are revised based on the well-known Bellman-Ford algorithm.

CONCLUSION

Creating defenses for attacks requires monitoring dynamic network activities in order to obtain timely and signification information. While most current effort focuses on detecting Net-DDoS attacks with stable background traffic, we proposed a detection architecture in this paper aiming at monitoring Web traffic in order to reveal dynamic shifts in normal burst traffic, which might signal onset of App-DDoS attacks during the flash crowd event. Our method reveals early attacks merely depending on the document popularity obtained from the server log. The propos- -ed method is based on PCA, ICA, and HsMM. We conducted the experiment with different App-DDoS attack modes (i.e., constant rate attacks, increasing rate attacks and stochastic pulsi- -ng attack) during a flash crowd event collected from a real trace. Our simulation results how that the system could capture the shift of Web traffic caused by attacks under the flash crowd and the entropy of the observed data fitting to the HsMM can be used as the measure of abnorm- -ality. It also demonstrates that the proposed architecture is expected to be practical in monitoring App-DDoS attacks and in triggering more dedicated detection on victim network.

You might also like