Joint Routing and Medium Access Control in Fixed Random Access Wireless Multi hop Networks

22/10/201902/07/2019 by admin

Hop-by-Hop Message Authentication and Source Privacy in Wireless Sensor Networks

22/10/201902/07/2019 by admin

HOP-BY-HOP MESSAGE AUTHENTICATION AND SOURCE PRIVACY IN WIRELESS SENSOR NETWORKS

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

BONAFIDE CERTIFICATE

Certified that this project report titled “HOP-BY-HOP MESSAGE AUTHENTICATION AND SOURCE PRIVACY IN WIRELESS SENSOR NETWORKS” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

CHAPTER 1

ABSTRACT:

Message authentication is one of the most effective ways to thwart unauthorized and corrupted messages from being forwarded in wireless sensor networks (WSNs). For this reason, many message authentication schemes have been developed, based on either symmetric-key cryptosystems or public-key cryptosystems. Most of them, however, have the limitations of high computational and communication overhead in addition to lack of scalability and resilience to node compromise attacks. To address these issues, a polynomial-based scheme was recently introduced. However, this scheme and its extensions all have the weakness of a built-in threshold determined by the degree of the polynomial: when the number of messages transmitted is larger than this threshold, the adversary can fully recover the polynomial.

In this paper, we propose a scalable authentication scheme based on elliptic curve cryptography (ECC). While enabling intermediate nodes authentication, our proposed scheme allows any node to transmit an unlimited number of messages without suffering the threshold problem. In addition, our scheme can also provide message source privacy. Both theoretical analysis and simulation results demonstrate that our proposed scheme is more efficient than the polynomial-based approach in terms of computational and communication overhead under comparable security levels while providing message source privacy.

INTRODUCTION:

MESSAGE authentication plays a key role in thwarting unauthorized and corrupted messages from being forwarded in networks to save the precious sensor energy. For this reason, many authentication schemes have been proposed in literature to provide message authenticity and integrity verification for wireless sensor networks (WSNs) .These schemes can largely be divided into two categories: public-key based approaches and symmetric-key based approaches. The symmetric-key based approach requires complex key management, lacks of scalability, and is not resilient to large numbers of node compromise attacks since the message sender and the receiver have to share a secret key. The shared key is used by the sender to generate a message authentication code (MAC) for each transmitted message. However, for this method, the authenticity and integrity of the message can only be verified by the node with the shared secret key, which is generally shared by a group of sensor nodes. An intruder can compromise the key by capturing a single sensor node.

In addition, this method does not work in multicast networks. To solve the scalability problem, a secret polynomial based message authentication scheme was introduced in. The idea of this scheme is similar to a threshold secret sharing, where the threshold is determined by the degree of the polynomial. This approach offers information-theoretic security of the shared secret key when the number of messages transmitted is less than the threshold. The intermediate nodes verify the authenticity of the message through a polynomial evaluation. However, when the number of messages transmitted is larger than the threshold, the polynomial can be fully recovered and the system is completely broken. An alternative solution was proposed in to thwart the intruder from recovering the polynomial by computing the coefficients of the polynomial. The idea is to add a random noise, also called a perturbation factor, to the polynomial so that the coefficients of the polynomial cannot be easily solved. However, a recent study shows that the random noise can be completely removed from the polynomial using error-correcting code techniques . For the public-key based approach, each message is transmitted along with the digital signature of the message generated using the sender’s private key. Every intermediate forwarder and the final receiver can authenticate the message using the sender’s public key. One of the limitations of the public-key based scheme is the high computational overhead. The recent progress on elliptic curve cryptography (ECC) shows that the public key schemes can be more advantageous in terms of key by capturing a single sensor node. In addition, this method does not work in multicast networks. To solve the scalability problem, a secret polynomial based message authentication scheme was introduced in. The idea of this scheme is similar to a threshold secret sharing, where the threshold is determined by the degree of the polynomial.

This approach offers information-theoretic security of the shared secret key when the number of messages transmitted is less than the threshold. The intermediate nodes verify the authenticity of the message through a polynomial evaluation. However, when the number of messages transmitted is larger than the threshold, the polynomial can be fully recovered and the system is completely broken. An alternative solution was proposed in to thwart the intruder from recovering the polynomial by computing the coefficients of the polynomial. The idea is to add a random noise, also called a perturbation factor, to the polynomial so that the coefficients of the polynomial cannot be easily solved. However, a recent study shows that the random noise can be completely removed from the polynomial using error-correcting code techniques. For the public-key based approach, each message is transmitted along with the digital signature of the message generated using the sender’s private key. Every intermediate forwarder and the final receiver can authenticate the message using the sender’s public key. One of the limitations of the public-key based scheme is the high computational overhead. The recent progress on elliptic curve cryptography (ECC) shows that the public key schemes can be more advantageous in terms of computational complexity, memory usage, and security resilience, since public-key based approaches have a simple and clean key management.

In this paper, we propose an unconditionally secure and efficient source anonymous message authentication (SAMA) scheme based on the optimal modified ElGamal signature (MES) scheme on elliptic curves. This MES scheme is secure against adaptive chosen-message attacks in the random oracle model. Our scheme enables the intermediate nodes to authenticate the message so that all corrupted message can be detected and dropped to conserve the sensor power. While achieving compromiseresiliency, flexible-time authentication and source identity protection, our scheme does not have the threshold problem. Both theoretical analysis and simulation results demonstrate that our proposed scheme is more efficient than the polynomial-based algorithms under comparable security levels.

The major contributions of this paper are the following: 1. We develop a source anonymous message authentication code (SAMAC) on elliptic curves that can provide unconditional source anonymity. 2. We offer an efficient hop-by-hop message authentication mechanism for WSNs without the threshold limitation. 3. We devise network implementation criteria on source node privacy protection in WSNs. 4. We propose an efficient key management framework to ensure isolation of the compromised nodes. 5. We provide extensive simulation results under ns-2 and TelosB on multiple security levels. To the best of our knowledge, this is the first scheme that provides hop-by-hop node authentication without the threshold limitation, and has performance better than the symmetric-key based schemes.

The distributed nature of our algorithm makes the scheme suitable for decentralized networks. The remainder of this paper is organized as follows: Section 2 presents the terminology and the preliminary that will be used in this paper. Section 3 discusses the related work, with a focus on polynomial-based schemes. Section 4 describes the proposed source anonymous message authentication scheme on elliptic curves. Section 5 discusses the ambiguity set (AS) selection strategies for source privacy. Section 6 describes key management and compromised node detection. Performance analysis and simulation results are provided in Section 7. We conclude in Section 8. through multi-hop communications. We assume there is a security server (SS) that is responsible for generation, storage and distribution of the security parameters among the network.

This server will never be compromised. However, after deployment, the sensor nodes may be captured and compromised by attackers. Once compromised, all information stored in the sensor nodes can be accessed by the attackers. The compromised nodes can be reprogrammed and fully controlled by the attackers. However, the compromised nodes will not be able to create new public keys that can be accepted by the SS and other nodes. Based on the above assumptions, this paper considers two types of attacks launched by the adversaries: Passive attacks. Through passive attacks, the adversaries could eavesdrop on messages transmitted in the network and perform traffic analysis. Active attacks. Active attacks can only be launched from the compromised sensor nodes. Once the sensor nodes are compromised, the adversaries will obtain all the information stored in the compromised nodes, including the security parameters of the compromised nodes. The adversaries can modify the contents of the messages, and inject their own messages.

LITRATURE SURVEY:

ATTACKING CRYPTOGRAPHIC SCHEMES BASED ON ‘PERTURBATION POLYNOMIALS

AUTHOR: M. Albrecht, C. Gentry, S. Halevi, and J. Katz,

PUBLISH: Report 2009/098, http://eprint.iacr.org/, 2009.

We show attacks on several cryptographic schemes that have recently been proposed for achieving various security goals in sensor networks. Roughly speaking, these schemes all use “perturbation polynomials” to add “noise” to polynomial-based systems that oer information- theoretic security, in an attempt to increase the resilience threshold while maintaining eciency. We show that the heuristic security arguments given for these modified schemes do not hold, and that they can be completely broken once we allow even a slight extension of the parameters beyond those achieved by the underlying information-theoretic schemes. Our attacks apply to the key predistribution scheme of Zhang et al. (MobiHoc 2007), the access-control schemes of Subramanian et al. (PerCom 2007), and the authentication schemes of Zhang et al. (INFOCOM 2008).

CRYPTOGRAPHIC KEY LENGTH RECOMMENDATION

PUBLISH: http://www.keylength.com/en/3/, 2013.

In most cryptographic functions, the key length is an important security parameter. Both academic and private organizations provide recommendations and mathematical formulas to approximate the minimum key size requirement for security. Despite the availability of these publications, choosing an appropriate key size to protect your system from attacks remains a headache as you need to read and understand all these papers.
This web site implements mathematical formulas and summarizes reports from well-known organizations allowing you to quickly evaluate the minimum security requirements for your system. You can also easily compare all these techniques and find the appropriate key length for your desired level of protection. The lengths provided here are designed to resist mathematic attacks; they do not take algorithmic attacks, hardware flaws, etc. into account.

LIGHTWEIGHT AND COMPROMISE-RESILIENT MESSAGE AUTHENTICATION IN SENSOR NETWORKS

AUTHOR: W. Zhang, N. Subramanian, and G. WangProc.

PUBLISH: IEEE INFOCOM, Apr. 2008.

Numerous authentication schemes have been proposed in the past for protecting communication authenticity and integrity in wireless sensor networks. Most of them however have following limitations: high computation or communication overhead, no resilience to a large number of node compromises, delayed authentication, lack of scalability, etc. To address these issues, we propose in this paper a novel message authentication approach which adopts a perturbed polynomial-based technique to simultaneously accomplish the goals of lightweight, resilience to a large number of node compromises, immediate authentication, scalability, and non-repudiation. Extensive analysis and experiments have also been conducted to evaluate the scheme in terms of security properties and system overhead.

COMPARING SYMMETRIC-KEY AND PUBLIC-KEY BASED SECURITY SCHEMES IN SENSOR NETWORKS: A CASE STUDY OF USER ACCESS CONTROL

AUTHOR: H. Wang, S. Sheng, C. Tan, and Q.

PUBLISH: Li Proc. IEEE 28th Int’l Conf. Distributed Computing Systems (ICDCS), pp. 11-18, 2008.

While symmetric-key schemes are efficient in processing time for sensor networks, they generally require complicated key management, which may introduce large memory and communication overhead. On the contrary, public-key based schemes have simple and clean key management, but cost more computational time. The recent progress of elliptic curve cryptography (ECC) implementation on sensors motivates us to design a public-key scheme and compare its performance with the symmetric-key counterparts. This paper builds the user access control on commercial off-the-shelf sensor devices as a case study to show that the public-key scheme can be more advantageous in terms of the memory usage, message complexity, and security resilience. Meanwhile, our work also provides insights in integrating and designing public-key based security protocols for sensor networks.

CHAPTER 2

EXISTING SYSTEM:

The public-key based approach, each message is transmitted along with the digital signature of the message generated using the sender’s private key. Every intermediate forwarder and the final receiver can authenticate the message using the sender’s public key. One of the limitations of the public-key based scheme is the high computational overhead.
Computational complexity, memory usage, and security resilience, since public-key based approaches have a simple and clean key management.

DISADVANTAGES OF EXISTING SYSTEM:

High computational and communication overhead.
Lack of scalability and resilience to node compromise attacks.
Polynomial-based scheme have the weakness of a built-in threshold determined by the degree of the polynomial.

PROPOSED SYSTEM:

We propose an unconditionally secure and efficient SAMA. The main idea is that for each message m to be released, the message sender, or the sending node, generates a source anonymous message authenticator for the message m.
The generation is based on the MES scheme on elliptic curves. For a ring signature, each ring member is required to compute a forgery signature for all other members in the AS.
In our scheme, the entire SAMA generation requires only three steps, which link all non-senders and the message sender to the SAMA alike. In addition, our design enables the SAMA to be verified through a single equation without individually verifying the signatures.

ADVANTAGES OF PROPOSED SYSTEM:

A novel and efficient SAMA based on ECC. While ensuring message sender privacy, SAMA can be applied to any message to provide message content authenticity.
To provide hop-by-hop message authentication without the weakness of the built- in threshold of the polynomial-based scheme, we then proposed a hop-by-hop message authentication scheme based on the SAMA.
When applied to WSNs with fixed sink nodes, we also discussed possible techniques for compromised node identification

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

SOFTWARE REQUIREMENTS:

Operating System : Windows XP or Win7
Front End : Java JDK 1.7
Tools : Eclipse
Document : MS-Office 2007

CHAPTER 3

SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data. The physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

ARCHITECTURE DIAGRAM:

DATAFLOW DIAGRAM:

UML DIAGRAMS:

USE CASE DIAGRAM:

Client

Server

CLASS DIAGRAM:

SEQUENCE DIAGRAM:

Files Transmitted

ECC Encrypted Data

SAMA Verification

Checking for key

Data Received

ACTIVITY DIAGRAM:

CHAPTER 4

4.0 IMPLEMENTATION:

Privacy is sometimes referred to as anonymity. Communication anonymity in information management has been discussed in a number of previous works in generally refers to the state of being unidentifiable within a set of subjects. This set is called the AS. Sender anonymity means that a particular message is not linkable to any sender, and no message is linkable to a particular sender. We will start with the definition of the unconditionally secure SAMA.

4.1 ALGORITHM:

4.3 MODULES:

SERVER CLIENT MODULE:

KEY MANAGEMENT AND NODE DETECTION

SYMMETRIC-KEY CRYPTOSYSTEM

PUBLIC-KEY CRYPTOSYSTEM

HOP-BY-HOP AUTHENTICATION

4.4 MODULES DESCRIPTON:

SERVER CLIENT MODULE:

Client-server computing or networking is a distributed application architecture that partitions tasks or workloads between service providers (servers) and service requesters, called clients. Often clients and servers operate over a computer network on separate hardware. A server machine is a high-performance host that is running one or more server programs which share its resources with clients. A client also shares any of its resources; Clients therefore initiate communication sessions with servers which await (listen to) incoming requests.

KEY MANAGEMENT AND NODE DETECTION:

Recently, message sender anonymity based on ring a signature was introduced in this approach enables the message sender to generate a source anonymous message signature with content authenticity assurance. To generate a ring signature, a ring member randomly selects an AS and forges a message signature for all other members. Then he uses his trap-door information to glue the ring together. The original scheme has very limited flexibility and very high complexity.

We focused on the cryptographic algorithm public-key based approach, each message is transmitted along with the digital signature of the message generated using the sender’s private key. Every intermediate forwarder and the final receiver can authenticate the message using the sender’s public key. The recent progress on ECC shows that the public-key schemes can be more advantageous in terms of memory usage, message complexity, and security resilience, since public-key based approaches have a simple and clean key management.

SYMMETRIC-KEY CRYPTOSYSTEM:

Symmetric key and hash based authentication schemes were proposed for WSNs. In these schemes, each symmetric authentication key is shared by a group of sensor nodes. An intruder can compromise the key by capturing a single sensor node. Therefore, these schemes are not resilient to node compromise attacks. Another type of symmetric-key scheme requires synchronization among nodes. These schemes, including TESLA and its variants, can also provide message sender authentication. However, this scheme requires initial time synchronization, which is not easy to be implemented in large scale WSNs. In addition, they also introduce delay in message authentication, and the delay increases as the network scales up.

PUBLIC-KEY CRYPTOSYSTEM:

A secret polynomial based message authentication scheme was introduced in scheme offers information- theoretic security with ideas similar to a threshold secret sharing, where the threshold is determined by the degree of the polynomial. When the number of messages transmitted is below the threshold, the scheme enables the intermediate node to verify the authenticity of the message through polynomial evaluation. However, when the number of messages transmitted is larger than the threshold, the polynomial can be fully recovered and the system is completely broken. To increase the threshold and the complexity for the intruder to reconstruct the secret polynomial, a random noise, also called a perturbation factor, was added to the polynomial in to thwart the adversary from computing the coefficient of the polynomial.

However, the added perturbation factor can be completely removed using error-correcting code techniques for the public-key based approach, each message is transmitted along with the digital signature of the message generated using the sender’s private key. Every intermediate forwarder and the final receiver can authenticate the message using the sender’s public key. The recent progress on ECC shows that the public-key schemes can be more advantageous in terms of memory usage, message complexity, and security resilience, since public-key based approaches have a simple and clean key management.

HOP-BY-HOP AUTHENTICATION:

Hop-by-hop authentication can be achieved through a public-key encryption system, the public-key based schemes were generally considered as not preferred, mainly due to their high computational overhead. However, our research demonstrates that it is not always true, especially for elliptic curve public-key cryptosystems.

In our scheme, each SAMA contains an AS of n randomly selected nodes that dynamically changes for each message. For n ¼ 1, our scheme can provide at least the same security as the bivariate polynomial-based scheme. For n > 1, we can provide extra source privacy benefits. Even if one message is corrupted, other messages transmitted in the network can still be secure.

Therefore, n can be much smaller than the parameters dx and dy. In fact, even a small n may provide adequate source privacy, while ensuring high system performance. In addition, in the bivariate polynomial-based scheme, there is only one base station that can send messages. All the other nodes can only act as intermediate nodes or receivers. This property makes the base station easy to attack, and severely narrows the applicability of this scheme. In fact, the major traffic inWSNs is packet delivery from the sensor nodes to the sink node.

Our scheme enables every node to transmit the message to the sink node as a message initiator.

The recent progress on ECC has demonstrated that the public-key based schemes have more advantages in terms of memory usage, message complexity, and security resilience, since public-key based approaches have a simple and clean key management.

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

The feasibility of the project is analyzed in this phase and business proposal is put forth with a very general plan for the project and some cost estimates. During system analysis the feasibility study of the proposed system is to be carried out. This is to ensure that the proposed system is not a burden to the company. For feasibility analysis, some understanding of the major requirements for the system is essential.

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

This study is carried out to check the economic impact that the system will have on the organization. The amount of fund that the company can pour into the research and development of the system is limited. The expenditures must be justified. Thus the developed system as well within the budget and this was achieved because most of the technologies used are freely available. Only the customized products had to be purchased.

5.1.2 TECHNICAL FEASIBILITY:

This study is carried out to check the technical feasibility, that is, the technical requirements of the system. Any system developed must not have a high demand on the available technical resources. This will lead to high demands on the available technical resources. This will lead to high demands being placed on the client. The developed system must have a modest requirement, as only minimal or null changes are required for implementing this system.

5.1.3 SOCIAL FEASIBILITY:

The aspect of study is to check the level of acceptance of the system by the user. This includes the process of training the user to use the system efficiently. The user must not feel threatened by the system, instead must accept it as a necessity. The level of acceptance by the users solely depends on the methods that are employed to educate the user about the system and to make him familiar with it. His level of confidence must be raised so that he is also able to make some constructive criticism, which is welcomed, as he is the final user of the system.

5.2 SYSTEM TESTING:

Testing is a process of checking whether the developed system is working according to the original objectives and requirements. It is a set of activities that can be planned in advance and conducted systematically. Testing is vital to the success of the system. System testing makes a logical assumption that if all the parts of the system are correct, the global will be successfully achieved. In adequate testing if not testing leads to errors that may not appear even many months. This creates two problems, the time lag between the cause and the appearance of the problem and the effect of the system errors on the files and records within the system. A small system error can conceivably explode into a much larger Problem. Effective testing early in the purpose translates directly into long term cost savings from a reduced number of errors. Another reason for system testing is its utility, as a user-oriented vehicle before implementation. The best programs are worthless if it produces the correct outputs.

5.2.1 UNIT TESTING:

A program represents the logical elements of a system. For a program to run satisfactorily, it must compile and test data correctly and tie in properly with other programs. Achieving an error free program is the responsibility of the programmer. Program testing checks for two types of errors: syntax and logical. Syntax error is a program statement that violates one or more rules of the language in which it is written. An improperly defined field dimension or omitted keywords are common syntax errors. These errors are shown through error message generated by the computer. For Logic errors the programmer must examine the output carefully.

UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.3 FUNCTIONAL TESTING:

Functional testing of an application is used to prove the application delivers correct results, using enough inputs to give an adequate level of confidence that will work correctly for all sets of inputs. The functional testing will need to prove that the application works for each client type and that personalization function work correctly.When a program is tested, the actual output is compared with the expected output. When there is a discrepancy the sequence of instructions must be traced to determine the problem. The process is facilitated by breaking the program into self-contained portions, each of which can be checked at certain key points. The idea is to compare program values against desk-calculated values to isolate the problems.

FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 4 NON-FUNCTIONAL TESTING:

The Non Functional software testing encompasses a rich spectrum of testing strategies, describing the expected results for every test case. It uses symbolic analysis techniques. This testing used to check that an application will work in the operational environment. Non-functional testing includes:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.5 LOAD TESTING:

An important tool for implementing system tests is a Load generator. A Load generator is essential for testing quality requirements such as performance and stress. A load can be a real load, that is, the system can be put under test to real usage by having actual telephone users connected to it. They will generate test input data for system test.

Load Testing

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

Performance tests are utilized in order to determine the widely defined performance of the software system such as execution time associated with various parts of the code, response time and device utilization. The intent of this testing is to identify weak points of the software system and quantify its shortcomings.

PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

The software reliability is the ability of a system or component to perform its required functions under stated conditions for a specified period of time and it is being ensured in this testing. Reliability can be expressed as the ability of the software to reveal defects under testing conditions, according to the specified requirements. It the portability that a software system will operate without failure under given conditions for a given time interval and it focuses on the behavior of the software element. It forms a part of the software quality control team.

RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

Security testing evaluates system characteristics that relate to the availability, integrity and confidentiality of the system data and services. Users/Clients should be encouraged to make sure their security needs are very clearly known at requirements time, so that the security issues can be addressed by the designers and testers.

SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.7 WHITE BOX TESTING:

White box testing, sometimes called glass-box testing is a test case design method that uses the control structure of the procedural design to derive test cases. Using white box testing method, the software engineer can derive test cases. The White box testing focuses on the inner structure of the software structure to be tested.

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

Black box testing, also called behavioral testing, focuses on the functional requirements of the software. That is, black testing enables the software engineer to derive sets of input conditions that will fully exercise all functional requirements for a program. Black box testing is not alternative to white box techniques. Rather it is a complementary approach that is likely to uncover a different class of errors than white box methods. Black box testing attempts to find errors which focuses on inputs, outputs, and principle function of a software module. The starting point of the black box testing is either a specification or code. The contents of the box are hidden and the stimulated software should produce the desired results.

5.1.10 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

6.0 SOFTWARE DESCRIPTION:

6.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

With most programming languages, you either compile or interpret a program so that you can run it on your computer. The Java programming language is unusual in that a program is both compiled and interpreted. With the compiler, first you translate a program into an intermediate language called Java byte codes —the platform-independent codes interpreted by the interpreter on the Java platform. The interpreter parses and runs each Java byte code instruction on the computer. Compilation happens just once; interpretation occurs each time the program is executed. The following figure illustrates how this works.

You can think of Java byte codes as the machine code instructions for the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a development tool or a Web browser that can run applets, is an implementation of the Java VM. Java byte codes help make “write once, run anywhere” possible. You can compile your program into byte codes on any platform that has a Java compiler. The byte codes can then be run on any implementation of the Java VM. That means that as long as a computer has a Java VM, the same program written in the Java programming language can run on Windows 2000, a Solaris workstation, or on an iMac.

6.2 THE JAVA PLATFORM:

A platform is the hardware or software environment in which a program runs. We’ve already mentioned some of the most popular platforms like Windows 2000, Linux, Solaris, and MacOS. Most platforms can be described as a combination of the operating system and hardware. The Java platform differs from most other platforms in that it’s a software-only platform that runs on top of other hardware-based platforms.

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The Java API is a large collection of ready-made software components that provide many useful capabilities, such as graphical user interface (GUI) widgets. The Java API is grouped into libraries of related classes and interfaces; these libraries are known as packages. The next section, What Can Java Technology Do? Highlights what functionality some of the packages in the Java API provide.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

Native code is code that after you compile it, the compiled code runs on a specific hardware platform. As a platform-independent environment, the Java platform can be a bit slower than native code. However, smart compilers, well-tuned interpreters, and just-in-time byte code compilers can bring performance close to that of native code without threatening portability.

6.3 WHAT CAN JAVA TECHNOLOGY DO?

The most common types of programs written in the Java programming language are applets and applications. If you’ve surfed the Web, you’re probably already familiar with applets. An applet is a program that adheres to certain conventions that allow it to run within a Java-enabled browser.

However, the Java programming language is not just for writing cute, entertaining applets for the Web. The general-purpose, high-level Java programming language is also a powerful software platform. Using the generous API, you can write many types of programs.

An application is a standalone program that runs directly on the Java platform. A special kind of application known as a server serves and supports clients on a network. Examples of servers are Web servers, proxy servers, mail servers, and print servers. Another specialized program is a servlet.

A servlet can almost be thought of as an applet that runs on the server side. Java Servlets are a popular choice for building interactive web applications, replacing the use of CGI scripts. Servlets are similar to applets in that they are runtime extensions of applications. Instead of working in browsers, though, servlets run within Java Web servers, configuring or tailoring the server.

How does the API support all these kinds of programs? It does so with packages of software components that provides a wide range of functionality. Every full implementation of the Java platform gives you the following features:

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

6.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

We can’t promise you fame, fortune, or even a job if you learn the Java programming language. Still, it is likely to make your programs better and requires less effort than other languages. We believe that Java technology will help you do the following:

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

6.5 ODBC:

Microsoft Open Database Connectivity (ODBC) is a standard programming interface for application developers and database systems providers. Before ODBC became a de facto standard for Windows programs to interface with database systems, programmers had to use proprietary languages for each database they wanted to connect to. Now, ODBC has made the choice of the database system almost irrelevant from a coding perspective, which is as it should be. Application developers have much more important things to worry about than the syntax that is needed to port their program from one database to another when business needs suddenly change.

Through the ODBC Administrator in Control Panel, you can specify the particular database that is associated with a data source that an ODBC application program is written to use. Think of an ODBC data source as a door with a name on it. Each door will lead you to a particular database. For example, the data source named Sales Figures might be a SQL Server database, whereas the Accounts Payable data source could refer to an Access database. The physical database referred to by a data source can reside anywhere on the LAN.

The ODBC system files are not installed on your system by Windows 95. Rather, they are installed when you setup a separate database application, such as SQL Server Client or Visual Basic 4.0. When the ODBC icon is installed in Control Panel, it uses a file called ODBCINST.DLL. It is also possible to administer your ODBC data sources through a stand-alone program called ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program and each maintains a separate list of ODBC data sources.

From a programming perspective, the beauty of ODBC is that the application can be written to use the same set of function calls to interface with any data source, regardless of the database vendor. The source code of the application doesn’t change whether it talks to Oracle or SQL Server. We only mention these two as an example. There are ODBC drivers available for several dozen popular database systems. Even Excel spreadsheets and plain text files can be turned into data sources. The operating system uses the Registry information written by ODBC Administrator to determine which low-level ODBC drivers are needed to talk to the data source (such as the interface to Oracle or SQL Server). The loading of the ODBC drivers is transparent to the ODBC application program. In a client/server environment, the ODBC API even handles many of the network issues for the application programmer.

The advantages of this scheme are so numerous that you are probably thinking there must be some catch. The only disadvantage of ODBC is that it isn’t as efficient as talking directly to the native database interface. ODBC has had many detractors make the charge that it is too slow. Microsoft has always claimed that the critical factor in performance is the quality of the driver software that is used. In our humble opinion, this is true. The availability of good ODBC drivers has improved a great deal recently. And anyway, the criticism about performance is somewhat analogous to those who said that compilers would never match the speed of pure assembly language. Maybe not, but the compiler (or ODBC) gives you the opportunity to write cleaner programs, which means you finish sooner. Meanwhile, computers get faster every year.

6.6 JDBC:

In an effort to set an independent database standard API for Java; Sun Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a generic SQL database access mechanism that provides a consistent interface to a variety of RDBMSs. This consistent interface is achieved through the use of “plug-in” database connectivity modules, or drivers. If a database vendor wishes to have JDBC support, he or she must provide the driver for each platform that the database and Java run on.

To gain a wider acceptance of JDBC, Sun based JDBC’s framework on ODBC. As you discovered earlier in this chapter, ODBC has widespread support on a variety of platforms. Basing JDBC on ODBC will allow vendors to bring JDBC drivers to market much faster than developing a completely new connectivity solution.

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

The remainder of this section will cover enough information about JDBC for you to know what it is about and how to use it effectively. This is by no means a complete overview of JDBC. That would fill an entire book.

6.7 JDBC Goals:

Few software packages are designed without goals in mind. JDBC is one that, because of its many goals, drove the development of the API. These goals, in conjunction with early reviewer feedback, have finalized the JDBC class library into a solid framework for building database applications in Java.

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

The designers felt that their main goal was to define a SQL interface for Java. Although not the lowest database interface level possible, it is at a low enough level for higher-level tools and APIs to be created. Conversely, it is at a high enough level for application programmers to use it confidently. Attaining this goal allows for future tool vendors to “generate” JDBC code and to hide many of JDBC’s complexities from the end user.

SQL Conformance

SQL syntax varies as you move from database vendor to database vendor. In an effort to support a wide variety of vendors, JDBC will allow any query statement to be passed through it to the underlying database driver. This allows the connectivity module to handle non-standard functionality in a manner that is suitable for its users.

JDBC must be implemental on top of common database interfaces

The JDBC SQL API must “sit” on top of other common SQL level APIs. This goal allows JDBC to use existing ODBC level drivers by the use of a software interface. This interface would translate JDBC calls to ODBC and vice versa.

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

This goal probably appears in all software design goal listings. JDBC is no exception. Sun felt that the design of JDBC should be very simple, allowing for only one method of completing a task per mechanism. Allowing duplicate functionality only serves to confuse the users of the API.

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Because more often than not, the usual SQL calls used by the programmer are simple SELECT’s, INSERT’s, DELETE’s and UPDATE’s, these queries should be simple to perform with JDBC. However, more complex SQL statements should also be possible.

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Java is also unusual in that each Java program is both compiled and interpreted. With a compile you translate a Java program into an intermediate language called Java byte codes the platform-independent code instruction is passed and run on the computer.

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

6.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

The IP layer provides a connectionless and unreliable delivery system. It considers each datagram independently of the others. Any association between datagram must be supplied by the higher layers. The IP layer supplies a checksum that includes its own header. The header includes the source and destination addresses. The IP layer handles routing through an Internet. It is also responsible for breaking up large datagram into smaller ones for transmission and reassembling them at the other end.

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

A service exists on a host, and is identified by its port. This is a 16 bit number. To send a message to a server, you send it to the port for that service of the host that it is running on. This is not location transparency! Certain of these ports are “well known”.

Sockets:

A socket is a data structure maintained by the system to handle network connections. A socket is created using the call socket. It returns an integer that is like a file descriptor. In fact, under Windows, this handle can be used with Read File and Write File functions.

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

Here “family” will be AF_INET for IP communications, protocol will be zero, and type will depend on whether TCP or UDP is used. Two processes wishing to communicate over a network create a socket each. These are similar to two ends of a pipe – but the actual pipe does not yet exist.

6.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

6.8.1. Map Visualizations:

Charts showing values that relate to geographical areas. Some examples include: (a) population density in each state of the United States, (b) income per capita for each country in Europe, (c) life expectancy in each country of the world. The tasks in this project include: Sourcing freely redistributable vector outlines for the countries of the world, states/provinces in particular countries (USA in particular, but also other areas);

Creating an appropriate dataset interface (plus default implementation), a rendered, and integrating this with the existing XYPlot class in JFreeChart; Testing, documenting, testing some more, documenting some more.

6.8.2. Time Series Chart Interactivity

Implement a new (to JFreeChart) feature for interactive time series charts — to display a separate control that shows a small version of ALL the time series data, with a sliding “view” rectangle that allows you to select the subset of the time series data to display in the main chart.

6.8.3. Dashboards

There is currently a lot of interest in dashboard displays. Create a flexible dashboard mechanism that supports a subset of JFreeChart chart types (dials, pies, thermometers, bars, and lines/time series) that can be delivered easily via both Java Web Start and an applet.

6.8.4. Property Editors

The property editor mechanism in JFreeChart only handles a small subset of the properties that can be set for charts. Extend (or reimplement) this mechanism to provide greater end-user control over the appearance of the charts.

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

8.1 CONCLUSION:

In this paper, we first proposed a novel and efficient SAMA based on ECC. While ensuring message sender privacy, SAMA can be applied to any message to provide message content authenticity. To provide hop-by-hop message authentication without the weakness of the builtin threshold of the polynomial-based scheme, we then proposed a hop-by-hop message authentication scheme based on the SAMA. When applied to WSNs with fixed sink nodes, we also discussed possible techniques for compromised node identification.

We compared our proposed scheme with the bivariate polynomial-based scheme through simulations using ns-2 and TelosB. Both theoretical and simulation results show that, in comparable scenarios, our proposed scheme is more efficient than the bivariate polynomial-based scheme in terms of computational overhead, energy consumption, delivery ratio, message delay, and memory consumption.

CHAPTER 9

REFERENCE:

[1] M. Albrecht, C. Gentry, S. Halevi, and J. Katz, “Attacking Cryptographic Schemes Based on ‘Perturbation Polynomials’,” Report 2009/098, http://eprint.iacr.org/, 2009.

[2] “Cryptographic Key Length Recommendation,” http://www.keylength.com/en/3/, 2013.

[3] W. Zhang, N. Subramanian, and G. Wang, “Lightweight and Compromise-Resilient Message Authentication in Sensor Networks,” Proc. IEEE INFOCOM, Apr. 2008.

[4] H. Wang, S. Sheng, C. Tan, and Q. Li, “Comparing Symmetric-Key and Public-Key Based Security Schemes in Sensor Networks: A Case Study of User Access Control,” Proc. IEEE 28th Int’l Conf. Distributed Computing Systems (ICDCS), pp. 11-18, 2008.

[5] Q. Liu, Y. Ge, Z. Li, H. Xiong, and E. Chen, “Personalized Travel Package Recommendation,” Proc. IEEE 11th Int’l Conf. Data Mining (ICDM ’11), pp. 407-416, 2011.

[6] Q. Liu, E. Chen, H. Xiong, C. Ding, and J. Chen, “Enhancing Collaborative Filtering by User Interests Expansion via Personalized Ranking,” IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 1, pp. 218-233, Feb. 2012.

Developing Vehicular Data Cloud Services in the IoT Environment

22/10/201902/07/2019 by admin

Designing an Architecture for Monitoring Patients at Home Ontologies and Web Services for Clinical an

22/10/201902/07/2019 by admin

DESIGNING AN ARCHITECTURE FOR MONITORING PATIENTS AT HOME ONTOLOGIES AND WEB SERVICES FOR CLINICAL AND TECHNICAL MANAGEMENT INTEGRATION

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

CERTIFICATE

Certified that this project report titled “Designing an Architecture for Monitoring Patients at Home Ontologies and Web Services for Clinical and Technical Management Integration” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

DECLARATION

I hereby declare that the project work entitled “Designing an Architecture for Monitoring Patients at Home Ontologies and Web Services for Clinical and Technical Management Integration” Submitted to BHARATHIDASAN UNIVERSITY in partial fulfillment of the requirement for the award of the Degree of MASTER OF SCIENCE IN COMPUTER SCIENCE is a record of original work done by me the guidance of Prof.A.Vinayagam M.Sc., M.Phil., M.E., to the best of my knowledge, the work reported here is not a part of any other thesis or work on the basis of which a degree or award was conferred on an earlier occasion to me or any other candidate.

(Student Name)

(Reg.No)

Place:

Date:

ACKNOWLEDGEMENT

I am extremely glad to present my project “Designing an Architecture for Monitoring Patients at Home Ontologies and Web Services for Clinical and Technical Management Integration” which is a part of my curriculum of third semester Master of Science in Computer science. I take this opportunity to express my sincere gratitude to those who helped me in bringing out this project work.

I would like to express my Director, Dr. K. ANANDAN, M.A.(Eco.), M.Ed., M.Phil.,(Edn.), PGDCA., CGT., M.A.(Psy.) of who had given me an opportunity to undertake this project.

I am highly indebted to Co-Ordinator Prof. Muniappan Department of Physics and thank from my deep heart for her valuable comments I received through my project.

I wish to express my deep sense of gratitude to my guide
Prof. A.Vinayagam M.Sc., M.Phil., M.E., for her immense help and encouragement for successful completion of this project.

I also express my sincere thanks to the all the staff members of Computer science for their kind advice.

And last, but not the least, I express my deep gratitude to my parents and friends for their encouragement and support throughout the project.

CHAPTER 1

ABSTRACT:

This paper presents the design and implementation of an architecture based on the combination of ontologies, rules, web services, and the autonomic computing paradigm to manage data in home-based telemonitoring scenarios.

The architecture includes two layers: 1) a conceptual layer and 2) a data and communication layer. On the one hand, the conceptual layer based on ontologies is proposed to unify the management procedure and integrate incoming data from all the sources involved in the telemonitoring process. On the other hand, the data and communication layer based on REST web service (WS) technologies is proposed to provide practical backup to the use of the ontology, to provide a real implementation of the tasks it describes and thus to provide a means of exchanging data (support communication tasks).

We study regarding chronic obstructive pulmonary disease data management is presented in order to evaluate the efficiency of the architecture. This proposed ontology-based solution defines a flexible and scalable architecture in order to address main challenges presented in home-based telemonitoring scenarios and thus provide a means to integrate, unify, and transfer data supporting both clinical and technical management tasks.

1.2 INTRODUCTION

Patient empowerment is considered as a philosophy of health care based on the perspective that better outcomes are achieved when patients become active participants in their own health management. This new paradigm is a central idea in the European Union (EU) health strategy supported by international health organizations including the World Health Organization among others, and its effectiveness in yielding quality of care is an obvious and essential area of research. This new idea invites to look for new ways of providing healthcare, e.g., by using information and communications technologies. In this context, home-based telemonitoring systems can be used as self-care management tools, while collaborative processes among healthcare personnel and patients are maintained, thus the patient’s safe control is guaranteed. Telemonitoring systems face the problem of delivering medicine to the current growing population with chronic conditions while at the same time covering the dimensions of quality of care and new paradigms such as empowerment can be supported.

By periodically collecting patients’ themselves clinical data (located at their home sites) and transferring them to physicians located in remote sites, patient’s health status supervision and feedback provision are possible. This type of telemedicine system guarantees patient control while reducing costs and avoiding hospital overflows. These two sites (home site and healthcare site) comprised a typical home-based telemonitoring system. At home site, data acquired by using MDs together with the patient’s feedback are collected in a concentrator device (HG) used to evaluate and/or transfer the acquired data outside the patient’s home if necessary. At the health-care site, a server device is used to manage information from the home site as well as to manage and store the patient’s monitoring guidelines defined by physicians (TS, telemonitoring server). In fact, this telemonitoring process, and consequently the evolution of the patient’s health status, ismanaged through the indications or monitoring guidelines provided by physicians.

Although significant contributions have been made in this field in recent decades, telemedicine and in e-health scenarios in general still pose numerous challenges that need to be addressed by researchers in order to take maximum advantage of the benefits that these systems provide and to support their long-term implementation. Interoperability and integration are critical challenges that also need to be addressed when developing monitoring systems in order to provide effective healthcare and to make possible seamless communication among the different heterogeneous health entities that participate in the monitoring process. This integration should be addressed at both end sites of the scenario but also in the communication link, thus integrating the way of transferring and exchanging information efficiently between them.

We providing personalized care services and taking into account the patient’s context have been identified as additional requirements. Furthermore, apart from clinical data aspects, technical issues should be also addressed in this scenario. Technical management of all the devices that comprise the telemonitoring scenario (e.g., the MDs and HG) is an important task that may or may not be integrated under the same architecture as clinical management. Hence, at this technical level, research is still required to address these challenges. Consequently there is a need for the development of new telemonitoring architectures.

Great efforts have been made in recent years in developing standards to deal with interoperability at different points of the e-health communication infrastructure such as the ISO/IEEE 11073 (X73) for MDs interoperability, the OpenEHR initiative for storage, management and retrieval of electronic health record (EHR) information or as the standardized Health Level Seven7 (HL7) messages to solve clinical data transferences. Nevertheless, additional efforts are required to enable them to work together and ultimately provide a higher level of integration.

Specifically, in this telemonitoring scenario, there is not a unique standard-based solution to address data and management integration. Since several standards can be used (some of them in combination with proprietary protocols or other standards) at different points of this scenario, the interoperability problem remains unsolved unless these standards would merge into one or alignments and combination of them would be done. According to Berges et al. interoperability does not mean to have a unique representation but a semantically acknowledged equivalent one. That is the reason to propose in this study an ontology-based architecture in order to provide with a common knowledge about the exchanged data and the management of such data. This ontology constitutes the knowledge equivalent one. Then, at both ends of the architecture other standards could be used for other managing purposes relating this model with the specific desired approach. Using this alternative, a knowledge model is first provided that avoids alignment of models two by two, while all being related through the main ontology.

Ontologies-based solutions have been popularized over the past few years. Ontologies provide a higher level of abstraction and have been successfully used in telemonitoring scenarios and other areas to provide knowledge representation and semantic integration, thus a common understanding about data exchanged by all the entities. Furthermore, its combination with rules allows providing personalized management services and thus personalized care. Although there are works that describe the details of an ontology approach in this domain, they do not devote much attention to the architecture implementation and the communication used to exchange the information described. Consequently, fewworks have given details about this practical implementation of the ontology-based system which may be of interest for the development of other ontology-based applications in and outside the e-health domain.

This paper presents an ontology-driven architecture to integrate data management and enable its communication in a telemonitoring scenario. The proposed architecture includes two layers: the conceptual layer (the ontology) and the communication and data layer. The conceptual layer uses the HOTMES and its extensions introduced. Specifically, the OWL-DL language was selected to define this ontology model. The second layer is based on WS technologies. WSs have been successfully used in network management and also in other works to exchange data modeled by ontology. However, our proposal, inspired on the representational state transfer (REST) style and based on a generic communication method, provides a different design approach that may be reusable for other systems based on ontologies. Furthermore, security issues have been considered. The aim is to define a flexible and scalable architecture in order to address main challenges presented in home-based telemonitoring scenarios and thus provide a means to integrate and transfer data supporting both clinical and technical data management.

1.3 LITRATURE SURVEY

AUTHOR AND PUBLICATION: JD. Trigo, I. Mart´ınez, A. Alesanco, A. Kollmann, J. Escayola, D. Hayn, G. Schreier, and J. Garc´ıa, “AN INTEGRATED HEALTHCARE INFORMATION SYSTEM FOR END-TO-END STANDARDIZED EXCHANGE AND HOMOGENEOUS MANAGEMENT OF DIGITAL ECG FORMATS,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 4, pp. 518–529, Jul. 2012.

EXPLANATION:

This paper investigates the application of the enterprise information system (EIS) paradigm to standardized cardiovascular condition monitoring. There are many specifications in cardiology, particularly in the ECG standardization arena. The existence of ECG formats, however, does not guarantee the implementation of homogeneous, standardized solutions for ECG management. In fact, hospital management services need to cope with various ECG formats and, moreover, several different visualization applications. This heterogeneity hampers the normalization of integrated, standardized healthcare information systems, hence the need for finding an appropriate combination of ECG formats and suitable EIS-based software architecture that enables standardized exchange and homogeneous management of ECG formats. Determining such a combination is one objective of this paper.

We develop the integrated healthcare information system that satisfies the requirements posed by the previous determination. The ECG formats selected include ISO/IEEE11073, Standard Communications Protocol for Computer-Assisted Electrocardiography, and an ECG ontology. The EIS-enabling techniques and technologies selected include web services, simple object access protocol, extensible markup language, or business process execution language. Such a selection ensures the standardized exchange of ECGs within, or across, healthcare information systems while providing modularity and accessibility.

AUTHOR AND PUBLICATION: D. Ria˜no, F. Real, J. A. L´opez-Vallverd´u, F. Campana, S. Ercolani, P. Mecocci, R. Annicchiarico, and C. Caltagirone, “AN ONTOLOGY-BASED PERSONALIZATION OF HEALTH-CARE KNOWLEDGE TO SUPPORT CLINICAL DECISIONS FOR CHRONICALLY ILL PATIENTS,” J. Biomed. Informat., vol. 45, no. 3, pp. 429–446, 2012.

EXPLANATION:

Chronically ill patients are complex health care cases that require the coordinated interaction of multiple professionals. A correct intervention of these sort of patients entails the accurate analysis of the conditions of each concrete patient and the adaptation of evidence-based standard intervention plans to these conditions. There are some other clinical circumstances such as wrong diagnoses, unobserved comorbidities, missing information, unobserved related diseases or prevention, whose detection depends on the capacities of deduction of the professionals involved. In this paper, we introduce ontology for the care of chronically ill patients and implement two personalization processes and a decision support tool. The first personalization process adapts the contents of the ontology to the particularities observed in the health-care record of a given concrete patient, automatically providing a personalized ontology containing only the clinical information that is relevant for health-care professionals to manage that patient. The second personalization process uses the personalized ontology of a patient to automatically transform intervention plans describing health-care general treatments into individual intervention plans. For comorbid patients, this process concludes with the semi-automatic integration of several individual plans into a single personalized plan. Finally, the ontology is also used as the knowledge base of a decision support tool that helps health-care professionals to detect anomalous circumstances such as wrong diagnoses, unobserved comorbidities, missing information, unobserved related diseases, or preventive actions. Seven health-care centers participating in the K4CARE project, together with the group SAGESA and the Local Health System in the town of Pollenza have served as the validation platform for these two processes and tool. Health-care professionals participating in the evaluation agree about the average quality 84% (5.9/7.0) and utility 90% (6.3/7.0) of the tools and also about the correct reasoning of the decision support tool, according to clinical standards.

AUTHOR AND PUBLICATION: I.Berges, J. Bermudez, and A. Illarramendi, “TOWARDS SEMANTIC INTEROPERABILITY OF ELECTRONIC HEALTH RECORDS,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 3, pp. 424–431, May 2012.

EXPLANATION:

Although the goal of achieving semantic interoperability of electronic health records (EHRs) is pursued by many researchers, it has not been accomplished yet. In this paper, we present a proposal that smoothes out the way toward the achievement of that goal. In particular, our study focuses on medical diagnoses statements. In summary, the main contributions of our ontology-based proposal are the following: first, it includes a canonical ontology whose EHR-related terms focus on semantic aspects. As a result, their descriptions are independent of languages and technology aspects used in different organizations to represent EHRs. Moreover, those terms are related to their corresponding codes in well-known medical terminologies. Second, it deals with modules that allow obtaining rich ontological representations of EHR information managed by proprietary models of health information systems. The features of one specific module are shown as reference. Third, it considers the necessary mapping axioms between ontological terms enhanced with so-called path mappings. This feature smoothes out structural differences between heterogeneous EHR representations, allowing proper alignment of information.

AUTHOR AND PUBLICATION: N. Lasierra,A.Alesanco, J.Garc´ıa, andD.O’Sullivan, “DATA MANAGEMENT IN HOME SCENARIOS USING AN AUTONOMIC ONTOLOGY-BASED APPROACH,” in Proc. of the 9th IEEE Int. Conf. Pervasive Workshop on Manag. Ubiquitous Commun. Services part of PerCom, 2012, pp. 94–99.

EXPLANATION:

An ontology-based approach to deal with data and management procedure integration in home-based scenarios is presented in this paper. The proposed ontology not only provides a means to represent exchanged data but also to unify the way of accessing, controlling, evaluating and transferring information remotely. The structure of this ontology has been inspired by the autonomic computing paradigm, thus it describes the tasks that comprise the MAPE (Monitor, Analyze, Plan and Execute) process. Furthermore the use of SPARQL (Simple Protocol and RDF Query Language) is proposed in this paper to express conditions and rules that determine the performance of these tasks according to each situation. Finally two practical application cases of the proposed ontology-based approach are presented.

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1 EXISTING SYSTEM:

Telemonitoring systems face the problem of delivering medicine to the current growing population with chronic conditions while at the same time covering the dimensions of quality of care and new paradigms such as empowerment can be supported. By periodically collecting patients’ themselves clinical data (located at their home sites) and transferring them to physicians located in remote sites, patient’s health status supervision and feedback provision are possible.

This type of telemedicine system guarantees patient control while reducing costs and avoiding hospital overflows. These two sites (home site and healthcare site) comprised a typical home-based telemonitoring system. At home site, data acquired by using MDs together with the patient’s feedback are collected in a concentrator device (HG) used to evaluate and/or transfer the acquired data outside the patient’s home if necessary.

2.1.1 DISADVANTAGES:

Existing models for chronic diseases pose several technology-oriented challenges for home-based care, where assistance services rely on a close collaboration among different stakeholders, such as health operators, patient relatives, and social community members.

An ontology-based context model and a related context management system providing a configurable and extensible service-oriented framework to ease the development of applications for monitoring and handling patient chronic conditions.

The system has been developed in a prototypal version, and integrated with a service platform for supporting operators of home-based care networks in cooperating and sharing patient-related information and coordinating mutual interventions for handling critical and alarm situations.

2.2 PROPOSED SYSTEM:

We present an ontology-driven architecture to integrate data management and enable its communication in a telemonitoring scenario. It enables to not only integrate patient’s clinical data management but also technical data management of all devices that are included in the scenario. The proposed architecture includes two layers: the conceptual layer (the ontology) and the communication and data layer.

The conceptual layer uses the HOTMES and its extensions introduced specifically in the OWL-DL language was selected to define this ontology model. The second layer is based on WS technologies. WSs have been successfully used in network management and also in other works to exchange data modeled by ontology is our proposal, inspired on the representational state transfer (REST) style and based on a generic communication method, provides a different design approach that may be reusable for other systems based on ontologies.

Furthermore, security issues have been considered. The aim is to define a flexible and scalable architecture in order to address main challenges presented in home-based telemonitoring scenarios and thus provide a means to integrate and transfer data supporting both clinical and technical data management.

2.2.1 ADVANTAGES:

Ontologies provide a higher level of abstraction and have been successfully used in telemonitoring scenarios and other areas to provide knowledge representation and semantic integration, thus a common understanding about data exchanged by all the entities. Furthermore, its combination with rules allows providing personalized management services and thus personalized care.

We describe the details of an ontology approach in this domain, they do not devote much attention to the architecture implementation and the communication used to exchange the information described.

Our implementation of the ontology-based system which may be of interest for the development of other ontology-based applications in and outside the e-health domain the ontology for interpreting the data transferred for the communication of end sources of the architecture. The data and communication layer deals with data management and transmission.

2.3 HARDWARE & SOFTWARE REQUIREMENTS:

2.3.1 HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

2.3.2 SOFTWARE REQUIREMENTS:

Operating System : Windows XP or Win7
Front End : JAVA JDK 1.7
Back End : MYSQL Server
Server : Apache Tomact Server
Script : JSP Script
Document : MS-Office 2007

CHAPTER 3

3.0 SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data’s in the physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.1 ARCHITECTURE DIAGRAM

3.2 DATAFLOW DIAGRAM

ADMIN:

USER:

UML DIAGRAMS:

3.2 USE CASE DIAGRAM:

3.3 SEQUENCE DIAGRAM:

3.5 ACTIVITY DIAGRAM:

CHAPTER 4

4.0 IMPLEMENTATION:

ONTOLOGIES:

According to one of the most widely accepted definitions of ontologies in computer science, ontology can be described as “an explicit and formal specification of a shared conceptualization”. In simple words, ontologies represent concepts and basic relationships for the purpose of comprehension of a common knowledge area. To develop an ontology means to formalize a common view of a certain domain.

1) OWL Language: In computer science, there are plenty of formal languages that can be used to define and constructontologies. These languages allow encoding knowledge contained in ontology in a simple and formal way. However, the standardized RDF and OWL have been gaining popularity in the semantic web world. Ontology can be formally described in OWL using following basic elements: 1) classes; 2) individuals; and 3) properties. These elements are used in order to describe concepts, instances, or members of a class and relationships between individuals of two classes (object properties) or to link individuals with datatype values, respectively (data type properties). Apart from these basic elements OWL provides with class descriptors used to precisely describe OWL classes which includes properties restrictions (value and cardinality constraints), class axioms, properties axioms, and properties over individuals.

2) Rules: Generally, ontology-based solutions combine knowledge presented in ontologies with dynamic knowledge presented by the use of rules. A system based on the use of rules usually contains a set of if-then rules (which indicate what should be done according to a situation) and a rule engine used to apply them. By using rules, the behavior of individuals can be expressed inside a domain. Hence, they can be used to generate new knowledge and can also be used to provide personalized services. One of the most popular languages for rules definition is SWRL.

However, in our study, we used SPARQL to define some rules is a query language it can be used as a rule language by combining CONSTRUCT clause and FILTER restrictions. On the one hand, the CONSTRUCT query form returns a single RDF graph built based on the results of matching with the graph pattern of the query and by taking the specified graph template. On the other hand, the FILTER clause can be used to restrict solutions to those which the filter expression considers as TRUE. Only if the filter function evaluates to true is the solution to be included in the solution sequence. Note that although this language was good enough for our purpose, its limitations should be studied for other purposes (e.g., recursive tasks) and the adequacy of SWRL could be studied for complex applications.

WEB SERVICES

Web services are used in this study as software technology to access and exchange information modeled by the ontology. According to the W3C, a WS is a “software system designed to support interoperable machine-to-machine interaction over a communication network”. Systems may interact with the web services by exchanging SOAP messages serialized in XML for its message format and sent over other application layer protocols, usually HTTP. Although SOAP-based web services are the most popular types of WSs, there are other styles of programming a WS such as the REST style.

1) Rest Style for DesigningWeb Services: REST is a style of software architecture for distributed hypermedia systems such as the World Wide Web first defined in 2000 by Fielding. This style is based on the idea of transferring the representations of resources, a resource being any item of interest. One of the key advantages of the REST architecture are scalability of components and generality of interfaces. Although REST was initially described in the context of HTTP, this paradigm can be applied to other protocols or implementations. Web services can also be described using this style. A WS implemented using HTTP and the principles of REST architecture is designated as REST(ful) WS. Requests made from the client and responses from the WS are used to transfer resources information. Each resource is identified through an URI. Stateless behavior of data using XML and/or JSON and explicitly used HTTP methods (PUT, GET, POST, DELETE) to exchange resources are the key characteristics of a REST(ful) WS.

4.1 MODULES:

MANAGEMENT PROFILE:

DATA AND COMMUNICATION LAYER:

HG AND TS MANAGEMENT MODULES:

COMMUNICATION FLOW AND WORKFLOW:

4.3 MODULE DESCRIPTION:

CLINICAL MANAGEMENT PROFILE:

COPD patients were identified as candidates to be monitored at home sites. From a clinical point of view, it was an interesting case study (some estimations suggest that up to 10% of the European population suffers COPD). From a technical point of view, the case of the COPD patient led to define a complex technical management profile (because different MDs are required to be used by the patient) and interesting option to test the performance of the agent. Hence, one patient profile was designed according to the clinical HOTMES ontology and one technical management profile was designed according to the technical HOTMES ontology.

The patient profile includes the required tasks to monitor a COPD patient such as controlling the FEV1 measurement in order to detect the presence and severity of the airway obstruction. It was configured by a primary care physician by means of published clinical guidelines in patient profile included 15 monitoring task, 11 analysis task, 9 planning task, and 3 execution task. This configuration led to include 144 new instances and to configure 18 rules. The details of this profile and its evaluation to configure other type of profiles can be technical management profile was designed to monitor the state of theMDs used by the COPD patient (weighing scale, a blood pressure monitor, a pulse-oximeter, and a glucometer) and the consumption of resources of the correspondent HG. In addition rules were configured and 83 new instances were required to be configured in the technical management profile in additional information of the application of the HOTMES ontology for technical tasks.

DATA AND COMMUNICATION LAYER:

In the data layer, the communication between the end sites is established using WS technologies. Consequently, a WS has been designed to be placed in the TS and also a web client to be installed in the HG (to establish a communication with the TS). This communication allows the HG to ask for its associated management profile to the TS and to transmit acquired information from the HG to the TS.

A REST WS was developed in order to enhance the scalability and flexibility of the architecture and improve the performance (efficiency). This WS comprises and defines a set of operations over the following resources: an OWL ontology, the rules (transferred by means of an XML), OWL individuals (sent by the IndividualWS structure), properties datatype values corresponding to an individual (identified by the URI of the individual and the URI of the property sent in a string generic type), and inform messages to provide some control functions to the web pair communication.

Each one of these resources was identified by an URI, and a set of operations was defined for each particular resource using HTTP methods (e.g., GET or PUT). This WS interface allows information described in the ontology to be exchanged in a generic manner. This is one key that contributes to the reusability and easy extension of the architecture. Described communication methods do not depend on the knowledge itself described in the ontology (related to the service) but on the fact of using an ontology to represent such knowledge. A summary of the resources and defined operations is depicted in Table I. As mentioned in the description of the converter module, individuals are exchanged by using a developed structure designated as IndividualWS. Using OWL language, an individual of the ontology can be described as a member of a class with individual axioms or facts as individual property values (datatype and object properties).

HG AND TS MANAGEMENT MODULES:

Two management modules and web technology modules inside the HG and the TS constitute the main parts of the telemedicine system (see Fig. 1). The modules that comprise the architecture have been developed using .NET technologies. Specifically, the .NET framework (version 3.5) has been used to process the ontology and create new instances, data acquisition, and manipulation when the rules are applied. Regarding the web modules the components of the remote management module installed in the TS are depicted in Fig. 1. This management module includes the following three components:

1) Ontology knowledge base module: This module contains the ontology knowledge models and the instances of the registered management profiles. The TDB triple-store has been used to store the ontology model and new instances in this knowledge base module.

2) Converter module: The communication module of this architecture is mainly based on OWL instances exchanged generically by means of a developed object structure named IndividualWS. The converter module is used to wrap and unwrap the individuals structure used to exchange information with web clients. Furthermore, this module incorporates some reasoning tasks. Ontology-based reasoning is used in order to check instances before including new information

in the model and to ensure the consistency of the model.

3) Rules module: This module is used to store rules associated with each management profile. These rules are subsequently transferred by means of an XML file. As shown in Fig. 1, an additional GUI is required in order to make easier for EM, technical or clinical (physician), the process of defining the profiles and the rules. We are currentlyworking in the development of this GUI combining ontology visualization techniques and usability methods. The methodology used to design this interface components of the management module installed in the HG are equally depicted in Fig. 1. This last management module has been designated the “Semantic Autonomic Agent.” This module plays a key role in the architecture. It is in charge of integrating incoming data and executing the management tasks described in the management profile.

The communication between this agent and the management module installed at the remote site is established through a web client connection to the WS installed in the remote TS. The architecture of the agent comprises the ontology knowledge base module, the rules module, the converter module, and the following modules.

1) MAPE module: This module constitutes the computing core of the agent. It will be used to run the tasks specified in each management profile, hence to execute the closed loop from the MAPE loop process.

2) Integrator module: Information transferred by MDs and also contextual data provided by patients will be acquired in this module, which integrates data coming from different data sources.

3) Reminders and alarms module: This module includes clock functionalities to ask patients about data (reminders) or to collect information from a specific software resource.

4) Actions module: This last module is used to execute actions described within the execution tasks of the management profile if an abnormal finding occurs.

FLOW AND WORKFLOW PERFORMANCE:

All the modules and sources involved in the management procedure. The first step (see Fig. 3) consists in the download of the management profile (patient profile or technical profile). First of all, an instance of the management profile should be configured by an EM placed at a remote site. Furthermore, a set of individual rules should be configured for each particular management purpose. As shown in Fig. 3, the designed GUI helps the physician with the ontology instantiation process and the rules definition. The outputs of this interface (which uses selected classes of the ontology as a navigation tool) are a personalized management profile and a set of rules gathered in an XML file. Other functionalities such as queries over acquired data or crossing data among patients to take some decisions could be of interest to be included in this tool.

The communication is always initiated by the user (web client at HG). Through a connection to the web service, the user (the patient in the telemonitoring scenario) situated at home site will acquire the required management profile. As shown in Fig. 3, if the user requests for an update of his/her management profile, then the version of the available profile at the TS will be requested for its evaluation (GET property value). When the user requests a new management profile, first, it is checked whether the ontology to download it is available (GET ontology). After that, the rules and the management profile will be downloaded when required.

The methods involved are 1) GET (rules) and 2) GET (individual). Note that the TLS authentication phase is not depicted in Fig. 3, but it is initially carried out in order to allow the web client connection to the web service. As depicted in Fig. 3, the associated management profile is extracted from the ontology and the instances of the ontology managed by Jena are wrapped into the IndividualWS structure through the converter module. Once the management profile is in the HG, it will be processed into the converter module, unwrapped, and inserted as individuals managed by Jena in the ontology. Once the management profile has been included in the ontology knowledge base module of the HG, it will be evaluated in the MAPE module and the management procedure will be performed by running the tasks specified in the profile.

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

This creates two problems, the time lag between the cause and the appearance of the problem and the effect of the system errors on the files and records within the system. A small system error can conceivably explode into a much larger Problem. Effective testing early in the purpose translates directly into long term cost savings from a reduced number of errors. Another reason for system testing is its utility, as a user-oriented vehicle before implementation. The best programs are worthless if it produces the correct outputs.

5.2.1 UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.2 FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 3 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.4 LOAD TESTING:

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

6.0 SOFTWARE DESCRIPTION:

6.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

6.2 THE JAVA PLATFORM:

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

6.3 WHAT CAN JAVA TECHNOLOGY DO?

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

6.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

6.5 ODBC:

6.6 JDBC:

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

6.7 JDBC Goals:

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

SQL Conformance

JDBC must be implemental on top of common database interfaces

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

6.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

Sockets:

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

6.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

6.8.1. Map Visualizations:

6.8.2. Time Series Chart Interactivity

6.8.3. Dashboards

6.8.4. Property Editors

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

8.1 CONCLUSION:

This study describes architecture to enable data integration and its management in an ontology-driven telemonitoring solution implemented in home-based scenarios. This is an innovative architecture that facilitates the integration of several management services at home sites using the same software engine. The architecture has been specifically studied to support both technical and clinical services in the telemonitoring scenario, thus avoiding installing additional software for technical purposes.

HOTMES ontology used at the conceptual layer to describe a management profile on the one hand, our ontology contributes to integrate data and its management offering benefits in terms of knowledge representation, workflow organization, and self-management capabilities to the system. Its combination with rules allows providing personalized services.

This application ontology could be in future improved by introducing concepts from domain ontology. On the other hand, the data and communication layer of the architecture, based on the REST WS, was oriented to minimizing the consumption of resources and providing reusable key ideas for future ontology-based architecture developments.

8.2 FUTURE ENHANCEMENT

This solution represents a further step toward the possibility of establishing more effective home-based telemonitoring systems and thus improving the remote care of patientswith chronic diseases. As it was reported in, good telemedicine implementations are developed after a process where the dynamic interaction among a combination of socio-technical and also clinical factors is optimized. It means that additional work should be done (e.g., to measure the interaction of the

patient–doctor using the system and also the truthfulness of the system for a long period of time) before adopting this solution in a real scenario its complete development, first, a concordance study should be conducted in order to determine its clinical efficiency. Then, a social impact study should be conducted in order to determine how the system allowed improving patient’s quality of life. Regarding these last studies, the results presented in evidence the benefits of telemonitoring systems while linking their success to the usability design issues and features.

CHAPTER 9

9.1 REFERENCES

[1] I. Martinez et al., “Seamless integration of ISO/IEEE11073 personal health devices and ISO/EN13606 electronic health records into an endto- end interoperable solution,” Telemed. J. E. Health, vol. 16, no. 10, pp. 993–1004, 2010.

[2] M. Figueredo and J. Dias, “Service oriented architecture to support realtime implementation of artifact detection in critical care monitoring,” in Proc. IEEE. Annu. Int. Conf. Eng. Med. Biol. Soc., 2011, pp. 4925–4928.

[3] JD. Trigo, I. Mart´ınez, A. Alesanco, A. Kollmann, J. Escayola, D. Hayn, G. Schreier, and J. Garc´ıa, “An integrated healthcare information system for end-to-end standardized exchange and homogeneous management of digital ECG formats,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 4, pp. 518–529, Jul. 2012.

[4] F. Paganelli and D. Giuli, “An ontology-based system for context-aware and configurable services to support home-based continuous care,” IEEE Trans. Inform. Tech. Biomed., vol. 15, no. 2, pp. 324–333, 2011.

[5] D. Ria˜no, F. Real, J. A. L´opez-Vallverd´u, F. Campana, S. Ercolani, P. Mecocci, R. Annicchiarico, and C. Caltagirone, “An ontology-based personalization of health-care knowledge to support clinical decisions for chronically ill patients,” J. Biomed. Informat., vol. 45, no. 3, pp. 429–446, 2012.

[6] I. Berges, J. Bermudez, and A. Illarramendi, “Towards semantic interoperability of electronic health records,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 3, pp. 424–431, May 2012.

[7] G. Mulligan and D. Gracanin, “A comparison of SOAP and REST implementations of a service based interaction independence middleware framework,” in Proc. Winter Simul. Conf., 2009, pp. 1423–1432.

Congestion Aware Routing in Nonlinear Elastic Optical Networks

22/10/201902/07/2019 by admin

CLOUDQUAL A Quality Model for Cloud Services

22/10/201902/07/2019 by admin

Cloud Service Negotiation in Internet of Things Environment A Mixed Approach

22/10/201902/07/2019 by admin

Behavioral Malware Detection in Delay Tolerant Networks

22/10/201902/07/2019 by admin

BEHAVIORAL MALWARE DETECTION IN DELAY TOLERANT NETWORKS

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

BONAFIDE CERTIFICATE

Certified that this project report titled “BEHAVIORAL MALWARE DETECTION IN DELAY TOLERANT NETWORKS” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

CHAPTER 1

1.0 ABSTRACT:

The delay-tolerant-network (DTN) model is becoming a viable communication alternative to the traditional infrastructural model for modern mobile consumer electronics equipped with short-range communication technologies such as Bluetooth, NFC, and Wi-Fi Direct. Proximity malware is a class of malware that exploits the opportunistic contacts and distributed nature of DTNs for propagation. Behavioral characterization of malware is an effective alternative to pattern matching in detecting malware, especially when dealing with polymorphic or obfuscated malware.

In this paper, we first propose a general behavioral characterization of proximity malware which based on naive Bayesian model, which has been successfully applied in non-DTN settings such as filtering email spams and detecting botnets. We identify two unique challenges for extending Bayesian malware detection to DTNs (“insufficient evidence versus evidence collection risk” and “filtering false evidence sequentially and distributedly”), and propose a simple yet effective method, look ahead, to address the challenges. Furthermore, we propose two extensions to look ahead, dogmatic filtering, and adaptive look ahead, to address the challenge of “malicious nodes sharing false evidence.” Real mobile network traces are used to verify the effectiveness of the proposed methods.

1.1 INTRODUCTION

MALWARE:

Malware, short for malicious software, is any software used to disrupt computer operation, gather sensitive information, or gain access to private computer systems. It can appear in the form of executable code, scripts, active content, and other software. ‘Malware’ is a general term used to refer to a variety of forms of hostile or intrusive software. The term badware is sometimes used, and applied to both true (malicious) malware and unintentionally harmful software, viruses, worms, trojan horses, ransomware, spyware, adware, scareware and other malicious programs of active malware threats were worms or trojans rather than viruses. In law, malware is sometimes known as a computer contaminant, as in the legal codes of several malware is often disguised as, or embedded in, non-malicious files.

Spyware or other malware is sometimes found embedded in programs supplied officially by companies, e.g., downloadable from websites, that appear useful or attractive, but may have, for example, additional hidden tracking functionality that gathers marketing statistics. An example of such software, which was described as illegitimate, is the Sony rootkit, a Trojan embedded into CDs sold by Sony, which silently installed and concealed itself on purchasers’ computers with the intention of preventing illicit copying; it also reported on users’ listening habits, and created vulnerabilities that were exploited by unrelated malware.

PURPOSES:

Many early infectious programs, including the first Internet Worm, were written as experiments or pranks. Today, malware is used by both black hat hackers and governments, to steal personal, financial, or business information and sometimes for sabotage (e.g., Stuxnet). Malware is sometimes used broadly against government or corporate websites to gather guarded information, or to disrupt their operation in general. However, malware is often used against individuals to gain information such as personal identification numbers or details, bank or credit card numbers, and passwords. Left unguarded, personal and networked computers can be at considerable risk against these threats. (These are most frequently defended against by various types of firewall, anti-virussoftware, and network hardware).

Since the rise of widespread broadband Internet access, malicious software has more frequently been designed for profit. Since 2003, the majority of widespread viruses and worms have been designed to take control of users’ computers for illicit purposes. Infected “zombie computers” are used to send email spam, to host contraband data such as child pornography, or to engage in distributed denial-of-service attacks as a form of extortion.

Programs designed to monitor users’ web browsing, display unsolicited advertisements, or redirect affiliate marketing revenues are called spyware. Spyware programs do not spread like viruses; instead they are generally installed by exploiting security holes. They can also be packaged together with user-installed software, such as peer-to-peer applications.

Ransomware affects an infected computer in some way, and demands payment to reverse the damage. For example, programs such as CryptoLocker encrypt files securely, and only decrypt them on payment of a substantial sum of money.

INFECTIOUS MALWARE: VIRUSES AND WORMS:

The best-known types of malware, viruses and worms, are known for the manner in which they spread, rather than any specific types of behavior. The term computer virus is used for a program that embeds itself in some other executable software (including the operating system itself) on the target system without the users consent and when that is run causes the virus to spread to other executables. On the other hand, a worm is a stand-alone malware program that actively transmits itself over a network to infect other computers. These definitions lead to the observation that a virus requires the user to run an infected program or operating system for the virus to spread, whereas a worm spreads itself.

CONCEALMENT: Viruses, trojan horses, rootkits, and backdoors

TROJAN HORSES

For a malicious program to accomplish its goals, it must be able to run without being detected, shut down, or deleted. When a malicious program is disguised as something normal or desirable, users may unwittingly install it. This is the technique of the Trojan horse or trojan. In broad terms, a Trojan horse is any program that invites the user to run it, concealing harmful or malicious executable code of any description. The code may take effect immediately and can lead to many undesirable effects, such as encrypting the user’s files or downloading and implementing further malicious functionality.[citation needed]

In the case of some spyware, adware, etc. the supplier may require the user to acknowledge or accept its installation, describing its behavior in loose terms that may easily be misunderstood or ignored, with the intention of deceiving the user into installing it without the supplier technically in breach of the law.[citation needed]

ROOTKITS

Once a malicious program is installed on a system, it is essential that it stays concealed, to avoid detection. Software packages known as rootkits allow this concealment, by modifying the host’s operating system so that the malware is hidden from the user. Rootkits can prevent a malicious process from being visible in the system’s list of processes, or keep its files from being read.

Some malicious programs contain routines to defend against removal, not merely to hide them. An early example of this behavior is recorded in the Jargon File tale of a pair of programs infesting a Xerox CP-V time sharing system:

Each ghost-job would detect the fact that the other had been killed, and would start a new copy of the recently-stopped program within a few milliseconds. The only way to kill both ghosts was to kill them simultaneously (very difficult) or to deliberately crash the system.

BACKDOORS

A backdoor is a method of bypassing normal authentication procedures, usually over a connection to a network such as the Internet. Once a system has been compromised, one or more backdoors may be installed in order to allow access in the future,[30] invisibly to the user.

The idea has often been suggested that computer manufacturers preinstall backdoors on their systems to provide technical support for customers, but this has never been reliably verified. It was reported in 2014 that US government agencies had been diverting computers purchased by those considered “targets” to secret workshops where software or hardware permitting remote access by the agency was installed, considered to be among the most productive operations to obtain access to networks around the world. Backdoors may be installed by Trojan horses, worms, implants, or other methods.

Malware authors target bugs, or loopholes, to exploit. A common method is exploitation of vulnerability, where software designed to store data in a specified region of memory does not prevent more data than the buffer can accommodate being supplied. Malware may provide data that overflows the buffer, with malicious executable code or data after the end; when this payload is accessed it does what the attacker, not the legitimate software, determines.

INSECURE DESIGN OR USER ERROR

Early PCs had to be booted from floppy disks; when built-in hard drives became common the operating system was normally started from them, but it was possible to boot from another boot device if available, such as a floppy disk, CD-ROM, DVD-ROM, or USB flash drive. It was common to configure the computer to boot from one of these devices when available. Normally none would be available; the user would intentionally insert, say, a CD into the optical drive to boot the computer in some special way, for example to install an operating system. Even without booting, computers can be configured to execute software on some media as soon as they become available, e.g. to autorun a CD or USB device when inserted.

Malicious software distributors would trick the user into booting or running from an infected device or medium; for example, a virus could make an infected computer add autorunnable code to any USB stick plugged into it; anyone who then attached the stick to another computer set to autorun from USB would in turn become infected, and also pass on the infection in the same way. More generally, any device that plugs into a USB port-—”including gadgets like lights, fans, speakers, toys, even a digital microscope”—can be used to spread malware. Devices can be infected during manufacturing or supply if quality control is inadequate.

This form of infection can largely be avoided by setting up computers by default to boot from the internal hard drive, if available, and not to autorun from devices. Intentional booting from another device is always possible by pressing certain keys during boot.

Older email software would automatically open HTML email containing potentially malicious JavaScript code; users may also execute disguised malicious email attachments and infected executable files supplied in other ways.

1.2 SCOPE OF THE PROJECT:

We present a general behavioral characterization of proximity malware, which captures the functional but imperfect nature in detecting proximity malware characterization, and with a simple cut-off malware containment strategy, we formulate the malware detection process as a distributed decision problem. We analyze the risk associated with the decision, and design a simple, yet effective, strategy, look ahead, which naturally reflects individual nodes’ intrinsic risk inclinations against malware infection. Look ahead extends the naive Bayesian model, and addresses the DTN specific, malware-related, “insufficient evidence versus evidence collection risk” problem.

We consider the benefits of sharing assessments among nodes, and address challenges derived from the DTN model: liars (i.e., bad-mouthing and false praising malicious nodes) and defectors (i.e., good nodes that have turned rogue due to malware infections). We present two alternative techniques, dogmatic filtering and adaptive look ahead, that naturally extend look ahead to consolidate evidence provided by others, while containing the negative effect of false evidence. A nice property of the proposed evidence consolidation methods is that the results will not worsen even if liars are the majority in the neighborhood traces are used to verify the effectiveness of the methods.

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1 EXISTING SYSTEM:

Existing worms, spam, and phishing exploit gaps in traditional threat models that usually revolve around preventing unauthorized access and information disclosure. The new threat landscape requires security researchers to consider a wider range of attacks: opportunistic attacks in addition to targeted ones; attacks coming not just from malicious users, but also from subverted (yet otherwise benign) hosts; coordinated/distributed attacks in addition to isolated, single-source methods; and attacks blending flaws across layers, rather than exploiting a single vulnerability. Some of the largest security lapses in the last decade are due to designers ignoring the complexity of the threat landscape.

The increasing penetration of wireless networking, and more specifically wifi, may soon reach critical mass, making it necessary to examine whether the current state of wireless security is adequate for fending off likely attacks.

Three types of threats that seem insufficiently addressed by existing technology and deployment techniques. The first threat is wildfire worms, a class of worms that spreads contagiously between hosts on neighboring APs. We show that such worms can spread to a large fraction of hosts in a dense urban setting, and that the propagation speed can be such that most existing defenses cannot react in a timely fashion. Worse, such worms can penetrate through networks protected by WEP and other security mechanisms. The second threat we discuss is large-scale spoofing attacks that can be used for massive phishing and spam campaigns. We show how an attacker can easily use a botnet by acquiring access to wifi-capable zombie hosts, and can use these zombies to target not just the local wireless LAN, but any LAN within range, greatly increasing his reach across heterogeneous networks.

2.2 DISADVANTAGES:

Viruses can cause many problems on your computer. Usually, they display pop-up ads on your desktop or steal your information. Some of the more nasty ones can even crash your computer or delete your files.

Your computer gets slowed down. Many “hackers” get jobs with software firms by finding and exploiting problems with software.

Some the applications won’t start (ex: I hate mozilla virus won’t let you start the mozilla) you cannot see some of the settings in your OS. (Ex one kind of virus disables hide folder options and you will never be able to set it).

To quantify these threats, we rely on real-world data extracted from wifi maps of large metropolitan areas in the country. Existing results suggest that a carefully crafted wireless worm can infect up to 80% of all wifi connected hosts in some metropolitan areas within 20 minutes, and that an attacker can launch phishing attacks or build a tracking system to monitor the location of 10-50% of wireless users in these metropolitan areas with just 1,000 zombies under his control.

2.3 PROPOSED SYSTEM:

In this paper, we present a simple, yet effective solution, look ahead, which naturally reflects individual nodes’ intrinsic risk inclinations against malware infection, to balance between these two extremes. Essentially, we extend the naive Bayesian model, which has been applied in filtering email, spams detecting botnets, and designing IDSs.

We analyze the risk associated with the decision, and design a simple, yet effective, strategy, look ahead, which naturally reflects individual nodes’ intrinsic risk inclinations against malware infection. Look ahead extends the naive Bayesian model, and addresses the DTN specific, malware-related, “insufficient evidence versus evidence collection risk” Proximity malware is a malicious program that disrupts the host node’s normal function and has a chance of duplicating itself to other nodes during (opportunistic) contact opportunities between nodes in the DTN.

2.4 ADVANTAGES:

Two DTN specific, malware-related:

1. Insufficient evidence versus evidence collection risk. In DTNs, evidence (such as Bluetooth connection or SSH session requests) is collected only when nodes come into contact. But contacting malware-infected nodes carries the risk of being infected. Thus, nodes must make decisions (such as whether to cut off other nodes and, if yes, when) online based on potentially insufficient evidence.

2. Filtering false evidence sequentially and distributedly. Sharing evidence among opportunistic acquaintances helps alleviating the aforementioned insufficient evidence problem; however, false evidence shared by malicious nodes (the liars) may negate the benefits of sharing. In DTNs, nodes must decide whether to accept received evidence sequentially and distributedly.

2.5 HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

SOFTWARE REQUIREMENTS:

Operating System : Windows XP
Front End : JAVA JDK 1.7
Document : MS-Office 2007

CHAPTER 3

3.0 SYSTEM DESIGN

ARCHITECTURE DIAGRAM / DATA FLOW DIAGRAM / UML DIAGRAM:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data. The physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.0 ARCHITECTURE DIAGRAM

DATAFLOW DIAGRAM:

LEVEL 1

Server Client

LEVEL 2

UML DIAGRAMS:

USE CASE DIAGRAM:

CLASS DIAGRAM:

SEQUENCE DIAGRAM:

ACTIVITY DIAGRAMS:

CHAPTER 4

4.0 IMPLEMENTATION

4.1 MODULES:

DELAY TOLERANT NETWORKS:

PROXIMITY MALWARE:

MALWARE PROPAGATION:

BAYESIAN FILTERING:

4.1 MODULE DESCRIPTION:

DELAY TOLERANT NETWORKS:

Delay-tolerant networking (DTN) is an approach to computer network architecture that seeks to address the technical issues in heterogeneous networks that may lack continuous network connectivity. Examples of such networks are those operating in mobile or extreme terrestrial environments, or planned networks in space.

Recently, the term disruption-tolerant networking has gained currency in the United States due to support from DARPA, which has funded many DTN projects. Disruption may occur because of the limits of wireless radio range, sparsity of mobile nodes, energy resources, attack, and noise.

DTNs, including a growing number of academic conferences on delay and disruption-tolerant networking, and growing interest in combining work from sensor networks and MANETs with the work on DTN. This field saw many optimizations on classic ad hoc and delay-tolerant networking algorithms and began to examine factors such as security, reliability, verifiability, and other areas of research that are well understood in traditional computer networking.

PROXIMITY MALWARE:

Proximity malware is a malicious program that disrupts the host node’s normal function and has a chance of duplicating itself to other nodes during (opportunistic) contact opportunities between nodes in the DTN. When duplication occurs, the other node is infected with the malware. In our model, we assume that each node is capable of assessing the other party for suspicious actions after each encounter, resulting in a binary assessment. For example, a node can assess a Bluetooth connection or an SSH session for potential Cabir or Ikee infection.

The watchdog components in previous works on malicious behavior detection in MANETs and distributed reputation systems are other examples. A node is either evil or good, based on if it is or is not infected by the malware. The suspiciousaction assessment is assumed to be an imperfect but functional indicator of malware infections: It may occasionally assess an evil node’s actions as “nonsuspicious” or a good node’s actions as “suspicious,” but most suspicious actions are correctly attributed to evil nodes. A previous work on distributed IDS presents an example for such imperfect but functional binary classifier on nodes’ behaviors.

MALWARE PROPAGATION:

We analyzed malware propagation through proximity channels in social networks. Akritidis et al. quantified the threat of proximity malware in wide-area wireless networks in optimal malware signature distribution in heterogeneous, resource-constrained mobile networks in traditional, non-DTN, networks and Bayer et al.

We proposed to detect malware with learned behavioral model, in terms of system call and program flow. We extend the Naive Bayesian model, which has been applied in filtering email spams detecting botnets and designing IDSs and address DTN-specific, malware-related, problems. In the context of detecting slowly propagating Internet worm, Dash et al. presented a distributed IDS architecture of local/global detector that resembles the neighborhood-watch model, with the assumption of attested/honest evidence, i.e., without liars.

BAYESIAN FILTERING:

Naive Bayes classifiers can be trained very efficiently in a supervised learning setting. In many practical applications, parameter estimation for naive Bayes models uses the method of maximum likelihood; in other words, one can work with the naive Bayes model without accepting Bayesian probability or using any Bayesian methods.

The implications are as follows:

. Given enough assessments, honest nodes are likely to obtain a close estimation of a node’s suspiciousness (suppose they have not cut the node off yet), even if they only use their own assessments.

. The liars have to share a significant amount of false evidence to sway the public’s opinion on a node’s suspiciousness.

. The most susceptible victims of liars are the nodes that have little evidence Dogmatic filtering. Dogmatic filtering is based on the observation that one’s own assessments are truthful and, therefore, can be used to bootstrap the evidence consolidation process. A node shall only accept evidence that will not sway its current opinion too much. We call this observation the dogmatic principle.

We extend the Naive Bayesian model, which has been applied in filtering email, spams detecting botnets and designing IDSs and address DTN-specific, malware-related, problems. In the context of detecting slowly propagating Internet worm, Dash et al. presented a distributed IDS architecture of local/global detector that resembles the neighborhood-watch model,

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY:

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.3 FUNCTIONAL TESTING:

FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 4 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.5 LOAD TESTING:

Load Testing

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.7 WHITE BOX TESTING:

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

5.1.10 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

7.0 SOFTWARE DESCRIPTION:

7.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

7.2 THE JAVA PLATFORM:

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

7.3 WHAT CAN JAVA TECHNOLOGY DO?

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

7.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

7.5 ODBC:

7.6 JDBC:

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

7.7 JDBC Goals:

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

SQL Conformance

JDBC must be implemental on top of common database interfaces

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

7.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

Sockets:

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

7.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

7.8.1. Map Visualizations:

7.8.2. Time Series Chart Interactivity

7.8.3. Dashboards

7.8.4. Property Editors

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

8.1 CONCLUSION

Behavioral characterization of malware is an effective alternative to pattern matching in detecting malware, especially when dealing with polymorphic or obfuscated malware. Naive Bayesian model has been successfully applied in non-DTN settings, such as filtering email spams and detecting botnets.

We propose a general behavioral characterization of DTN-based proximity malware. We present look ahead, along with dogmatic filtering and adaptive look ahead, to address two unique challenging in extending Bayesian filtering to DTNs: “insufficient evidence versus evidence collection risk” and “filtering false evidence sequentially and distributedly.” In prospect, extension of the behavioral characterization of proximity malware to account for strategic malware detection evasion with game theory is a challenging yet interesting future work.

CHAPTER 9

9.0 REFERENCES:

[1] C. Kolbitsch, P. Comparetti, C. Kruegel, E. Kirda, X. Zhou, and X. Wang, “Effective and Efficient Malware Detection at the End Host,” Proc. 18th Conf. USENIX Security Symp., 2009.

[2] U. Bayer, P. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda, “Scalable, Behavior-Based Malware Clustering,” Proc. 16th Ann. Network and Distributed System Security Symp. (NDSS), 2009.

[3] G. Zyba, G. Voelker, M. Liljenstam, A. Me´hes, and P. Johansson, “Defending Mobile Phones from Proximity Malware,” Proc. IEEE INFOCOM, 2009.

[4] F. Li, Y. Yang, and J. Wu, “CPMC: An Efficient Proximity Malware Coping Scheme in Smartphone-Based Mobile Networks,” Proc. IEEE INFOCOM, 2010.

[6] J. Zdziarski, Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification. No Starch Press, 2005.

[7] R. Villamarı´n-Salomo´n and J. Brustoloni, “Bayesian Bot Detection Based on DNS Traffic Similarity,” Proc. ACMymp. Applied Computing (SAC), 2013.

Automatic Scaling of Internet Applications for Cloud Computing Services

22/10/201902/07/2019 by admin

AUTOMATIC SCALING OF INTERNET APPLICATIONS FOR CLOUD

COMPUTING SERVICES

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

CERTIFICATE

Certified that this project report titled “Automatic Scaling of Internet Applications for Cloud Computing Services” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

DECLARATION

I hereby declare that the project work entitled “Automatic Scaling of Internet Applications for Cloud Computing Services” Submitted to BHARATHIDASAN UNIVERSITY in partial fulfillment of the requirement for the award of the Degree of MASTER OF SCIENCE IN COMPUTER SCIENCE is a record of original work done by me the guidance of Prof.A.Vinayagam M.Sc., M.Phil., M.E., to the best of my knowledge, the work reported here is not a part of any other thesis or work on the basis of which a degree or award was conferred on an earlier occasion to me or any other candidate.

(Student Name)

(Reg.No)

Place:

Date:

ACKNOWLEDGEMENT

I am extremely glad to present my project “Automatic Scaling of Internet Applications for Cloud Computing Services” which is a part of my curriculum of third semester Master of Science in Computer science. I take this opportunity to express my sincere gratitude to those who helped me in bringing out this project work.

I would like to express my Director, Dr. K. ANANDAN, M.A.(Eco.), M.Ed., M.Phil.,(Edn.), PGDCA., CGT., M.A.(Psy.) of who had given me an opportunity to undertake this project.

I am highly indebted to Co-Ordinator Prof. Muniappan Department of Physics and thank from my deep heart for her valuable comments I received through my project.

I wish to express my deep sense of gratitude to my guide
Prof. A.Vinayagam M.Sc., M.Phil., M.E., for her immense help and encouragement for successful completion of this project.

I also express my sincere thanks to the all the staff members of Computer science for their kind advice.

And last, but not the least, I express my deep gratitude to my parents and friends for their encouragement and support throughout the project.

CHAPTER 1

1.1 ABSTRACT:

Many Internet applications can benefit from an automatic scaling property where their resource usage can be scaled up and down automatically by the cloud service provider. We present a system that provides automatic scaling for Internet applications in the cloud environment. We encapsulate each application instance inside a virtual machine (VM) and use virtualization technology to provide fault isolation. We model it as the Class Constrained Bin Packing (CCBP) problem where each server is a bin and each class represents an application. The class constraint reflects the practical limit on the number of applications a server can run simultaneously.

We develop an efficient semi-online color set algorithm that achieves good demand satisfaction ratio and saves energy by reducing the number of servers used when the load is low. Experiment results demonstrate that our system can improve the throughput an open source implementation of restore the normal QoS five times as fast during flash crowds. Large scale simulations demonstrate that our algorithm is extremely scalable applications. This is an order of magnitude improvement over traditional application placement algorithms in enterprise environments.

1.2 INTRODUCTION

One of the often cited benefits of cloud computing service is the resource elasticity: a business customer can scale up and down its resource usage as needed without upfront capital investment or long term commitment. The Amazon EC2 service, for example, allows users to buy as many virtual machine (VM) instances as they want and operate them much like physical hardware. However, the users still need to decide how much resources are necessary and forhow long. We believe many Internet applications can benefit from an auto scaling property where their resource usage can be scaled up and down automatically by the cloud service provider. A user only needs to upload the application onto a single server in the cloud, and the cloud service will replicate the application onto more or fewer servers as its demand comes and goes. The users are charged only for what they actually use—the so-called “pay as you go” model.

The typical architecture of data center servers for Internet applications it consists of a load balancing switch, a set of application servers, and a set of backend storage servers. The front end switch is typically a Layer 7 switch which parses application level information in Web requests and forwards them to the servers with the corresponding applications running. The switch sometimes runs in a redundant pair for fault tolerance. Each application can run on multiple server machines and the set of their running instances are often managed by some clustering software such as Web Logic.

Each server machine can host multiple applications. The applications store their state information in the backend storage servers. It is important that the applications themselves are stateless so that they can be replicated safely. The storage servers may also become overloaded, but the focus of this work is on the application tier. The Google AppEngine service, for example, requires that the applications be structured in such a two tier architecture and uses the BigTable as its scalable storage solution. A detailed comparison with AppEngine is deferred to Section 7 so that sufficient background can be established. Some distributed data processing applications cannot be mapped into such a tiered architecture easily and thus are not the target of this work. We believe our architecture is representative of a large set of Internet services hosted in the cloud computing environment.

Even though the cloud computing model is sometimes advocated as providing infinite capacity on demand, the capacity of data centers in the real world is finite. The illusion of infinite capacity in the cloud is provided through statistical multiplexing. When a large number of applications experience their peak demand around the same time, the available resources in the cloud can become constrained and some of the demand may not be satisfied. We define the demand satisfaction ratio as the percentage of application demand that is satisfied successfully. The amount of computing capacity available to an application is limited by the placement of its running instances on the servers.

The more instances an application has and the more powerful the underlying servers are, the higher the potential capacity for satisfying the application demand. On the other hand, when the demand of the applications is low, it is important to conserve energy by reducing the number of servers used. Various studies have found that the cost of electricity is a major portion of the operation cost of large data centers. At the same time, the average server utilization in many Internet data centers is very low: real world estimates range from 5% to 20%. Moreover, work has found that the most effective way to conserve energy is to turn the whole server off. The application placement problem is essential to achieving a high demand satisfaction ratio without wasting energy.

In this paper, we present a system that provides automatic scaling for Internet applications in the cloud environment. Our contributions include the following.

We summarize the automatic scaling problem in the cloud environment, and model it as a modified Class Constrained Bin Packing (CCBP) problem where each server is a bin and each class represents an application. We develop an innovative auto scaling algorithm to solve the problem and present a rigorous analysis on the quality of it with provable bounds. Compared to the existing Bin Packing solutions, we creatively support item departure which can effectively avoid the frequent placement changes1 caused by repacking.

We support green computing by adjusting the placement of application instances adaptively and putting idle machines into the standby mode. Experiments and simulations show that our algorithm is highly efficient and scalable which can achieve high demand satisfaction ratio, low placement change frequency, short request response time, and good energy saving.

We build a real cloud computing system which supports our auto scaling algorithm. We compare the performance of our system with an open source implementation of the Amazon EC2 auto scaling system in a testbed of 30 Dell Power Edge blade servers. Experiments show that our system can restore the normal QoS five times as fast when a flash crowd happens.

We use a fast restart technique based on virtual machine (VM) suspend and resume that reduces the application start up time dramatically for Internet services.

1.3 LITRATURE SURVEY

AUTHOR AND PUBLICATION: C. Tang, M. Steinder, M. Spreitzer, and G. Pacifici, “A SCALABLE APPLICATION PLACEMENT CONTROLLER FOR ENTERPRISE DATA CENTERS,” in Proc. Int. World Wide Web Conf. (WWW’07), May 2007, pp. 331–340.

EXPLANATION:

Given a set of machines and a set of Web applications with dynamically changing demands, an online application placement controller decides how many instances to run for each application and where to put them, while observing all kinds of resource constraints. This NP hard problem has real usage in commercial middleware products. Existing approximation algorithms for this problem can scale to at most a few hundred machines, and may produce placement solutions that are far from optimal when system resources are tight. In this paper, we propose a new algorithm that can produce within 30seconds high-quality solutions for hard placement problems with thousands of machines and thousands of applications. This scalability is crucial for dynamic resource provisioning in large-scale enterprise data centers. Our algorithm allows multiple applications to share a single machine, and strivesto maximize the total satisfied application demand, to minimize the number of application starts and stops, and to balance the load across machines. Compared with existing state-of-the-art algorithms, for systems with 100 machines or less, our algorithm is up to 134 times faster, reduces application starts and stops by up to 97%, and produces placement solutions that satisfy up to 25% more application demands. Our algorithm has been implemented and adopted in a leading commercial middleware product for managing the performance of Web applications.

AUTHOR AND PUBLICATION: C. Adam and R. Stadler, “SERVICE MIDDLEWARE FOR SELF-MANAGING LARGE-SCALE SYSTEMS,” IEEE Trans. Netw. Serv. Manage., vol. 4, no. 3, pp. 50–64, Dec. 2007.

EXPLANATION:

Resource management poses particular challenges in large-scale systems, such as server clusters that simultaneously process requests from a large number of clients. A resource management scheme for such systems must scale both in the in the number of cluster nodes and the number of applications the cluster supports. Current solutions do not exhibit both of these properties at the same time. Many are centralized, which limits their scalability in terms of the number of nodes, or they are decentralized but rely on replicated directories, which also reduces their ability to scale. In this paper, we propose novel solutions to request routing and application placement- two key mechanisms in a scalable resource management scheme.

Our solution to request routing is based on selective update propagation, which ensures that the control load on a cluster node is independent of the system size. Application placement is approached in a decentralized manner, by using a distributed algorithm that maximizes resource utilization and allows for service differentiation under overload. The paper demonstrates how the above solutions can be integrated into an overall design for a peer-to-peer management middleware that exhibits properties of self-organization. Through complexity analysis and simulation, we show to which extent the system design is scalable. We have built a prototype using accepted technologies and have evaluated it using a standard benchmark. The testbed measurements show that the implementation, within the parameter range tested, operates efficiently, quickly adapts to a changing environment and allows for effective service differentiation by a system administrator.

AUTHOR AND PUBLICATION: J. Famaey, W. D. Cock, T. Wauters, F. D. Turck, B. Dhoedt, and P. Demeester, “A LATENCY-AWARE ALGORITHM FOR DYNAMIC SERVICE PLACEMENT IN LARGE-SCALE OVERLAYS,” in Proc. IFIP/IEEE Int. Conf. Symp. Integrat. Netw. Manage. (IM’09), 2009, pp. 414–421.

EXPLANATION:

A generic and self-managing service hosting infrastructure, provides a means to offer a large variety of services to users across the Internet. Such an infrastructure provides mechanisms to automatically allocate resources to services, discover the location of these services, and route client requests to a suitable service instance. In this paper we propose a dynamic and latency-aware algorithm for assigning resources to services. Additionally, the proposed service hosting architecture and its protocols to support the service placement algorithm, are described in detail. Extensive simulations were performed to compare the solution of our latency-aware algorithm to the latency-unaware variant, in terms of system efficiency and scalability.

AUTHOR AND PUBLICATION: E. Caron, L. Rodero-Merino, F. Desprez, and A.Muresan, “AUTOSCALING, LOAD BALANCING AND MONITORING IN COMMERCIAL AND OPENSOURCE CLOUDS,” INRIA, Rapport de recherche RR-7857, Feb. 2012.

EXPLANATION:

Now – a – days because of increased use of internet the associate resource are increasing rapidly resulting generation of high work load. To provide the reliable service to client with QOS the load balancing mechanism is necessary in cloud environment, to prevent system from overloading and crash an autoscaling mechanism must also be provided according to the application and incoming user traffic. load balancing mechanism provides the distribution of load among one or more nodes of cloud system, for efficient service model autoscaling feature also enabled with the load balancer to handle the excess load. Auto scaling scaled-up and scaled down the platform dynamically according to the clients incoming traffic this save money and physical resources. Latency based routing is the new concept in cloud computing which provide the load balancing based on DNS latency to global client by mapping domain name system (DNS) through the different hosted zone .provide the load balancing based on geographical service region. To achieve above mentioned we use the public cloud services such as amazons’EC2. ELB. This research is divided in four part such as: i) load balancing ii) auto scaling iii)latency based routing iv) resource monitoring . while discussing each topic in detailed we will implement the individual service and test while providing load from external software tool putty we will produce result for efficient load balancing.

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1 EXISTING SYSTEM:

Existing algorithms for the online and offline class-constrained bin packing problem is motivated by applications in the data-placement problem to video-on-demand servers and applications in the cutting and packing area. For the online problem we provide lower bounds for any bounded space algorithm and we also present an algorithm for the unbounded version with approximation factor low value.

For the offline problem we present practical approximation algorithms for two special cases of the problem, with conditions already considered in the literature: when all items have the same size and the parameterized version of the problem. We also perform several tests with these practical algorithms. For the instances we considered representing practical ones, the algorithms optimal solutions an CCBP for the special case where the number of different classes of the input instance is bounded by a constant.

Therefore, in order to solve our problem, we modified the CCBP model to support the “Minimize the placement change frequency” goal and provide a new enhanced semionline approximation algorithm to solve it in the next section. Note that the equations above are just a formal presentation of the goals and constraints of our problem.

2.1.1 DISADVANTAGES:

Automatic scaling problem in the cloud environment, and model it as a modified Class Constrained Bin Packing (CCBP) problem where each server is a bin and each class represents an application.

In the traditional bin packing problem, a series of items of different sizes need to be packed into a minimum number of bins. The class constrained version of this problem divides the items into classes or colors.

Each bin has capacity v and can accommodate items from at most c distinct classes. It is “class constrained” because the class diversity of items packed into the same bin is constrained. The goal is to pack the items into a minimum number of bins.

Existing algorithm is the support for item departure which is essential to maintaining not performance in a cloud computing environment where the resource demands of Internet applications can vary dynamically.

2.2 PROPOSED SYSTEM:

We develop an efficient semi-online color set algorithm that achieves good demand satisfaction ratio and saves energy by reducing the number of servers used each class of items with a color and organize them into color sets as they arrive in the input sequence. The number of distinct colors in a color set is at most c (i.e., the maximum number of distinct classes in a bin). This ensures that items in a color set can always be packed into the same bin without violating the class constraint. The packing is still subject to the capacity constraint of the bin. All color sets contain exactly c colors except the last one which may contain fewer colors. Items from different color sets are packed independently.

A greedy algorithm is used to pack items within each color set: the items are packed into the current bin until the capacity is reached. Then the next bin is opened for packing. Thus each color set has at most one unfilled (i.e., non-full) bin. Note that a full bin may contain fewer than c colors. When a new item from a specific color set arrives, it is packed into the corresponding unfilled bin. If all bins of that color set are full, then a new bin is opened to accommodate the item. The load increase of an application is modeled as the arrival of items with the corresponding color. A naive algorithm is to always pack the item into the unfilled bin if there is one. If the unfilled bin does not contain that color already, then a new color is added into the bin.

We allocate the new colors to the unfilled sets first using the following add_new_colors procedure.

Procedure add_new_colors:

Sort the list of unfilled color sets in descending order of their cardinality. Use a greedy algorithm to add the new colors into those sets according to their positions in the list.

If we run out of the new colors before filling up all but the last unfilled sets, use the consolidate_unfilled_sets procedure below to consolidate the remaining unfilled sets until there is only one left.

If there are still new colors left after filling up all unfilled sets in the system, we partition the remaining new colors into additional color sets using a greedy algorithm.

The consolidate_unfilled_sets procedure below consolidates unfilled sets in the system until there is only one left.

Procedure consolidate_unfilled_sets:

Sort the list of unfilled color sets in descending order of their cardinality Use the last set in the list (with the fewest colors) to fill the first set in the list (with the most colors) through the fill procedure below. Remove the resulting full set or empty set from the list.

2.2.1 ADVANTAGES:

We support green computing by adjusting the placement of application instances adaptively and putting idle machines into the standby mode. Experiments and simulations show that our algorithm is highly efficient and scalable which can achieve high demand satisfaction ratio, low placement change frequency, short request response time, and good energy saving.

We build a real cloud computing system which supports our auto scaling algorithm. We compare the performance of our system with an open source implementation of the internet auto scaling system in a testbed of 30 Dell PowerEdge blade servers.

Experiments show that our system can restore the normal QoS five times as fast when a flash crowd happens. We use a fast restart technique based on virtual machine (VM) suspend and resume that reduces the application start up time dramatically for Internet services.

2.3 HARDWARE & SOFTWARE REQUIREMENTS:

2.3.1 HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

2.3.2 SOFTWARE REQUIREMENTS:

Operating System : Windows XP or Win7
Front End : JAVA JDK 1.7
Back End : MYSQL Server
Server : Apache Tomact Server
Script : JSP Script
Document : MS-Office 2007

CHAPTER 3

3.0 SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data’s in the physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.1 ARCHITECTURE DIAGRAM

3.2 DATAFLOW DIAGRAM

ADMIN:

USER:

UML DIAGRAMS:

3.2 USE CASE DIAGRAM:

ADMIN:

USER:

3.3 CLASS DIAGRAM:

3.4 SEQUENCE DIAGRAM:

ADMIN:

USER:

3.5 ACTIVITY DIAGRAM:

ADMIN:

USER:

CHAPTER 4

4.0 IMPLEMENTATION:

4.1 ALGORITHM:

Our algorithm belongs to the family of color set algorithms with significant modification to adapt to our problem. A detailed comparison with the existing algorithm is deferred to Section 7 so that sufficient background can be established. We label each class of items with a color and organize them into color sets as they arrive in the input sequence. The number of distinct colors in a color set is at most c (i.e., the maximum number of distinct classes in a bin). This ensures that items in a color set can always be packed into the same bin without violating the class constraint. The packing is still subject to the capacity constraint of the bin. All color sets contain exactly c colors except the last one which may contain fewer colors.

Items from different color sets are packed independently. A greedy algorithm is used to pack items within each color set: the items are packed into the current bin until the capacity is reached. Then the next bin is opened for packing. Thus each color set has at most one unfilled (i.e., non-full) bin. Note that a full bin may contain fewer than c colors. When a new item from a specific color set arrives, it is packed into the corresponding unfilled bin. If all bins of that color set are full, then a new bin is opened to accommodate the item.

Our basic idea is to fill up the unfilled sets (except the last one) while minimizing its impact on the existing color assignment. We first check if there are any pending requests to add new colors into the system. If there are, we allocate the new colors to the unfilled sets first using the following add_new_colors procedure.

Procedure add_new_colors:

Sort the list of unfilled color sets in descending order of their cardinality. Use a greedy algorithm to add the new colors into those sets according to their positions in the list. If we run out of the new colors before filling up all but the last unfilled sets, use the consolidate_unfilled_sets procedure below to consolidate the remaining unfilled sets until there is only one left.

If there are still new colors left after filling up all unfilled sets in the system, we partition the remaining new colors into additional color sets using a greedy algorithm. The consolidate_unfilled_sets procedure below consolidates unfilled sets in the system until there is only one left.

Procedure consolidate_unfilled_sets:

Remove the resulting full set or empty set from the list.

Repeat the previous step until there is only one unfilled set left in the list. The fill( , ) procedure below uses the colors in set to fill the set . Procedure fill( , ):

Sort the list of colors in in ascending order of their numbers of items.

Addthe first color in the list (with the fewest items) into. Use “item departure” operation in and “item arrival” operation in to move all items of that color from to . Then remove that color from the list. Repeat the above step until either becomes empty or becomes full.

4.2 MODULES:

USER MODULES:

LOCAL NODE MANAGER (LNM):

APPLICATION LOAD INCREASE:

APPLICATION LOAD DECREASE:

AUTOMATIC SCALING RESOURCES:

APPROXIMATION RATIO:

4.3 MODULE DESCRIPTION:

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.2 FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 3 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.4 LOAD TESTING:

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

6.0 SOFTWARE DESCRIPTION:

6.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

6.2 THE JAVA PLATFORM:

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

6.3 WHAT CAN JAVA TECHNOLOGY DO?

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

6.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

6.5 ODBC:

6.6 JDBC:

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

6.7 JDBC Goals:

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

SQL Conformance

JDBC must be implemental on top of common database interfaces

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

6.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

Sockets:

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

6.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

6.8.1. Map Visualizations:

6.8.2. Time Series Chart Interactivity

6.8.3. Dashboards

6.8.4. Property Editors

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

8.1 CONCLUSION

We presented the design and implementation of a system that can scale up and down the number of application instances automatically based on demand. We developed a color set algorithm to decide the application placement and the load distribution. Our system achieves high satisfaction ratio of application demand even when the load is very high. It saves energy by reducing the number of running instances when the load is low.

There are several directions for future work. Some cloud service providers may provide multiple levels of services to their customers. When the resources become tight, they may want to give their premium customers a higher demand satisfaction ratio than other customers. In the future, we plan to extend our system to support differentiated services but also consider fairness when allocating the resources across the applications. We mentioned in the paper that we can divide multiple generations of hardware in a data center into “equivalence classes” and run our algorithm within each class.

Our future work is to develop an efficient algorithm to distribute incoming requests among the set of equivalence classes and to balance the load across those server clusters adaptively. As analyzed in the paper, CCBP works well when the aggregate load of applications in a color set is high. Another direction for future work is to extend the algorithm to pack applications with complementary bottleneck resources together, e.g., to co-locate a CPU intensive application with a memory intensive one so that different dimensions of server resources can be adequately utilized.

CHAPTER 9

9.1 REFERENCES

C. Tang, M. Steinder, M. Spreitzer, and G. Pacifici, “A scalable application placement controller for enterprise data centers,” in Proc. Int. World Wide Web Conf. (WWW’07), May 2007, pp. 331–340.

C. Adam and R. Stadler, “Service middleware for self-managing large-scale systems,” IEEE Trans. Netw. Serv. Manage., vol. 4, no. 3, pp. 50–64, Dec. 2007.

J. Famaey, W. D. Cock, T. Wauters, F. D. Turck, B. Dhoedt, and P. Demeester, “A latency-aware algorithm for dynamic service placement in large-scale overlays,” in Proc. IFIP/IEEE Int. Conf. Symp. Integrat. Netw. Manage. (IM’09), 2009, pp. 414–421.

E. Caron, L. Rodero-Merino, F. Desprez, and A.Muresan, “Autoscaling, load balancing and monitoring in commercial and opensource clouds,” INRIA, Rapport de recherche RR-7857, Feb. 2012.

A. Karve, T. Kimbrel, G. Pacifici, M. Spreitzer, M. Steinder, M. Sviridenko, and A. Tantawi, “Dynamic placement for clustered web applications,” in Proc. Int. World Wide Web Conf. (WWW’06), May 2006, pp. 595–604.

D. Magenheimer, “Transcendent memory: A new approach to managingRAMin a virtualized environment,” in Proc. Linux Symp., 2009, pp. 191–200.

E. C. Xavier and F. K. Miyazawa, “The class constrained bin packing problem with applications to video-on-demand,” Theor. Comput.Sci., vol. 393, no. 1–3, pp. 240–259, 2008.

A Study on False Channel Condition Reporting Attacks in Wireless Networks

22/10/201902/07/2019 by admin

A STUDY ON FALSE CHANNEL CONDITION REPORTING ATTACKS IN WIRELESS NETWORKS

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

BONAFIDE CERTIFICATE

Certified that this project report titled “A STUDY ON FALSE CHANNEL CONDITION REPORTING ATTACKS IN WIRELESS NETWORKS” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

CHAPTER 1

ABSTRACT:

Wireless networking protocols are increasingly being designed to exploit a user’s measured channel condition; we call such protocols channel-aware. Each user reports the measured channel condition to a manager of wireless resources and a channel-aware protocol uses these reports to determine how resources are allocated to users. In a channel-aware protocol, each user’s reported channel condition affects the performance of every other user. The deployment of channel-aware protocols increases the risks posed by false channel-condition feedback.

We study what happens in the presence of an attacker that falsely reports its channel condition. We perform case studies on channel-aware network protocols to understand how an attack can use false feedback and how much the attack can affect network performance. The results of the case studies show that we need a secure channel condition estimation algorithm to fundamentally defend against the channel-condition misreporting attack. We design such an algorithm and evaluate our algorithm through analysis and simulation. Our evaluation quantifies the effect of our algorithm on system performance as well as the security and the performance of our algorithm.

INTRODUCTION

Many protocols in modern wireless networks treat a link’s channel condition information as a protocol input parameter; we call such protocols channel-aware. Examples include cooperative relaying network architectures, efficient ad hoc network routing metrics, and opportunistic schedulers. While work on channel-aware protocols has mainly focused on how channel condition information can be used to more efficiently utilize wireless resources, security aspects of channel-aware protocols have only recently been studied. These works on security of channel-aware protocols revealed new threats in specific network environments by simulation or measurement.

However, understanding the effect of possible attacks across varied network environments is still an open area for study. In particular, we consider the effect of a user equipment’s reporting false channel condition. This issue is partially addressed in the work of Racic et al. in a limited network setting. They consider a particular scheduler in a cellular network with handover process and propose a secure handover algorithm. In contrast, we reveal the possible effects of false channel condition reporting in various channel-aware network protocols and propose a primitive defense mechanism that provides secure channel condition estimation.

Our contributions are:

• We analyze specific attack mechanisms and evaluate the effects of misreporting channel condition on various channel-aware wireless network protocols including cooperative relaying protocols, routing metrics in wireless ad-hoc network and opportunistic schedulers.

• We propose a secure channel condition estimation algorithm that can be used to construct a secure channel-aware protocol in single-hop settings.

• We analyze our algorithm in the respects of performance and security, and we perform a simulation study to understand the impact of our algorithm on system performance.

The false channel condition reporting attack that we introduce in this paper is difficult to identify by existing mechanisms, since our attack is mostly protocol compliant; only the channel-condition measurement mechanism need to be modified. Our attack can thus be performed using modified user equipment legitimately registered to a network.

1.3 LITRATURE SURVEY

EXPLOITING AND DEFENDING OPPORTUNISTIC SCHEDULING IN CELLULAR DATA NETWORKS

PUBLICATION: R. Racic, D. Ma, H. Chen, and X. Liu, IEEE Trans. Mobile Comput., vol. 9, no. 5, pp. 609–620, May 2010.

Third Generation (3G) cellular networks take advantage of time-varying and location-dependent channel conditions of mobile users to provide broadband services. Under fairness and QoS constraints, they use opportunistic scheduling to efficiently utilize the available spectrum. Opportunistic scheduling algorithms rely on the collaboration among all mobile users to achieve their design objectives. However, we demonstrate that rogue cellular devices can exploit vulnerabilities in popular opportunistic scheduling algorithms, such as Proportional Fair (PF) and Temporal Fair (TF), to usurp the majority of time slots in 3G networks. Our simulations show that under realistic conditions, only five rogue device per 50-user cell can capture up to 95 percent of the time slots, and can cause 2-second end-to-end interpacket transmission delay on VoIP applications for every user in the same cell, rendering VoIP applications useless. To defend against this attack, we propose strengthening the PF and TF schedulers and a robust handoff scheme.

ON THE VULNERABILITY OF THE PROPORTIONAL FAIRNESS SCHEDULER TO RETRANSMISSION ATTACKS

PUBLICATION: U. Ben-Porat, A. Bremler-Barr, H. Levy, and B. Plattner, in Proc. IEEE INFOCOM, Shanghai, China, Apr. 2011, pp. 1431–1439.

Channel aware schedulers of modern wireless networks – such as the popular Proportional Fairness Scheduler (PFS) – improve throughput performance by exploiting channel fluctuations while maintaining fairness among the users. In order to simplify the analysis, PFS was introduced and vastly investigated in a model where frame losses do not occur, which is of course not the case in practical wireless networks. Recent studies focused on the efficiency of various implementations of PFS in a realistic model where frame losses can occur. In this work we show that the common straight forward adaptation of PFS to frame losses exposes the system to a malicious attack (which can alternatively be caused by malfunctioning user equipment) that can drastically degrade the performance of innocent users. We analyze the factors behind the vulnerability of the system and propose a modification of PFS designed for the frame loss model which is resilient to such malicious attack while maintaining the fairness properties of original PFS.

A MEASUREMENT STUDY OF SCHEDULER-BASED ATTACKS IN 3G WIRELESS NETWORKS

PUBLICATION: S. Bali, S. Machiraju, H. Zang, and V. Frost, in Proc. PAM, Berlin, Germany, 2007.

Though high-speed (3G) wide-area wireless networks have been rapidly proliferating, little is known about the robustness and security properties of these networks. In this paper, we make initial steps towards understanding these properties by studying Proportional Fair (PF), the scheduling algorithm used on the downlinks of these networks. We find that the fairness-ensuring mechanism of PF can be easily corrupted by a malicious user to monopolize the wireless channel thereby starving other users. Using extensive experiments on commercial and laboratory-based CDMA networks, we demonstrate this vulnerability and quantify the resulting performance impact. We find that delay jitter can be increased by up to 1 second and TCP throughput can be reduced by as much as 25−30 % by a single malicious user. Based on our results, we argue for the need to use a more robust scheduling algorithm and outline one such algorithm. 1

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1 EXISTING SYSTEM:

Many protocols in modern wireless networks treat a link’s channel condition information as a protocol input parameter; we call such protocols channel-aware. Examples include cooperative relaying network architectures, efficient ad hoc network routing metrics, and opportunistic schedulers. While work on channel-aware protocols has mainly focused on how channel condition information can be used to more efficiently utilize wireless resources, security aspects of channel-aware protocols have only recently been studied. These works on security of channel-aware protocols revealed new threats in specific network environments by simulation or measurement. However, under-standing the effect of possible attacks across varied network environments is still an open area for study.

2.1.1 DISADVANTAGES:

Difficult to guarantee QoS in MANETs due to their unique features including user mobility, channel variance errors, and limited bandwidth.
Although these protocols can increase the QoS of the MANETs to a certain extent, they suffer from invalid reservation and race condition problems.

2.2 PROPOSED SYSTEM:

We introduce our attack concept and perform case studies to quantize the attack effects on specific channel-aware network protocols. Depending on deployed PHY-layer technologies (e.g. OFDM), a system can utilize conditions for subchannels to perform more efficient frequency-selective scheduling. Our work can apply for this case by handling each subchannel condition information separately. However, for clarity of presentation, we consider a single channel between network participants in this paper.

We can easily implement false channel condition reporting attack by modifying only a subcomponent to report channel condition. This subcomponent of user equipment can be implemented in hardware or software. One recent trend of user equipment implementation is to increasingly move hardware part to software part for adaptable configuration of a general hardware. The increasing software control of user equipment makes false channel condition reporting attack an increasingly practical attack.

2.2.1 ADVANTAGES:

We analyze specific attack mechanisms and evaluate the effects of misreporting channel condition on various channel-aware wireless network protocols including cooperative relaying protocols, routing metrics in wireless ad-hoc network and opportunistic schedulers.

We propose a secure channel condition estimation algorithm that can be used to construct a secure channel-aware protocol in single-hop settings.

We analyze our algorithm in the respects of performance and security, and we perform a simulation study to understand the impact of our algorithm on system performance.

2.3 HARDWARE & SOFTWARE REQUIREMENTS:

2.3.1 HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

2.3.2 SOFTWARE REQUIREMENTS:

Operating System : Windows XP or Win 7
Front End : Java JDK 1.7
Back End : MS-ACCESS
Tools : Netbeans IDE 7
Document : MS-Office 2007

CHAPTER 3

3.0 SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data. The physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.1 ARCHITECTURE DIAGRAM:

3.2 DATAFLOW DIAGRAM:

UML DIAGRAMS:

3.2 USE CASE DIAGRAM:

3.3 CLASS DIAGRAM:

3.4 SEQUENCE DIAGRAM:

3.5 ACTIVITY DIAGRAM:

CHAPTER 4

4.0 IMPLEMENTATION:

We performed a simulation study to evaluate the overclaiming attack’s effect on the normal users’ performance in a cooperative relaying environment. We quantifyWe use the ns-2 simulator patched with EURANE UMTS system simulator. Our simulated network consists of one base station serving four users. The base station sends 11Mbps of Constant Bit Rate (CBR) traffic to each user. There is one attacker who may falsely report its channel condition to the base station. We represent channel condition using the Channel Quality Indicator (CQI) defined in the 3GPP standard.

The same equation and block size are used for a case study of opportunistic scheduler presented in Section 2.2.3. We assume that one victim is close to the attacker so that when the victim experiences poor channel condition, the victim can use the attacker as a relaying node. The other two normal users get packets directly from the base station and do not participate in the cooperative relaying protocol. The victim and the two normal users honestly report their channel condition. The channel for the attacker and two normal users uses a shadowing plus Rayleigh model of a moving node 100m away from the base station with velocity of 3km/h. We vary the victim’s distance to the base station from 100m to 500m to see the effect of the victim’s channel condition on the performance degradation.

The attacker’s goal in this simulation is to reduce the victim’s throughput. The attacker can adopt two approaches. In the conservative approach, the attacker does not forward packets for the victim without falsely reporting its channel condition. In the aggressive approach, the attacker overclaims its channel condition so that the attacker can increase its probability of relaying packets for the victim.

Our simulations do not consider the overhead that an actual relaying protocol might incur in finding a new relaying node due to channel condition variation since such overhead is not related to the effect of attack. We assume that each transmission uses an orthogonal carrier so that transmissions do not interfere with each other.

Our simulations do not implement a relay discovery protocol; rather, we compare the attacker and victim CQI, and use the link with better CQI value to transmit to the victim. This relaying scheme is an example; a system operator may choose a different scheme. However, the scheme that we chose is good for exploiting increased diversity to optimize throughput. We ran each of our simulations for 100 simulated seconds.

4.1 ALGORITHM

In this section, we evaluate the performance and the security of our algorithm. Firstly, we analyze the performance of our algorithm according to algorithm parameters. This analysis can be used for parameter design guidelines. This analysis result is compared to simulation results. Secondly, we analyze the security of our algorithm. In this analysis, we show how much a brute-force attacker can be successful in guessing the value included in a challenge according to algorithm parameters. This analysis can be used for understanding tradeoff between security and system overhead. Thirdly, we integrate our algorithm into a network simulator and evaluate the effect of our algorithm on the system performance. We show that our algorithm securely and effectively estimates channel condition through most of its parameter space.

We used identical parameters, except we replaced the static channel with various variable channel models. We measure the throughput of normal users under scheduling policies of MAX-SINR, PF and MAX-SINR with our algorithm. In MAX-SINR with our algorithm, a base station does not use reported CQI-level to determine a user with the best channel condition in a give time slot. Instead, the base station uses CQI-level estimated by our algorithm. In the case study of opportunistic scheduler, our observation was that PF scheduler prevented attackers from stealing throughput. Hence, we concluded that PF was a good candidate for defending against false channel condition reporting attack. However, our simulation results for the system performance show that MAX-SINR with our algorithm can achieve higher throughput than PF scheduler in most cases. The fact that the performance of our algorithm depends on channel characteristic affects the throughput of normal users in case of MAX-SINR with our algorithm.

4.2 MODULE DESCRIPTION:

SERVER CLIENT MODULE:

NETWORK SECURITY:

ATTACK MODEL:

COOPERATIVE RELAYING:

EFFICIENT ROUTING METRICS:

PERFORMANCE ANALYSIS

4.3 MODULES DESCRIPTION:

SERVER CLIENT MODULE:

NETWORK SECURITY:

Network-accessible resources may be deployed in a network as surveillance and early-warning tools, as the detection of attackers are not normally accessed for legitimate purposes. Techniques used by the attackers that attempt to compromise these decoy resources are studied during and after an attack to keep an eye on new exploitation techniques. Such analysis may be used to further tighten security of the actual network being protected by the data’s. Data forwarding can also direct an attacker’s attention away from legitimate servers. A user encourages attackers to spend their time and energy on the decoy server while distracting their attention from the data on the real server. Similar to a server, a user is a network set up with intentional vulnerabilities. Its purpose is also to invite attacks so that the attacker’s methods can be studied and that information can be used to increase network security.

ATTACK MODEL:

We introduce our attack concept and perform case studies to quantize the attack effects on specific channel-aware network protocols. We evaluate the effect of falsely reported channel condition under three types of channel-aware protocols: cooperative relaying protocols in hybrid networks, efficient routing metrics in wireless ad hoc networks, and opportunistic schedulers in high-speed wireless networks. For each protocol, we suggest possible attack mechanisms and quantify the effectiveness of the attack. We show that we can defend against some attacks using existing algorithms, and that other attacks require new security mechanisms. Each following case study has the same presentation format. First, we briefly explain the protocols. Then, we discuss effective attack scenarios for each protocol. Finally, we use simulations to evaluate the effect of each attack scenario.

COOPERATIVE RELAYING:

In a mobile wireless network, mobile nodes can experience different channel conditions depending on their different locations. When a node experiences a channel condition that is too poor to receive packets from a source node, a third node may have a good channel condition to both the source and the intended destination. Cooperative relaying network architectures help a node that has poor channel condition to route its packet through a node with a good channel condition, thus improving system throughput.

A cooperative relaying protocol must distribute channel condition information for each candidate path, find the most appropriate relay path, and provide incentives to motivate nodes to forward packets for other nodes. Specifically, in UCAN user equipment has two wireless adaptors, one High Data Rate (HDR) cellular interface and one IEEE 802.11 interface. The HDR interface is used for communication with a base station and the IEEE 802.11 interface is used for peer-to-peer communication with other user equipment in a network.

EFFICIENT ROUTING METRICS:

Routing Protocols in Ad Hoc Networks a wireless ad hoc network supports communication between nodes without need for centralized infrastructure such as base stations or access points. To deliver packets to destinations out of a source node’s transmission range, the source employs the help of intermediate nodes to forward each packet to its destination. Routing protocols in wireless ad hoc network discover routes between nodes. When there are multiple valid routes from a source to a destination, a routing protocol needs to choose among valid routes. A routing metric is a value associated to a route and represents the desirability of a route. A typical metric in the seminal routing protocols is minimum hop count. The rationale behind the metric of minimum hop count is that a route with fewer hops allows a packet to be delivered with the smaller number of transmissions.

PERFORMANCE ANALYSIS:

Our analysis assumes that the channel condition does not change. Though this assumption does not hold in a mobile environment, the purpose of our analysis is not to capture every detail of real world but to verify our simulator. For the evaluation in a realistic environment, we perform simulations with channel models considering variable channel conditions, as described in that the challenge size and Psref (i) are the same for different challenges for easy comparison of performance. The equations in our analysis do not assume the same values of challenge size and Psref (i). However, with different values of challenge size and Psref (i), it is not easy to understand the parameters’ effect on the performance. In this analysis, we assume that the challenges are authenticated.

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.2 FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 3 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.4 LOAD TESTING:

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

6.0 SOFTWARE DESCRIPTION:

6.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

6.2 THE JAVA PLATFORM:

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

6.3 WHAT CAN JAVA TECHNOLOGY DO?

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

6.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

6.5 ODBC:

6.6 JDBC:

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

6.7 JDBC Goals:

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

SQL Conformance

JDBC must be implemental on top of common database interfaces

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

6.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

Sockets:

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

6.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

6.8.1. Map Visualizations:

6.8.2. Time Series Chart Interactivity

6.8.3. Dashboards

6.8.4. Property Editors

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

8.1 CONCLUSION

In this paper, we have studied the threat imposed by falsely reporting users’ channel condition. Through case studies for three different types of wireless network protocols, we show that in a cooperative relaying network and a network using ETX, a false reporting attack can significantly reduce the performance of other users. Our false channel-feedback attack can arise in any channel-aware protocol where a user reports its own channel condition. To counter such attacks, we propose a secure channel condition estimation algorithm to prevent the overclaiming attack. Through analysis and simulations, we show that with proper parameters, we can prevent the over claiming attack.

CHAPTER 9

9.1 REFERENCES

Dongho Kim and Yih-Chun Hu, “A Study on False Channel Condition Reporting Attacks in Wireless Networks”, IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 13, NO. 5, MAY 2014.

H. Luo, R. Ramjee, P. Sinha, L. E. Li, and S. Lu, “Ucan: A unified cellular and ad-hoc network architecture,” in Proc. ACM MobiCom, San Diego, CA, USA, 2003, pp. 353–367.

D. S. J. De Couto, D. Aguayo, J. Bicket, and R. Morris, “A highthroughput path metric for multi-hop wireless routing,” in Proc. ACM MobiCom, San Diego, CA, USA, 2003, pp. 134–146.

R. Draves, J. Padhye, and B. Zill, “Comparison of routing metrics for static multi-hop wireless networks,” in Proc. ACM SIGCOMM, Portland, OR, USA, 2004, pp. 133–144.

A. Jalali, R. Padovani, and R. Pankaj, “Data throughput of CDMA-HDR a high efficiency-high data rate personal communication wireless system,” in Proc. IEEE VTC, Tokyo, Japan, 2000, pp. 1854–1858.

P. Viswanath, D. N. C. Tse, and R. Laroia, “Opportunistic beamforming using dumb antennas,” IEEE Trans. Inf. Theory, vol. 48, no. 6, pp. 1277–1294, Jun. 2002.

R. Racic, D. Ma, H. Chen, and X. Liu, “Exploiting and defending opportunistic scheduling in cellular data networks,” IEEE Trans. Mobile Comput., vol. 9, no. 5, pp. 609–620, May 2010.

U. Ben-Porat, A. Bremler-Barr, H. Levy, and B. Plattner, “On the vulnerability of the proportional fairness scheduler to retransmission attacks,” in Proc. IEEE INFOCOM, Shanghai, China, Apr. 2011, pp. 1431–1439.

S. Bali, S. Machiraju, H. Zang, and V. Frost, “A measurement study of scheduler-based attacks in 3G wireless networks,” in Proc. PAM, Berlin, Germany, 2007.

A QoS-Oriented Distributed Routing Protocol for Hybrid Wireless Networks

22/10/201902/07/2019 by admin

A QOS-ORIENTED DISTRIBUTED ROUTING PROTOCOL FOR HYBRID

WIRELESS NETWORKS

PROJECT REPORT

Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY

In partial fulfillment of the requirements for the award of the degree

MASTER OF TECHNOLOGY

COMPUTER SCIENCE & ENGINEERING

APRIL 2015

BONAFIDE CERTIFICATE

Certified that this project report titled “A QOS-ORIENTED DISTRIBUTED ROUTING PROTOCOL FOR HYBRID WIRELESS NETWORKS” is the bonafide work of Mr. _____________Who carried out the research under my supervision Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

Signature of the Guide Signature of the H.O.D

Name Name

CHAPTER 1

ABSTRACT:

As wireless communication gains popularity, significant research has been devoted to supporting real-time transmission with stringent Quality of Service (QoS) requirements for wireless applications. At the same time, a wireless hybrid network that integrates a mobile wireless ad hoc network (MANET) and a wireless infrastructure network has been proven to be a better alternative for the next generation wireless networks. By directly adopting resource reservation-based QoS routing for MANETs, hybrids networks inherit invalid reservation and race condition problems in MANETs. How to guarantee the QoS in hybrid networks remains an open problem.

In this paper, we propose a QoS-Oriented Distributed routing protocol (QOD) to enhance the QoS support capability of hybrid networks. Taking advantage of fewer transmission hops and anycast transmission features of the hybrid networks, QOD transforms the packet routing problem to a resource scheduling problem.

QOD incorporates five algorithms:

1) A QoS-guaranteed neighbor selection algorithm to meet the transmission delay requirement,

2) A distributed packet scheduling algorithm to further reduce transmission delay,

3) A mobility-based segment resizing algorithm that adaptively adjusts segment size according to node mobility in order to reduce transmission time,

4) A traffic redundant elimination algorithm to increase the transmission throughput, and

5) A data redundancy elimination-based transmission algorithm to eliminate the redundant data to further improve the transmission QoS.

Analytical and simulation results based on the random way-point model and the real human mobility model show that QOD can provide high QoS performance in terms of overhead, transmission delay, mobility-resilience, and scalability.

1.2 INTRODUCTION

The rapid development of wireless networks has stimulated numerous wireless applications that have been used in wide areas such as commerce, emergency services, military, education, and entertainment. The number of WiFi capable mobile devices including laptops and handheld devices (e.g., smartphone and tablet PC) has been increasing rapidly. For example, the number of wireless Internet users has tripled world-wide in the last three years, and the number of smartphone users in US has increased from 92.8 million in 2011 to 121.4 million in 2012, and will reach around 207 million by 2017. Nowadays, people wish to watch videos, play games, watch TV, and make longdistanceconferencing via wireless mobile devices “on the go.” Therefore, video streaming applications such as Qik, Flixwagon, and FaceTime on the infrastructure wireless networks have received increasing attention recently. These applications use an infrastructure to directly connect mobile users for video watching or interaction in real time. The widespread use of wireless and mobile devices and the increasing demand for mobile multimedia streaming services are leading to a promising near future where wireless multimedia services (e.g., mobile gaming, online TV, and online conferences) are widely deployed.

The emergence and the envisioned future of real time andmultimedia applications have stimulated the need of high Quality of Service (QoS) support in wireless and mobile networking environments. The QoS support reduces endto- end transmission delay and enhances throughput to guarantee the seamless communication between mobile devices and wireless infrastructures.

At the same time, hybrid wireless networks (i.e., multihop cellular networks) have been proven to be a better network structure for the next generation wireless networks and can help to tackle the stringent end-to-end QoS requirements of different applications. Hybrid networks synergistically combine infrastructure networks and MANETs to leverage each other. Specifically, infrastructure networks improve the scalability of MANETs, while MANETs automatically establish self-organizing networks, extending the coverage of the infrastructure networks. In a vehicle opportunistic access network (an instance of hybrid networks), people in vehicles need to upload or download videos from remote Internet servers through access points (APs) (i.e., base stations) spreading out in a city.

Since it is unlikely that the base stations cover the entire city to maintain sufficiently strong signal everywhere to support an application requiring high link rates, the vehicles themselves can form a MANET to extend the coverage of the base stations, providing continuous network connections. How to guarantee the QoS in hybrid wireless networks with high mobility and fluctuating bandwidth still remains an open question. In the infrastructure wireless networks, QoS provision (e.g., Intserv, RSVP ) has been proposed for QoS routing, which often requires node negotiation, admission control, resource reservation, and, priority scheduling of packets.

However, it is more difficult to guarantee QoS in MANETs due to their unique features including user mobility, channel variance errors, and limited bandwidth. Thus, attempts to directly adapt the QoS solutions for infrastructure networks to MANETs generally do not have great succes. Numerous reservation-based QoS routing protocols have been proposed for MANETs that create routes formed by nodes and links that reserve their resources to fulfill QoS requirements. Although these protocols can increase the QoS of the MANETs to a certain extent, they suffer from invalid reservation and race condition problems. Invalid reservation problem means that the reserved resources become useless if the data transmission path between a source node and a destination node breaks. Race condition problem means a double allocation of the same resource to two different QoS paths. However, little effort has been devoted to support QoS routing in hybrid networks. Most of the current works in hybrid networks focus on increasing network capacity or routing reliability but cannot provide QoS-guaranteed services. Direct adoption of the reservation-based QoS routing protocols of MANETs into hybrid networks inherits the invalid reservation and race condition problems.

In order to enhance the QoS support capability of hybrid networks, in this paper, we propose a QoS-Oriented Distributed routing protocol (QOD). Usually, a hybrid network has widespread base stations. The data transmission in hybrid networks has two features. First, an AP can be a source or a destination to any mobile node. Second, the number of transmission hops between a mobile node and an AP is small. The first feature allows a stream to have anycast transmission along multiple transmission paths to its destination through base stations, and the second feature enables a source node to connect to an AP through an intermediate node. Taking full advantage of the two features, QOD transforms the packet routing problem into a dynamic resource scheduling problem. Specifically, in QOD, if a source node is not within the transmission range of the AP, a source node selects nearby neighbors that can provide QoS services to forward its packets to base stations in a distributed manner. The source node schedules the packet streams to neighbors based on their queuing condition, channel condition, and mobility, aiming toreduce transmission time and increase network capacity. The neighbors then forward packets to base stations, which further forward packets to the destination. In this paper, we focus on the neighbor node selection for QoS-guaranteed transmission. QOD is the first work for QoS routing in hybrid networks.

LITRATURE SURVEY

QOS MULTICAST ROUTING BY USING MULTIPLE PATHS/TREES IN WIRELESSAD HOC NETWORKS

AUTHOR: H. Wu and X. Jia

PUBLISH: Ad Hoc Networks, vol. 5, pp. 600-612, 2009.

In this paper, we investigate the issues of QoS multicast routing in wireless ad hoc networks. Due to limited bandwidth of a wireless node, a QoS multicast call could often be blocked if there does not exist a single multicast tree that has the requested bandwidth, even though there is enough bandwidth in the system to support the call. In this paper, we propose a new multicast routing scheme by using multiple paths or multiple trees to meet the bandwidth requirement of a call. Three multicast routing strategies are studied, SPT (shortest path tree) based multiple-paths (SPTM), least cost tree based multiple-paths (LCTM) and multiple least cost trees (MLCT). The final routing tree(s) can meet the user’s QoS requirements such that the delay from the source to any destination node shall not exceed the required bound and the aggregate bandwidth of the paths or trees shall meet the bandwidth requirement of the call. Extensive simulations have been conducted to evaluate the performance of our three multicast routing strategies. The simulation results show that the new scheme improves the call success ratio and makes a better use of network resources.

QUALITY OF SERVICE PROVISIONING IN AD HOC WIRELESS NETWORKS: A SURVEY OF ISSUES AND SOLUTIONS

AUTHOR: T. Reddy, I. Karthigeyan, B. Manoj, and C. Murthy

PULISH: Ad Hoc Networks, vol. 4, no. 1, pp. 83-124, 2006.

An ad hoc wireless network (AWN) is a collection of mobile hosts forming a temporary network on the fly, without using any fixed infrastructure. Characteristics of AWNs such as lack of central coordination, mobility of hosts, dynamically varying network topology, and limited availability of resources make QoS provisioning very challenging in such networks. In this paper, we describe the issues and challenges in providing QoS for AWNs and review some of the QoS solutions proposed. We first provide a layer-wise classification of the existing QoS solutions, and then discuss each of these solutions.

QOS ROUTING BASED ON MULTI-CLASS NODES FOR MOBILE AD HOC NETWORKS

AUTHOR: X. Du, Ad Hoc Networks

PUBLISH: vol. 2, pp. 241-254, 2004.

Efficient routing is very important for Mobile Ad hoc Networks (MANETs). Most existing routing protocols consider homogeneous ad hoc networks, in which all nodes are identical, i.e., they have the same communication capabilities and characteristics. Although a homogeneous network model is simple and easy to analyze, it misses important characteristics of many realistic MANETs such as military battlefield networks. In addition, a homogeneous ad hoc network suffers from poor performance limits and scalability. In many ad hoc networks, multiple types of nodes do co-exist; and some nodes have larger transmission power, higher transmission data rate, better processing capability, and are more robust against bit errors and congestion than other nodes. Hence, a heterogeneous network model is more realistic and provides many advantages (e.g., leading to more efficient routing protocol design). In this paper,

We present a new routing protocol called Multi-Class (MC) routing, which is specifically designed for heterogeneous MANETs. Moreover, we also design a new Medium Access Control (MAC) protocol for heterogeneous MANETs, which is more efficient than IEEE 802.11b. Extensive simulation results demonstrate that the MC routing has very good performance, and outperforms a popular routing protocol — Zone Routing Protocol, in terms of reliability, scalability, route discovery latency, overhead, as well as packet delay and throughput.

PROVISIONING OF ADAPTABILITY TO VARIABLE TOPOLOGIES FOR ROUTING SCHEMES IN MANETS

AUTHOR: S. Jiang, Y. Liu, Y. Jiang, and Q. Yin,

PUBLISH: IEEE J. Selected Areas in Comm., vol. 22, no. 7, pp. 1347-1356, Sept. 2004.

Frequent changes in network topologies caused by mobility in mobile ad hoc networks (MANETs) impose great challenges to designing routing schemes for such networks. Various routing schemes each aiming at particular type of MANET (e.g., flat or clustered MANETs) with different mobility degrees (e.g., low, medium, and high mobility) have been proposed in the literature. However, since a mobile node should not be limited to operate in a particular MANET assumed by a routing scheme, an important issue is how to enable a mobile node to achieve routing performance as high as possible when it roams across different types of MANETs. To handle this issue, a quantity that can predict the link status for a time period in the future with the consideration of mobility is required. In this paper, we discuss such a quantity and investigate how well this quantity can be used by the link caching scheme in the dynamic source routing protocol to provide the adaptability to variable topologies caused by mobility through computer simulation in NS-2.

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1EXISTING SYSTEM:

Existing approaches for providing guaranteed services in the infrastructure networks are based on two models: integrated services (IntServ) and differentiated service (DiffServ) [42]. IntServ is a stateful model that uses resource reservation for individual flow, and uses admission control and a scheduler to maintain the QoS of traffic flows. In contrast, DiffServ is a stateless model which uses coarsegrained class-based mechanism for traffic management a number of queuing scheduling algorithms. Reservation-based QoS routing protocols have been proposed for MANETs that create routes formed by nodes and links that reserve their resources to fulfill QoS requirements although these protocols can increase the QoS of the MANETs to a certain extent.

2.2 DISADVANTAGES:

Cannot provide QoS-guaranteed services.
Suffer from invalid reservation and race condition problems .
Invalid reservation problem means that the reserved resources become useless and Race condition problem means a double allocation of the same resource to two different QoS paths.

PROPOSED SYSTEM:

We propose a QoS-Oriented Distributed routing protocol (QOD). Usually, a hybrid network has widespread base stations.

The data transmission in hybrid networks has two features.

First, an AP can be a source or a destination to any mobile node. Second, the number of transmission hops between a mobile node and an AP is small. The first feature allows a stream to have anycast transmission along multiple transmission paths to its destination through base stations, and the second feature enables a source node to connect to an AP through an intermediate node. Taking full advantage of the two features, QOD transforms the packet routing problem into a dynamic resource scheduling problem. Specifically, in QOD, if a source node is not within the transmission range of the AP, a source node selects nearby neighbors that can provide QoS services to forward its packets to base stations in a distributed manner. The source node schedules the packet streams to neighbors based on their queuing condition, channel condition, and mobility, aiming to reduce transmission time and increase network capacity. The neighbors then forward packets to base stations, which further forward packets to the destination.

ADVANTAGES:

QoS-guaranteed neighbor selection algorithm. The algorithm selects qualified neighbors and employs deadline-driven scheduling mechanism to guarantee QoS routing.

Distributed packet scheduling algorithm. After qualified neighbors are identified, this algorithm schedules packet routing. It assigns earlier generated packets to forwarders with higher queuing delays, while assigns more recently generated packets to forwarders with lower queuing delays to reduce total transmission delay.

Mobility-based segment resizing algorithm. The source node adaptively resizes each packet in its packet stream for each neighbor node according to the neighbor’s mobility in order to increase the scheduling feasibility of the packets from the source node.

Soft-deadline based forwarding scheduling algorithm. In this algorithm, an intermediate node first forwards the packet with the least time allowed to wait before being forwarded out to achieve fairness in packet forwarding.

Data redundancy elimination based transmission. Due to the broadcasting feature of the wireless networks, the APs and mobile nodes can overhear and cache packets. This algorithm eliminates the redundant data to improve the QoS of the packet transmission.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

SOFTWARE REQUIREMENTS:

Operating System : Windows XP
Front End : Java JDK 1.7
Document : MS-Office 2007

CHAPTER 3

SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data. The physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.1 SYSTEM ARCHITECTURE:

3.2 DATAFLOW DIAGRAM

SENSOR NODE:

MOBILE RELAY NODE:

SINK:

UML DIAGRAMS:

3.2 USE CASE DIAGRAM:

3.3 CLASS DIAGRAM:

3.4 SEQUENCE DIAGRAM:

3.5 ACTIVITY DIAGRAM:

CHAPTER 4

4.0 IMPLEMENTATION:

QOD ROUTING PROTOCOL:

Scheduling feasibility is the ability of a node to guarantee a packet to arrive at its destination within QoS requirements. As mentioned, when the QoS of the direct transmission between a source node and an AP cannot be guaranteed, the source node sends a request message to its neighbor nodes. After receiving a forward request from a source node, a neighbor node ni with space utility less than a threshold replies the source node. The reply message contains information about available resources for checking packet scheduling feasibility (Section 2.4), packet arrival interval Ta, transmission delay TI!D, and packet deadline Dp of the packets in each flow being forwarded by the neighbor for queuing delay estimation and distributed packet scheduling and the node’s mobility speed for determining packet size. Based on this information, the source node chooses the replied neighbors that can guarantee the delay QoS of packet transmission to APs.

The selected neighbor nodes periodically report their statuses to the source node, which ensures their scheduling feasibility and locally schedules the packet stream to them. The individual packets are forwarded to the neighbor nodes that are scheduling feasible in a round-robin fashion from a longer delayed node to a shorter delayed node, aiming to reduce the entire packet transmission delay. Algorithm 1 shows the pseudocode for the QOD routing protocol executed by each node. The QOD distributed routing algorithm is developed based on the assumption that the neighboring nodes in the network have different channel utilities and workloads using IEEE 802.11 protocol. Otherwise, there is no need for packet scheduling in routing, since all neighbors produce comparative delay for packet forwarding. Therefore, we analyze the difference in node channel utilities and workloads in a network with IEEE 802.11 protocol in order to see whether the assumption holds true in practice.

4.1 ALGORITHM

The packets travel from different APs, which may lead to different packet transmission delay, resulting in a jitter at the receiver side. The jitter problem can be solved by using token buckets mechanism at the destination APs to shape the traffic flows. This technique is orthogonal to our study in this paper and its details are beyond the scope of this paper.

4.2 MODULES:

HYBRID WIRELESS NETWORKS:

DISTRIBUTED PACKET SCHEDULING:

NEIGHBOR SELECTION ALGORITHM:

MOBILITY-BASED PACKET RESIZING:

4.3 MODULE DESCRIPTION:

HYBRID WIRELESS NETWORKS:

Hybrid wireless networks (i.e., multihop cellular networks) have been proven to be a better network structure for the next generation wireless networks and can help to tackle the stringent end-to end QoS requirements of different applications. Hybrid networks synergistically combine infrastructure networks and MANETs to leverage each other. Specifically, infrastructure networks improve the scalability of MANETs, while MANETs automatically establish self-organizing networks, extending the coverage of the infrastructure networks.

In a vehicle opportunistic access network (an instance of hybrid networks), people in vehicles need to upload or download videos from remote Internet servers through access points (APs) (i.e., base stations) spreading out in a city. Since it is unlikely that the base stations cover the entire city to maintain sufficiently strong signal everywhere to support an application requiring high link rates, the vehicles themselves can form a MANET to extend the coverage of the base stations, providing continuous network connections.

The QoS requirements mainly include end-to-end delay bound, which is essential for many applications with stringent real-time requirement. While throughput guarantee is also important, it is automatically guaranteed by bounding the transmission delay for a certain amount of packets. The source node conducts admission control to check whether there are enough resources to satisfy the requirements of QoS of the packet stream in the network model of a hybrid network. For example, when a source node n1 wants to upload files to an Internet server through APs, it can choose to send packets to the APs directly by itself or require its neighbor nodes n2, n3, or n4 to assist the packet transmission.

DISTRIBUTED PACKET SCHEDULING:

QoS of the packet transmission and how a source node assigns traffic to the intermediate nodes to ensure their scheduling feasibility in order to further reduce the stream transmission time, a distributed packet scheduling algorithm is proposed for packet routing. This algorithm assigns earlier generated packets to forwarders with higher queuing delays and scheduling feasibility, while assigns more recently generated packets to forwarders with lower queuing delays and scheduling feasibility, so that the transmission delay of an entire packet stream can be reduced.

NEIGHBOR SELECTION ALGORITHM:

Since short delay is the major real-time QoS requirement for traffic transmission, QOD incorporates the Earliest Deadline First scheduling algorithm (EDF), which is a deadline driven scheduling algorithm for data traffic scheduling in intermediate nodes. In this algorithm, an intermediate node assigns the highest priority to the packet with the closest deadline and forwards the packet with the highest priority first. Let us use SpðiÞ to denote the size of the packet steam from node ni, use Wi to denote the bandwidth of node i, and TaðiÞ to denote the packet arrival interval from node ni.

MOBILITY-BASED PACKET RESIZING:

In a highly dynamic mobile wireless network, the transmission link between two nodes is frequently broken down. The delay generated in the packet retransmission degrades the QoS of the transmission of a packet flow. On the other hand, a node in a highly dynamic network has higher probability to meet different mobile nodes and APs, which is beneficial to resource scheduling. As (2) shows, the space utility of an intermediate node that is used for forwarding a packet p is reducing packet size can increase the scheduling feasibility of an intermediate node and reduces packet dropping probability. However, we cannot make the size of the packet too small because it generates more packets to be transmitted, producing higher packet overhead. Based on this rationale and taking advantage of the benefits of node mobility, we propose a mobility-based packet resizing algorithm for QOD in this section. The basic idea is that the larger size packets are assigned to lower mobility intermediate nodes and smaller size packets are assigned to higher mobility intermediate nodes, which increases the QoS-guaranteed packet transmissions. Specifically, in QOD, as the mobility of a node increases, the size of a packet Sp sent from a node to its neighbor nodes i decreases as following:

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY:

This study is carried out to check the technical feasibility, that is, the technical requirements of the system. Any system developed must not have a high demand on the available technical resources. This will lead to high demands on the available technical resources. This will lead to high demands being placed on the client. The developed system must have a modest requirement, as only minimal or null changes are required for implementing this system.

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.3 FUNCTIONAL TESTING:

FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 4 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.5 LOAD TESTING:

Load Testing

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.7 WHITE BOX TESTING:

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

5.1.10 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

CONCLUSION:

We propose a QoS oriented distributed routing protocol (QOD) for hybrid networks to provide QoS services in a highly dynamic scenario. Taking advantage of the unique features of hybrid networks, i.e., anycast transmission and short transmission hops, QOD transforms the packet routing problem to a packet scheduling problem. In QOD, a source node directly transmits packets to an AP if the direct transmission can guarantee the QoS of the traffic. Otherwise, the source node schedules the packets to a number of qualified neighbor nodes.

Specifically, QOD incorporates five algorithms. The QoS-guaranteed neighbor selection algorithm chooses qualified neighbors for packet forwarding. The distributed packet scheduling algorithm schedules the packet transmission to further reduce the packet transmission time. The mobility-based packet resizing algorithm resizes packets and assigns smaller packets to nodes with faster mobility to guarantee the routing QoS in a highly mobile environment.

The traffic redundant elimination-based transmission algorithm can further increase the transmission throughput. The soft-deadline-based forwarding scheduling achieves fairness in packet forwarding scheduling when some packets are not scheduling feasible. Experimental results show that QOD can achieve high mobility-resilience, scalability, and contention reduction. In the future, we plan to evaluate the performance of QOD based on the real testbed.

CHAPTER 9

REFERENCES:

[1] H. Wu and X. Jia, “QoS Multicast Routing by Using Multiple Paths/Trees in Wireless Ad Hoc Networks,” Ad Hoc Networks, vol. 5, pp. 600-612, 2009.

[2] T. Reddy, I. Karthigeyan, B. Manoj, and C. Murthy, “Quality of Service Provisioning in Ad Hoc Wireless Networks: A Survey of Issues and Solutions,” Ad Hoc Networks, vol. 4, no. 1, pp. 83-124, 2006.

[3] X. Du, “QoS Routing Based on Multi-Class Nodes for Mobile Ad Hoc Networks,” Ad Hoc Networks, vol. 2, pp. 241-254, 2004.

[4] S. Jiang, Y. Liu, Y. Jiang, and Q. Yin, “Provisioning of Adaptability to Variable Topologies for Routing Schemes in MANETs,” IEEE J. Selected Areas in Comm., vol. 22, no. 7, pp. 1347-1356, Sept. 2004.

[5] M. Conti, E. Gregori, and G. Maselli, “Reliable and Efficient Forwarding in Ad Hoc Networks,” Ad Hoc Networks, vol. 4, pp. 398-415, 2006.

[6] G. Chakrabarti and S. Kulkarni, “Load Balancing and Resource Reservation in Mobile Ad Hoc Networks,” Ad Hoc Networks, vol. 4, pp. 186-203, 2006.

[7] Z. Shen and J.P. Thomas, “Security and QoS Self-Optimization in Mobile Ad Hoc Networks,” IEEE Trans. Mobile Computing, vol. 7, pp. 1138-1151, Sept. 2008.

[8] S. Ibrahim, K. Sadek, W. Su, and R. Liu, “Cooperative Communications with Relay-Selection: When to Cooperate and Whom to Cooperate With?” IEEE Trans. Wireless Comm., vol. 7, no. 7, pp. 2814-2827, July 2008.

A Denial of Service Attack to UMTS Networks Using SIM-Less Devices

22/10/201902/07/2019 by admin

A Cocktail Approach for Travel Package Recommendation

22/10/201902/07/2019 by admin

In this paper, propose a cocktail approach to generate the lists for personalized travel package recommendation. Furthermore, we extend the TAST model to the tourist-relation-area-season topic (TRAST) model for capturing the latent relationships among the tourists in each travel group. Finally, we evaluate the TAST model, the TRAST model, and the cocktail recommendation approach on the real-world travel package data.

TRAST model can be easily extended for computing relationships among many more tourists. However, the computation cost will also go up. To simplify the problem, in this paper, each time we only consider two tourists in a travel group as a tourist pair for mining their relationships. By this TRAST model, all the tourists’ travel preferences are represented by relationship distributions.

We can use their relationship distributions as features to cluster them, so as to put them into different travel groups. Thus, in this scenario, many clustering methods can be adopted. Since choosing clustering algorithm is beyond the scope of this paper, in the experiments, we refer to K-means one of the most popular clustering algorithms.

INTRODUCTION:

AS an emerging trend, more and more travel companies provide online services. However, the rapid growth of online travel information imposes an increasing challenge for tourists who have to choose from a large number of available travel packages for satisfying their personalized needs. Moreover, to increase the profit, the travel companies have to understand the preferences from different tourists and serve more attractive packages. Therefore, the demand for intelligent travel services is expected to increase dramatically. Since recommender systems have been successfully applied to enhance the quality of service in a number of fields, it is natural choice to provide travel package recommendations. Actually, recommendations for tourists have been studied before and to the best of our knowledge, the first operative tourism recommender system was introduced by Delgado and DavidsonDespite of the increasing interests in this field, the problem of leveraging unique features to distinguish personalized travel package recommendations from traditional recommender systems remains pretty open.

Indeed, there are many technical and domain challenges inherent in designing and implementing an effective recommender system for personalized travel package recommendation. First, travel data are much fewer and sparser than traditional items, such as movies for recommendation, because the costs for a travel are much more expensive than for watching a movie. Second, every travel package consists of many landscapes (places of interest and attractions), and, thus, has intrinsic complex spatio-temporal relationships. For example, a travel package only includes the landscapes which are geographically colocated together. Also, different travel packages areusually developed for different travel seasons.

Therefore, the landscapes in a travel package usually have spatialtemporal autocorrelations. Third, traditional recommender systems usually rely on user explicit ratings. However, for travel data, the user ratings are usually not conveniently available. Finally, the traditional items for recommendation usually have a long period of stable value, while the values of travel packages can easily depreciate over time and a package usually only lasts for a certain period of time. The travel companies need to actively create new tour packages to replace the old ones based on the interests of the tourists. To address these challenges, in our preliminary work we proposed a cocktail approach on personalized travel package recommendation. Specifically, we first analyze the key characteristics of the existing travel packages. Along this line, travel time and travel destinations are divided into different seasons and areas. Then, wedevelop a tourist-area-season topic (TAST) model, which can represent travel packages and tourists by different topic distributions. In the TAST model, the extraction of topics is conditioned on both the tourists and the intrinsic features (i.e., locations, travel seasons) of the landscapes. As a result, the TAST model can well represent the content of the travel packages and the interests of the tourists. Based on this TAST model, a cocktail approach is developed for personalized travel package recommendation by considering some additional factors including the seasonal behaviors of tourists, the prices of travel packages, and the cold start problem of new packages.

Finally, the experimental results on real-world travel data show that the TAST model can effectively capture the unique characteristics of travel data and the cocktail recommendation approach performs much better than traditional techniques. In this paper, we further study some related topic models of the TAST model, and explain the corresponding travel package recommendation strategies based on them. Also, we propose the tourist-relation-area-season topic (TRAST) model, which helps understand the reasons why tourists form a travel group. This goes beyond personalized package recommendations and is helpful for capturing the latent relationships among the tourists in each travel group.

In addition, we conduct systematic experiments on the realworld data. These experiments not only demonstrate that the TRAST model can be used as an assessment for travel group automatic formation but also provide more insights into the TAST model and the cocktail recommendation

approach. In summary, the contributions of the TAST model, the cocktail approaches, and the TRAST model for travel package recommendations are shown in Fig. 1, whereeach dashed rectangular box in the dashed circle identifies atravel group and the tourists in the same travel group are represented by the same icons.

LITRATURE SURVEY

COST-AWARE TRAVEL TOUR RECOMMENDATION

AUTHOR: Y. Ge et al.,

PUBLISH: Proc. 17th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (SIGKDD ’11), pp. 983-991, 2011.

Recent years have witnessed an increased interest in recommender systems. Despite significant progress in this field, there still remain numerous avenues to explore. Indeed, this paper provides a study of exploiting online travel information for personalized travel package recommendation. A critical challenge along this line is to address the unique characteristics of travel data, which distinguish travel packages from traditional items for recommendation. To that end, in this paper, we first analyze the characteristics of the existing travel packages and develop a tourist-area-season topic (TAST) model. This TAST model can represent travel packages and tourists by different topic distributions, where the topic extraction is conditioned on both the tourists and the intrinsic features (i.e., locations, travel seasons) of the landscapes. Then, based on this topic model representation, we propose a cocktail approach to generate the lists for personalized travel package recommendation. Furthermore, we extend the TAST model to the tourist-relation-area-season topic (TRAST) model for capturing the latent relationships among the tourists in each travel group. Finally, we evaluate the TAST model, the TRAST model, and the cocktail recommendation approach on the real-world travel package data. Experimental results show that the TAST model can effectively capture the unique characteristics of the travel data and the cocktail approach is, thus, much more effective than traditional recommendation techniques for travel package recommendation. Also, by considering tourist relationships, the TRAST model can be used as an effective assessment for travel group formation.

FLDA: MATRIX FACTORIZATION THROUGH LATENT DIRICHLET ALLOCATION

AUTHOR: D. Agarwal and B. Chen

PUBLISH: Proc. Third ACM Int’l Conf. Web Search and Data Mining (WSDM ’10), pp. 91-100, 2010.

We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a “bag-of-words” representation for item meta-data is natural. Such scenarios are commonplace in web applications like content recommendation, ad targeting and web search where items are articles, ads and web pages respectively. Because of data sparseness, regularization is key to good predictive accuracy. Our method works by regularizing both user and item factors simultaneously through user features and the bag of words associated with each item. Specifically, each word in an item is associated with a discrete latent factor often referred to as the topic of the word; item topics are obtained by averaging topics across all words in an item. Then, user rating on an item is modeled as user’s affinity to the item’s topics where user affinity to topics (user factors) and topic assignments to words in items (item factors) are learned jointly in a supervised fashion. To avoid overfitting, user and item factors are regularized through Gaussian linear regression and Latent Dirichlet Allocation (LDA) priors respectively. We show our model is accurate, interpretable and handles both cold-start and warm-start scenarios seamlessly through a single model. The efficacy of our method is illustrated on benchmark datasets and a new dataset from Yahoo! Buzz where fLDA provides superior predictive accuracy in cold-start scenarios and is comparable to state-of-the-art methods in warm-start scenarios. As a by-product, fLDA also identifies interesting topics that explains user-item interactions. Our method also generalizes a recently proposed technique called supervised LDA (sLDA) to collaborative filtering applications. While sLDA estimates item topic vectors in a supervised fashion for a single regression, fLDA incorporates multiple regressions (one for each user) in estimating the item factors.

MAP-BASED INTERACTION WITH A CONVERSATIONAL MOBILE RECOMMENDER SYSTEM,

AUTHOR: D. Agarwal and B. Chen

PUBLISH: Proc. Second Int’l Conf. Mobile Ubiquitous Computing, Systems, Services and Technologies (UBICOMM ’08), pp. 212-218, 2008.

Recommender systems are information search and decision support tools used when there is an overwhelming set of options to consider or when the user lacks the domain-specific knowledge necessary to take autonomous decisions. They provide users with personalized recommendations adapted to their needs and preferences in a particular usage context. In this paper, we present an approach for integrating recommendation and electronic map technologies to build a map-based conversational mobile recommender system that can effectively and intuitively support users in finding their desired products and services. The results of our real-user study show that integrating map-based visualization and interaction in mobile recommender systems improves the system recommendation effectiveness and increases the user satisfaction.

GENERATING COMPARATIVE DESCRIPTIONS OF PLACES OF INTEREST IN THE TOURISM DOMAIN

AUTHOR: B.D. Carolis, N. Novielli, V.L. Plantamura, and E. Gentile

PUBLISH: Proc. Third ACM Conf. Recommender Systems (RecSys ’09), pp. 277-280, 2009.

When visiting cities as tourists, most of the times people do not make very detailed plans and, when choosing where to go and what to seem they tend to select the area with the major number of interesting facilities. Therefore, it would be useful to support the user choice with contextual information presentation, information clustering and comparative explanations of places of potential interest in a given area. In this paper we illustrate how My Map, a mobile recommender system in the Tourism domain, generates comparative descriptions to support users in making decisions about what to see, among relevant objects of interest.

CHAPTER 2

EXISTING SYSTEM:

We first analyze the characteristics of the existing travel packages and develop a tourist-area-season topic (TAST) model. This TAST model can represent travel packages and tourists by different topic distributions, where the topic extraction is conditioned on both the tourists and the intrinsic features (i.e., locations, travel seasons) of the landscapes a tourist-area-season topic (TAST) model which can represent travel packages and tourists by different topic distributions. In the TAST model, the extraction of topics is conditioned on both the tourists and the intrinsic features (i.e., locations, travel seasons) of the landscapes.

As a result, the TAST model can well represent the content of the travel packages and the interests of the tourists. Based on this TAST model, a cocktail approach is developed for personalized travel package recommendation by considering some additional factors including the seasonal behaviors of tourists, the prices of travel packages, and the cold start problem of new packages. Finally, the experimental results on real-world travel data show that the TAST model can effectively capture the unique characteristics of travel data and the cocktail recommendation approach performs much better than traditional techniques.

DISADAVANTAGES:

The previous cocktail recommendation approach (Cocktail) is mainly based on the TAST model and the collaborative filtering method.

Indeed, another possible cocktail approach is the content-based cocktail, and in the following, we call this method TAST Content.

The main difference between TAST Content and Cocktail is that in TAST Content the content similarity between packages and tourists are used for ranking packages instead of using collaborative filtering interests of the tourists, thus it may also suffer from the overspecialization problem.

PROPOSED SYSTEM:

We propose the tourist-relation-area-season topic (TRAST) model, which helps understand the reasons why tourists form a travel group. This goes beyond personalized package recommendations and is helpful for capturing the latent relationships among the tourists in each travel group. In addition, we conduct systematic experiments on the real world data. These experiments demonstrate that the TRAST model can be used as an assessment for travel group automatic formation but also provide more insights into the TAST model and the cocktail recommendation approach.

Our contributions of the cocktail approaches, and the TRAST model for travel package recommendations are each dashed rectangular box in the dashed circle identifies a travel group and the tourists in the same travel group are represented in this TRAST model, all the tourists’ travel preferences are represented by relationship distributions. For a set of tourists, who want to travel the same package, we can use their relationship distributions as features to cluster them, so as to put them into different travel groups. Thus, in this scenario, many clustering methods can be adopted. Since choosing clustering algorithm is beyond the scope of this paper, in the experiments, we refer to K-means one of the most popular clustering algorithms.

ADVANTAGES:

The results of the season splitting and price segmentation,
The understanding of the extracted topics,
A recommendation performance comparison between Cocktail and benchmark methods,
The evaluation of the TRAST model, and
A brief discussion on recommendations for travel groups.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

2.3.2 SOFTWARE REQUIREMENTS:

Operating System : Windows XP
Front End : JAVA JDK 1.7/JSP
Back End : MYSQL Server

CHAPTER 3

3.0 SYSTEM DESIGN:

SYSTEM DESIGN

SYSTEM ARCHITECTURE:

ARCHITECTURE DIAGRAM:

CHAPTER 4

DATA FLOW DIAGRAM:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of input data to the system, various processing carried out on this data, and the output data is generated by this system.
The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.
DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.
DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling language in the field of object-oriented software engineering. The standard is managed, and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object oriented computer software. In its current form UML is comprised of two major components: a Meta-model and a notation. In the future, some form of method or process may also be added to; or associated with, UML.

The Unified Modeling Language is a standard language for specifying, Visualization, Constructing and documenting the artifacts of software system, as well as for business modeling and other non-software systems.

The UML represents a collection of best engineering practices that have proven successful in the modeling of large and complex systems.

The UML is a very important part of developing objects oriented software and the software development process. The UML uses mostly graphical notations to express the design of software projects.

GOALS:

The Primary goals in the design of the UML are as follows:

Provide users a ready-to-use, expressive visual modeling Language so that they can develop and exchange meaningful models.
Provide extendibility and specialization mechanisms to extend the core concepts.
Be independent of particular programming languages and development process.
Provide a formal basis for understanding the modeling language.
Encourage the growth of OO tools market.
Support higher level development concepts such as collaborations, frameworks, patterns and components.
Integrate best practices.

USE CASE DIAGRAM:

A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical overview of the functionality provided by a system in terms of actors, their goals (represented as use cases), and any dependencies between those use cases. The main purpose of a use case diagram is to show what system functions are performed for which actor. Roles of the actors in the system can be depicted.

CLASS DIAGRAM:

In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of static structure diagram that describes the structure of a system by showing the system’s classes, their attributes, operations (or methods), and the relationships among the classes. It explains which class contains information.

SEQUENCE DIAGRAM:

A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams.

ACTIVITY DIAGRAM:

Activity diagrams are graphical representations of workflows of stepwise activities and actions with support for choice, iteration and concurrency. In the Unified Modeling Language, activity diagrams can be used to describe the business and operational step-by-step workflows of components in a system. An activity diagram shows the overall flow of control.

MODULES:

COCKTAIL APPROACH

TRAVEL PACKAGE

TRAST MODEL

RECOMMENDATIONS

MODULES DESCRIPTION:

COCKTAIL APPROACH:

Our cocktail approaches the specific preferences of the tourist while he/she is planning a trip, such as the transportation preference. In other words, we focus on designing the recommendation algorithm to attract the tourists before they make a travel decision rather than providing the travel support in the on-tour stage. Thus, our approach may be only useful in some situations (e.g., email marketing). Also, if we want to deploy this work for real-world services, tourist package and locating in the top of the recommendation list to a lack of interest of those packages, relevant (desirable) travel packages in the test set may be just a small fraction of the entire relevant ones that are actually of interest to each tourist.

TRAVEL PACKAGE:

We aim to make personalized travel package recommendations for the tourists. Thus, the users

are the tourists and the items are the existing packages, and we exploit a real-world travel data set provided by a travel company packages in a way that each tourist has traveled at least two different packages.

First, it is very sparse, and each tourist has only a few travel records. The extreme sparseness of the data leads to difficulties for using traditional recommendation techniques, such as collaborative filtering. For example, it is hard to find the credible nearest neighbors for the tourists because there are very few cotraveling packages.

Second, the travel data has strong time dependence. The travel packages often have a life cycle along with the change to the business demand, i.e., they only last for a certain period. In contrast, most of the landscapes will still be active after the original package has been discarded. These landscapes can be used to form new packages together with some other landscapes. Thus, we can observe that the landscapes are more sustainable and important than the package itself.

Third, landscape has some intrinsic features like the geographic location and the right travel seasons. Only the landscapes with similar spatial-temporal features are suitable for the same packages, i.e., the landscapes in one package have spatial-temporal autocorrelations and follow the first law of geography-everything is related to everything else, but the nearby things are more related than distant things.

Fourth, the tourists will consider both time and financial costs before they accept a package. This is quite different from the traditional recommendations where the cost of an item is usually not a concern. Thus, it is very important to profile the tourists based on their interests as well as the time and the money they can afford. Since the package with a higher price often tends to have more time and vice versa, in this paper we only take the price factor into consideration.

Fifth, people often travel with their friends, family, or colleagues. Even when two tourists in the same travel group are totally strangers, there must be some reasons for the travel company to put them together. For instance, they may be of the same age or have the same travel schedule. Hence, it is also very important to understand the relationships among the tourists in the same travel group. This understanding can help to form the travel group. Last but not least, few tourist ratings are available for travel packages. However, we can see that every choice of a travel package indicates the strong interest of the tourist in the content provided in the package.

TRAST MODEL:

TRAST model, all the tourists’ travel preferences are represented by relationship distributions. For a set of tourists, who want to travel the same package, we can use their relationship distributions as features to cluster them, so as to put them into different travel groups. Thus, in this scenario, many clustering methods can be adopted. Since choosing clustering algorithm is beyond the scope of this paper, in the experiments, we refer to K-means, one of the most popular clustering algorithms. Thus, the TRAST model can be used as an assessment for travel group automatic formation. Indeed, in real applications, when generating a travel group, some more external constraints, such as tourists’ travel date requirements, the travel company’s travel group schedule should also be considered by TRAST1 to represent the latent relationships directly.

TRAST model, the purchases of the tourists in each travel group are summed up as one single expense record and, thus, it has more complex generative process. We can understand this process by a simple example. Assume that two selected tourists in a travel group (U00d) are u1 and u2, who are young and dating with each other. Now, they decide to travel in winter (Sd) and the destination is North America (Ad). To generate a travel landscape (l), we first extract a relationship (r, e.g., lover), and then find a topic (t) for lovers to travel in the winter (e.g., skiing). Finally, based on this skiing topic and the selected travel area (e.g., Northeast America), we draw a landscape (e.g., Stowe, Vermont).

RECOMMENDATIONS:

The evaluations in previous sections are mainly focused on the individual (personalized) recommendations. Since there are tourists who frequently travel together, it is interesting to know whether the latent variables (e.g., the topics of each individual tourist and the relationships of a travel group) as well as the cocktail approaches are useful for making recommendations to a group of tourists. To this end, we performed an experimental study on group recommendations.

We adopt the widely used degree of agreement (DOA) and Top-K [23] as the evaluation metrics. Also, a simple user study was conducted and volunteers were invited to rate the recommendations. For comparison, we recorded the best performance of each algorithm by tuning their parameters, and we also set some general rules for fair comparison. For instance, for collaborative filtering-based methods, we usually consider the contribution of the nearest neighbors with similarity values larger than 0.

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY:

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.3 FUNCTIONAL TESTING:

FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 4 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.5 LOAD TESTING:

Load Testing

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.7 WHITE BOX TESTING:

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

5.1.10 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 6

6.0 SOFTWARE DESCRIPTION:

6.1 JAVA TECHNOLOGY:

Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be characterized by all of the following buzzwords:

Simple
- Architecture neutral
- Object oriented
- Portable
- Distributed
- High performance
- Interpreted
- Multithreaded
- Robust
- Dynamic
- Secure

6.2 THE JAVA PLATFORM:

The Java platform has two components:

The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)

You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.

The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.

6.3 WHAT CAN JAVA TECHNOLOGY DO?

The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
Applets: The set of conventions used by applets.
Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
Software components: Known as JavaBeans^TM, can plug into existing component architectures.
Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
Java Database Connectivity (JDBC^TM): Provides uniform access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.

7.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?

Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure Java^TMProduct Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.

6.5 ODBC:

Microsoft Open Database Connectivity (ODBC) is a standard programming interface for application developers and database systems providers. Before ODBC became a de facto standard for Windows programs to interface with database systems, programmers had to use proprietary languages for each database they wanted to connect to. Now, ODBC has made the choice of the database system almost irrelevant from a coding perspective, which is as it should be. Application developers have much more important things to worry about than the syntax that is needed to port their program from one database to another when business needs suddenly change.

6.6 JDBC:

JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.

6.7 JDBC Goals:

The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows:

SQL Level API

SQL Conformance

JDBC must be implemental on top of common database interfaces

Provide a Java interface that is consistent with the rest of the Java system

Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.

Keep it simple

Use strong, static typing wherever possible

Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime.

Keep the common cases simple

Finally we decided to precede the implementation using Java Networking.

And for dynamically updating the cache table we go for MS Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the following

Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Compilation happens just once; interpretation occurs each time the program is executed. The figure illustrates how this works.

7.7 NETWORKING TCP/IP STACK:

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol.

IP datagram’s:

UDP:

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model – see later.

TCP:

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet addresses

In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address.

Network address:

Class A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

Subnet address:

Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.

Host address:

8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet.

Total address:

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses

Sockets:

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol);

6.8 JFREE CHART:

JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart’s extensive feature set includes:

A consistent and well-documented API, supporting a wide range of chart types;

A flexible design that is easy to extend, and targets both server-side and client-side applications;

Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG);

JFreeChart is “open source” or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.

6.8.1. Map Visualizations:

6.8.2. Time Series Chart Interactivity

6.8.3. Dashboards

6.8.4. Property Editors

CHAPTER 7

APPENDIX

7.1 SAMPLE SOURCE CODE

7.2 SAMPLE OUTPUT

CHAPTER 8

CONCLUSION:

We present study on personalized travel package recommendation. Specifically, we first analyzed the unique characteristics of travel packages and developed the TAST model, a Bayesian network for travel package and tourist representation. The TAST model can discover the interests of the tourists and extract the spatial-temporal correlations among landscapes. Then, we exploited the TAST model for developing a cocktail approach on personalized travel package recommendation. This cocktail approach follows a hybrid recommendation strategy and has the ability to combine several constraints existing in the real-world scenario.

Furthermore, we extended the TAST model to the TRAST model, which can capture the relationships among tourists in each travel group. Finally, an empirical study was conducted on real-world travel data. Experimental results demonstrate that the TAST model can capture the unique characteristics of the travel packages, the cocktail approach can lead to better performances of travel package recommendation, and the TRAST model can be used as an effective assessment for travel group automatic formation. We hope these encouraging results could lead to many future works.

Single Image Super-Resolution Based on Gradient Proﬁle Sharpness

22/10/201902/07/2019 by admin

Real-Time Big Data Analytical Architecture for Remote Sensing Application

05/08/201902/07/2019 by admin

In today’s era, there is a great deal added to real-time remote sensing Big Data than it seems at first, and extracting the useful information in an efficient manner leads a system toward a major computational challenges, such as to analyze, aggregate, and store, where data are remotely collected. Keeping in view the above mentioned factors, there is a need for designing a system architecture that welcomes both realtime, as well as offline data processing. In this paper, we propose real-time Big Data analytical architecture for remote sensing satellite application.

The proposed architecture comprises three main units:

1) Remote sensing Big Data acquisition unit (RSDU);

2) Data processing unit (DPU); and

3) Data analysis decision unit (DADU).

First, RSDU acquires data from the satellite and sends this data to the Base Station, where initial processing takes place. Second, DPU plays a vital role in architecture for efficient processing of real-time Big Data by providing filtration, load balancing, and parallel processing. Third, DADU is the upper layer unit of the proposed architecture, which is responsible for compilation, storage of the results, and generation of decision based on the results received from DPU.

1.2 INTRODUCTION:

Recently, a great deal of interest in the field of Big Data and its analysis has risen mainly driven from extensive number of research challenges strappingly related to bonafide applications, such as modeling, processing, querying, mining, and distributing large-scale repositories. The term “Big Data” classifies specific kinds of data sets comprising formless data, which dwell in data layer of technical computing applications and the Web. The data stored in the underlying layer of all these technical computing application scenarios have some precise individualities in common, such as 1) largescale data, which refers to the size and the data warehouse; 2) scalability issues, which refer to the application’s likely to be running on large scale (e.g., Big Data); 3) sustain extraction transformation loading (ETL) method from low, raw data to well thought-out data up to certain extent; and 4) development of uncomplicated interpretable analytical over Big Data warehouses with a view to deliver an intelligent and momentous knowledge for them.

Big Data are usually generated by online transaction, video/audio, email, number of clicks, logs, posts, social network data, scientific data, remote access sensory data, mobile phones, and their applications. These data are accumulated in databases that grow extraordinarily and become complicated to confine, form, store, manage, share, process, analyze, and visualize via typical database software tools. Advancement in Big Data sensing and computer technology revolutionizes the way remote data collected, processed, analyzed, and managed. Particularly, most recently designed sensors used in the earth and planetary observatory system are generating continuous stream of data. Moreover, majority of work have been done in the various fields of remote sensory satellite image data, such as change detection, gradient-based edge detection region similarity based edge detection and intensity gradient technique for efficient intraprediction.

In this paper, we referred the high speed continuous stream of data or high volume offline data to “Big Data,” which is leading us to a new world of challenges. Such consequences of transformation of remotely sensed data to the scientific understanding are a critical task. Hence the rate at which volume of the remote access data is increasing, a number of individual users as well as organizations are now demanding an efficient mechanism to collect, process, and analyze, and store these data and its resources. Big Data analysis is somehow a challenging task than locating, identifying, understanding, and citing data. Having a large-scale data, all of this has to happen in a mechanized manner since it requires diverse data structure as well as semantics to be articulated in forms of computer-readable format.

However, by analyzing simple data having one data set, a mechanism is required of how to design a database. There might be alternative ways to store all of the same information. In such conditions, the mentioned design might have an advantage over others for certain process and possible drawbacks for some other purposes. In order to address these needs, various analytical platforms have been provided by relational databases vendors. These platforms come in various shapes from software only to analytical services that run in third-party hosted environment. In remote access networks, where the data source such as sensors can produce an overwhelming amount of raw data.

We refer it to the first step, i.e., data acquisition, in which much of the data are of no interest that can be filtered or compressed by orders of magnitude. With a view to using such filters, they do not discard useful information. For instance, in consideration of new reports, is it adequate to keep that information that is mentioned with the company name? Alternatively, is it necessary that we may need the entire report, or simply a small piece around the mentioned name? The second challenge is by default generation of accurate metadata that describe the composition of data and the way it was collected and analyzed. Such kind of metadata is hard to analyze since we may need to know the source for each data in remote access.

1.3 LITRATURE SURVEY:

BIG DATA AND CLOUD COMPUTING: CURRENT STATE AND FUTURE OPPORTUNITIES

AUTHOR: D. Agrawal, S. Das, and A. E. Abbadi

PUBLISH: Proc. Int. Conf. Extending Database Technol. (EDBT), 2011, pp. 530–533.

EXPLANATION:

Scalable database management systems (DBMS)—both for update intensive application workloads as well as decision support systems for descriptive and deep analytics—are a critical part of the cloud infrastructure and play an important role in ensuring the smooth transition of applications from the traditional enterprise infrastructures to next generation cloud infrastructures. Though scalable data management has been a vision for more than three decades and much research has focussed on large scale data management in traditional enterprise setting, cloud computing brings its own set of novel challenges that must be addressed to ensure the success of data management solutions in the cloud environment. This tutorial presents an organized picture of the challenges faced by application developers and DBMS designers in developing and deploying internet scale applications. Our background study encompasses both classes of systems: (i) for supporting update heavy applications, and (ii) for ad-hoc analytics and decision support. We then focus on providing an in-depth analysis of systems for supporting update intensive web-applications and provide a survey of the state-of-theart in this domain. We crystallize the design choices made by some successful systems large scale database management systems, analyze the application demands and access patterns, and enumerate the desiderata for a cloud-bound DBMS.

CHANGE DETECTION IN SYNTHETIC APERTURE RADAR IMAGE BASED ON FUZZY ACTIVE CONTOUR MODELS AND GENETIC ALGORITHMS

AUTHOR: J. Shi, J. Wu, A. Paul, L. Jiao, and M. Gong

PUBLISH: Math. Prob. Eng., vol. 2014, 15 pp., Apr. 2014.

EXPLANATION:

This paper presents an unsupervised change detection approach for synthetic aperture radar images based on a fuzzy active contour model and a genetic algorithm. The aim is to partition the difference image which is generated from multitemporal satellite images into changed and unchanged regions. Fuzzy technique is an appropriate approach to analyze the difference image where regions are not always statistically homogeneous. Since interval type-2 fuzzy sets are well-suited for modeling various uncertainties in comparison to traditional fuzzy sets, they are combined with active contour methodology for properly modeling uncertainties in the difference image. The interval type-2 fuzzy active contour model is designed to provide preliminary analysis of the difference image by generating intermediate change detection masks. Each intermediate change detection mask has a cost value. A genetic algorithm is employed to find the final change detection mask with the minimum cost value by evolving the realization of intermediate change detection masks. Experimental results on real synthetic aperture radar images demonstrate that change detection results obtained by the improved fuzzy active contour model exhibits less error than previous approaches.

A BIG DATA ARCHITECTURE FOR LARGE SCALE SECURITY MONITORING

AUTHOR: S. Marchal, X. Jiang, R. State, and T. Engel

PUBLISH: Proc. IEEE Int. Congr. Big Data, 2014, pp. 56–63.

EXPLANATION:

Network traffic is a rich source of information for security monitoring. However the increasing volume of data to treat raises issues, rendering holistic analysis of network traffic difficult. In this paper we propose a solution to cope with the tremendous amount of data to analyse for security monitoring perspectives. We introduce an architecture dedicated to security monitoring of local enterprise networks. The application domain of such a system is mainly network intrusion detection and prevention, but can be used as well for forensic analysis. This architecture integrates two systems, one dedicated to scalable distributed data storage and management and the other dedicated to data exploitation. DNS data, NetFlow records, HTTP traffic and honeypot data are mined and correlated in a distributed system that leverages state of the art big data solution. Data correlation schemes are proposed and their performance are evaluated against several well-known big data framework including Hadoop and Spark.

CHAPTER 2

2.0 SYSTEM ANALYSIS

2.1 EXISTING SYSTEM:

Existing methods inapplicable on standard computers it is not desirable or possible to load the entire image into memory before doing any processing. In this situation, it is necessary to load only part of the image and process it before saving the result to the disk and proceeding to the next part. This corresponds to the concept of on-the-flow processing. Remote sensing processing can be seen as a chain of events or steps is generally independent from the following ones and generally focuses on a particular domain. For example, the image can be radio metrically corrected to compensate for the atmospheric effects, indices computed, before an object extraction based on these indexes takes place.

The typical processing chain will process the whole image for each step, returning the final result after everything is done. For some processing chains, iterations between the different steps are required to find the correct set of parameters. Due to the variability of satellite images and the variety of the tasks that need to be performed, fully automated tasks are rare. Humans are still an important part of the loop. These concepts are linked in the sense that both rely on the ability to process only one part of the data.

In the case of simple algorithms, this is quite easy: the input is just split into different non-overlapping pieces that are processed one by one. But most algorithms do consider the neighborhood of each pixel. As a consequence, in most cases, the data will have to be split into partially overlapping pieces. The objective is to obtain the same result as the original algorithm as if the processing was done in one go. Depending on the algorithm, this is unfortunately not always possible.

2.1.1 DISADVANTAGES:

A reader that loads the image, or part of the image in memory from the file on disk;

A filter which carries out a local processing that does not require access to neighboring pixels (a simple threshold for example), the processing can happen on CPU or GPU;

A filter that requires the value of neighboring pixels to compute the value of a given pixel (a convolution filter is a typical example), the processing can happen on CPU or GPU;

A writer to output the resulting image in memory into a file on disk, note that the file could be written in several steps. We will illustrate in this example how it is possible to compute part of the image in the whole pipeline, incurring only minimal computation overhead.

2.2 PROPOSED SYSTEM:

We present a remote sensing Big Data analytical architecture, which is used to analyze real time, as well as offline data. At first, the data are remotely preprocessed, which is then readable by the machines. Afterward, this useful information is transmitted to the Earth Base Station for further data processing. Earth Base Station performs two types of processing, such as processing of real-time and offline data. In case of the offline data, the data are transmitted to offline data-storage device. The incorporation of offline data-storage device helps in later usage of the data, whereas the real-time data is directly transmitted to the filtration and load balancer server, where filtration algorithm is employed, which extracts the useful information from the Big Data.

On the other hand, the load balancer balances the processing power by equal distribution of the real-time data to the servers. The filtration and load-balancing server not only filters and balances the load, but it is also used to enhance the system efficiency. Furthermore, the filtered data are then processed by the parallel servers and are sent to data aggregation unit (if required, they can store the processed data in the result storage device) for comparison purposes by the decision and analyzing server. The proposed architecture welcomes remote access sensory data as well as direct access network data (e.g., GPRS, 3G, xDSL, or WAN). The proposed architecture and the algorithms are implemented in applying remote sensing earth observatory data.

We proposed architecture has the capability of dividing, load balancing, and parallel processing of only useful data. Thus, it results in efficiently analyzing real-time remote sensing Big Data using earth observatory system. Furthermore, the proposed architecture has the capability of storing incoming raw data to perform offline analysis on largely stored dumps, when required. Finally, a detailed analysis of remotely sensed earth observatory Big Data for land and sea area are provided using .NET. In addition, various algorithms are proposed for each level of RSDU, DPU, and DADU to detect land as well as sea area to elaborate the working of architecture.

2.2.1 ADVANTAGES:

Big Data process high-speed, large amount of real-time remote sensory image data using our proposed architecture. It works on both DPU and DADU by taking data from medical application.

Our architecture for offline as well online traffic, we perform a simple analysis on remote sensing earth observatory data. We assume that the data are big in nature and difficult to handle for a single server.

The data are continuously coming from a satellite with high speed. Hence, special algorithms are needed to process, analyze, and make a decision from that Big Data. Here, in this section, we analyze remote sensing data for finding land, sea, or ice area.

We have used the proposed architecture to perform analysis and proposed an algorithm for handling, processing, analyzing, and decision-making for remote sensing Big Data images using our proposed architecture.

2.3 HARDWARE & SOFTWARE REQUIREMENTS:

2.3.1 HARDWARE REQUIREMENT:

v Processor – Pentium –IV

Speed – 1.1 GHz
- RAM – 256 MB (min)
- Hard Disk – 20 GB
- Floppy Drive – 1.44 MB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVGA

2.3.2 SOFTWARE REQUIREMENTS:

Operating System : Windows XP or Win7
Front End : Microsoft Visual Studio .NET 2008
Script : C# Script
Back End : MS-SQL Server 2005
Document : MS-Office 2007

CHAPTER 3

3.0 SYSTEM DESIGN:

Data Flow Diagram / Use Case Diagram / Flow Diagram:

The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system

The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These components are the system process, the data used by the process, an external entity that interacts with the system and the information flows in the system.

DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.

DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be partitioned into levels that represent increasing information flow and functional detail.

NOTATION:

SOURCE OR DESTINATION OF DATA:

External sources or destinations, which may be people or organizations or other entities

DATA SOURCE:

Here the data referenced by a process is stored and retrieved.

PROCESS:

People, procedures or devices that produce data’s in the physical component is not identified.

DATA FLOW:

Data moves in a specific direction from an origin to a destination. The data flow is a “packet” of data.

MODELING RULES:

There are several common modeling rules when creating DFDs:

All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.

3.1 ARCHITECTURE DIAGRAM:

3.2 DATAFLOW DIAGRAM:

UML DIAGRAMS:

3.2 USE CASE DIAGRAM:

3.3 CLASS DIAGRAM:

3.4 SEQUENCE DIAGRAM:

3.5 ACTIVITY DIAGRAM:

CHAPTER 4

4.0 IMPLEMENTATION:

Big Data covers diverse technologies same as cloud computing. The input of Big Data comes from social networks (Facebook, Twitter, LinkedIn, etc.), Web servers, satellite imagery, sensory data, banking transactions, etc. Regardless of very recent emergence of Big Data architecture in scientific applications, numerous efforts toward Big Data analytics architecture can already be found in the literature. Among numerous others, we propose remote sensing Big Data architecture to analyze the Big Data in an efficient manner as shown in Fig. 1. Fig. 1 delineates n number of satellites that obtain the earth observatory Big Data images with sensors or conventional cameras through which sceneries are recorded using radiations. Special techniques are applied to process and interpret remote sensing imagery for the purpose of producing conventional maps, thematic maps, resource surveys, etc. We have divided remote sensing Big Data architecture.

Healthcare scenarios, medical practitioners gather massive volume of data about patients, medical history, medications, and other details. The above-mentioned data are accumulated in drug-manufacturing companies. The nature of these data is very complex, and sometimes the practitioners are unable to show a relationship with other information, which results in missing of important information. With a view in employing advance analytic techniques for organizing and extracting useful information from Big Data results in personalized medication, the advance Big Data analytic techniques give insight into hereditarily causes of the disease.

4.1 ALGORITHM:

This algorithm takes satellite data or product and then filters and divides them into segments and performs load-balancing algorithm.

The processing algorithm calculates results for different parameters against each incoming block and sends them to the next level. In step 1, the calculation of mean, SD, absolute difference, and the number of values, which are greater than the maximum threshold, are performed. Furthermore, in the next step, the results are transmitted to the aggregation server.

ACA collects the results from each processing servers against each Bi and then combines, organizes, and stores these results in RDBMS database.

4.2 MODULES:

DATA ANALYSIS DECISION UNIT (DADU):

DATA PROCESSING UNIT (DPU):

REMOTE SENSING APPLICATION RSDU:

FINDINGS AND DISCUSSION:

ALGORITHM DESIGN AND TESTING:

4.3 MODULE DESCRIPTION:

DATA PROCESSING UNIT (DPU):

In data processing unit (DPU), the filtration and load balancer server have two basic responsibilities, such as filtration of data and load balancing of processing power. Filtration identifies the useful data for analysis since it only allows useful information, whereas the rest of the data are blocked and are discarded. Hence, it results in enhancing the performance of the whole proposed system. Apparently, the load-balancing part of the server provides the facility of dividing the whole filtered data into parts and assign them to various processing servers. The filtration and load-balancing algorithm varies from analysis to analysis; e.g., if there is only a need for analysis of sea wave and temperature data, the measurement of these described data is filtered out, and is segmented into parts.

Each processing server has its algorithm implementation for processing incoming segment of data from FLBS. Each processing server makes statistical calculations, any measurements, and performs other mathematical or logical tasks to generate intermediate results against each segment of data. Since these servers perform tasks independently and in parallel, the performance proposed system is dramatically enhanced, and the results against each segment are generated in real time. The results generated by each server are then sent to the aggregation server for compilation, organization, and storing for further processing.

DATA ANALYSIS DECISION UNIT (DADU):

DADU contains three major portions, such as aggregation and compilation server, results storage server(s), and decision making server. When the results are ready for compilation, the processing servers in DPU send the partial results to the aggregation and compilation server, since the aggregated results are not in organized and compiled form. Therefore, there is a need to aggregate the related results and organized them into a proper form for further processing and to store them. In the proposed architecture, aggregation and compilation server is supported by various algorithms that compile, organize, store, and transmit the results. Again, the algorithm varies from requirement to requirement and depends on the analysis needs. Aggregation server stores the compiled and organized results into the result’s storage with the intention that any server can use it as it can process at any time.

The aggregation server also sends the same copy of that result to the decision-making server to process that result for making decision. The decision-making server is supported by the decision algorithms, which inquire different things from the result, and then make various decisions (e.g., in our analysis, we analyze land, sea, and ice, whereas other finding such as fire, storms, Tsunami, earthquake can also be found). The decision algorithm must be strong and correct enough that efficiently produce results to discover hidden things and make decisions. The decision part of the architecture is significant since any small error in decision-making can degrade the efficiency of the whole analysis. DADU finally displays or broadcasts the decisions, so that any application can utilize those decisions at real time to make their development. The applications can be any business software, general purpose community software, or other social networks that need those findings (i.e., decision-making).

REMOTE SENSING APPLICATION RSDU:

Remote sensing promotes the expansion of earth observatory system as cost-effective parallel data acquisition system to satisfy specific computational requirements. The Earth and Space Science Society originally approved this solution as the standard for parallel processing in this particular qualifications for improved Big Data acquisition, soon it was recognized that traditional data processing technologies could not provide sufficient power for processing such kind of data. Therefore, the need for parallel processing of the massive volume of data was required, which could efficiently analyze the Big Data. For that reason, the proposed RSDU is introduced in the remote sensing Big Data architecture that gathers the data from various satellites around the globe as possible that the received raw data are distorted by scattering and absorption by various atmospheric gasses and dust particles. We assume that the satellite can correct the erroneous data.

However, to make the raw data into image format, the remote sensing satellite uses effective data analysis, remote sensing satellite preprocesses data under many situations to integrate the data from different sources, which not only decreases storage cost, but also improves analysis accuracy. The data must be corrected in different methods to remove distortions caused due to the motion of the platform relative to the earth, platform attitude, earth curvature, nonuniformity of illumination, variations in sensor characteristics, etc. The data is then transmitted to Earth Base Station for further processing using direct communication link. We divided the data processing procedure into two steps, such as real-time Big Data processing and offline Big Data processing. In the case of offline data processing, the Earth Base Station transmits the data to the data center for storage. This data is then used for future analyses. However, in real-time data processing, the data are directly transmitted to the filtration and load balancer server (FLBS), since storing of incoming real-time data degrades the performance of real-time processing.

FINDINGS AND DISCUSSION:

Preprocessed and formatted data from satellite contains all or some of the following parts depending on the product.

1) Main product header (MPH): It includes the products basis information, i.e., id, measurement and sensing time, orbit, information, etc.

2) Special products head (SPH): It contains information specific to each product or product group, i.e., number of data sets descriptors (DSD), directory of remaining data sets in the file, etc.

3) Annotation data sets (ADS): It contains information of quality, time tagged processing parameters, geo location tie points, solar, angles, etc.

4) Global annotation data sets (GADs): It contains calling factors, offsets, calibration information, etc.

5) Measurement data set (MDS): It contains measurements or graphical parameters calculated from the measurement including quality flag and the time tag measurement as well. The image data are also stored in this part and are the main element of our analysis.

The MPH and SPH data are in ASCII format, whereas all the other data sets are in binary format. MDS, ADS, and GADs consist of the sequence of records and one or more fields of the data for each record. In our case, the MDS contains number of records, and each record contains a number of fields. Each record of the MDS corresponds to one row of the satellite image, which is our main focus during analysis.

ALGORITHM DESIGN AND TESTING:

Our algorithms are proposed to process high-speed, large amount of real-time remote sensory image data using our proposed architecture. It works on both DPU and DADU by taking data from satellite as input to identify land and sea area from the data set. The set of algorithms contains four simple algorithms, i.e., algorithm I, algorithm II, algorithm III, and algorithm IV that work on filtrations and load balancer, processing servers, aggregation server, and on decision-making server, respectively. Algorithm I, i.e., filtration and load balancer algorithm (FLBA) works on filtration and load balancer to filter only the require data by discarding all other information. It also provides load balancing by dividing the data into fixed size blocks and sending them to the processing server, i.e., one or more distinct blocks to each server. This filtration, dividing, and load-balancing task speeds up our performance by neglecting unnecessary data and by providing parallel processing. Algorithm II, i.e., processing and calculation algorithm (PCA) processes filtered data and is implemented on each processing server. It provides various parameter calculations that are used in the decision-making process. The parameters calculations results are then sent to aggregation server for further processing. Algorithm III, i.e., aggregation and compilations algorithm (ACA) stores, compiles, and organizes the results, which can be used by decision-making server for land and sea area detection. Algorithm IV, i.e., decision-making algorithm (DMA) identifies land area and sea area by comparing the parameters results, i.e., from aggregation servers, with threshold values.

CHAPTER 5

5.0 SYSTEM STUDY:

5.1 FEASIBILITY STUDY:

Three key considerations involved in the feasibility analysis are

ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY

5.1.1 ECONOMICAL FEASIBILITY:

5.1.2 TECHNICAL FEASIBILITY:

5.1.3 SOCIAL FEASIBILITY:

5.2 SYSTEM TESTING:

5.2.1 UNIT TESTING:

UNIT TESTING:

Description	Expected result
Test for application window properties.	All the properties of the windows are to be properly aligned and displayed.
Test for mouse operations.	All the mouse operations like click, drag, etc. must perform the necessary operations without any exceptions.

5.1.3 FUNCTIONAL TESTING:

FUNCTIONAL TESTING:

Description	Expected result
Test for all modules.	All peers should communicate in the group.
Test for various peer in a distributed network framework as it display all users available in the group.	The result after execution should give the accurate result.

5.1. 4 NON-FUNCTIONAL TESTING:

Load testing
Performance testing
Usability testing
Reliability testing
Security testing

5.1.5 LOAD TESTING:

Load Testing

Description	Expected result
It is necessary to ascertain that the application behaves correctly under loads when ‘Server busy’ response is received.	Should designate another active node as a Server.

5.1.5 PERFORMANCE TESTING:

PERFORMANCE TESTING:

Description	Expected result
This is required to assure that an application perforce adequately, having the capability to handle many peers, delivering its results in expected time and using an acceptable level of resource and it is an aspect of operational management.	Should handle large input values, and produce accurate result in a expected time.

5.1.6 RELIABILITY TESTING:

RELIABILITY TESTING:

Description	Expected result
This is to check that the server is rugged and reliable and can handle the failure of any of the components involved in provide the application.	In case of failure of the server an alternate server should take over the job.

5.1.7 SECURITY TESTING:

SECURITY TESTING:

Description	Expected result
Checking that the user identification is authenticated.	In case failure it should not be connected in the framework.
Check whether group keys in a tree are shared by all peers.	The peers should know group key in the same group.

5.1.7 WHITE BOX TESTING:

5.1.8 WHITE BOX TESTING:

Description	Expected result
Exercise all logical decisions on their true and false sides.	All the logical decisions must be valid.
Execute all loops at their boundaries and within their operational bounds.	All the loops must be finite.
Exercise internal data structures to ensure their validity.	All the data structures must be valid.

5.1.9 BLACK BOX TESTING:

5.1.10 BLACK BOX TESTING:

Description	Expected result
To check for incorrect or missing functions.	All the functions must be valid.
To check for interface errors.	The entire interface must function normally.
To check for errors in a data structures or external data base access.	The database updation and retrieval must be done.
To check for initialization and termination errors.	All the functions and data structures must be initialized properly and terminated normally.

All the above system testing strategies are carried out in as the development, documentation and institutionalization of the proposed goals and related policies is essential.

CHAPTER 7

7.0 SOFTWARE SPECIFICATION:

7.1 FEATURES OF .NET:

Microsoft .NET is a set of Microsoft software technologies for rapidly building and integrating XML Web services, Microsoft Windows-based applications, and Web solutions. The .NET Framework is a language-neutral platform for writing programs that can easily and securely interoperate. There’s no language barrier with .NET: there are numerous languages available to the developer including Managed C++, C#, Visual Basic and Java Script.

The .NET framework provides the foundation for components to interact seamlessly, whether locally or remotely on different platforms. It standardizes common data types and communications protocols so that components created in different languages can easily interoperate.

“.NET” is also the collective name given to various software components built upon the .NET platform. These will be both products (Visual Studio.NET and Windows.NET Server, for instance) and services (like Passport, .NET My Services, and so on).

7.2 THE .NET FRAMEWORK

The .NET Framework has two main parts:

1. The Common Language Runtime (CLR).

2. A hierarchical set of class libraries.

The CLR is described as the “execution engine” of .NET. It provides the environment within which programs run. The most important features are

Conversion from a low-level assembler-style language, called Intermediate Language (IL), into code native to the platform being executed on.
Memory management, notably including garbage collection.
Checking and enforcing security restrictions on the running code.
Loading and executing programs, with version control and other such features.
The following features of the .NET framework are also worth description:

Managed Code

The code that targets .NET, and which contains certain extra Information – “metadata” – to describe itself. Whilst both managed and unmanaged code can run in the runtime, only managed code contains the information that allows the CLR to guarantee, for instance, safe execution and interoperability.

Managed Data

With Managed Code comes Managed Data. CLR provides memory allocation and Deal location facilities, and garbage collection. Some .NET languages use Managed Data by default, such as C#, Visual Basic.NET and JScript.NET, whereas others, namely C++, do not. Targeting CLR can, depending on the language you’re using, impose certain constraints on the features available. As with managed and unmanaged code, one can have both managed and unmanaged data in .NET applications – data that doesn’t get garbage collected but instead is looked after by unmanaged code.

Common Type System

The CLR uses something called the Common Type System (CTS) to strictly enforce type-safety. This ensures that all classes are compatible with each other, by describing types in a common way. CTS define how types work within the runtime, which enables types in one language to interoperate with types in another language, including cross-language exception handling. As well as ensuring that types are only used in appropriate ways, the runtime also ensures that code doesn’t attempt to access memory that hasn’t been allocated to it.

Common Language Specification

The CLR provides built-in support for language interoperability. To ensure that you can develop managed code that can be fully used by developers using any programming language, a set of language features and rules for using them called the Common Language Specification (CLS) has been defined. Components that follow these rules and expose only CLS features are considered CLS-compliant.

7.3 THE CLASS LIBRARY

.NET provides a single-rooted hierarchy of classes, containing over 7000 types. The root of the namespace is called System; this contains basic types like Byte, Double, Boolean, and String, as well as Object. All objects derive from System. Object. As well as objects, there are value types. Value types can be allocated on the stack, which can provide useful flexibility. There are also efficient means of converting value types to object types if and when necessary.

The set of classes is pretty comprehensive, providing collections, file, screen, and network I/O, threading, and so on, as well as XML and database connectivity.

The class library is subdivided into a number of sets (or namespaces), each providing distinct areas of functionality, with dependencies between the namespaces kept to a minimum.

7.4 LANGUAGES SUPPORTED BY .NET

The multi-language capability of the .NET Framework and Visual Studio .NET enables developers to use their existing programming skills to build all types of applications and XML Web services. The .NET framework supports new versions of Microsoft’s old favorites Visual Basic and C++ (as VB.NET and Managed C++), but there are also a number of new additions to the family.

Visual Basic .NET has been updated to include many new and improved language features that make it a powerful object-oriented programming language. These features include inheritance, interfaces, and overloading, among others. Visual Basic also now supports structured exception handling, custom attributes and also supports multi-threading.

Visual Basic .NET is also CLS compliant, which means that any CLS-compliant language can use the classes, objects, and components you create in Visual Basic .NET.

Managed Extensions for C++ and attributed programming are just some of the enhancements made to the C++ language. Managed Extensions simplify the task of migrating existing C++ applications to the new .NET Framework.

C# is Microsoft’s new language. It’s a C-style language that is essentially “C++ for Rapid Application Development”. Unlike other languages, its specification is just the grammar of the language. It has no standard library of its own, and instead has been designed with the intention of using the .NET libraries as its own.

Microsoft Visual J# .NET provides the easiest transition for Java-language developers into the world of XML Web Services and dramatically improves the interoperability of Java-language programs with existing software written in a variety of other programming languages.

Active State has created Visual Perl and Visual Python, which enable .NET-aware applications to be built in either Perl or Python. Both products can be integrated into the Visual Studio .NET environment. Visual Perl includes support for Active State’s Perl Dev Kit.

Other languages for which .NET compilers are available include

FORTRAN
COBOL
Eiffel

ASP.NET XML WEB SERVICES	Windows Forms
Base Class Libraries
Common Language Runtime
Operating System

Fig1 .Net Framework

C#.NET is also compliant with CLS (Common Language Specification) and supports structured exception handling. CLS is set of rules and constructs that are supported by the CLR (Common Language Runtime). CLR is the runtime environment provided by the .NET Framework; it manages the execution of the code and also makes the development process easier by providing services.

C#.NET is a CLS-compliant language. Any objects, classes, or components that created in C#.NET can be used in any other CLS-compliant language. In addition, we can use objects, classes, and components created in other CLS-compliant languages in C#.NET .The use of CLS ensures complete interoperability among applications, regardless of the languages used to create the application.

CONSTRUCTORS AND DESTRUCTORS:

Constructors are used to initialize objects, whereas destructors are used to destroy them. In other words, destructors are used to release the resources allocated to the object. In C#.NET the sub finalize procedure is available. The sub finalize procedure is used to complete the tasks that must be performed when an object is destroyed. The sub finalize procedure is called automatically when an object is destroyed. In addition, the sub finalize procedure can be called only from the class it belongs to or from derived classes.

GARBAGE COLLECTION

Garbage Collection is another new feature in C#.NET. The .NET Framework monitors allocated resources, such as objects and variables. In addition, the .NET Framework automatically releases memory for reuse by destroying objects that are no longer in use.

In C#.NET, the garbage collector checks for the objects that are not currently in use by applications. When the garbage collector comes across an object that is marked for garbage collection, it releases the memory occupied by the object.

OVERLOADING

Overloading is another feature in C#. Overloading enables us to define multiple procedures with the same name, where each procedure has a different set of arguments. Besides using overloading for procedures, we can use it for constructors and properties in a class.

MULTITHREADING:

C#.NET also supports multithreading. An application that supports multithreading can handle multiple tasks simultaneously, we can use multithreading to decrease the time taken by an application to respond to user interaction.

STRUCTURED EXCEPTION HANDLING

C#.NET supports structured handling, which enables us to detect and remove errors at runtime. In C#.NET, we need to use Try…Catch…Finally statements to create exception handlers. Using Try…Catch…Finally statements, we can create robust and effective exception handlers to improve the performance of our application.

7.5 THE .NET FRAMEWORK

The .NET Framework is a new computing platform that simplifies application development in the highly distributed environment of the Internet.

OBJECTIVES OF .NET FRAMEWORK

1. To provide a consistent object-oriented programming environment whether object codes is stored and executed locally on Internet-distributed, or executed remotely.

2. To provide a code-execution environment to minimizes software deployment and guarantees safe execution of code.

3. Eliminates the performance problems.

There are different types of application, such as Windows-based applications and Web-based applications.

7.6 FEATURES OF SQL-SERVER

The OLAP Services feature available in SQL Server version 7.0 is now called SQL Server 2000 Analysis Services. The term OLAP Services has been replaced with the term Analysis Services. Analysis Services also includes a new data mining component. The Repository component available in SQL Server version 7.0 is now called Microsoft SQL Server 2000 Meta Data Services. References to the component now use the term Meta Data Services. The term repository is used only in reference to the repository engine within Meta Data Services

SQL-SERVER database consist of six type of objects,

They are,

1. TABLE

2. QUERY

3. FORM

4. REPORT

5. MACRO

7.7 TABLE:

A database is a collection of data about a specific topic.

VIEWS OF TABLE:

We can work with a table in two types,

1. Design View

2. Datasheet View

Design View

To build or modify the structure of a table we work in the table design view. We can specify what kind of data will be hold.

Datasheet View

To add, edit or analyses the data itself we work in tables datasheet view mode.

QUERY:

A query is a question that has to be asked the data. Access gathers data that answers the question from one or more table. The data that make up the answer is either dynaset (if you edit it) or a snapshot (it cannot be edited).Each time we run query, we get latest information in the dynaset. Access either displays the dynaset or snapshot for us to view or perform an action on it, such as deleting or updating.

CHAPTER 7

7.0 APPENDIX

7.1 SAMPLE SCREEN SHOTS:

7.2 SAMPLE SOURCE CODE:

CHAPTER 8

8.1 CONCLUSION AND FUTURE:

In this paper, we proposed architecture for real-time Big Data analysis for remote sensing applications in the architecture efficiently processed and analyzed real-time and offline remote sensing Big Data for decision-making. The proposed architecture is composed of three major units, such as 1) RSDU; 2) DPU; and 3) DADU. These units implement algorithms for each level of the architecture depending on the required analysis. The architecture of real-time Big is generic (application independent) that is used for any type of remote sensing Big Data analysis. Furthermore, the capabilities of filtering, dividing, and parallel processing of only useful information are performed by discarding all other extra data. These processes make a better choice for real-time remote sensing Big Data analysis.

The algorithms proposed in this paper for each unit and subunits are used to analyze remote sensing data sets, which helps in better understanding of land and sea area. The proposed architecture welcomes researchers and organizations for any type of remote sensory Big Data analysis by developing algorithms for each level of the architecture depending on their analysis requirement. For future work, we are planning to extend the proposed architecture to make it compatible for Big Data analysis for all applications, e.g., sensors and social networking. We are also planning to use the proposed architecture to perform complex analysis on earth observatory data for decision making at realtime, such as earthquake prediction, Tsunami prediction, fire detection, etc.