ENABLING EFFICIENT MULTI-KEYWORD RANKED SEARCH

ABSTRACT:

In mobile cloud computing, a fundamental application is to outsource the mobile data to external cloud servers for scalable data storage. The outsourced data, however, need to be encrypted due to the privacy and confidentiality concerns of their owner. This results in the distinguished difficulties on the accurate search over the encrypted mobile cloud data.

In this paper, we develop the searchable encryption for multi-keyword ranked search over the storage data. Specifically, by considering the large number of outsourced documents (data) in the cloud, we utilize the relevance score and k-nearest neighbor techniques to develop an efficient multi-keyword search scheme that can return the ranked search results based on the accuracy.

This framework, we leverage an efficient index to further improve the search efficiency, and adopt the blind storage system to conceal access pattern of the search user. Security analysis demonstrates that our scheme can achieve confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Finally, using extensive simulations, we show that our proposal can achieve much improved efficiency in terms of search functionality and search time compared with the existing proposals.

GOAL OF THE PROJECT:

Efficient and privacy-preserving multi-keyword ranked search over encrypted mobile cloud data via blind storage system, the EMRS has following design goals:

  • Multi-Keyword Ranked Search: To meet the requirements for practical uses and provide better user experience, the EMRS should not only support multi-keyword search over encrypted mobile cloud data, but also achieve relevance-based result ranking.
  • Search Efficiency: Since the number of the total documents may be very large in a practical situation, the EMRS should achieve sublinear search with better search efficiency.
  • Confidentiality and Privacy Preservation: To prevent the cloud server from learning any additional information about the documents and the index, and to keep search users’ trapdoors secret, the EMRS should cover all the security requirements that we introduced above.

INTRODUCTION

Mobile cloud computing gets rid of the hardware limitation of mobile devices by exploring the scalable and virtualized cloud storage and computing resources, and accordingly is able to provide much more powerful and scalable mobile services to users. In mobile cloud computing, mobile users typically outsource their data to external cloud servers, e.g., iCloud, to enjoy a stable, low-cost and scalable way for data storage and access. However, as outsourced data typically contain sensitive privacy information, such as personal photos, emails, etc., which would lead to severe confidentiality and privacy violations, if without efficient protections. It is therefore necessary to encrypt the sensitive data before outsourcing them to the cloud. The data encryption, however, would result in salient difficulties when other users need to access interested data with search, due to the difficulties of search over encrypted data.

This fundamental issue in mobile cloud computing accordingly motivates an extensive body of research in the recent years on the investigation of searchable encryption technique to achieve efficient searching over outsourced encrypted data. A collection of research works have recently been developed on the topic of multi-keyword search over encrypted data. Propose a symmetric searchable encryption scheme which achieves high efficiency for large databases with modest scarification on security guarantees. Propose a multi-keyword search scheme supporting result ranking by adopting k-nearest neighbors (kNN) technique. Propose a dynamic searchable encryption scheme through blind storage to conceal access pattern of the search user.

In order to meet the practical search requirements, search over encrypted data should support the following three functions.

First, the searchable encryption schemes should support multi-keyword search, and provide the same user experience as searching in Google search with different keywords; single-keyword search is far from satisfactory by only returning very limited and inaccurate search results. Second, to quickly identify most relevant results, the search user would typically prefer cloud servers to sort the returned search results in a relevance-based order ranked by the relevance of the search request to the documents. In addition, showing the ranked search to users can also eliminate the unnecessary network traffic by only sending back the most relevant results from cloud to search users.

Third, as for the search efficiency, since the number of the documents contained in a database could be extraordinarily large, searchable encryption schemes should be efficient to quickly respond to the search requests with minimum delays.

In contrast to the theoretical benefits, most of the existing proposals, however, fail to offer sufficient insights towards the construction of full functioned searchable encryption as described above. As an effort towards the issue, in this paper, we propose an efficient multi-keyword ranked search (EMRS) scheme over encrypted mobile cloud data through blind storage.

Our main contributions can be summarized as follows:

  • We introduce a relevance score in searchable encryption to achieve multi-keyword ranked search over the encrypted mobile cloud data. In addition to that, we construct an efficient index to improve the search efficiency.
  • By modifying the blind storage system in the EMRS, we solve the trapdoor unlinkability problem and conceal access pattern of the search user from the cloud server.
  • We give thorough security analysis to demonstrate that the EMRS can reach a high security level including confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Moreover, we implement extensive experiments, which show that the EMRS can achieve enhanced efficiency in the terms of functionality and search efficiency compared with existing proposals.

LITRATURE SURVEY

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing works built various types of secure index and corresponding index-based keyword matching algorithms to improve search efficiency. All these works only support the search of single keyword. Subsequent works extended the search capability to multiple, conjunctive or disjunctive, keywords search. However, they support only exact keyword matching. Misspelled keywords in the query will result in wrong or no matching. Very recently, a few works extended the search capability to approximate keyword matching (also known as fuzzy search). These are all for single keyword search, with a common approach involving expanding the index file by covering possible combinations of keyword misspelling so that a certain degree of spelling error, measured by edit distance, can be tolerated. Although a wild-card approach is adopted to minimize the expansion of the resulting index file, for a l-letter long keyword to tolerate an error up to an edit distance of d, the index has to be expanded times.

Thus, it is not scalable as the storage complexity increases exponentially with the increase of the error tolerance. To support multi-keyword search, the search algorithm will have to run multiple rounds To date, efficient multi-keyword fuzzy search over encrypted data remains a challenging problem. We want to point out that the efforts on search over encrypted data involve not only information retrieval techniques such as advanced data structures used to represent the searchable index, and efficient search algorithms that run over the corresponding data structure, but also the proper design of cryptographic protocols to ensure the security and privacy of the overall system. Although single keyword search and fuzzy search have been implemented separately, a combination of the two does not lead to a secure and efficient single keyword fuzzy search scheme.

DISADVANTAGES:

The large number of data users and documents in cloud, it is crucial for the search service to allow multi-keyword query and provide result similarity ranking to meet the effective data retrieval need. The searchable encryption focuses on single keyword search or Boolean keyword search, and rarely differentiates the search results.

  • Single-keyword search without ranking
  • Boolean- keyword search without ranking
  • Single-keyword similarity search with ranking

PROPOSED SYSTEM:

Propose a symmetric searchable encryption scheme which achieves high efficiency for large databases with modest scarification on security guarantees. Propose a multi-keyword search scheme supporting result ranking by adopting k-nearest neighbors (kNN) technique. Propose a dynamic searchable encryption scheme through blind storage to conceal access pattern of the search user.

We propose the detailed EMRS. Since the encrypted documents and index z are both stored in the blind storage system, we would provide the general construction of the blind storage system. Moreover, since the EMRS aims to eliminate the risk of sharing the key that is used to encrypt the documents with all search users and solve the trapdoor unlinkability problem in Naveed’s scheme.

We modify the construction of blind storage and leverage ciphertext policy attribute-based encryption (CP-ABE) technique in the EMRS. However, specific construction of CP-ABE is out of scope of this paper and we only give a simple indication here. The notations of this paper are shown in Table 1. The EMRS consists of the following phases: System Setup, Construction of Blind Storage, Encrypted Database Setup, Trapdoor Generation, Efficient and Secure Search, and Retrieve Documents from Blind Storage.

ADVANTAGES:

In this paper, we propose an efficient multi-keyword ranked search (EMRS) scheme over encrypted mobile cloud data through blind storage.

Our main contributions can be summarized as follows:

  • We introduce a relevance score in searchable encryption to achieve multi-keyword ranked search over the encrypted mobile cloud data. In addition to that, we construct an efficient index to improve the search efficiency.
  • By modifying the blind storage system in the EMRS, we solve the trapdoor unlinkability problem and conceal access pattern of the search user from the cloud server.
  • We give thorough security analysis to demonstrate that the EMRS can reach a high security level including confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Moreover, we implement extensive experiments, which show that the EMRS can achieve enhanced efficiency in the terms of functionality and search efficiency compared with existing proposals

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007

EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval

Abstract—Graph-based ranking models have been widely applied in information retrieval area. In this paper, we focus on a well known graph-based model – the Ranking on Data Manifold model, or Manifold Ranking (MR). Particularly, it has been successfully applied to content-based image retrieval, because of its outstanding ability to discover underlying geometrical structure of the given image database. However, manifold ranking is computationally very expensive, which significantly limits its applicability to large databases especially for the cases that the queries are out of the database (new samples). We propose a novel scalable graph-based ranking model called Efficient Manifold Ranking (EMR), trying to address the shortcomings of MR from two main perspectives: scalable graph construction and efficient ranking computation. Specifically, we build an anchor graph on the database instead of a traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking. An approximate method is adopted for efficient out-of-sample retrieval. Experimental results on some large scale image databases demonstrate that EMR is a promising method for real world retrieval  applications.

INTRODUCTION
GRAPH-BASED ranking models have been deeply studied and widely applied in information retrieval area. In this paper, we focus on the problem of applying a novel and efficient graph-based model for contentbased image retrieval (CBIR), especially for out-of-sample retrieval on large scale databases.
Traditional image retrieval systems are based on keyword search, such as Google and Yahoo image search. In these systems, a user keyword (query) is matched with the context around an image including the title, manual annotation, web document, etc. These systems don’t utilize information from images. However these systems suffer many problems, such as shortage of the text information and inconsistency of the meaning of the text and image. Content-based image retrieval is a considerable choice to overcome these difficulties. CBIR has drawn a great attention in the past two decades [1]–[3]. Different from traditional keyword search systems, CBIR systems utilize the low-level features, including global features (e.g., color moment, edge histogram, LBP [4]) and local features (e.g., SIFT [5]), automatically extracted from images. A great amount of researches have been performed for designing more informative low-level features to represent images, or better metrics (e.g., DPF [6]) to measure the perceptual similarity, but their performance is restricted by many conditions and is sensitive to the data. Relevance feedback [7] is a useful tool for interactive CBIR. User’s high level perception is captured by dynamically updated weights based on the user’s feedback.
Most traditional methods focus on the data features too much but they ignore the underlying structure information, which is of great importance for semantic discovery, especially when the label information is unknown. Many databases have underlying cluster or manifold structure.
Under such circumstances, the assumption of label consistency is reasonable [8], [9]. It means that those nearby data points, or points belong to the same cluster or manifold, are very likely to share the same semantic label. This phenomenon is extremely important to explore the semantic relevance when the label information is unknown. In our opinion, a good CBIR system should consider images’ lowlevel features as well as the intrinsic structure of the image database.
Manifold Ranking (MR) [9], [10], a famous graph-based ranking model, ranks data samples with respect to the intrinsic geometrical structure collectively revealed by a large number of data. It is exactly in line with our consideration. MR has been widely applied in many applications, and shown to have excellent performance and feasibility on a variety of data types, such as the text [11], image [12], [13], and video[14]. By taking the underlying structure into account, manifold ranking assigns each data sample a relative ranking score, instead of an absolute pairwise similarity as traditional ways. The score is treated as a similarity metric defined on the manifold, which is more meaningful to capturing the semantic relevance degree. He et al. [12] firstly applied MR to CBIR, and significantly improved image retrieval performance compared with state-of-the-art algorithms.
However, manifold ranking has its own drawbacks to handle large scale databases – it has expensive computational cost, both in graph construction and ranking computation stages. Particularly, it is unknown how to handle an out-of-sample query (a new sample) efficiently under the existing framework. It is unacceptable to recompute the model for a new query. That means, original manifold ranking is inadequate for a real world CBIR system, in which the user provided query is always an out-of-sample.
In this paper, we extend the original manifold ranking and propose a novel framework named Efficient Manifold Ranking (EMR). We try to address the shortcomings of manifold ranking from two perspectives: the first is scalable graph construction; and the second is efficient computation, especially for out-of-sample retrieval. Specifically, we build an anchor graph on the database instead of the traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking computation. The model has two separate stages: an offline stage for building (or learning) the ranking model and an online stage for handling a new query. With EMR, we can handle a database with 1 million images and do the online retrieval in a short time. To the best of our knowledge, no previous manifold ranking based algorithm has run out-of-sample retrieval on a database in this scale.

Effective Key Management in Dynamic Wireless Sensor Networks

Recently, wireless sensor networks (WSNs) have been deployed for a wide variety of applications, including military sensing and tracking, patient status monitoring, traffic flow monitoring, where sensory devices often move between different locations. Securing data and communications requires suitable encryption key protocols. In this paper, we propose a certificateless-effective key management (CL-EKM) protocol for secure communication in dynamic WSNs characterized by node mobility. The CL-EKM supports efficient key updates when a node leaves or joins a cluster and ensures forward and backward key secrecy. The protocol also supports efficient key revocation for compromised nodes and minimizes the impact of a node compromise on the security of other communication links.

A security analysis of our scheme shows that our protocol is effective in defending against various  attacks. We implement CL-EKM in Contiki OS and simulate it using Cooja simulator to assess its time, energy, communication, and memory performance

DISTORTION-AWARE CONCURRENT MULTIPATH TRANSFER FOR MOBILE VIDEO STREAMING IN HETEROGENEOUS WIRELESS NETWORKS

ABSTRACT:

The massive proliferation of wireless infrastructures with complementary characteristics prompts the bandwidth aggregation for Concurrent Multipath Transfer (CMT) over heterogeneous access networks. Stream Control Transmission Protocol (SCTP) is the standard transport-layer solution to enable CMT in multihomed communication environments. However, delivering high-quality streaming video with the existing CMT solutions still remains problematic due to the stringent quality of service (QoS) requirements and path asymmetry in heterogeneous wireless networks.

In this paper, we advance the state of the art by introducing video distortion into the decision process of multipath data transfer. The proposed distortion-aware concurrent multipath transfer (CMT-DA) solution includes three phases: 1) per-path status estimation and congestion control; 2) quality-optimal video flow rate allocation; 3) delay and loss controlled data retransmission. The term ‘flow rate allocation’ indicates dynamically picking appropriate access networks and assigning the transmission rates.

We analytically formulate the data distribution over multiple communication paths to minimize the end-to-end video distortion and derive the solution based on the utility maximization theory. The performance of the proposed CMT-DA is evaluated through extensive semi-physical emulations in Exata involving H.264 video streaming. Experimental results show that CMT-DA outperforms the reference schemes in terms of video peak signal-to-noise ratio (PSNR), good put, and inter-packet delay.

INTRODUCTION:

During the past few years, mobile video streaming service online gaming, etc. has become one of the “killer applications” and the video traffic headed for hand-held devices has experienced explosive growth. The latest market research conducted by Cisco Company indicates that video streaming accounts for 53 percent of the mobile Internet traffic in parallel, global mobile data is expected to increase 11-fold in the next five years. Another ongoing trend feeding this tremendous growth is the popularity of powerful mobile terminals (e.g., smart phones and iPad), which facilitates individual users to access the Internet and watch videos from everywhere [4].

Despite the rapid advancements in network infrastructures, it is still challenging to deliver high-quality streaming video over wireless platforms. On one hand, the Wi-Fi networks are limited in radio coverage and mobility support for individual users; On the other hand, the cellular networks can well sustain the user mobility but their bandwidth is often inadequate to support the throughput-demanding video applications. Although the 4 G LTE and WiMAX can provide higher peak data rate and extended coverage, the available capacity will still be insufficient compared to the ever-growing video data traffic.

The complementary characteristics of heterogeneous access networks prompt the bandwidth aggregation for concurrent multipath transfer (CMT) to enhance transmission throughput and reliability (see Fig. 1). With the emergency of multihomed/multinetwork terminals CMT is considered to be a promising solution for supporting video streaming in future wireless networking. The key research issue in multihomed video delivery over heterogeneous wireless networks must be effective integration of the limited channel resources available for providing adequate quality of service (QoS). Stream control transmission protocol (SCTP) is the standard transport-layer solution that exploits the multihoming feature to concurrently distribute data across multiple independent end-to-end paths.

Therefore, many CMT solutions have been proposed to optimize the delay, throughput, or reliability performance for efficient data delivery. However, due to the special characteristics of streaming video, these network-level criteria cannot always improve the perceived media quality. For instance, a real-time video application encoded in constant bit rate (CBR) may not effectively leverage the throughput gains since its streaming rate is typically fixed or bounded by the encoding schemes. In addition, involving a communication path with available bandwidth but long delay in the multipath video delivery may degrade the streaming video quality as the end-to-end distortion increases. Consequently, leveraging the CMT for high-quality streaming video over heterogeneous wireless networks is largely unexplored.

In this paper, we investigate the problem by introducing video distortion into the decision process of multipath data transfer over heterogeneous wireless networks. The proposed Distortion-Aware Concurrent Multipath Transfer (CMT-DA) solution is a transport-layer protocol and includes three phases: 1) per-path status estimation and congestion control to exploit the available channel resources; 2) data flow rate allocation to minimize the end-to-end video distortion; 3) delay and loss constrained data retransmission for bandwidth conservation. The detailed descriptions of the proposed solution will be presented in Section 4. Specifically, the contributions of this paper can be summarized in the following.

_ An effective CMT solution that uses path status estimation, flow rate allocation, and retransmission control to optimize the real-time video quality in integrated heterogeneous wireless networks.

_ A mathematical formulation of video data distribution over parallel communication paths to minimize the end-to-end distortion. The utility maximization theory is employed to derive the solution for optimal transmission rate assignment extensive semi-physical emulations in Exata involving real-time H.264 video streaming.

LITRATURE SURVEY:

CMT-QA: QUALITY-AWARE ADAPTIVE CONCURRENT MULTIPATH DATA TRANSFER IN HETEROGENEOUS WIRELESS NETWORKS

AUTHOR: C. Xu, T. Liu, J. Guan, H. Zhang, and G. M. Muntean,

PUBLICATION: IEEE Trans. Mobile Comput., vol. 12, no. 11, pp. 2193–2205, Nov. 2013.

EXPLANATION:

Mobile devices equipped with multiple network interfaces can increase their throughput by making use of parallel transmissions over multiple paths and bandwidth aggregation, enabled by the stream control transport protocol (SCTP). However, the different bandwidth and delay of the multiple paths will determine data to be received out of order and in the absence of related mechanisms to correct this, serious application-level performance degradations will occur. This paper proposes a novel quality-aware adaptive concurrent multipath transfer solution (CMT-QA) that utilizes SCTP for FTP-like data transmission and real-time video delivery in wireless heterogeneous networks. CMT-QA monitors and analyses regularly each path’s data handling capability and makes data delivery adaptation decisions to select the qualified paths for concurrent data transfer. CMT-QA includes a series of mechanisms to distribute data chunks over multiple paths intelligently and control the data traffic rate of each path independently. CMT-QA’s goal is to mitigate the out-of-order data reception by reducing the reordering delay and unnecessary fast retransmissions. CMT-QA can effectively differentiate between different types of packet loss to avoid unreasonable congestion window adjustments for retransmissions. Simulations show how CMT-QA outperforms existing solutions in terms of performance and quality of service.

PERFORMANCE ANALYSIS OF PROBABILISTIC MULTIPATH TRANSMISSION OF VIDEO STREAMING TRAFFIC OVER MULTI-RADIO WIRELESS DEVICES

AUTHOR: W. Song and W. Zhuang

PUBLICATION: IEEE Trans. Wireless Commun., vol. 11, no. 4, pp. 1554–1564, 2012.

EXPLANATION:

Popular smart wireless devices become equipped with multiple radio interfaces. Multihoming support can be enabled to allow for multiple simultaneous associations with heterogeneous networks. In this study, we focus on video streaming traffic and propose analytical approaches to evaluate the packet-level and call-level performance of a multipath transmission scheme, which sends video traffic bursts over multiple available channels in a probabilistic manner. A probability generation function (PGF) and z-transform method is applied to derive the PGF of packet delay and any arbitrary moment in general. Particularly, we can obtain the average delay, delay jitter, and delay outage probability. The essential characteristics of video traffic are taken into account, such as deterministic burst intervals, highly dynamic burst length, and batch arrivals of transmission packets. The video substream traffic resulting from the probabilistic flow splitting is characterized by means of zero-inflated models. Further, the call-level performance, in terms of flow blocking probability and system throughput, is evaluated with a three-dimensional Markov process and compared with that of an always-best access selection. The numerical and simulations results demonstrate the effectiveness of our analysis framework and the performance gain of multipath transmission.

AN END-TO-END VIRTUAL PATH CONSTRUCTION SYSTEM FOR STABLE LIVE VIDEO STREAMING OVER HETEROGENEOUS WIRELESS NETWORKS

AUTHOR: S. Han, H. Joo, D. Lee, and H. Song

PUBLICATION: IEEE J. Sel. Areas Commun., vol. 29, no. 5, pp. 1032–1041, May 2011.

EXPLANATION:

In this paper, we propose an effective end-to-end virtual path construction system, which exploits path diversity over heterogeneous wireless networks. The goal of the proposed system is to provide a high quality live video streaming service over heterogeneous wireless networks. First, we propose a packetization-aware fountain code to integrate multiple physical paths efficiently and increase the fountain decoding probability over wireless packet switching networks. Second, we present a simple but effective physical path selection algorithm to maximize the effective video encoding rate while satisfying delay and fountain decoding failure rate constraints. The proposed system is fully implemented in software and examined over real WLAN and HSDPA networks.

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing method an effective approach in designing error-resilient wireless video broadcasting systems in recent years, Joint source-channel coding (JSCC) attracts increasing interests in both research community and industry because it shows better results in robust layered video transmission over error-prone channels of various techniques available during these years may be found. However, there are still many open problems in terms of how to serve heterogeneous users with diverse screen features and variable reception performances in wireless video broadcast system. One particular challenging problem of this heterogeneous quality-of-service (QoS) video provision is: the users would prefer flexible video with low quality to match their screens, at the same time; the video stream could be reliable received.

The main technical difficulties are as follows:

  • A distinctive characteristic in current wireless broadcast system is that the receivers are highly heterogeneous in terms of their terminal processing capabilities and available bandwidths. In source side, scalable video coding (SVC) has been proposed to provide an attractive solution to this problem.
  • However, in order to support flexible video broadcasting, the scalable video sources need to provide adaptation ability through a variety of schemes, such as scalable video stream extraction layer generation with different priority and summarization before they can be transmitted over the error-prone networks.

DISADVANTAGES:

  • Existing layered video data is very sensitive to transmission failures, the transmission must be more reliable, have low overhead and support large numbers of devices with heterogeneous characteristics. In broadcast and multicast networks, conventional schemes such as adaptive retransmission have their limitations, for example, retransmission may lead to implosion problem.
  • Forward error correction (FEC) and unequal error protection (UEP) are employed to provide the QoS support for video transmission. However, in order to obtain as minimum investment as possible in broadcasting system deployment, server-side must be designed more scalable, reliable, independent, and support vast number of autonomous receivers. Suitable FEC approaches are expected such that can eliminate the retransmission and lower the unnecessary receptions overhead at each receiver-side.
  • Conventionally, the joint source and channel coding are designed with seldom consideration in heterogeneous characteristics, and most of the above challenges are ignored in practical video broadcasting system. This leads to the need for heterogeneous QoS video provision in broadcasting network. This paper presents the point of view to study the hybrid-scalable video from new quality metric so as to support users’ heterogeneous requirements.

PROPOSED SYSTEM:

We proposed Distortion-Aware Concurrent Multipath Transfer (CMT-DA) solution is a transport-layer protocol and includes three phases: 1) per-path status estimation and congestion control to exploit the available channel resources; 2) data flow rate allocation to minimize the end-to-end video distortion; 3) delay and loss constrained data retransmission for bandwidth conservation an effective CMT solution that uses path status estimation, flow rate allocation, and retransmission control to optimize the real-time video quality in integrated heterogeneous wireless networks.

We propose a quality-aware adaptive concurrent multipath transfer (CMT-QA) scheme that distributes the data based on estimated path quality. Although the path status is an important factor that affects the scheduling policy, the application requirements should also be considered to guarantee the QoS. Basically, the proposed CMT-DA is different from the CMT-QA as we take the video distortion as the benchmark. Still, the proposed solutions (path status estimation, flow rate allocation, and retransmission control) in CMT-DA are significantly different from those in CMTQA. In another research conducted by a realistic evaluation tool-set is proposed to analyze and optimize the performance of multimedia distribution when taking advantage of CMT-based multihoming SCTP solutions.

ADVANTAGES:

  • We propose a novel out-of-order scheduling approach for in-order arriving of the data chunks in CMT-DA based on the progressive water-filling algorithm. Heterogeneous wireless networks based on fountain code. The encoded multipath streaming model proposed by Chow et al. is a joint multipath and FEC approach for real time live streaming applications.
  • We propose an end-to-end virtual path construction system that exploits the path diversity in heterogeneous wireless networks based on fountain code. The encoded multipath streaming model proposed by Chow et al. is a joint multipath and FEC approach for real time live streaming applications. The authors provide asymptotic analysis and derive closed-form solution for the FEC packets allocation.
  • The major components at the sender side are the parameter control unit, flow rate allocator, and retransmission controller. The parameter control unit is responsible for processing the acknowledgements (ACKs) feedback from the receiver, estimating the path status and adapting the congestion window size. The delay and loss requirements are imposed by the video applications to achieve the target video quality.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor              –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Tools                                     :           Netbeans or Eclipse
  • Script :           Java Script
  • Document :           MS-Office 2007

Defeating Jamming With the Power of Silence A Game-Theoretic Analysis

Abstract:

The timing channel is a logical communication channel in which information is encoded in the timing between events. Recently, the use of the timing channel has been proposed as a countermeasure to reactive jamming attacks performed by an energy- constrained malicious node. In fact, while a jammer is able to disrupt the information contained in the attacked packets, timing information cannot be jammed, and therefore, timing channels can be exploited to deliver information to the receiver even on a jammed channel. Since the nodes under attack and the jammer have conflicting interests, their interactions can be modeled by means of game theory. Accordingly, in this paper, a game-theoretic model of the interactions between nodes exploiting the timing channel to achieve resilience to jamming attacks and a jammer is derived and analyzed. More specifically, the Nash equilibrium is studied in terms of existence, uniqueness, and convergence under best response dynamics. Furthermore, the case in which the communication nodes set their strategy and the jammer reacts accordingly is modeled and analyzed as a Stackelberg game, by considering both perfect and imperfect knowledge of the jammer’s utility function. Extensive numerical results are presented, showing the impact of network parameters on the system performance.

Introduction:

A timing channel is a communication channel which exploits silence intervals between consecutive transmissions to encode information. Recently, use of timing channels has been proposed in the wireless domain to support low rate, energy efficient communications  as well as covert and resilient communications Timing channels are more although not totally  immune from reactive jamming attacks. In fact, the interfering signal begins its disturbing action against the communication only after identifying an ongoing transmission, and thus after the timing information has been decoded by the receiver.

Timing channel-based communication scheme has been proposed to counteract jamming by establishing a low rate physical layer on top of the traditional physical/link layers using detection and timing of failed packet receptions at the receiver.

The energy cost of jamming the timing channel and the resulting trade-offs have been analyzed. The interactions between the jammer and the node whose transmissions are under attack, which we call target node.

Specifically, assume that the target node wants to maximize the amount of information that can be transmitted per unit of time by means of the timing channel, whereas, the jammer wants to minimize such amount of information while reducing the energy expenditure.

The target node and the jammer have conflicting interests; we develop a game theoretical framework that models their interactions. We investigate both the case in which these two adversaries play their strategies.

 The situation when the target node (the leader) anticipates the actions of the jammer (the follower). To this purpose, we study both the Nash Equilibria (NEs) and Stackelberg Equilibria (SEs) of our proposed games.

Existing system:

Recently, use of timing channels has been proposed in the wireless domain to support low rate, energy efficient communications as well as covert and resilient communications. In existing system methodologies to detect jamming attacks are illustrated; it is also shown that it is possible to identify which kind of jamming attack is ongoing by looking at the signal strength and other relevant network parameters, such as bit and packet errors. Several solutions against reactive jamming have been proposed that exploit different techniques, such as frequency hopping, power control and UN jammed bits.

Disadvantages:

  • Continuous jamming is very costly in terms of energy consumption for the jammer
  • Existing solutions usually rely on users’ cooperation and coordination, which might not be guaranteed in a jammed environment. In fact, the reactive jammer can totally disrupt each transmitted packet and, consequently, no information can be decoded and then used to this purpose.

Proposed system:

Our proposed system implementation focus on the resilience of timing channels to jamming attacks. In general, these attacks can completely disrupt communications when the jammer continuously emits a high power disturbing signal, i.e., when continuous jammingis performed.

Analyze the interactions between the jammer and the node whose transmissions are under attack, which we call target node. Specifically, we assume that the target node wants to maximize the amount of information that can be transmitted per unit of time by means of the timing channel, whereas, the jammer wants to minimize such amount of information while reducing the energy expenditure.

As the target node and the jammer have conflicting interests, we develop a game theoretical framework that models their interactions. We investigate both the case in which these two adversaries play their strategies simultaneously and the situation when the target node (the leader) anticipates the actions of the jammer (the follower). To this purpose, we study both the Nash Equilibria (NEs) and Stackelberg Equilibria (SEs) of our proposed games.

Advantages:

  • System model the interactions between a jammer and a target node as a jamming game
  • We prove the existence, uniqueness and convergence to the Nash equilibrium (NE) under best response dynamics
  • We prove the existence and uniqueness of the equilibrium of the Stackelberg game where the target node plays as a leader and the jammer reacts consequently
  • We investigate in this latter Stackelberg scenario the impact on the achievable performance of imperfect knowledge of the jammer’s utility function;
  • We conduct an extensive numerical analysis which shows that our proposed models well capture the main factors behind the utilization of timing channels, thus representing a promising framework for the design and understanding of such systems.

Modules:

NASH Equilibrium Analysis:

The Nash Equilibrium points (NEs), in which both players achieve their highest utility given the strategy profile of the opponent. In the following we also provide proofs of the existence, uniqueness and convergence to the Nash Equilibrium under best response dynamics.

Existence of the Nash Equilibrium:

 It is well known that the intersection points between bT(y) and bJ(x)are the NEs of the game. Therefore, to demonstrate the existence of at least one NE, it suffices to prove that bT(y) and bJ(x) have one or more intersection points. In other words, it is sufficient to find one or more pairs.

Uniqueness of the Nash Equilibrium:

After proving the NE existence in Theorem, let us prove the uniqueness of the NE, that is, there is only one strategy profile such that no player has incentive to deviate unilaterally.

Convergence to the Nash Equilibrium:

Analyze the convergence of the game to the NE when players follow Best Response Dynamics (BRD). In BRD the game starts from any initial point(x(0),y(0))∈Sand, at each successive step, each player plays its strategy by following its best response function.

Performance Analysis

The game allows the leader to achieve a utility which is atleast equal to the utility achieved in the ordinary game at the NE, if we assume perfect knowledge, that is, the target node is completely aware of the utility function of the jammer and its parameters, and thus it is able to evaluate bJ(x). Whereas, if some parameters in the utility function of the jammer are unknown at the target node

Conclusion:

Our system implementation proposed a game-theoretic model of the interactions between a jammer and a communication node that exploits a timing channel to improve resilience to jamming attacks. Structural properties of the utility functions of the two players have been analyzed and exploited to prove the existence and uniqueness of the Nash Equilibrium. The convergence of the game to the Nash Equilibrium has been studied and proved by analyzing the best response dynamics. Furthermore, as the reactive jammer is assumed to start transmitting its interference signal only after detecting activity of the node under attack, a Stackelberg game has been properly investigated, and proofs on the existence and uniqueness of the Stackelberg Equilibrium has been provided.

DATA-STREAM-BASED INTRUSION DETECTION SYSTEM FOR ADVANCED METERING INFRASTRUCTURE IN SMART GRID: A FEASIBILITY STUDY

ABSTRACT:

In this paper, we will focus on the security of advanced metering infrastructure (AMI), which is one of the most crucial components of SG. AMI serves as a bridge for providing bidirectional information flow between user domain and utility domain. AMI’s main functionalities encompass power measurement facilities, assisting adaptive power pricing and demand side management, providing self-healing ability, and interfaces for other systems.

AMI is usually composed of three major types of components, namely, smart meter, data concentrator, and central system (a.k.a. AMI headend) and bidirectional communication networks among those components. AMI is exposed to various security threats such as privacy breach, energy theft, illegal monetary gain, and other malicious activities. As AMI is directly related to revenue earning, customer power consumption, and privacy, of utmost importance is securing its infrastructure. In order to protect AMI from malicious attacks, we look into the intrusion detection system (IDS) aspect of security solution.

We can define IDS as a monitoring system for detecting any unwanted entity into a targeted system (such as AMI in our context). We treat IDS as a second line security measure after the first line of primary AMI security techniques such as encryption, authorization, and authentication, Hence, changing specifications in all key IDS sensors would be expensive and cumbersome. In this paper, we choose to employ anomaly-based IDS using data mining approaches.

INTRODUCTION

Smart grid (SG) is a set of technologies that integrate modern information technologies with present power grid system. Along with many other benefits, two-way communication, updating users about their consuming behavior, controlling home appliances and other smart components remotely, and monitoring power grid’s stability are unique features of SG. To facilitate such kinds of novel features, SG needs to incorporate many new devices and services. For communicating, monitoring, and controlling of these devices/services, there may also be a need for many new protocols and standards. However, the combination of all these new devices, services, protocols, and standards make SG a very complex system that is vulnerable to increased security threats—like any other complex systems are. In particular, because of its bidirectional, interoperable, and software-oriented nature, SG is very prone to cyber attacks. If proper security measures are not taken, a cyber attack on SG can potentially bring about a huge catastrophic impact on the whole grid and, thus, to the society. Thus, cyber security in SG is treated as one of the vital issues by the National Institute of Standards and Technology and the Federal Energy Regulatory Commission.

In this paper, we will focus on the security of advanced metering infrastructure (AMI), which is one of the most crucial components of SG. AMI serves as a bridge for providing bidirectional information flow between user domain and utility domain [2]. AMI’s main functionalities encompass power measurement facilities, assisting adaptive power pricing and demand side management, providing self-healing ability, and interfaces for other systems. AMI is usually composed of three major types of components, namely, smart meter, data concentrator, and central system (a.k.a. AMI headend) and bidirectional communication networks among those components. Being a complex system in itself, AMI is exposed to various security threats such as privacy breach, energy theft, illegal monetary gain, and other malicious activities. As AMI is directly related to revenue earning, customer power consumption, and privacy, of utmost importance is securing its infrastructure.

LITRATURE SURVEY

EFFICIENT AUTHENTICATION SCHEME FOR DATA AGGREGATION IN SMART GRID WITH FAULT TOLERANCE AND FAULT DIAGNOSIS

PUBLISH: IEEE Power Energy Soc. Conf. ISGT, 2012, pp. 1–8.

AUTOHR: D. Li, Z. Aung, J. R. Williams, and A. Sanchez

EXPLANATION:

Authentication schemes relying on per-packet signature and per-signature verification introduce heavy cost for computation and communication. Due to its constraint resources, smart grid’s authentication requirement cannot be satisfied by this scheme. Most importantly, it is a must to underscore smart grid’s demand for high availability. In this paper, we present an efficient and robust approach to authenticate data aggregation in smart grid via deploying signature aggregation, batch verification and signature amortization schemes to less communication overhead, reduce numbers of signing and verification operations, and provide fault tolerance. Corresponding fault diagnosis algorithms are contributed to pinpoint forged or error signatures. Both experimental result and performance evaluation demonstrate our computational and communication gains.

CYBER SECURITY ISSUES FOR ADVANCED METERING INFRASTRUCTURE (AMI)

PUBLISH: IEEE Power Energy Soc. Gen. Meet. – Convers. Del. Electr. Energy 21st Century, 2008, pp. 1–5.

AUTOHR: F. M. Cleveland

EXPLANATION:

Advanced Metering Infrastructure (AMI) is becoming of increasing interest to many stakeholders, including utilities, regulators, energy markets, and a society concerned about conserving energy and responding to global warming. AMI technologies, rapidly overtaking the earlier Automated Meter Reading (AMR) technologies, are being developed by many vendors, with portions being developed by metering manufacturers, communications providers, and back-office Meter Data Management (MDM) IT vendors. In this flurry of excitement, very little effort has yet been focused on the cyber security of AMI systems. The comment usually is “Oh yes, we will encrypt everything – that will make everything secure.” That comment indicates unawareness of possible security threats of AMI – a technology that will reach into a large majority of residences and virtually all commercial and industrial customers. What if, for instance, remote connect/disconnect were included as one AMI capability – a function of great interest to many utilities as it avoids truck rolls. What if a smart kid hacker in his basement cracked the security of his AMI system, and sent out 5 million disconnect commands to all customer meters on the AMI system.

INTRUSION DETECTION FOR ADVANCED METERING INFRASTRUCTURES: REQUIREMENTS AND ARCHITECTURAL DIRECTIONS

PUBLISH: IEEE Int. Conf. SmartGridComm, 2010, pp. 350–355.

AUTOHR: R. Berthier, W. H. Sanders, and H. Khurana,

EXPLANATION:

The security of Advanced Metering Infrastructures (AMIs) is of critical importance. The use of secure protocols and the enforcement of strong security properties have the potential to prevent vulnerabilities from being exploited and from having costly consequences. However, as learned from experiences in IT security, prevention is one aspect of a comprehensive approach that must also include the development of a complete monitoring solution. In this paper, we explore the practical needs for monitoring and intrusion detection through a thorough analysis of the different threats targeting an AMI. In order to protect AMI from malicious attacks, we look into the intrusion detection system (IDS) aspect of security solution. We can define IDS as a monitoring system for detecting any unwanted entity into a targeted system (such as AMI in our context). We treat IDS as a second line security measure after the first line of primary AMI security techniques such as encryption, authorization, and authentication, such as [3]. However, Cleveland [4] stressed that these first line security solutions alone are not sufficient for securing AMI.

MOA: MASSIVE ONLINE ANALYSIS, A FRAMEWORK FOR STREAM CLASSIFICATION AND CLUSTERING

PUBLISH: JMLR Workshop Conf. Proc., Workshop Appl. Pattern Anal., 2010, vol. 11, pp. 44–50.

AUTOHR: A. Bifet, G. Holmes, B. Pfahringer, P. Kranen, H. Kremer, T. Jansen, and T. Seidl

EXPLANATION:

In today’s applications, massive, evolving data streams are ubiquitous. Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problems of scaling up the implementation of state of the art algorithms to real world dataset sizes and of making algorithms comparable in benchmark streaming settings. It contains a collection of offline and online algorithms for both classification and clustering as well as tools for evaluation. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily compare several algorithms and apply them to real world data sets and settings. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license. Besides providing algorithms and measures for evaluation and comparison, MOA is easily extensible with new contributions and allows the creation of benchmark scenarios through storing and sharing setting files.

SECURING ADVANCED METERING INFRASTRUCTURE USING INTRUSION DETECTION SYSTEM WITH DATA STREAM MINING

PUBLISH: Proc. PAISI, 2012, vol. 7299, pp. 96–111

AUTOHR: M. A. Faisal, Z. Aung, J. Williams, and A. Sanchez

EXPLANATION:

Advanced metering infrastructure (AMI) is an imperative component of the smart grid, as it is responsible for collecting, measuring, analyzing energy usage data, and transmitting these data to the data concentrator and then to a central system in the utility side. Therefore, the security of AMI is one of the most demanding issues in the smart grid implementation. In this paper, we propose an intrusion detection system (IDS) architecture for AMI which will act as a complimentary with other security measures. This IDS architecture consists of three local IDSs placed in smart meters, data concentrators, and central system (AMI headend). For detecting anomaly, we use data stream mining approach on the public KDD CUP 1999 data set for analysis the requirement of the three components in AMI. From our result and analysis, it shows stream data mining technique shows promising potential for solving security issues in AMI.

DATA STREAM MINING ARCHITECTURE FOR NETWORK INTRUSION DETECTION

PUBLISH: IEEE Int. Conf. IRI, 2004, pp. 363–368

AUTOHR: N. C. N. Chu, A. Williams, R. Alhajj, and K. Barker

EXPLANATION:

In this paper, we propose a stream mining architecture which is based on a single-pass approach. Our approach can be used to develop efficient, effective, and active intrusion detection mechanisms which satisfy the near real-time requirements of processing data streams on a network with minimal overhead. The key idea is that new patterns can now be detected on-the-fly. They are flagged as network attacks or labeled as normal traffic, based on the current network trend, thus reducing the false alarm rates prevalent in active network intrusion systems and increasing the low detection rate which characterizes passive approaches.

RESEARCH ON DATA MINING TECHNOLOGIES APPLYING INTRUSION DETECTION

PUBLISH: Proc. IEEE ICEMMS, 2010, pp. 230–233

AUTOHR: Z. Qun and H. Wen-Jie

EXPLANATION:

Intrusion detection is one of network security area of technology main research directions. Data mining technology was applied to network intrusion detection system (NIDS), may automatically discover the new pattern from the massive network data, to reduce the workload of the manual compilation intrusion behavior patterns and normal behavior patterns. This article reviewed the current intrusion detection technology and the data mining technology briefly. Focus on data mining algorithm in anomaly detection and misuse detection of specific applications. For misuse detection, the main study the classification algorithm; for anomaly detection, the main study the pattern comparison and the cluster algorithm. In pattern comparison to analysis deeply the association rules and sequence rules . Finally, has analysed the difficulties which the current data mining algorithm in intrusion detection applications faced at present, and has indicated the next research direction.

AN EMBEDDED INTRUSION DETECTION SYSTEM MODEL FOR APPLICATION PROGRAM

PUBLISH: IEEE PACIIA, 2008, vol. 2, pp. 910–912.

AUTOHR: S. Wu and Y. Chen

EXPLANATION:

Intrusion detection is an effective security mechanism developed in the recent decade. Because of its wide applicability, intrusion detection becomes the key part of the security mechanism. The modern technologies and models in intrusion detection field are categorized and studied. The characters of current practical IDS are introduced. The theories and realization of IDS based on applications are presented. The basic ideas concerned with how to design and realize the embedded IDS for application are proposed.

ACCURACY UPDATED ENSEMBLE FOR DATA STREAMS WITH CONCEPT DRIFT

PUBLISH: Proc. 6th Int. Conf. HAIS Part II, 2011, pp. 155–163.

AUTOHR: D. Brzeziñski and J. Stefanowski

EXPLANATION:

In this paper we study the problem of constructing accurate block-based ensemble classifiers from time evolving data streams. AWE is the best-known representative of these ensembles. We propose a new algorithm called Accuracy Updated Ensemble (AUE), which extends AWE by using online component classifiers and updating them according to the current distribution. Additional modifications of weighting functions solve problems with undesired classifier excluding seen in AWE. Experiments with several evolving data sets show that, while still requiring constant processing time and memory, AUE is more accurate than AWE.

ACTIVE LEARNING WITH EVOLVING STREAMING DATA

PUBLISH: Proc. ECML-PKDD Part III, 2011, pp. 597–612.

AUTOHR: I. liobaitë, A. Bifet, B. Pfahringer, and G. Holmes

EXPLANATION:

In learning to classify streaming data, obtaining the true labels may require major effort and may incur excessive cost. Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. In this paper we develop two active learning strategies for streaming data that explicitly handle concept drift. They are based on uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space. We empirically demonstrate that these strategies react well to changes that can occur anywhere in the instance space and unexpectedly.

LEARNING FROM TIME-CHANGING DATA WITH ADAPTIVE WINDOWING

PUBLISH: Proc. SIAM Int. Conf. SDM, 2007, pp. 443–448.

AUTOHR: A. Bifet and R. Gavaldà,

EXPLANATION:

We present a new approach for dealing with distribution change and concept drift when learning from data sequences that may vary with time. We use sliding windows whose size, instead of being fixed a priori, is recomputed online according to the rate of change observed from the data in the window itself. This delivers the user or programmer from having to guess a time-scale for change. Contrary to many related works, we provide rigorous guarantees of performance, as bounds on the rates of false positives and false negatives. Using ideas from data stream algorithmics, we develop a time- and memory-efficient version of this algorithm, called ADWIN2. We show how to combine ADWIN2 with the Naïve Bayes (NB) predictor, in two ways: one, using it to monitor the error rate of the current model and declare when revision is necessary and, two, putting it inside the NB predictor to maintain up-to-date estimations of conditional probabilities in the data. We test our approach using synthetic and real data streams and compare them to both fixed-size and variable-size window strategies with good results.

DATA-DRIVEN COMPOSITION FOR SERVICE-ORIENTED SITUATIONAL WEB APPLICATIONS

ABSTRACT:

This paper presents a systematic data-driven approach to assisting situational application development. We first propose a technique to extract useful information from multiple sources to abstract service capabilities with set tags. This supports intuitive expression of user’s desired composition goals by simple queries, without having to know underlying technical details. A planning technique then exploits composition solutions which can constitute the desired goals, even with some potential new interesting composition opportunities. A browser-based tool facilitates visual and iterative refinement of composition solutions, to finally come up with the satisfying outputs. A series of experiments demonstrate the efficiency and effectiveness of our approach. Data-driven composition technique for situational web applications by using tag-based semantics in to illustrate the overall life-cycle of our “compose as-you-search” composition approach, to propose the clustering technique for deriving tag-based composition semantics, and to evaluate the composition planning effectiveness, respectively.

Compared with previous work, this paper is significantly updated by introducing a semi-supervised technique for clustering hierarchical tag based semantics from service documentations and human-annotated annotations. The derived semantics link service capabilities and developers’ processing goals, so that the composition is processed by planning the “Tag HyperLinks” from initialquery to the goals. The planning algorithm is also further evaluated in terms of recommendation quality, performance, and scalability over data sets from real-world service repositories. Results show that our approach reaches satisfying precision and high-quality composition recommendations. We also demonstrate that our approach can accommodate even larger size of services than real world repositories so as to promise performance. Besides, more details of our interactive development prototyping are presented. We particularly demonstrate how the composition UI can help developers intuitively compose situational applications, and iteratively refine their goals until requirements are finally satisfied.

 INTRODUCTION:

We develop and deliver software systems more quickly, and these systems must provide increasingly ambitious functionality to adapt ever-changing requirements and environments. Particularly, in recent a few years, the emergence and wide adoption of Web 2.0 have enlarged the body of service computing research. Web 2.0 not only focuses on the resource sharing and utilization from user and social perspective, but also exhibits the notion of “Web as a Platform” paradigm. A very important trend is that, more and more service consumers (including programmers, business analysts or even endusers) are capable of participating and collaborating for their own requirements and interests by means of developing situational software applications (also noted as “situated software”).

Software engineering perspective, situational software applications usually follow the opportunistic development fashion, where small subsets of users create applications to fulfill a specific purpose. Currently, composing available web-delivered services (including SOAP based web services, REST (RE presentational State Transfer) web services and RSS/Atom feeds) into a single web applications, or so called “service mashups” (or “mashups” for short) has been popular. They are supposed to be flexible response for new needs or demands and quick roll-out of some potentially unanticipated functionality. To support situational application development, a number of tools from both academia and industry have emerged.

However, we argue that, the large number of available services and the complexity of composition constraints make manual composition difficult. For the situational applications developers, who might be non-professional programmers, the key challenge remained is that they intend to represent their desired goals simply and intuitively, and be quickly navigated to proper service that can response their requests. They usually do not care about (or understand) the underlying technical details (e.g., syntactics, semantics, message mediation, etc). They just want to figure out all intermediate steps needed to generate desired outputs.

Moreover, many end-users may have a general wish to know what they are trying to achieve, but not know the specifics of what they want or what is possible. It means that the process of designing and developing the situational application requires not only the abstraction of individual services, but also much broader perspective on the evolving collections of services that can potentially incorporate with current onesWe first present a data-driven composition technique for situational web applications by using tag-based semantics in ICWS 2011 work.

The main contributions in this paper are to illustrate the overall life-cycle of our “composeas-you-search” composition approach, to propose the clustering technique for deriving tag-based composition semantics, and to evaluate the composition planning effectiveness, respectively. Compared with previous work, this paper is significantly updated by introducing a semi-supervised technique for clustering hierarchical tag-based semantics from service documentations and human-annotated annotations. The derived semantics link service capabilities and developers’ processing goals, so that the composition is processed by planning the “Tag HyperLinks” from initialquery to the goals.

The planning algorithm is also further evaluated in terms of recommendation quality, performance, and scalability over data sets from real-world service repositories. Results show that our approach reaches satisfying precision and high-quality composition recommendations. We also demonstrate that our approach can accommodate even larger size of services than real world repositories so as to promise performance. Besides, more details of our interactive development prototyping are presented. We particularly demonstrate how the composition UI can help developers intuitively compose situational applications, and iteratively refine their goals until requirements are finally satisfied.

SCOPE OF THE PROJECT

User-oriented abstraction: The tourist uses tags to represent their desired goals and find relevant services. Tags provide a uniform abstraction of user requirements and service capabilities, and lower the entry barrier to perform development. 

Data-driven development: In the whole development process, the tourist selects or inputs some tags, while some relevant services are recommended. This reflects a “Compose-as-you-Search” development process. Recommended services either process these tags as inputs, or produce these tags as outputs. As shown in Fig. 1, each service has some inputs and outputs, which are associated with tagged data. In this way, services can be connected to build data flows. Developers can search their goals by means of tags, and compose recommended services in a data driven fashion. 

Potential composition navigation: The developer is always assisted with possible composition suggestions, based on the tags in the current goals. The composition engine interprets the user queries and automatically generates some appropriate compositions alternatives by a planning algorithm (Section 4). The recommendations not only contain the desired outputs from the developers’ goals, but also suggest some interesting or relevant suggestions leading to potential new composition possibilities.

For example, the tag “Italian” introduced the Google Translation service, which tourist was not aware of such composition possibility. In this way, the composition process is not like traditional semantic web services techniques which might need specific goals, but leads to some emergent opportunities according to current application situations.

LITRATURE SURVEY:

COMPOSING DATA-DRIVEN SERVICE MASHUPS WITH TAG-BASED SEMANTIC ANNOTATIONS

AUTHOR: X. Liu, Q. Zhao, G. Huang, H. Mei, and T. Teng

PUBLISH: Proc. IEEE Int’l Conf. Web Services (ICWS ’11), pp. 243-250, 2011.

EXPLANATION:

Spurred by Web 2.0 paradigm, there emerge large numbers of service mashups by composing readily accessible data and services. Mashups usually address solving situational problems and require quick and iterative development lifecyle. In this paper, we propose an approach to composing data driven mashups, based on tag-based semantics. The core principle is deriving semantic annotations from popular tags, and associating them with programmatic inputs and outputs data. Tag-based semantics promise a quick and simple comprehension of data capabilities. Mashup developers including end-users can intuitively search desired services with tags, and combine several services by means of data flows. Our approach takes a planning technique to retrieving the potentially relevant composition opportunities. With our graphical composition user interfaces, developers can iteratively modify, adjust and refine their mashups to be more satisfying.

TOWARDS AUTOMATIC TAGGING FOR WEB SERVICES

AUTHOR: L. Fang, L. Wang, M. Li, J. Zhao, Y. Zou, and L. Shao

PUBLISH: Proc. IEEE 19th Int’l Conf. Web Services, pp. 528-535, 2012.

EXPLANATION:

Tagging technique is widely used to annotate objects in Web 2.0 applications. Tags can support web service understanding, categorizing and discovering, which are important tasks in a service-oriented software system. However, most of existing web services’ tags are annotated manually. Manual tagging is time-consuming. In this paper, we propose a novel approach to tag web services automatically. Our approach consists of two tagging strategies, tag enriching and tag extraction. In the first strategy, we cluster web services using WSDL documents, and then we enrich tags for a service with the tags of other services in the same cluster. Considering our approach may not generate enough tags by tag enriching, we also extract tags from WSDL documents and related descriptions in the second step. To validate the effectiveness of our approach, a series of experiments are carried out based on web-scale web services. The experimental results show that our tagging method is effective, ensuring the number and quality of generated tags. We also show how to use tagging results to improve the performance of a web service search engine, which can prove that our work in this paper is useful and meaningful.

A TAG-BASED APPROACH FOR THE DESIGN AND COMPOSITION OF INFORMATION PROCESSING APPLICATIONS

AUTHOR: E. Bouillet, M. Feblowitz, Z. Liu, A. Ranganathan, and A. Riabov

PUBLISH: ACM SIGPLAN Notices, vol. 43, no. 10, pp. 585-602, Sept. 2008.

EXPLANATION:

In the realm of component-based software systems, pursuers of the holy grail of automated application composition face many significant challenges. In this paper we argue that, while the general problem of automated composition in response to high-level goal statements is indeed very difficult to solve, we can realize composition in a restricted context, supporting varying degrees of manual to automated assembly for specific types of applications. We propose a novel paradigm for composition in flow-based information processing systems, where application design and component development are facilitated by the pervasive use of faceted, tag-based descriptions of processing goals, of component capabilities, and of structural patterns of families of application. The facets and tags represent different dimensions of both data and processing, where each facet is modeled as a finite set of tags that are defined in a controlled folksonomy. All data flowing through the system, as well as the functional capabilities of components are described using tags. A customized AI planner is used to automatically build an application, in the form of a flow of components, given a high-level goal specification in the form of a set of tags. End-users use an automatically populated faceted search and navigation mechanism to construct these high-level goals. We also propose a novel software engineering methodology to design and develop a set of reusable, well-described components that can be assembled into a variety of applications. With examples from a case study in the Financial Services domain, we demonstrate that composition using a faceted, tag-based application design is not only possible, but also extremely useful in helping end-users create situational applications from a wide variety of available components.

Data Collection in Multi-Application Sharing Wireless Sensor Networks

Data sharing for data collection among multiple applications is an efficient way to reduce communication cost for Wireless Sensor Networks (WSNs). This paper is the first work to introduce the interval data sharing problem which is to investigate how to transmit as less data as possible over the network, and meanwhile the transmitted data satisfies the requirements of all the applications. Different from current studies where each application requires a single data sampling during each task, we study the problem where each application requires a continuous interval of data sampling in each task. The proposed problem is a nonlinear nonconvex optimization problem. In order to lower the high complexity for solving a nonlinear nonconvex optimization problem in resource restricted WSNs, a 2-factor approximation algorithm whose time complexity is Oðn2Þ and memory complexity is OðnÞ is provided. A special instance of this problem is also analyzed. This special instance can be solved with a dynamic programming algorithm in polynomial time, which gives an optimal result in Oðn2Þ time complexity and OðnÞ memory complexity.
Three online algorithms are provided to process the continually coming tasks. Both the theoretical analysis and simulation results demonstrate the effectiveness of the proposed algorithms

COST-AWARE SECURE ROUTING (CASER) PROTOCOL DESIGN FOR WIRELESS SENSOR NETWORKS

ABSTRACT:

Lifetime optimization and security are two conflicting design issues for multi-hop wireless sensor networks (WSNs) with non-replenishable energy resources. In this paper, we first propose a novel secure and efficient Cost-Aware SEcure Routing (CASER) protocol to address these two conflicting issues through two adjustable parameters: energy balance control (EBC) and probabilistic based random walking. We then discover that the energy consumption is severely disproportional to the uniform energy deployment for the given network topology, which greatly reduces the lifetime of the sensor networks. We propose an efficient non-uniform energy deployment strategy to optimize the lifetime and message delivery ratio under the same energy resource and security requirement. We also provide a quantitative security analysis on the proposed routing protocol.

Our theoretical analysis and java simulation results demonstrate that the proposed CASER protocol can provide an excellent tradeoff between routing efficiency and energy balance, and can significantly extend the lifetime of the sensor networks in all scenarios. For the non-uniform energy deployment, our analysis shows that we can increase the lifetime and the total number of messages that can be delivered by more than four times under the same assumption. We also demonstrate that the proposed CASER protocol can achieve a high message delivery ratio while preventing routing traceback attacks.

INTRODUCTION:

The recent technological advances make wireless sensor networks (WSNs) technically and economically feasible to be widely used in both military and civilian applications, such as monitoring of ambient conditions related to the environment, precious species and critical infrastructures. A key feature of such networks is that each network consists of a large number of untethered and unattended sensor nodes. These nodes often have very limited and non-replenishable energy resources, which makes energy an important design issue for these networks. Routing is another very challenging design issue for WSNs. A properly designed routing protocol should not only ensure high message delivery ratio and low energy consumption for message delivery, but also balance the entire sensor network energy consumption, and thereby extend the sensor network lifetime.

WSNs rely on wireless communications, which is by nature a broadcast medium. It is more vulnerable to security attacks than its wired counterpart due to lack of a physical boundary. In particular, in the wireless sensor domain, anybody with an appropriate wireless receiver can monitor and intercept the sensor network communications. The adversaries may use expensive radio transceivers, powerful workstations and interact with the network from a distance since they are not restricted to using sensor network hardware. It is possible for the adversaries to perform jamming and routing traceback attacks. Motivated by the fact that WSNs routing is often geography-based, we propose a geography-based secure and effi- cient Cost-Aware SEcure routing (CASER) protocol for WSNs without relying on flooding.

CASER allows messages to be transmitted using two routing strategies, random walking and deterministic routing, in the same framework. The distribution of these two strategies is determined by the specific security requirements. This scenario is analogous to delivering US Mail through USPS: express mails cost more than regular mails; however, mails can be delivered faster. The protocol also provides a secure message delivery option to maximize the message delivery ratio under adversarial attacks. In addition, we also give quantitative secure analysis on the proposed routing protocol based on the criteria proposed in CASER protocol has two major advantages: (i) It ensures balanced energy consumption of the entire sensor network so that the lifetime of the WSNs can be maximized. (ii) CASER protocol supports multiple routing strategies based on the routing requirements, including fast/slow message delivery and secure message delivery to prevent routing traceback attacks and malicious traffic jamming attacks in WSNs.

Our contributions of this paper can be summarized as follows:

1) We propose a secure and efficient Cost-Aware SEcure Routing (CASER) protocol for WSNs. In this protocol, cost-aware based routing strategies can be applied to address the message delivery requirements.

2) We devise a quantitative scheme to balance the energy consumption so that both the sensor network lifetime and the total number of messages that can be delivered are maximized under the same energy deployment (ED).

3) We develop theoretical formulas to estimate the number of routing hops in CASER under varying routing energy balance control (EBC) and security requirements.

4) We quantitatively analyze security of the proposed routing algorithm.

5) We provide an optimal non-uniform energy deployment (noED) strategy for the given sensor networks based on the energy consumption ratio. Our theoretical and simulation results both show that under the same total energy deployment, we can increase the lifetime and the number of messages that can be delivered more than four times in the non-uniform energy deployment scenario.

LITRATURE SURVEY:

QUANTITATIVE MEASUREMENT AND DESIGN OF SOURCE-LOCATION PRIVACY SCHEMES FOR WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li, J. Ren, and J. Wu

PUBLISH: IEEE Trans. Parallel Distrib. Syst., vol. 23, no. 7, pp. 1302–1311, Jul. 2012.

EXPLANATION:

Wireless sensor networks (WSNs) have been widely used in many areas for critical infrastructure monitoring and information collection. While confidentiality of the message can be ensured through content encryption, it is much more difficult to adequately address source-location privacy (SLP). For WSNs, SLP service is further complicated by the nature that the sensor nodes generally consist of low-cost and low-power radio devices. Computationally intensive cryptographic algorithms (such as public-key cryptosystems), and large scale broadcasting-based protocols may not be suitable. In this paper, we first propose criteria to quantitatively measure source-location information leakage in routing-based SLP protection schemes for WSNs. Through this model, we identify vulnerabilities of some well-known SLP protection schemes. We then propose a scheme to provide SLP through routing to a randomly selected intermediate node (RSIN) and a network mixing ring (NMR). Our security analysis, based on the proposed criteria, shows that the proposed scheme can provide excellent SLP. The comprehensive simulation results demonstrate that the proposed scheme is very efficient and can achieve a high message delivery ratio. We believe it can be used in many practical applications.

PROVIDING HOP-BY-HOP AUTHENTICATION AND SOURCE PRIVACY IN WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li, J. Li, J. Ren, and J. Wu

PUBLISH: IEEE Conf. Comput. Commun. Mini-Conf., Orlando, FL, USA, Mar. 2012, pp. 3071–3075.

EXPLANATION:

Message authentication is one of the most effective ways to thwart unauthorized and corrupted traffic from being forwarded in wireless sensor networks (WSNs). To provide this service, a polynomial-based scheme was recently introduced. However, this scheme and its extensions all have the weakness of a built-in threshold determined by the degree of the polynomial: when the number of messages transmitted is larger than this threshold, the adversary can fully recover the polynomial. In this paper, we propose a scalable authentication scheme based on elliptic curve cryptography (ECC). While enabling intermediate node authentication, our proposed scheme allows any node to transmit an unlimited number of messages without suffering the threshold problem. In addition, our scheme can also provide message source privacy. Both theoretical analysis and simulation results demonstrate that our proposed scheme is more efficient than the polynomial-based approach in terms of communication and computational overhead under comparable security levels while providing message source privacy.

SOURCE-LOCATION PRIVACY THROUGH DYNAMIC ROUTING IN WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li and J. Ren

PUBLISH: IEEE INFOCOM 2010, San Diego, CA, USA., Mar. 15–19, 2010. pp. 1–9.

EXPLANATION:

Wireless sensor networks (WSNs) have the potential to be widely used in many areas for unattended event monitoring. Mainly due to lack of a protected physical boundary, wireless communications are vulnerable to unauthorized interception and detection. Privacy is becoming one of the major issues that jeopardize the successful deployment of wireless sensor networks. While confidentiality of the message can be ensured through content encryption, it is much more difficult to adequately address the source-location privacy. For WSNs, source-location privacy service is further complicated by the fact that the sensor nodes consist of low-cost and low-power radio devices, computationally intensive cryptographic algorithms and large scale broadcasting-based protocols are not suitable for WSNs. In this paper, we propose source-location privacy schemes through routing to randomly selected intermediate node(s) before the message is transmitted to the SINK node. We first describe routing through a single a single randomly selected intermediate node away from the source node. Our analysis shows that this scheme can provide great local source-location privacy. We also present routing through multiple randomly selected intermediate nodes based on angle and quadrant to further improve the global source location privacy. While providing source-location privacy for WSNs, our simulation results also demonstrate that the proposed schemes are very efficient in energy consumption, and have very low transmission latency and high message delivery ratio. Our protocols can be used for many practical applications.

SYSTEM ANALYSIS:

EXISTING SYSTEM:

In Geographic and energy aware routing (GEAR), the sink node disseminates requests with geographic attributes to the target region instead of using flooding. Each node forwards messages to its neighboring nodes based on estimated cost and learning cost. Source-location privacy is provided through broadcasting that mixes valid messages with dummy messages. The transmission of dummy messages not only consumes significant amount of sensor energy, but also increases the network collisions and decreases the packet delivery ratio. In phantom routing protocol, each message is routed from the actual source to a phantom source along a designed directed walk through either sector based approach or hop-based approach. The direction/sector information is stored in the header of the message. In this way, the phantom source can be away from the actual source. Unfortunately, once the message is captured on the random walk path, the adversaries are able to get the direction/sector information stored in the header of the message.

DISADVANTAGES:

  • More energy consumption
  • Increase the network collision
  • Reduce the packet delivery ratio
  • Cannot provide the full secure for packets

PROPOSED SYSTEM:

We propose a secure and efficient Cost Aware Secure Routing (CASER) protocol that can address energy balance and routing security concurrently in WSNs. In CASER routing protocol, each sensor node needs to maintain the energy levels of its immediate adjacent neighboring grids in addition to their relative locations. Using this information, each sensor node can create varying filters based on the expected design tradeoff between security and efficiency. The quantitative security analysis demonstrates the proposed algorithm can protect the source location information from the adversaries. In this project, we will focus on two routing strategies for message forwarding: shortest path message forwarding, and secure message forwarding through random walking to create routing path unpredictability for source privacy and jamming prevention.

  • We propose a secure and efficient Cost-Aware SEcure Routing (CASER) protocol for WSNs. In this protocol, cost-aware based routing strategies can be applied to address the message delivery requirements.
  • We devise a quantitative scheme to balance the energy consumption so that both the sensor network lifetime and the total number of messages that can be delivered are maximized under the same energy deployment (ED).
  • We develop theoretical formulas to estimate the number of routing hops in CASER under varying routing energy balance control (EBC) and security requirements.
  • We quantitatively analyze security of the proposed routing algorithm. We provide an optimal non-uniform energy deployment (noED) strategy for the given sensor networks based on the energy consumption ratio.
  • Our theoretical and simulation results both show that under the same total energy deployment, we can increase the lifetime and the number of messages that can be delivered more than four times in the non-uniform energy deployment scenario.

ADVANTAGES:

  • Reduce the energy consumption
  • Provide the more secure for packet and also routing
  • Increase the message delivery ratio
  • Reduce the time delay

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Tools :           Netbeans 7
  • Document :           MS-Office 2007

CONTENT-BASED IMAGE RETRIEVAL USING ERROR DIFFUSION BLOCK TRUNCATION CODING FEATURES

ABSTRACT:

This paper presents a new approach to index color images using the features extracted from the error diffusion block truncation coding (EDBTC). The EDBTC produces two color quantizers and bitmap images, which are further, processed using vector quantization (VQ) to generate the image feature descriptor. Herein two features are introduced, namely, color histogram feature (CHF) and bit pattern histogram feature (BHF), to measure the similarity between a query image and the target image in database.

The CHF and BHF are computed from the VQ-indexed color quantizer and VQ-indexed bitmap image, respectively. The distance computed from CHF and BHF can be utilized to measure the similarity between two images. As documented in the experimental result, the proposed indexing method outperforms the former block truncation coding based image indexing and the other existing image retrieval schemes with natural and textural data sets. Thus, the proposed EDBTC is not only examined with good capability for image compression but also offers an effective way to index images for the content based image retrieval system.

INTRODUCTION

Many former schemes have been developed to improve the retrieval accuracy in the content-based image retrieval (CBIR) system. One type of them is to employ image features derived from the compressed data stream as opposite to the classical approach that extracts an image descriptor from the original image; this retrieval scheme directly generates image features from the compressed stream without first performing the decoding process. This type of retrieval aims to reduce the time computation for feature extraction/generation since most of the multimedia images are already converted to compressed domain before they are recorded in any storage devices. In the image features are directly constructed from the typical block truncation coding (BTC) or halftoning-based BTC compressed data stream without performing the decoding procedure.

These image retrieval schemes involve two phases, indexing and searching, to retrieve a set of similar images from the database.

The indexing phase extracts the image features from all of the images in the database which is later stored in database as feature vector. In the searching phase, the retrieval system derives the image features from an image submitted by a user (as query image), which are later utilized for performing similarity matching on the feature vectors stored in the database. The image retrieval system finally returns a set of images to the user with a specific similarity criterion, such as color similarity and texture similarity. The concept of the BTC is to look for a simple set of representative vectors to replace the original images. Specifically, the BTC compresses an image into a new domain by dividing the original image into multiple nonoverlapped image blocks, and each block is then represented with two extreme quantizers (i.e., high and low mean values) and bitmap image. Two subimages constructed by the two quantizers and the corresponding bitmap image are produced at the end of BTC encoding stage, which are later transmitted into the decoder module through the transmitter. To generate the bitmap image, the BTC scheme performs thresholding operation using the mean value of each image block such that a pixel value greater than the mean value is regarded as 1 (white pixel) and vice versa.

The traditional BTC method does not improve the image quality or compression ratio compared with JPEG or JPEG 2000. However, the BTC schemes achieve much lower computational complexity compared with that of these techniques. Some attempts have been addressed to improve the BTC reconstructed image quality and compression ratio, and also to reduce the time computation. Even though the BTC scheme needs low computational complexity, it often suffers from blocking effect and false contour problems, making it less satisfactory for human perception. The halftoning-based BTC, namely, error diffusion BTC (EDBTC) is proposed to overcome the two above disadvantages of the BTC. Similar to the BTC scheme, EDBTC looks for a new representation (i.e., two quantizers and bitmap image) for reducing the storage requirement. The EDBTC bitmap image is constructed by considering the quantized error which diffuses to the nearby pixels to compensate the overall brightness, and thus, this error difussion strategy effectively removes the annoying blocking effect and false contour, while maintaining the low computational complexity.

The low-pass nature of human visual system is employed in to access the reconstructed image quality, in which the continuous image and its halftone version are perceived similarly by human vision when these two images viewed from a distance. The EDBTC method divides a given image into multiple nonoverlapped image blocks and each block is processed independently to obtain two extreme quantizers. This unique feature of independent processing enables the parallelism scenario. In bitmap image generation step, the pixel values in each block are thresholded by a fixed average value in the block with employing error kernel to diffuse the quantization error to the neighboring pixels during the encoding stage. A new image retrieval system has been proposed for the color image.

Three feature descriptors, namely, structure element correlation (SEC), gradient value correlation (GVC), and gradient direction correlation (GDC) are utilized to measure the similarity between the query and the target images in database. This indexing scheme provides a promising result in big database and outperforms the former existing approaches, as reported in the method in compresses a grayscale image by combining the effectiveness of fractal encoding, discrete cosine transform (DCT), and standard deviation of an image block. An auxiliary encoding algorithm has also been proposed to improve the image quality and to reduce the blocking effect. As reported in this new encoding system achieves a good coding gain as well as the promising image quality with very efficient computation. In a new method for tamper detection and recovery is proposed utilizing the DCT coefficient, fractal coding scheme, and the matched block technique. This new scheme yields a higher tampering detection rate and achieves good restored image quality, as demonstrated in combines the fractal image compression and wavelet transform to reduce the time computation in image encoding stage.

This method produces a good image quality with a fast encoding speed, as reported in the fast and efficient image coding with the no-search fractal coding strategies have been proposed methods employ the modified graylevel transform to improve the successful matching probability between the range and domain block in the fractal coding. Two gray-level transforms on quadtree partition are used in to achieve a fast image coding and to improve the decoded image quality. The method in exploits a fitting plane method and a modified gray-level transform to speedup the encoding process. The fractal image coding presented in accelerates the image encoding stage, reduces the compression ratio, and simultaneously improves the reconstructed image quality. A fast fractal coding is also proposed in which utilizes the matching error threshold. This method first reduces the codebook capacity and takes advantage of matching error threshold to shorten the encoding runtime. The method in can achieve a similar or better decoded image with the fast compression process compared with the conventional fractal encoding system with full search strategy.

The contributions can be summarized as follows: 1) extending the EDBTC image compression technique for the color image; 2) proposing two feature descriptors, namely, color histogram feature (CHF) and bit pattern histogram feature (BHF), which can be directly derived from the EDBTC compressed data stream without performing decoding process; and 3) presenting a new low complexity joint CBIR system and color image compression by exploiting the superiority of EDBTC scheme. The rest of this paper is organized as follows. A brief introduction of EDBTC is provided in Section II. Section III presents the proposed EDBTC image retrieval including the image feature generation and accuracy computation. Extensive experimental results are reported at Section IV. Finally, the conclusion is drawn at the end of this paper.

AUTHENTICATION HANDOVER AND PRIVACY PROTECTION IN 5G HETNETS USING SOFTWARE-DEFINED NETWORKING

ABSTRACT:

Recently, densified small cell deployment with overlay coverage through coexisting heterogeneous networks has emerged as a viable solution for 5G mobile networks. However, this multi-tier architecture along with stringent latency requirements in 5G brings new challenges in security provisioning due to the potential frequent handovers and authentications in 5G small cells and HetNets. In this article, we review related studies and introduce SDN into 5G as a platform to enable efficient authentication hand – over and privacy protection. Our objective is to simplify authentication handover by global management of 5G HetNets through sharing of userdependent security context information among related access points. We demonstrate that SDN-enabled security solutions are highly efficient through its centralized control capability, which is essential for delay-constrained 5G communications.

However, the specific key designed for handover and different handover procedures for various scenarios will increase handover complexity when applied to 5G HetNets. As the authentication server is often located remotely, the delay due to frequent enquiries between small cell APs and the authentication server for user verification may be up to hundreds of milliseconds, which is unacceptable for 5G communications. The authors of have proposed simplified hand – over authentication schemes involving direct authentication between UE and APs based on public cryptography. These schemes realize mutual authentication and key agreements with new networks through a three-way handshake without contacting any third party, like an authentication, authorization, and accounting (AAA) server. Although the handover authentication procedure is simplified, computation cost and delay are increased due to the overhead for exchanging more cryptographic messages through a wireless interface. For the same reason, carrying a digital signature is secure but not efficient for dynamic 5G wireless communications.

INTRODUCTION:

Over the past few years, anywhere, anytime wireless connectivity has gradually become a reality and has resulted in remarkably increased mobile traffic. Mobile data traffic from prevailing smart terminals, multimedia-intensive social applications, video streaming, and cloud services is predicted to grow at a compound annual growth rate of 61 percent before 2018, and is expected to outgrow the capabilities of the current fourth generation (4G) and Long Term Evolution (LTE) infrastructure by 2020 [1]. This explosive growth of data traffic and shortage of spectrum have necessitated intensive research and development efforts on 5G mobile networks. However, the relatively narrow usable frequency bands between several hundred megahertz and a few gigahertzes have been almost fully occupied by a variety of licensed or unlicensed networks, including 2G, 3G, LTE, LTE-Advanced (LTEA), and Wi-Fi. Although dynamic spectrum allocation could provide some improvement, the only way to find enough new bandwidth for 5G is to explore idle spectrum in the millimeterwave range of 30~300 GHz.

Authenticated Key Exchange Protocols for Parallel Network File Systems

We study the problem of key establishment for secure many-to-many communications. The problem is inspired by the proliferation of large-scale distributed file systems supporting parallel access to multiple storage devices. Our work focuses on the current Internet standard for such file systems, i.e., parallel
Network File System (pNFS), which makes use of Kerberos to establish parallel session keys between clients and storage devices.
Our review of the existing Kerberos-based protocol shows that it has a number of limitations:

(i) a metadata server facilitating key exchange between the clients and the storage devices has heavy workload that restricts the scalability of the protocol;

(ii) the protocol does not provide forward secrecy;

(iii) the metadata server generates itself all the session keys that are used between the clients and storage devices, and this inherently leads to key escrow. In this paper, we propose a variety of authenticated key exchange protocols that are designed to address the above issues. We show that our protocols are capable of reducing up to approximately 54% of the workload of the metadata server and concurrently supporting forward secrecy and escrow-freeness. All this requires only a small fraction of increased computation overhead at the client.

AGGREGATED-PROOF BASED HIERARCHICAL AUTHENTICATION SCHEME FOR THE INTERNET OF THINGS

ABSTRACT:

The Internet of Things (IoT) is becoming an attractive system paradigm to realize interconnections through the physical, cyber, and social spaces. During the interactions among the ubiquitous things, security issues become noteworthy, and it is significant to establish enhanced solutions for security protection. In this work, we focus on an existing U2IoT architecture (i.e., unit IoT and ubiquitous IoT), to design an aggregated-proof based hierarchical authentication scheme (APHA) for the layered networks. Concretely, 1) the aggregated-proofs are established for multiple targets to achieve backward and forward anonymous data transmission; 2) the directed path descriptors, homomorphism functions, and Chebyshev chaotic maps are jointly applied for mutual authentication; 3) different access authorities are assigned to achieve hierarchical access control. Meanwhile, the BAN logic formal analysis is performed to prove that the proposed APHA has no obvious security defects, and it is potentially available for the U2IoT architecture and other IoT applications.

INTRODUCTION:

The Internet of Things (IoT) is emerging as an attractive system paradigm to integrate physical perceptions, cyber interactions, and social correlations, in which the physical objects, cyber entities, and social attributes are required to achieve interconnections with the embedded intelligence. During the interconnections, the IoT is suffering from severe security challenges, and there are potential vulnerabilities due to the complicated networks referring to heterogeneous targets, sensors, and backend management systems. It becomes noteworthy to address the security issues for the ubiquitous things in the IoT.

Recent studies have been worked on the general IoT, including system models, service platforms, infrastructure architectures, and standardization. Particularly, a human-society inspired U2IoT architecture (i.e., unit IoT and ubiquitous IoT) is proposed to achieve the physical cyber- social convergence in the U2IoT architecture, mankind neural system and social organization framework are introduced to establish the single-application and multi-application IoT frameworks.

Multiple unit IoTs compose a local IoT within a region, or an industrial IoT for an industry. The local IoTs and industrial IoTs are covered within a national IoT, and jointly form the ubiquitous IoT. Towards the IoT security, related works mainly refer to the security architectures and recommended countermeasures secure communication and networking mechanisms cryptography algorithms and application security solutions.

Current researches mainly refer to three aspects: system security, network security, and application security.

_ System security mainly considers a whole IoT system to identify the unique security and privacy challenges, to design systemic security frameworks, and to provide security measures and guidelines.

_ Network security mainly focuses on wireless communication networks (e.g., wireless sensor networks (WSN), radio frequency identification (RFID), and the Internet) to design key distribution algorithms, authentication protocols, advanced signature algorithms, access control mechanisms, and secure routing protocols. Particularly, authentication protocols are popular to address security and privacy issues in the IoT, and should be designed considering the things’ heterogeneity and hierarchy.

_ Application security serves for IoT applications (e.g.., multimedia, smart home, and smart grid), and resolves practical problems with particular scenario requirements.

Towards the U2IoT architecture, a reasonable authentication scheme should satisfy the following requirements. 1) Data CIA (i.e., confidentiality, integrity, and availability): The exchanged messages between any two legal entities should be protected against illegal access and modification. The communication channels should be reliable for the legal entities. 2) Hierarchical access control: Diverse access authorities are assigned to different entities to provide hierarchical interactions.

An unauthorised entity cannot access data exceeding its permission. 3) Forward security: Attackers cannot correlate any two communication sessions, and also cannot derive the previous interrogations according to the ongoing session. 4) Mutual authentication: The untrusted entities should pass each other’s verification so that only the legal entity can access the networks for data acquisition. 5) Privacy preservation: The sensors cannot correlate or disclose an individual target’s private information (e.g., location). Considering above security requirements, we design an aggregated proof based hierarchical authentication scheme (APHA) for the unit IoT.

EXISTING SYSTEM:

Existing WSN network is to be completely integrated into the Internet as part of the Internet of Things (IoT), it is necessary to consider various security challenges, such as the creation of a secure channel between an Internet host and a sensor node. In order to create such a channel, it is necessary to provide key management mechanisms that allow two remote devices to negotiate certain security credentials (e.g. secret keys) that will be used to protect the information flow analyze not only the applicability.

Existing mechanisms such as public key cryptography and pre-shared keys for sensor nodes in the IoT context, but also the applicability of those link-layer oriented key management systems (KMS) whose original purpose is to provide shared keys for sensor nodes belonging to the same WSNs to provide key management mechanisms to allow that two remote devices can negotiate certain security certificates (e.g., shared keys, Blom key pairs, and polynomial shares). The authors analyzed the applicability of existing mechanisms, including public key infrastructure (PKI) and pre-shared keys for sensor nodes in IoT contexts.

DISADVANTAGES:

Smart community model for IoT applications, and a cyber-physical system with the networked smart homes was introduced with security considerations. Filtering false network traffic and avoiding unreliable home gateways are suggested for safeguard. Meanwhile, the security challenges are discussed, including the cooperative authentication, unreliable node detection, target tracking, and intrusion detection group of individuals that hacked into federal sites and released confidential information to the public in the government is supposed to have the highest level of security, yet their system was easily breached.   Therefore, if all of our information is stored on the internet, people could hack into it, finding out everything about individuals lives. Also, companies could misuse the information that they are given access to.  This is a common mishap that occurs within companies all the time.  

PROPOSED SYSTEM:

We proposed scheme realizes data confidentiality and data integrity by the directed path descriptor and homomorphism based Chebyshev chaotic maps, establishes trust relationships via the lightweight mechanisms, and applies dynamically hashed values to achieve session freshness. It indicates that the APHA is suitable for the U2IoT architecture.

In this work, the main purpose is to provide bottom-up safeguard for the U2IoT architecture to realize secure interactions. Towards the U2IoT architecture, a reasonable authentication scheme should satisfy the following requirements.

1) Data CIA (i.e., confidentiality, integrity, and availability): The exchanged messages between any two legal entities should be protected against illegal access and modification. The communication channels should be reliable for the legal entities.

2) Hierarchical access control: Diverse access authorities are assigned to different entities to provide hierarchical interactions. An unauthorised entity cannot access data exceeding its permission.

3) Forward security: Attackers cannot correlate any two communication sessions, and also cannot derive the previous interrogations according to the ongoing session.

4) Mutual authentication: The untrusted entities should pass each other’s verification so that only the legal entity can access the networks for data acquisition.

5) Privacy preservation: The sensors cannot correlate or disclose an individual target’s private information (e.g., location). Considering above security requirements, we design an aggregated proof based hierarchical authentication scheme (APHA) for the ubiquitous IoT.

ADVANTAGES:

Aggregated-proofs are established by wrapping multiple targets’ messages for anonymous data transmission, which realizes that individual information cannot be revealed during both backward and forward communication channels.

Directed path descriptors are defined based on homomorphism functions to establish correlation during the cross-layer interactions. Chebyshev chaotic maps are applied to describe the mapping relationships between the shared secrets and the path descriptors for mutual authentication.

Diverse access authorities on the group identifiers and pseudonyms are assigned to different entities for achieving the hierarchical access control through the layered networks.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed                                      –    1.1 GHz
    • RAM                                       –    256 MB (min)
    • Hard Disk                               –   20 GB
    • Floppy Drive                           –    1.44 MB
    • Key Board                              –    Standard Windows Keyboard
    • Mouse                                     –    Two or Three Button Mouse
    • Monitor                                   –    SVGA

 

SOFTWARE REQUIREMENTS:

  • Operating System                   :           Windows XP or Win7
  • Front End                                :           JAVA JDK 1.7
  • Back End                                :           MYSQL Server
  • Server                                      :           Apache Tomact Server
  • Script                                       :           JSP Script
  • Document                               :           MS-Office 2007

A TIME EFFICIENT APPROACH FOR DETECTING ERRORS IN BIG SENSOR DATA ON CLOUD

ABSTRACT:

Big sensor data is prevalent in both industry and scientific research applications where the data is generated with high volume and velocity it is difficult to process using on-hand database management tools or traditional data processing applications. Cloud computing provides a promising platform to support the addressing of this challenge as it provides a flexible stack of massive computing, storage, and software services in a scalable manner at low cost. Some techniques have been developed in recent years for processing sensor data on cloud, such as sensor-cloud. However, these techniques do not provide efficient support on fast detection and locating of errors in big sensor data sets.

We develop a novel data error detection approach which exploits the full computation potential of cloud platform and the network feature of WSN. Firstly, a set of sensor data error types are classified and defined. Based on that classification, the network feature of a clustered WSN is introduced and analyzed to support fast error detection and location. Specifically, in our proposed approach, the error detection is based on the scale-free network topology and most of detection operations can be conducted in limited temporal or spatial data blocks instead of a whole big data set. Hence the detection and location process can be dramatically accelerated.

Furthermore, the detection and location tasks can be distributed to cloud platform to fully exploit the computation power and massive storage. Through the experiment on our cloud computing platform of U-Cloud, it is demonstrated that our proposed approach can significantly reduce the time for error detection and location in big data sets generated by large scale sensor network systems with acceptable error detecting accuracy.

INTRODUCTION:

Recently, we enter a new era of data explosion which brings about new challenges for big data processing. In general, big data is a collection of data sets so large and complex that it becomes difficult to process with onhand database management systems or traditional data processing applications. It represents the progress of the human cognitive processes, usually includes data sets with sizes beyond the ability of current technology, method and theory to capture, manage, and process the data within a tolerable elapsed time. Big data has typical characteristics of five ‘V’s, volume, variety, velocity, veracity and value. Big data sets come from many areas, including meteorology, connectomics, complex physics simulations, genomics, biological study, gene analysis and environmental research. According to literature since 1980s, generated data doubles its size in every 40 months all over the world. In the year of 2012, there were 2.5 quintillion (2.5  1018) bytes of data being generated every day.

Hence, how to process big data has become a fundamental and critical challenge for modern society. Cloud computing provides apromising platform for big data processing with powerful computation capability, storage, scalability, resource reuse and low cost, and has attracted significant attention in alignment with big data. One of important source for scientific big data is the data sets collected by wireless sensor networks (WSN). Wireless sensor networks have potential of significantly enhancing people’s ability to monitor and interact with their physical environment. Big data set from sensors is often subject to corruption and losses due to wireless medium of communication and presence of hardware inaccuracies in the nodes. For a WSN application to deduce an appropriate result, it is necessary that the data received is clean, accurate, and lossless. However, effective detection and cleaning of sensor big data errors is a challenging issue demanding innovative solutions. WSN with cloud can be categorized as a kind of complex network systems. In these complex network systems such as WSN and social network, data abnormality and error become an annoying issue for the real network applications.

Therefore, the question of how to find data errors in complex network systems for improving and debugging the network has attracted the interests of researchers. Some work has been done for big data analysis and error detection in complex networks including intelligence sensors networks. There are also some works related to complex network systems data error detection and debugging with online data processing techniques. Since these techniques were not designed and developed to deal with big data on cloud, they were unable to cope with current dramatic increase of data size. For example, when big data sets are encountered, previous offline methods for error detectionand debugging on a single computer may take a long time and lose real time feedback. Because those offline methods are normally based on learning or mining, they often introduce high time cost during the process of data set training and pattern matching. WSN big data error detection commonly requires powerful real-time processing and storing of the massive sensor data as well as analysis in the context of using inherently complex error models to identify and locate events of abnormalities.

In this paper, we aim to develop a novel error detection approach by exploiting the massive storage, scalability and computation power of cloud to detect errors in big data sets from sensor networks. Some work has been done about processing sensor data on cloud. However, fast detection of data errors in big data with cloud remains challenging. Especially, how to use the computation power of cloud to quickly find and locate errors of nodes in WSN needs to be explored. Cloud computing, a disruptive trend at present, poses a significant impact on current IT industry and research communities. Cloud computing infrastructure is becoming popular because it provides an open, flexible, scalable and reconfigurable platform. The proposed error detection approach in this paper will be based on the classification of error types. Specifically, nine types of numerical data abnormalities/errors are listed and introduced in our cloud error detection approach. The defined error model will trigger the error detection process. Compared to previous error detection of sensor network systems, our approach on cloud will be designed and developed by utilizing the massive data processing capability of cloud to enhance error detection speed and real time reaction. In addition, the architecture feature of complex networks will also be analyzed to combine with the cloud computing with a more efficient way. Based on current research literature review, we divide complex network systems into scale-free type and non scale-free type. Sensor network is a kind of scale-free complex network system which matches cloud scalability feature.

A SCALABLE AND RELIABLE MATCHING SERVICE FOR CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEMS

ABSTRACT:

Characterized by the increasing arrival rate of live content, the emergency applications pose a great challenge: how to disseminate large-scale live content to interested users in a scalable and reliable manner. The publish/subscribe (pub/sub) model is widely used for data dissemination because of its capacity of seamlessly expanding the system to massive size. However, most event matching services of existing pub/sub systems either lead to low matching throughput when matching a large number of skewed subscriptions, or interrupt dissemination when a large number of servers fail. The cloud computing provides great opportunities for the requirements of complex computing and reliable communication.

In this paper, we propose SREM, a scalable and reliable event matching service for content-based pub/sub systems in cloud computing environment. To achieve low routing latency and reliable links among servers, we propose a distributed overlay Skip Cloud to organize servers of SREM. Through a hybrid space partitioning technique HPartition, large-scale skewed subscriptions are mapped into multiple subspaces, which ensures high matching throughput and provides multiple candidate servers for each event.

Moreover, a series of dynamics maintenance mechanisms are extensively studied. To evaluate the performance of SREM, 64 servers are deployed and millions of live content items are tested in a Cloud Stack testbed. Under various parameter settings, the experimental results demonstrate that the traffic overhead of routing events in SkipCloud is at least 60 percent smaller than in Chord overlay, the matching rate in SREM is at least 3.7 times and at most 40.4 times larger than the single-dimensional partitioning technique of BlueDove. Besides, SREM enables the event loss rate to drop back to 0 in tens of seconds even if a large number of servers fail simultaneously.

INTRODUCTION

Because of the importance in helping users to make realtime decisions, data dissemination has become dramatically significant in many large-scale emergency applications, such as earthquake monitoring, disaster weather warning and status update in social networks. Recently, data dissemination in these emergency applications presents a number of fresh trends. One is the rapid growth of live content. For instance, Facebook users publish over 600,000 pieces of content and Twitter users send over 100,000 tweets on average per minute. The other is the highly dynamic network environment. For instance, the measurement studies indicate that most users’ sessions in social networks only last several minutes. In emergency scenarios, the sudden disasters like earthquake or bad weather may lead to the failure of a large number of users instantaneously.

These characteristics require the data dissemination system to be scalable and reliable. Firstly, the system must be scalable to support the large amount of live content. The key is to offer a scalable event matching service to filter out irrelevant users. Otherwise, the content may have to traverse a large number of uninterested users before they reach interested users. Secondly, with the dynamic network environment, it’s quite necessary to provide reliable schemes to keep continuous data dissemination capacity. Otherwise, the system interruption may cause the live content becomes obsolete content. Driven by these requirements, publish/subscribe (pub/ sub) pattern is widely used to disseminate data due to its flexibility, scalability, and efficient support of complex event processing. In pub/sub systems (pub/subs), a receiver (subscriber) registers its interest in the form of a subscription. Events are published by senders to the pub/ sub system.

The system matches events against subscriptions and disseminates them to interested subscribers.

In traditional data dissemination applications, the live content are generated by publishers at a low speed, which makes many pub/subs adopt the multi-hop routing techniques to disseminate events. A large body of broker-based pub/subs forward events and subscriptions through organizing nodes into diverse distributed overlays, such as treebased design cluster-based design and DHT-based design. However, the multihop routing techniques in these broker-based systems lead to a low matching throughput, which is inadequate to apply to current high arrival rate of live content.

Recently, cloud computing provides great opportunities for the applications of complex computing and high speed communication where the servers are connected by high speed networks, and have powerful computing and storage capacities. A number of pub/sub services based on the cloud computing environment have been proposed, such as Move BlueDove and SEMAS. However, most of them can not completely meet the requirements of both scalability and reliability when matching large-scale live content under highly dynamic environments.

This mainly stems from the following facts:

1) Most of them are inappropriate to the matching of live content with high data dimensionality due to the limitation of their subscription space partitioning techniques, which bring either low matching throughput or high memory overhead.

2) These systems adopt the one-hop lookup technique among servers to reduce routing latency. In spite of its high efficiency, it requires each dispatching server to have the same view of matching servers. Otherwise, the subscriptions or events may be assigned to the wrong matching server, which brings the availability problem in the face of current joining or crash of matching servers. A number of schemes can be used to keep the consistent view, like periodically sending heartbeat messages to dispatching servers or exchanging messages among matching servers. However, these extra schemes may bring a large traffic overhead or the interruption of event matching service.

LITRATURE SURVEY

RELIABLE AND HIGHLY AVAILABLE DISTRIBUTED PUBLISH/SUBSCRIBE SERVICE

PUBLICATION: Proc. 28th IEEE Int. Symp. Reliable Distrib. Syst., 2009, pp. 41–50.

AUTHORS: R. S. Kazemzadeh and H.-A Jacobsen

EXPLANATION:

This paper develops reliable distributed publish/subscribe algorithms with service availability in the face of concurrent crash failure of up to delta brokers. The reliability of service in our context refers to per-source in-order and exactly-once delivery of publications to matching subscribers. To handle failures, brokers maintain data structures that enable them to reconnect the topology and compute new forwarding paths on the fly. This enables fast reaction to failures and improves the system’s availability. Moreover, we present a recovery procedure that recovering brokers execute in order to re-enter the system, and synchronize their routing information.

BUILDING A RELIABLE AND HIGH-PERFORMANCE CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEM

PUBLICATION: J. Parallel Distrib. Comput., vol. 73, no. 4, pp. 371–382, 2013.

AUTHORS: Y. Zhao and J. Wu

EXPLANATION:

Provisioning reliability in a high-performance content-based publish/subscribe system is a challenging problem. The inherent complexity of content-based routing makes message loss detection and recovery, and network state recovery extremely complicated. Existing proposals either try to reduce the complexity of handling failures in a traditional network architecture, which only partially address the problem, or rely on robust network architectures that can gracefully tolerate failures, but perform less efficiently than the traditional architectures. In this paper, we present a hybrid network architecture for reliable and high-performance content-based publish/subscribe. Two overlay networks, a high-performance one with moderate fault tolerance and a highly-robust one with sufficient performance, work together to guarantee the performance of normal operations and reliability in the presence of failures. Our design exploits the fact that, in a high-performance content-based publish/subscribe system, subscriptions are broadcast to all brokers, to facilitate efficient backup routing when failures occur, which incurs a minimal overhead. Per-hop reliability is used to gracefully detect and recover lost messages that are caused by transit errors. Two backup routing methods based on DHT routing are proposed. Extensive simulation experiments are conducted. The results demonstrate the superior performance of our system compared to other state-of-the-art proposals.

SCALABLE AND ELASTIC EVENT MATCHING FOR ATTRIBUTE-BASED PUBLISH/SUBSCRIBE SYSTEMS

PUBLICATION: Future Gener. Comput. Syst., vol. 36, pp. 102–119, 2013.

AUTHORS: X. Ma, Y. Wang, Q. Qiu, W. Sun, and X. Pei

EXPLANATION:

Due to the sudden change of the arrival live content rate and the skewness of the large-scale subscriptions, the rapid growth of emergency applications presents a new challenge to the current publish/subscribe systems: providing a scalable and elastic event matching service. However, most existing event matching services cannot adapt to the sudden change of the arrival live content rate, and generate a non-uniform distribution of load on the servers because of the skewness of the large-scale subscriptions. To this end, we propose SEMAS, a scalable and elastic event matching service for attribute-based pub/sub systems in the cloud computing environment. SEMAS uses one-hop lookup overlay to reduce the routing latency. Through ahierarchical multi-attribute space partition technique, SEMAS adaptively partitions the skewed subscriptions and maps them into balanced clusters to achieve high matching throughput. The performance-aware detection scheme in SEMAS adaptively adjusts the scale of servers according to the churn of workloads, leading to high performance–price ratio. A prototype system on an OpenStack-based platform demonstrates that SEMAS has a linear increasing matching capacity as the number of servers and the partitioning granularity increase. It is able to elastically adjust the scale of servers and tolerate a large number of server failures with low latency and traffic overhead. Compared with existing cloud based pub/sub systems, SEMAS achieves higher throughput in various workloads.

SYSTEM ANALYSIS

EXISTING SYSTEM:

Characterized by the increasing arrival rate of live content, the emergency applications pose a great challenge: how to disseminate large-scale live content to interested users in a scalable and reliable manner. The publish/subscribe (pub/sub) model is widely used for data dissemination because of its capacity of seamlessly expanding the system to massive size. However, most event matching services of existing pub/sub systems either lead to low matching throughput when matching a large number of skewed subscriptions, or interrupt dissemination when a large number of servers fail.

However, most existing event matching services cannot adapt to the sudden change of the arrival live content rate, and generate a non-uniform distribution of load on the servers because of the skewness of the large-scale subscriptions. To this end SEMAS, a scalable and elastic event matching service for attribute-based pub/sub systems in the cloud computing environment. SEMAS uses one-hop lookup overlay to reduce the routing latency. Through ahierarchical multi-attribute space partition technique, SEMAS adaptively partitions the skewed subscriptions and maps them into balanced clusters to achieve high matching throughput.

The performance-aware detection scheme in SEMAS adaptively adjusts the scale of servers according to the churn of workloads, leading to high performance–price ratio. A prototype system on an OpenStack-based platform demonstrates that SEMAS has a linear increasing matching capacity as the number of servers and the partitioning granularity increase. It is able to elastically adjust the scale of servers and tolerate a large number of server failures with low latency and traffic overhead.

DISADVANTAGES:

Publish/Subscribe (pub/sub) is a commonly used asynchronous communication pattern among application components. Senders and receivers of messages are decoupled from each other and interact with an intermediary— a pub/sub system.

A receiver registers its interest in certain kinds of messages with the pub/sub system in the form of a subscription. Messages are published by senders to the pub/sub system. The system matches messages (i.e., publications) to subscriptions and delivers messages to interested subscribers using a notification mechanism.

There are several ways for subscriptions to specify messages of interest. In its simplest form messages are associated with topic strings and subscriptions are defined as patterns of the topic string. A more expressive form is attribute-based pub/sub where messages are further annotated with various attributes.

Subscriptions are expressed as predicates on the message topic and attributes. An even more general form is content based pub/sub where subscriptions can be arbitrary Boolean functions on the entire content of messages (e.g., XML documents), limited to attributes1.

Attribute based pub/sub strikes a balance between the simplicity and performance of topic-based pub/sub and the expressiveness of content-based pub/sub. Many large-scale and loosely coupled applications including stock quote distribution, network management, and environmental monitoring can be structured around a pub/sub messaging paradigm.

PROPOSED SYSTEM:

We propose a scalable and reliable matching service for content-based pub/sub service in cloud computing environments, called SREM. Specifically, we mainly focus on two problems: one is how to organize servers in the cloud computing environment to achieve scalable and reliable routing. The other is how to manage subscriptions and events to achieve parallel matching among these servers. Generally speaking, we provide the following contributions:

We propose a distributed overlay protocol, called SkipCloud, to organize servers in the cloud computing environment. SkipCloud enables subscriptions and events to be forwarded among brokers in a scalable and reliable manner. Also it is easy to implement and maintain.

  • To achieve scalable and reliable event matching among multiple servers, we propose a hybrid multidimensional space partitioning technique, called HPartition. It allows similar subscriptions to be divided into the same server and provides multiple candidate matching servers for each event. Moreover, it adaptively alleviates hot spots and keeps workload balance among all servers.
  • We implement extensive experiments based on a CloudStack testbed to verify the performance of SREM under various parameter settings.
  • In order to take advantage of multiple distributed brokers, SREM divides the entire content space among the top clusters of SkipCloud, so that each top cluster only handles a subset of the entire space and searches a small number of candidate subscriptions. SREM employs a hybrid multidimensional space partitioning technique, called HPartition, to achieve scalable and reliable event matching.

ADVANTAGES:

To achieve reliable connectivity and low routing latency, these brokers are connected through a distributed overlay, called SkipCloud. The entire content space is partitioned into disjoint subspaces, each of which is managed by a number of brokers. Subscriptions and events are dispatched to the subspaces that are overlapping with them through SkipCloud.

Since the pub/sub system needs to find all the matched subscribers, it requires each event to be matched in all datacenters, which leads to large traffic overhead with the increasing number of datacenters and the increasing arrival rate of live content.

Besides, it’s hard to achieve workload balance among the servers of all datacenters due to the various skewed distributions of users’ interests. Another question is that why we need a distributed overlay like SkipCloud to ensure reliable logical connectivity in datacenter environment where servers are more stable than the peers in P2P networks.

This is because as the number of servers increases in datacenters, the node failure becomes normal, but not rare exception. The node failure may lead to unreliable and inefficient routing among servers. To this end, we try to organize servers into SkipCloud to reduce the routing latency in a scalable and reliable manner.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007