IDENTITY-BASED ENCRYPTION WITH OUTSOURCED REVOCATION IN CLOUD COMPUTING

ABSTRACT:

Identity-Based Encryption (IBE) which simplifies the public key and certificate management at Public Key Infrastructure (PKI) is an important alternative to public key encryption. However, one of the main efficiency drawbacks of IBE is the overhead computation at Private Key Generator (PKG) during user revocation. Efficient revocation has been well studied in traditional PKI setting, but the cumbersome management of certificates is precisely the burden that IBE strives to alleviate. In this paper, aiming at tackling the critical issue of identity revocation, we introduce outsourcing computation into IBE for the first time and propose a revocable IBE scheme in the server-aided setting.

Our scheme offloads most of the key generation related operations during key-issuing and key-update processes to a Key Update Cloud Service Provider, leaving only a constant number of simple operations for PKG and users to perform locally. This goal is achieved by utilizing a novel collusion-resistant technique: we employ a hybrid private key for each user, in which an AND gate is involved to connect and bound the identity component and the time component. Furthermore, we propose another construction which is provable secure under the recently formulized Refereed Delegation of Computation model. Finally, we provide extensive experimental results to demonstrate the efficiency of our proposed construction.

INTRODUCTION:

Identity-Based Encryption (IBE) is an interesting alternative to public key encryption, which is proposed to simplify key management in a certificate-based Public Key Infrastructure (PKI) by using human-intelligible identities (e.g., unique name, email address, IP address, etc) as public keys. Therefore, sender using IBE does not need to look up public key and certificate, but directly encrypts message with receiver’s identity.

Accordingly, receiver obtaining the private key associated with the corresponding identity from Private Key Generator (PKG) is able to decrypt such ciphertext. Though IBE allows an arbitrary string as the public key which is considered as appealing advantages over PKI, it demands an efficient revocation mechanism. Specifically, if the private keys of some users get compromised, we must provide a mean to revoke such users from system. In PKI setting, revocation mechanism is realized by appending validity periods to certificates or using involved combinations of techniques.

Nevertheless, the cumbersome management of certificates is precisely the burden that IBE strives to alleviate. As far as we know, though revocation has been thoroughly studied in PKI, few revocation mechanisms are known in IBE setting. In Boneh and Franklin suggested that users renew their private keys periodically and senders use the receivers’ identities concatenated with current time period. But this mechanism would result in an overhead load at PKG. In another word, all the users regardless of whether their keys have been revoked or not, have to contact with PKG periodically to prove their identities and update new private keys. It requires that PKG is online and the secure channel must be maintained for all transactions, which will become a bottleneck for IBE system as the number of users grows.

In presented a revocable IBE scheme. Their scheme is built on the idea of fuzzy IBE primitive but utilizing a binary tree data structure to record users’ identities at leaf nodes. Therefore, key-update efficiency at PKG is able to be significantly reduced from linear to the height of such binary tree (i.e. logarithmic in the number ofusers). Nevertheless, we point out that though the binary tree introduction is able to achieve a relative high performance, it will result in other problems:

1) PKG has to generate a key pair for all the nodes on the path from the identity leaf node to the root node, which results in complexity logarithmic in the number of users in system for issuing a single private key.

2) The size of private key grows in logarithmic in the number of users in system, which makes it difficult in private key storage for users.

3) As the number of users in system grows, PKG has to maintain a binary tree with a large amount of nodes, which introduces another bottleneck for the global system. In tandem with the development of cloud computing, there has emerged the ability for users to buy on-demand computing from cloud-based services such as Amazon’s EC2 and Microsoft’s Windows Azure. Thus it desires a new working paradigm for introducing such cloud services into IBE revocation to fix the issue of efficiency and storage overhead described above. A naive approach would be to simply hand over the PKG’s master key to the Cloud Service Providers (CSPs).

The CSPs could then simply update all the private keys by using the traditional key update technique [4] and transmit the private keys back to unrevoked users. However, the naive approach is based on an unrealistic assumption that the CSPs are fully trusted and is allowed to access the master key for IBE system. On the contrary, in practice the public clouds are likely outside of the same trusted domain of users and are curious for users’ individual privacy. For this reason, a challenge on how to design a secure revocable IBE scheme to reduce the overhead computation at PKG with an untrusted CSP is raised.

In this paper, we introduce outsourcing computation into IBE revocation, and formalize the security definition of outsourced revocable IBE for the first time to the best of our knowledge. We propose a scheme to offload all the key generation related operations during key-issuing and keyupdate, leaving only a constant number of simple operations for PKG and eligible users to perform locally. In our scheme, as with the suggestion in realize revocation through updating the private keys of the unrevoked users. But unlike that work which trivially concatenates time period with identity for key generation/update and requires to re-issue the whole private key for unrevoked users.

We propose a novel collusion-resistant key issuing technique: we employ a hybrid private key for each user, in which an AND gate is involved to connect and bound two sub-components, namely the identity component and the time component. At first, user is able to obtain the identity component and a default time component (i.e., for current time period) from PKG as his/her private key in key-issuing. Afterwards, in order to maintain decryptability, unrevoked users needs to periodically request on keyupdate for time component to a newly introduced entity named Key Update Cloud Service Provider (KU-CSP).

Our scheme does not have to re-issue the whole private keys, but just need to update a lightweight component of it at a specialized entity KU-CSP. We also specify that 1) with the aid of KU-CSP, user needs not to contact with PKG in key-update, and in other words, PKG is allowed to be offline after sending the revocation list to KU-CSP. 2) No secure channel or user authentication is required during key-update between user and KU-CSP. Furthermore, we consider realizing revocable IBE with a semi-honest KU-CSP. To achieve this goal, we present a security enhanced construction under the recently formalized Refereed Delegation of Computation (RDoC) model. Finally, we provide extensive experimental results to demonstrate the efficiency of our proposed construction

EXISTING SYSTEM:

  • Identity-Based Encryption (IBE) is an interesting alternative to public key encryption, which is proposed to simplify key management in a certificate-based Public Key Infrastructure (PKI) by using human-intelligible identities (e.g., unique name, email address, IP address, etc) as public keys.
  • Boneh and Franklin suggested that users renew their private keys periodically and senders use the receivers’ identities concatenated with current time period.
  • Hanaoka et al. proposed a way for users to periodically renew their private keys without interacting with PKG.
  • Lin et al. proposed a space efficient revocable IBE mechanism from non-monotonic Attribute-Based Encryption (ABE), but their construction requires times bilinear pairing operations for a single decryption where the number of revoked users is.

DISADVANTAGES:

Boneh and Franklin mechanism would result in an overhead load at PKG. In another word, all the users regardless of whether their keys have been revoked or not, have to contact with PKG periodically to prove their identities and update new private keys. It requires that PKG is online and the secure channel must be maintained for all transactions, which will become a bottleneck for IBE system as the number of users grows.

  • Boneh and Franklin’s suggestion is more a viable solution but impractical.
  • In Hanaoka et al system, however, the assumption required in their work is that each user needs to possess a tamper-resistant hardware device.
  • If an identity is revoked then the mediator is instructed to stop helping the user. Obviously, it is impractical since all users are unable to decrypt on their own and they need to communicate with mediator for each decryption.

PROPOSED SYSTEM:

  • In this paper, we introduce outsourcing computation into IBE revocation, and formalize the security definition of outsourced revocable IBE for the first time to the best of our knowledge. We propose a scheme to offload all the key generation related operations during key-issuing and keyupdate, leaving only a constant number of simple operations for PKG and eligible users to perform locally.
  • In our scheme, as with the suggestion, we realize revocation through updating the private keys of the unrevoked users. But unlike that work which trivially concatenates time period with identity for key generation/update and requires to re-issue the whole private key for unrevoked users, we propose a novel collusion-resistant key issuing technique: we employ a hybrid private key for each user, in which an AND gate is involved to connect and bound two sub-components, namely the identity component and the time component.
  • At first, user is able to obtain the identity component and a default time component (i.e., for current time period) from PKG as his/her private key in key-issuing. Afterwards, in order to maintain decryptability, unrevoked users needs to periodically request on keyupdate for time component to a newly introduced entity named Key Update Cloud Service Provider (KU-CSP).

ADVANTAGES:

  • Compared with the previous work, our scheme does not have to re-issue the whole private keys, but just need to update a lightweight component of it at a specialized entity KU-CSP.
  • We also specify in the aid of KU-CSP, user needs not to contact with PKG in key-update, in other words, PKG is allowed to be offline after sending the revocation list to KU-CSP.
  • No secure channel or user authentication is required during key-update between user and KU-CSP.
  • Furthermore, we consider to realize revocable IBE with a semi-honest KU-CSP. To achieve this goal, we present a security enhanced construction under the recently formalized Refereed Delegation of Computation (RDoC) model.
  • Finally, we provide extensive experimental results to demonstrate the efficiency of our proposed construction.

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor              –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007

IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTI-CLOUD STORAGE

ABSTRACT:

Remote data integrity checking is of crucial importance in cloud storage. It can make the clients verify whether their outsourced data is kept intact without downloading the whole data. In some application scenarios, the clients have to store their data on multi-cloud servers. At the same time, the integrity checking protocol must be efficient in order to save the verifier’s cost. From the two points, we propose a novel remote data integrity checking model: ID-DPDP (identity-based distributed provable data possession) in multi-cloud storage. The formal system model and security model are given. Based on the bilinear pairings, a concrete ID-DPDP protocol is designed. The proposed ID-DPDP protocol is provably secure under the hardness assumption of the standard CDH (computational Diffie-Hellman) problem. In addition to the structural advantage of Elimination of certificate management, our ID-DPDP protocol is also efficient and flexible. Based on the client’s authorization, the proposed ID-DPDP protocol can realize private verification, delegated verification and public verification.

SYSTEM ANALYSIS

EXISTING SYSTEM:

The foundations of cloud computing lie in the outsourcing of computing tasks to the third party. Remote data integrity checking is a primitive to address this issue. For the general case, when the client stores his data on multi-cloud servers, the distributed storage and integrity checking are risk. On the other hand, the integrity checking protocol must be efficient in order to make it suitable for capacity-limited end devices. Thus, based on distributed computation, we will study distributed remote data integrity checking model and present the corresponding concrete protocol in multi-cloud storage. And also integrity of user Is not possible in existing system.

DISADVANTAGES:

  • The integrity of data is not possible in existing system
  • An existing system public verifier does not check the data in multi cloud

PROPOSED SYSTEM:

  • First, we analyze the performance of our proposed ID-DPDP protocol from the computation and communication overhead. We compare our ID-DPDP protocol with the other up-to date PDP protocols.. Second, we analyze our proposed ID-DPDP protocol’s properties of flexibility and verification. Third, we give the prototypal implementation of the proposed ID-DPDP protocol.
  • The signature relates the client’s identity with his private key. Distributed computing is used to store the client’s data on multi-cloud servers. At the same time, distributed computing is also used to combine the multi-cloud servers’ responses to respond the verifier’s challenge.

ADVANTAGES:

  • In our proposed system each client has a private correspond to his identity (i.e.) name, id or any…
  • The public verifier allow the user to correspond to his identity (i.e.) private Key

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win 7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Script :           JSP Script
  • Document :           MS-Office 2007

GENERATING SEARCHABLE PUBLIC-KEY CIPHERTEXTS WITH HIDDEN STRUCTURES FOR FAST KEYWORD SEARCH

ABSTRACT:

In this paper proposes Searchable Public-Key Ciphertexts with Hidden Structures (SPCHS) for keyword search as fast as possible without sacrificing semantic security of the encrypted keywords. In SPCHS, all keyword-searchable ciphertexts are structured by hidden relations, and with the search trapdoor corresponding to a keyword, the minimum information of the relations is disclosed to a search algorithm as the guidance to find all matching ciphertexts efficiently.

We construct a SPCHS scheme from scratch in which the ciphertexts have a hidden star-like structure. We prove our scheme to be semantically secure in the Random Oracle (RO) model. The search complexity of our scheme is dependent on the actual number of the ciphertexts containing the queried keyword, rather than the number of all ciphertexts.

Finally, we present a generic SPCHS construction from anonymous identity-based encryption and collision-free full-identity malleable Identity-Based Key Encapsulation Mechanism (IBKEM) with anonymity. We illustrate two collision-free full-identity malleable IBKEM instances, which are semantically secure and anonymous, respectively, in the RO and standard models. The latter instance enables us to construct an SPCHS scheme with semantic security in the standard model.

INTRODUCTION:

We start by formally defining the concept of Searchable Public-key Ciphertexts with Hidden Structures (SPCHS) and its semantic security. In this new concept, keywordsearchable ciphertexts with their hidden structures can be generated in the public key setting; with a keyword search trapdoor, partial relations can be disclosed to guide the discovery of all matching ciphertexts. Semantic security is defined for both the keywords and the hidden structures. It is worth noting that this new concept and its semantic security are suitable for keyword-searchable ciphertexts with any kind of hidden structures. In contrast, the concept of traditional PEKS does not contain any hidden structure among the PEKS ciphertexts; correspondingly, its semantic security is only defined for the keywords. Following the SPCHS definition, we construct a simple SPCHS from scratch in the random oracle (RO) model. The scheme generates keyword-searchable ciphertexts with a hidden star-like structure. The search performance mainly depends on the actual number of the ciphertexts containing the queried keyword. For security, the scheme is proven semantically secure based on the Decisional Bilinear DiffieHellman (DBDH) assumption in the RO model.

We build a generic SPCHS construction with IdentityBased Encryption (IBE) and collision-free full-identity malleable IBKEM. The resulting SPCHS can generate keyword-searchable ciphertexts with a hidden star-like structure. Moreover, if both the underlying IBKEM and IBE have semantic security and anonymity (i.e. the privacy of receivers’ identities), the resulting SPCHS is semantically secure. As there are known IBE schemes [4], [5], [6], [7] in both the RO model and the standard model, an SPCHS construction is reduced to collision-free full-identity malleable IBKEM with anonymity. We proposed several IBKEM schemes to construct Verifiable Random Functions2 (VRF). We show that one of these IBKEM schemes is anonymous and collision-free fullidentity malleable in the RO model. We transform this IBE scheme into a collision-free full-identity malleable IBKEM scheme with semantic security and anonymity in the standard model. Hence, this new IBKEM scheme allows us to build SPCHS schemes secure in the standard model with the same search performance as the previous SPCHS construction from scratch in the RO model.

LITRATURE SURVEY

TITLE: FUZZY KEYWORD SEARCH OVER ENCRYPTED DATA IN CLOUD COMPUTING

AUTOHR: Li J., Wang Q., Wang C., Cao N., Ren K., Lou W

PUBLISH:  IEEE INFOCOM 2010, pp. 1-5. (2010)

EXPLANATION:

As Cloud Computing becomes prevalent, more and more sensitive information are being centralized into the cloud. For the protection of data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a very challenging task. Although traditional searchable encryption schemes allow a user to securely search over encrypted data through keywords and selectively retrieve files of interest, these techniques support only exact keyword search. That is, there is no tolerance of minor typos and format inconsistencies which, on the other hand, are typical user searching behavior and happen very frequently. This significant drawback makes existing techniques unsuitable in Cloud Computing as it greatly affects system usability, rendering user searching experiences very frustrating and system efficacy very low. In this paper, for the first time we formalize and solve the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy. Fuzzy keyword search greatly enhances system usability by returning the matching files when users’ searching inputs exactly match the predefined keywords or the closest possible matching files based on keyword similarity semantics, when exact match fails. In our solution, we exploit edit distance to quantify keywords similarity and develop an advanced technique on constructing fuzzy keyword sets, which greatly reduces the storage and representation overheads. Through rigorous security analysis, we show that our proposed solution is secure and privacy-preserving, while correctly realizing the goal of fuzzy keyword search.

TITLE: ANONYMOUS FUZZY IDENTITY-BASED ENCRYPTION FOR SIMILARITY SEARCH

AUTOHR: Cheung D. W., Mamoulis N., Wong W. K., Yiu S. M., Zhang

PUBLISH: ISAAC 2010. LNCS, vol. 6505, pp. 61-72. Springer, Heidelberg (2010)

EXPLANATION:

The predicate that was studied in the very beginning is “exact keyword matching”. That is, whether the value hidden by the token is equal to the attribute value hidden in the ciphertext. Schemes that only provide data item security are basically “Identity-Based Encryption”. Schemes protecting both the data item and the attributes were initiated in the private-key setting public-key setting. Relationship between and “Anonymous Identity-Based Encryption” was revisited in range query as the predicate was also considered. Boneh et al. devised an Augmented Broadcast Encryption which allows checking if the attribute value falls within a range on encrypted data. Their scheme also provides attribute protection. Then, Boneh and Waters extended it to multi-dimensional range query.

However, there is no practical scheme supporting this predicate with attribute protection in public-key settings investigated this problem in the private-key setting and is IND2-CKA secure. His scheme is in a public-key setting. However, the scheme requires the threshold value t to be fixed in the setup time. Our work is using as a framework provided schemes for handling predicates represented as inner products. Their formulation of using inner products with bounded disjunction is powerful. We show how to reduce inner products to hamming distance similarity comparison predicate, and then derive a slightly different encryption scheme for better performance when considering the inequality case. In our work, we consider the problem of attribute protection in public-key setting. In some applications, people may also want to provide protection to predicate (“the token”), which is inherently unachievable in public-key setting. Note that a predicate encryption supporting inner product in private-key setting has been devised in which can provide predicate privacy

TITLE: TRAPDOOR PRIVACY IN ASYMMETRIC SEARCHABLE ENCRYPTION SCHEMES

AUTOHR: Arriaga A., Tang Q., Ryan P

PUBLISH: AFRICACRYPT 2014. LNCS, vol. 8469, pp. 31-50. Springer, Heidelberg (2014)

EXPLANATION:

Asymmetric searchable encryption allows searches to be carried over ciphertexts, through delegation, and by means of trapdoors issued by the owner of the data. Public Key Encryption with Keyword Search (PEKS) is a primitive with such functionality that provides delegation of exact-match searches. As it is important that ciphertexts preserve data privacy, it is also important that trapdoors do not expose the user’s search criteria. The difficulty of formalizing a security model for trapdoor privacy lies in the verification functionality, which gives the adversary the power of verifying if a trapdoor encodes a particular keyword. In this paper, we provide a broader view on what can be achieved regarding trapdoor privacy in asymmetric searchable encryption schemes, and bridge the gap between previous definitions, which give limited privacy guarantees in practice against search patterns. We propose the notion of Strong Search Pattern Privacy for PEKS and construct a scheme that achieves this security notion.

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing semantically secure PEKS schemes take search time linear with the total number of all ciphertexts. This makes retrieval from large-scale databases prohibitive. Therefore, more efficient search performance is crucial for practically deploying PEKS schemes. One of the prominent works to accelerate the search over encrypted keywords in the public-key setting enabling search over encrypted keywords to be as effi- cient as the search for unencrypted keywords, such that a ciphertext containing a given keyword can be retrieved in time complexity logarithmic in the total number of all ciphertexts.

This is reasonable because the encrypted keywords can form a tree-like structure when stored according to their binary values. However, deterministic encryption has two inherent limitations. First, keyword privacy can be guaranteed only for keywords that are a priori hardto-guess by the adversary (i.e., keywords with high minentropy to the adversary); second, certain information of a message leaks inevitably via the ciphertext of the keywords since the encryption is deterministic. Hence, deterministic encryption is only applicable in special scenarios.

Observe that a keyword space is usually of no high minentropy in many scenarios. Semantic security is crucial to guarantee keyword privacy in such applications. Thus the linear search complexity of existing schemes is the major obstacle to their adoption. Unfortunately, the linear complexity seems to be inevitable because the server has to scan and test each ciphertext, due to the fact that these ciphertexts (corresponding to the same keyword or not) are indistinguishable to the server.

DISADVANTAGES:

Each sender should be able to generate the keyword-searchable ciphertexts with the hidden star-like structure by the receiver’s public-key; the server having a keyword search trapdoor should be able to disclose partial relations, which is related to all matching ciphertexts. Semantic security is preserved 1) if no keyword search trapdoor is known, all ciphertexts are indistinguishable, and no information is leaked about the structure, and 2) given a keyword search trapdoor, only the corresponding relations can be disclosed, and the matching ciphertexts leak no information about the rest of ciphertexts, except the fact that the rest do not contain the queried keyword.

  • The integrity of data is not possible in existing system
  • An existing system public verifier does not check the data in multi cloud

PROPOSED SYSTEM:

We propose methods of searchable Public-key Ciphertexts with Hidden Structures (SPCHS) and its semantic security. In this new concept, keywordsearchable ciphertexts with their hidden structures can be generated in the public key setting; with a keyword search trapdoor, partial relations can be disclosed to guide the discovery of all matching ciphertexts. Semantic security is defined for both the keywords and the hidden structures. Following the SPCHS definition, we construct a simple SPCHS from scratch in the random oracle (RO) model. The scheme generates keyword-searchable ciphertexts with a hidden star-like structure. The search performance mainly depends on the actual number of the ciphertexts containing the queried keyword.

We are also interested in providing a generic SPCHS construction to generate keyword-searchable ciphertexts with a hidden star-like structure. Our generic SPCHS is inspired by several interesting observations on Identity-Based Key Encapsulation Mechanism (IBKEM). We build a generic SPCHS construction with IdentityBased Encryption (IBE) and collision-free full-identity malleable IBKEM. The resulting SPCHS can generate keyword-searchable ciphertexts with a hidden star-like structure. Moreover, if both the underlying IBKEM and IBE have semantic security and anonymity (i.e. the privacy of receivers’ identities), the resulting SPCHS is semantically secure. As there are known IBE schemes in both the RO model and the standard model, an SPCHS construction is reduced to collision-free full-identity malleable IBKEM.

ADVANTAGES:

IBKEM schemes to construct Verifiable Random Functions2 (VRF) [8]. We show that one of these IBKEM schemes is anonymous and collision-free fullidentity malleable in the RO model utilized the “approximation” of multilinear maps to construct a standard-model version of Boneh-and-Franklin (BF) IBE scheme.

We transform this IBE scheme into a collision-free full-identity malleable IBKEM scheme with semantic security and anonymity in the standard model. Hence, this new IBKEM scheme allows us to build SPCHS schemes secure in the standard model with the same search performance as the previous SPCHS construction from scratch in the RO model.

  • In our proposed system each client has a private correspond to his identity (i.e.) name, id or any…
  • The public verifier allow the user to correspond to his identity (i.e.) private Key

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse              –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server                         :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007

FRIENDBOOK A SEMANTIC-BASED FRIEND RECOMMENDATION

ABSTRACT:

Existing social networking services recommend friends to users based on their social graphs, which may not be the most appropriate to reflect a user’s preferences on friend selection in real life. In this paper, we present Friendbook, a novel semantic-based friend recommendation system for social networks, which recommends friends to users based on their life styles instead of social graphs. By taking advantage of sensor-rich smartphones, Friendbook discovers life styles of users from user-centric sensor data, measures the similarity of life styles between users, and recommends friends to users if their life styles have high similarity. Inspired by text mining, we model a user’s daily life as life documents, from which his/her life styles are extracted by using the Latent Dirichlet Allocation algorithm.

We further propose a similarity metric to measure the similarity of life styles between users, and calculate users’ impact in terms of life styles with a friend-matching graph. Upon receiving a request, Friendbook returns a list of people with highest recommendation scores to the query user. Finally, Friendbook integrates a feedback mechanism to further improve the recommendation accuracy. We have implemented Friendbook on the Android-based smartphones, and evaluated its performance on both small-scale experiments and large-scale simulations. The results show that the recommendations accurately reflect the preferences of users in choosing friends.

INTRODUCTION:

What Is A Social Network?

Wikipedia defines a social network service as a service which “focuses on the building and verifying of online social networks for communities of people who share interests and activities, or who are interested in exploring the interests and activities of others, and which necessitates the use of software.”

A report published by OCLC provides the following definition of social networking sites: “Web sites primarily designed to facilitate interaction between users who share interests, attitudes and activities, such as Facebook, Mixi and MySpace.”

What Can Social Networks Be Used For?

Social networks can provide a range of benefits to members of an organization:

Support for learning: Social networks can enhance informal learning and support social connections within groups of learners and with those involved in the support of learning.

Support for members of an organisation:  Social networks can potentially be used my all members of an organisation, and not just those involved in working with students. Social networks can help the development of communities of practice.

Engaging with others: Passive use of social networks can provide valuable business intelligence and feedback on institutional services (although this may give rise to ethical concerns).

Ease of access to information and applications: The ease of use of many social networking services can provide benefits to users by simplifying access to other tools and applications. The Facebook Platform provides an example of how a social networking service can be used as an environment for other tools.

Common interface: A possible benefit of social networks may be the common interface which spans work / social boundaries. Since such services are often used in a personal capacity the interface and the way the service works may be familiar, thus minimising training and support needed to exploit the services in a professional context.  This can, however, also be a barrier to those who wish to have strict boundaries between work and social activities.

Examples of popular social networking services include:

Facebook: Facebook is a social networking Web site that allows people to communicate with their friends and exchange information. In May 2007 Facebook launched the Facebook Platform which provides a framework for developers to create applications that interact with core Facebook features

MySpace: MySpace is a social networking Web site offering an interactive, user-submitted network of friends, personal profiles, blogs and groups, commonly used for sharing photos, music and videos.

Ning: An online platform for creating social Web sites and social networks aimed at users who want to create networks around specific interests or have limited technical skills.

Twitter: Twitter is an example of a micro-blogging service. Twitter can be used in a variety of ways including sharing brief information with users and providing support for one’s peers.

Note that this brief list of popular social networking services omits popular social sharing services such as Flickr and YouTube.

Opportunities and Challenges

The popularity and ease of use of social networking services have excited institutions with their potential in a variety of areas. However effective use of social networking services poses a number of challenges for institutions including long-term sustainability of the services; user concerns over use of social tools in a work or study context; a variety of technical issues and legal issues such as copyright, privacy, accessibility; etc.

Institutions would be advised to consider carefully the implications before promoting significant use of such services.

Twenty years ago, people typically made friends with others who live or work close to themselves, such as neighbors or colleagues. We call friends made through this traditional fashion as G-friends, which stands for geographical location-based friends because they are influenced by the geographical distances between each other. With the rapid advances in social networks, services such as Facebook, Twitter and Google+ have provided us revolutionary ways of making friends. According to Facebook statistics, a user has an average of 130 friends, perhaps larger than any other time in history. One challenge with existing social networking services is how to recommend a good friend to a user. Most of them rely on pre-existing user relationships to pick friend candidates.

For example, Facebook relies on a social link analysis among those who already share common friends and recommends symmetrical users as potential friends. Unfortunately, this approach may not be the most appropriate based on recent sociology findings. According to these studies, the rules to group people together include: 1) habits or life style; 2) attitudes; 3) tastes; 4) moral standards; 5) economic level; and 6) people they already know. Rather, life styles are usually closely correlated with daily routines and activities. Therefore, if we could gather information on users’ daily routines and activities, we can exploit rule #1 and recommend friends to people based on their similar life styles. This recommendation mechanism can be deployed as a standalone app on smartphones or as an add-on to existing social network frameworks. In both cases, Friendbook can help mobile phone users find friends either among strangers or within a certain group as long as they share similar life styles.

LITRATURE SURVEY:

1) “Probabilistic mining of socio geographic routines from mobile phone data”

AUTHORS:  K. Farrahi and D. Gatica-Perez

There is relatively little work on the investigation of large-scale human data in terms of multimodality for human activity discovery. In this paper, we suggest that human interaction data, or human proximity, obtained by mobile phone Bluetooth sensor data, can be integrated with human location data, obtained by mobile cell tower connections, to mine meaningful details about human activities from large and noisy datasets. We propose a model, called bag of multimodal behavior that integrates the modeling of variations of location over multiple time-scales, and the modeling of interaction types from proximity. Our representation is simple yet robust to characterize real-life human behavior sensed from mobile phones, which are devices capable of capturing large-scale data known to be noisy and incomplete. We use an unsupervised approach, based on probabilistic topic models, to discover latent human activities in terms of the joint interaction and location behaviors of 97 individuals over the course of approximately a 10-month period using data from MIT’s Reality Mining project. Some of the human activities discovered with our multimodal data representation include “going out from 7 pm-midnight alone” and “working from 11 am-5 pm with 3-5 other people,” further finding that this activity dominantly occurs on specific days of the week. Our methodology also finds dominant work patterns occurring on other days of the week. We further demonstrate the feasibility of the topic modeling framework for human routine discovery by predicting missing multimodal phone data at specific times of the day.

  1. Collaborative and structural recommendation of friends using weblog-based social network analysis

AUTHORS:  W. H. Hsu, A. King, M. Paradesi, T. Pydimarri, and T. Weninger

In this paper, we address the problem of link recommendation in weblogs and similar social networks. First, we present an approach based on collaborative recommendation using the link structure of a social network and content-based recommendation using mutual declared interests. Next, we describe the application of this approach to a small representative subset of a large real-world social network: the user/community network of the blog service Live Journal. We then discuss the ground features available in Live Journal’s public user information pages and describe some graph algorithms for analysis of the social network. These are used to identify candidates, provide ground truth for recommendations, and construct features for learning the concept of a recommended link. Finally, we compare the performance of this machine learning approach to that of the rudimentary recommender system provided by Live Journal.

  1. Understanding Transportation Modes Based on GPS Data for Web Applications.

AUTHORS:  Y. Zheng, Y. Chen, Q. Li, X. Xie, and W.-Y. Ma.

User mobility has given rise to a variety of Web applications, in which the global positioning system (GPS) plays many important roles in bridging between these applications and end users. As a kind of human behavior, people’s transportation modes, such as walking and driving, can provide pervasive computing systems with more contextual information and enrich a user’s mobility with informative knowledge. In this article, we report on an approach based on supervised learning to automatically infer users’ transportation modes, including driving, walking, taking a bus and riding a bike, from raw GPS logs. Our approach consists of three parts: a change point-based segmentation method, an inference model and a graph-based post-processing algorithm. First, we propose a change point-based segmentation method to partition each GPS trajectory into separate segments of different transportation modes. Second, from each segment, we identify a set of sophisticated features, which are not affected by differing traffic conditions (e.g., a person’s direction when in a car is constrained more by the road than any change in traffic conditions). Later, these features are fed to a generative inference model to classify the segments of different modes. Third, we conduct graph-based post-processing to further improve the inference performance. This post-processing algorithm considers both the commonsense constraints of the real world and typical user behaviors based on locations in a probabilistic manner. The advantages of our method over the related works include three aspects. 1) Our approach can effectively segment trajectories containing multiple transportation modes. 2) Our work mined the location constraints from user-generated GPS logs, while being independent of additional sensor data and map information like road networks and bus stops. 3) The model learned from the dataset of some users can be applied to infer GPS data from others. Using the GPS logs collected by 65 people over a period of 10 months, we evaluated our approach via a set of experiments. As a result, based on the change-point-based segmentation method and Decision Tree-based inference model, we achieved prediction accuracy greater than 71 percent. Further, using the graph-based post-processing algorithm, the performance attained a 4-percent enhancement.

  1. Online friend recommendation through personality matching and collaborative filtering

AUTHORS: L. Bian and H. Holtzman

Most social network websites rely on people’s proximity on the social graph for friend recommendation. In this paper, we present Matchmaker, a collaborative filtering friend recommendation system based on personality matching. The goal of Matchmaker is to leverage the social information and mutual understanding among people in existing social network connections, and produce friend recommendations based on rich contextual data from people’s physical world interactions. Matchmaker allows users’ network to match them with similar TV characters, and uses relationships in the TV programs as parallel comparison matrix to suggest to the users friends that have been voted to suit their personality the best. The system’s ranking schema allows progressive improvement on the personality matching consensus and more diverse branching of users’ social network connections. Lastly, our user study shows that the application can also induce more TV content consumption by driving users’ curiosity in the ranking process.

SYSTEM ANALYSIS:

EXISTING SYSTEM:

Most of the friend suggestions mechanism relies on pre-existing user relationships to pick friend candidates. For example, Facebook relies on a social link analysis among those who already share common friends and recommends symmetrical users as potential friends. The rules to group people together include:

  • Habits or life style
  • Attitudes
  • Tastes
  • Moral standards
  • Economic level; and
  • People they already know.

Apparently, rule #3 and rule #6 are the mainstream factors considered by existing recommendation systems.

DISADVANTAGES:

  • Existing social networking services recommend friends to users based on their social graphs, which may not be the most appropriate to reflect a user’s preferences on friend selection in real life

PROPOSED SYSTEM:

  • A novel semantic-based friend recommendation system for social networks, which recommends friends to users based on their life styles instead of social graphs.
  • By taking advantage of sensor-rich smartphones, Friendbook discovers life styles of users from user-centric sensor data, measures the similarity of life styles between users, and recommends friends to users if their life styles have high similarity.
  • We model a user’s daily life as life documents, from which his/her life styles are extracted by using the Latent Dirichlet Allocation algorithm.
  • Similarity metric to measure the similarity of life styles between users, and calculate users’
  • Impact in terms of life styles with a friend-matching graph.
  • We integrate a linear feedback mechanism that exploits the user’s feedback to improve recommendation accuracy.

ADVANTAGES:

  • Recommend potential friends to users if they share similar life styles.
  • The feedback mechanism allows us to measure the satisfaction of users, by providing a user interface that allows the user to rate the friend list

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor                       –    SVGA

SOFTWARE REQUIREMENTS:

JAVA

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007

ENERGY EFFICIENT VIRTUAL NETWORK EMBEDDING FOR CLOUD NETWORKS

  • ABSTRACT:

In this paper, we propose an energy efficient virtual network embedding (EEVNE) approach for cloud computing networks, where power savings are introduced by consolidating resources in the network and data centers. We model our approach in an IP over WDM network using mixed integer linear programming (MILP). The performance of the EEVNE approach is compared with two approaches from the literature: the bandwidth cost approach (CostVNE) and the energy aware approach (VNE-EA). The CostVNE approach optimizes the use of available bandwidth, while the VNE-EA approach minimizes the power consumption by reducing the number of activated nodes and links without taking into account the granular power consumption of the data centers and the different network devices.

The results show that the EEVNE model achieves a maximum power saving of 60% (average 20%) compared to the CostVNE model under an energy inefficient data center power profile. We develop a heuristic, real-time energy optimized VNE (REOViNE), with power savings approaching those of the EEVNE model. We also compare the different approaches adopting energy efficient data center power profile. Furthermore, we study the impact of delay and node location constraints on the energy efficiency of virtual network embedding. We also show how VNE can impact the design of optimally located data centers for minimal power consumption in cloud networks. Finally, we examine the power savings and spectral efficiency benefits that VNE offers in optical orthogonal division multiplexing networks.

  • INTRODUCTION:

The ever growing uptake of cloud computing as a widely accepted computing paradigm calls for novel architectures to support QoS and energy efficiency in networks and data centers. Estimates indicate that in the long term, if current trends continue, the annual energy bill paid by data center operators will exceed the cost of equipment. Given the ecological and economic impact, both academia and industry are focusing efforts on developing energy efficient paradigms for cloud computing. In, the authors stated that the success of future cloud networks where clients are expected to be able to specify the data rate and processing requirements for hosted applications and services will greatly depend on network virtualization. The form of cloud computing service offering under study here is Infrastructure as a Service (IaaS). IaaS is the delivery of virtualized and dynamically scalable computing power, storage and networking on demand to clients on a pay as you go basis.

Network virtualization allows multiple heterogeneous virtual network architectures (comprising virtual nodes and links) to coexist on a shared physical platform, known as the substrate network which is owned and operated by an infrastructure provider (InP) or cloud service provider whose aim is to earn a profit from leasing network resources to its customers (Service Providers (SPs)). It provides scalability, customised and on demand allocation of resources and the promise of efficient use of network resources. Network virtualization is therefore a strong proponent for the realization of an efficient IaaS framework in cloud networks. InPs should have a resource allocation framework that reserves and allocates physical resources to elements such as virtual nodes and virtual links. Resource allocation is done using a class of algorithms commonly known as “virtual network embedding (VNE)” algorithms. The dynamic mapping of virtual resources onto the physical hardware maximizes the benefits gained from existing hardware. The VNE problem can be either Offline or Online. In offline problems all the virtual network requests (VNRs) are known and scheduled in advance while for the online problem, VNRs arrive dynamically and can stay in the network for an arbitrary duration.

Both online and offline problems are known to be NPhard. With constraints on virtual nodes and links, the offline VNE problem can be reduced to the NP-hard multiway separator problem, as a result, most of the work done in this area has focused on the design of heuristic algorithms and the use of networks with minimal complexity when solving mixed integer linear programming (MILP) models. Network virtualization has been proposed as an enabler of energy savings by means of resource consolidation. In all these proposals, the VNE models and/or algorithms do not address the link embedding problem as a multi-layer problem spanning from the virtualization layer through the IP layer and all the way to the optical layer. Except for the authors in, the others do not consider the power consumption of network ports/links as being related to the actual traffic passing through them.

On the contrary, we take a very generic, detailed and accurate approach towards energy efficient VNE (EEVNE) where we allow the model to decide the optimum approach to minimize the total network and data centers server power consumption. We consider the granular power consumption of various network elements that form the network engine in backbone networks as well as the power consumption in data centers. We develop a MILP model and a real-time heuristic to represent the EEVNE approach for clouds in IP over WDM networks with data centers. We study the energy efficiency considering two different power consumption profiles for servers in data centers; An energy inefficient power profile and an energy efficient power profile. Our work also investigates the impact of location and delay constraints in a practical enterprise solution of VNE in clouds. Furthermore we show how VNE can impact the design problem of optimally locating data centers for minimal power consumption in cloud networks.

  • LITRATURE SURVEY:

RESOURCE ALLOCATION IN A NETWORK-BASED CLOUD COMPUTING ENVIRONMENT: DESIGN CHALLENGES

AUTHOR: M. A. Sharkh, M. Jammal, A. Shami, and A. Ouda

PUBLISH: IEEE Commun. Mag., vol. 51, no. 11, pp. 46–52, 2013.

EXPLANATION:

Cloud computing is a utility computing paradigm that has become a solid base for a wide array of enterprise and end-user applications. Providers offer varying service portfolios that differ in resource configurations and provided services. A comprehensive solution for resource allocation is fundamental to any cloud computing service provider. Any resource allocation model has to consider computational resources as well as network resources to accurately reflect practical demands. Another aspect that should be considered while provisioning resources is energy consumption. This aspect is getting more attention from industrial and government parties. Calls for the support of green clouds are gaining momentum. With that in mind, resource allocation algorithms aim to accomplish the task of scheduling virtual machines on the servers residing in data centers and consequently scheduling network resources while complying with the problem constraints. Several external and internal factors that affect the performance of resource allocation models are introduced in this article. These factors are discussed in detail, and research gaps are pointed out. Design challenges are discussed with the aim of providing a reference to be used when designing a comprehensive energy-aware resource allocation model for cloud computing data centers.

DISTRIBUTED ENERGY EFFICIENT CLOUDS OVER CORE NETWORKS

AUTHOR: A. Q. Lawey, T. E. H. El-Gorashi, and J. M. H. Elmirghani

PUBLISH: IEEE J. Lightw. Technol., vol. 32, no. 7, pp. 1261–1281, Jan. 2014.

EXPLANATION:

In this paper, we introduce a framework for designing energy efficient cloud computing services over non-bypass IP/WDM core networks. We investigate network related factors including the centralization versus distribution of clouds and the impact of demand, content popularity and access frequency on the clouds placement, and cloud capability factors including the number of servers, switches and routers and amount of storage required in each cloud. We study the optimization of three cloud services: cloud content delivery, storage as a service (StaaS), and virtual machines (VMS) placement for processing applications. First, we develop a mixed integer linear programming (MILP) model to optimize cloud content delivery services. Our results indicate that replicating content into multiple clouds based on content popularity yields 43% total saving in power consumption compared to power un-aware centralized content delivery. Based on the model insights, we develop an energy efficient cloud content delivery heuristic, DEER-CD, with comparable power efficiency to the MILP results. Second, we extend the content delivery model to optimize StaaS applications. The results show that migrating content according to its access frequency yields up to 48% network power savings compared to serving content from a single central location. Third, we optimize the placement of VMs to minimize the total power consumption. Our results show that slicing the VMs into smaller VMs and placing them in proximity to their users saves 25% of the total power compared to a single virtualized cloud scenario. We also develop a heuristic for real time VM placement (DEER-VM) that achieves comparable power savings.

Reducing power consumption in embedding virtual infrastructures

AUTHOR: B. Wang, X. Chang, J. Liu, and J. K. Muppala

PUBLISH: c. IEEE Globecom Workshops, Dec. 3–7, 2012, pp. 714–718.

EXPLANATION:

Network virtualization is considered to be not only an enabler to overcome the inflexibility of the current Internet infrastructure but also an enabler to achieve an energy-efficient Future Internet. Virtual network embedding (VNE) is a critical issue in network virtualization technology. This paper explores a joint power-aware node and link resource allocation approach to handle the VNE problem with the objective of minimizing energy consumption. We first present a generalized power consumption model of embedding a VN. Then we formulate the problem as a mixed integer program and propose embedding algorithms. Simulation results demonstrate that the proposed algorithms perform better than the existing algorithms in terms of the power consumption in the overprovisioned scenarios.

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing methods of disaster-resilient optical datacenter networks through integer linear programming (ILP) and heuristics addressed content placement, routing, and protection of network and content for geographically distributed cloud services delivered by optical networks models and heuristics are developed to minimize delay and power consumption of clouds over IP/WDM networks. The authors of exploited anycast routing by intelligently selecting destinations and routes for users traffic served by clouds over optical networks, as opposed to unicast traffic, while switching off unused network elements. A unified, online, and weighted routing and scheduling algorithm is presented in for a typical optical cloud infrastructure considering the energy consumption of the network and IT resources.

In the authors provided an optimization-based framework, where the objective functions range from minimizing the energy and bandwidth cost to minimizing the total carbon footprint subject to QoS constraints. Their model decides where to build a data center, how many servers are needed in each datacenter and how to route requests. In we built a MILP model to study the energy efficiency of public cloud for content delivery over non-bypass IP/WDM core networks. The model optimizes clouds external factors including the location of the cloud in the IP/WDM network and whether the cloud should be centralized or distributed and cloud internal capability factors including the number of servers, internal LAN switches, routers, and amount of storage required in each cloud.

DISADVANTAGES:

(i) Studying the impact of small content (storage) size on the energy efficiency of cloud content delivery

(ii) Developing a real time heuristic for energy aware content delivery based on the content delivery model insights,

(iii) Extending the content delivery model to study the Storage as a Service (StaaS) application,

(iv) ILP model for energy aware cloud VM placement and designing a heuristic to mimic the model behaviour in real time.

PROPOSED SYSTEM:

We developed a MILP model which attempts to minimize the bandwidth cost of embedding a VNR. In the virtual network embedding energy aware (VNE-EA) model minimized the energy consumption by imposing the notion that the power consumption is minimized by switching off substrate links and nodes. The authors also assume that the power saved in switching off a substrate link is the same as the power saved by switching off a substrate node.

In the authors assumed that the power consumption in the network is insensitive to the number of ports used. They also seek to minimize the number of active working nodes and links. Botero and Hesselbach have proposed a model for energy efficiency using load balancing and have also developed a dynamic heuristic that reconfigures the embedding for energy efficiency once it is performed. They have implemented and evaluated their MILP models and heuristic algorithms using the ALEVIN Framework. The ALEVIN Framework is a good tool for developing, comparing and analyzing VNE algorithms.

The performance of the EEVNE approach is compared with two approaches from the literature: the bandwidth cost approach (CostVNE) and the energy aware approach (VNE-EA). The CostVNE approach optimizes the use of available bandwidth, while the VNE-EA approach minimizes the power consumption by reducing the number of activated nodes and links without taking into account the granular power consumption of the data centers and the different network devices.

The results show that the EEVNE model achieves a maximum power saving of 60% (average 20%) compared to the CostVNE model under energy inefficient data center power profile. We develop a heuristic, real-time energy optimized VNE (REOViNE), with power savings approaching those of the EEVNE model.

ADVANTAGES:

We are however unable to compare our model and heuristic to the implemented algorithms on the platform for the following reasons:

  1. Our input parameters are not compatible to the existing models and algorithms on the platform. Extensive extensions to the algorithms and models would be needed for them to include the optical layer. Our parameters include among others; the distance in km between links for us to determine the number of EDFA’s or Regenerators needed on a link, the wavelength rate, the number of wavelengths in a fiber, the power consumption of EDFAs, transponders, regenerators, router ports, optical cross connects, multiplexers, de-multiplexers, etc.
  2. The assumptions made in the calculation of power in our model and the models on the platform are different. We define the power consumption to its fine granularity to include power consumed due to traffic on each element that forms the network engine. One of our main contributions in this work is the inclusion of the optical layer in link embedding which is currently not supported by any of the algorithms on the ALEVIN platform.

We developed a generalized power consumption model of embedding a VNR and formulated it as a MILP model; however, they also assumed that the power consumption of the network ports is independent of traffic. In the authors propose a trade-off between maximizing the number of VNRs that can be accommodated by the InP and minimizing the energy cost of the whole system. They propose embedding requests in regions with the lowest electricity cost.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor              –    SVGA

SOFTWARE REQUIREMENTS:

JAVA

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Document :           MS-Office 2007

Enabling Fine-grained Multi-keyword Search

Abstract—Using cloud computing, individuals can store their data on remote servers and allow data access to public users through the cloud servers. As the outsourced data are likely to contain sensitive privacy information, they are typically encrypted before uploaded to the cloud. This, however, significantly limits the usability of outsourced data due to the difficulty of searching over the encrypted data. In this paper, we address this issue by developing the fine-grained multi-keyword search schemes over encrypted cloud data. Our original contributions are three-fold. First, we introduce the relevance scores and preference factors upon keywords which enable the precise keyword search and personalized user experience. Second, we develop a practical and very efficient multi-keyword search scheme.
The proposed scheme can support complicated logic search the mixed “AND”, “OR” and “NO”  perations of keywords. Third, we further employ the classified sub-dictionaries technique to achieve better efficiency on index building, trapdoor generating and query. Lastly, we analyze the security of the proposed schemes in terms of confidentiality of documents, privacy protection of index and trapdoor, and unlinkability of trapdoor. Through extensive experiments using the real-world dataset, we validate the performance of the proposed schemes. Both the security analysis and experimental results demonstrate that the proposed schemes can achieve the same security level comparing to the existing ones and better performance in terms of functionality, query complexity and efficiency.

INTRODUCTION
The cloud computing treats computing as a utility and leases out the computing and storage capacities to the public individuals [1], [2], [3]. In such a framework, the individual can remotely store her data on the cloud server, namely data outsourcing, and then make the cloud data open for public access through the cloud server. This represents a more scalable, low-cost and stable way for public data access because of the scalability and high efficiency of cloud servers, and therefore is favorable to small enterprises.

Note that the outsourced data may contain sensitive privacy information. It is often necessary to encrypt the private data before transmitting the data to the cloud servers [4], [5]. The data encryption, however, would significantly lower the usability of data due to the difficulty of searching over the encrypted data [6]. Simply encrypting the data may still cause other security concerns. For instance, Google Search uses SSL (Secure Sockets Layer) to encrypt the connection between search user and Google server when private data, such as documents and emails, appear in the search results [7].
However, if the search user clicks into another website from the search results page, that website may be able to identify the search terms that the user has used.
On addressing above issues, the searchable encryption (e.g., [8], [9], [10]) has been recently developed as a fundamental approach to enable searching over encrypted cloud data, which proceeds the following operations. Firstly, the data owner needs to generate several keywords according to the outsourced data. These keywords are then encrypted and stored at the cloud server. When a search user needs to access the outsourced data, it can select some relevant keywords and send the ciphertext of the selected keywords to the cloud server. The cloud server then uses the ciphertext to match the outsourced encrypted keywords, and lastly returns the matching results to the search user. To achieve the similar
search efficiency and precision over encrypted data as that of plaintext keyword search, an extensive body of research has been developed in literature. Wang et al. [11] propose a ranked keyword search scheme which considers the relevance scores of keywords. Unfortunately, due to using order-preserving
encryption (OPE) [12] to achieve the ranking property, the proposed scheme cannot achieve unlinkability of trapdoor.
Later, Sun et al. [13] propose a multi-keyword text search scheme which considers the relevance scores of keywords and utilizes a multidimensional tree technique to achieve efficient search query. Yu et al. [14] propose a multi-keyword top-k retrieval scheme which uses fully homomorphic encryption to
encrypt the index/trapdoor and guarantees high security. Cao et al. [6] propose a multi-keyword ranked search (MRSE), which applies coordinate machine as the keyword matching rule, i.e., return data with the most matching keywords.
Although many search functionalities have been developed in previous literature towards precise and efficient searchable encryption, it is still difficult for searchable encryption to achieve the same user experience as that of the plaintext search, like Google search. This mainly attributes to following
two issues. Firstly, query with user preferences is very popular in the plaintext search [15], [16]. It enables personalized search and can more accurately represent user’s requirements, but has
not been thoroughly studied and supported in the encrypted data domain. Secondly, to further improve the user’s experience on searching, an important and fundamental function is to enable the multi-keyword search with the comprehensive logic operations, i.e., the “AND”, “OR” and “NO” operations
of keywords. This is fundamental for search users to prune the searching space and quickly identify the desired data.
Cao et al. [6] propose the coordinate matching search scheme (MRSE) which can be regarded as a searchable encryption scheme with “OR” operation. Zhang et al. [17] propose a conjunctive keyword search scheme which can be regarded as a searchable encryption scheme with “AND” operation with
the returned documents matching all keywords. However, most existing proposals can only enable search with single logic operation, rather than the mixture of multiple logic operations on keywords, which motivates our work. In this work, we address above two issues by developing two Fine-grained Multi-keyword Search (FMS) schemes over encrypted cloud data. Our original contributions can be summarized in three aspects as follows:
• We introduce the relevance scores and the preference factors of keywords for searchable encryption. The relevance scores of keywords can enable more precise returned results, and the preference factors of keywords represent the importance of keywords in the search keyword set specified by search users and correspondingly enables personalized search to cater to specific user preferences. It thus further improves the search functionalities and user experience.
• We realize the “AND”, “OR” and “NO” operations in the multi-keyword search for searchable encryption. Compared with schemes in [6], [13] and [14], the proposed scheme can achieve more comprehensive functionality and lower query complexity.
• We employ the classified sub-dictionaries technique to enhance the efficiency of the above two schemes. Extensive experiments demonstrate that the enhanced schemes can achieve better efficiency in terms of index building, trapdoor generating and query in the comparison with schemes in [6], [13] and [14].

ENABLING EFFICIENT MULTI-KEYWORD RANKED SEARCH

ABSTRACT:

In mobile cloud computing, a fundamental application is to outsource the mobile data to external cloud servers for scalable data storage. The outsourced data, however, need to be encrypted due to the privacy and confidentiality concerns of their owner. This results in the distinguished difficulties on the accurate search over the encrypted mobile cloud data.

In this paper, we develop the searchable encryption for multi-keyword ranked search over the storage data. Specifically, by considering the large number of outsourced documents (data) in the cloud, we utilize the relevance score and k-nearest neighbor techniques to develop an efficient multi-keyword search scheme that can return the ranked search results based on the accuracy.

This framework, we leverage an efficient index to further improve the search efficiency, and adopt the blind storage system to conceal access pattern of the search user. Security analysis demonstrates that our scheme can achieve confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Finally, using extensive simulations, we show that our proposal can achieve much improved efficiency in terms of search functionality and search time compared with the existing proposals.

GOAL OF THE PROJECT:

Efficient and privacy-preserving multi-keyword ranked search over encrypted mobile cloud data via blind storage system, the EMRS has following design goals:

  • Multi-Keyword Ranked Search: To meet the requirements for practical uses and provide better user experience, the EMRS should not only support multi-keyword search over encrypted mobile cloud data, but also achieve relevance-based result ranking.
  • Search Efficiency: Since the number of the total documents may be very large in a practical situation, the EMRS should achieve sublinear search with better search efficiency.
  • Confidentiality and Privacy Preservation: To prevent the cloud server from learning any additional information about the documents and the index, and to keep search users’ trapdoors secret, the EMRS should cover all the security requirements that we introduced above.

INTRODUCTION

Mobile cloud computing gets rid of the hardware limitation of mobile devices by exploring the scalable and virtualized cloud storage and computing resources, and accordingly is able to provide much more powerful and scalable mobile services to users. In mobile cloud computing, mobile users typically outsource their data to external cloud servers, e.g., iCloud, to enjoy a stable, low-cost and scalable way for data storage and access. However, as outsourced data typically contain sensitive privacy information, such as personal photos, emails, etc., which would lead to severe confidentiality and privacy violations, if without efficient protections. It is therefore necessary to encrypt the sensitive data before outsourcing them to the cloud. The data encryption, however, would result in salient difficulties when other users need to access interested data with search, due to the difficulties of search over encrypted data.

This fundamental issue in mobile cloud computing accordingly motivates an extensive body of research in the recent years on the investigation of searchable encryption technique to achieve efficient searching over outsourced encrypted data. A collection of research works have recently been developed on the topic of multi-keyword search over encrypted data. Propose a symmetric searchable encryption scheme which achieves high efficiency for large databases with modest scarification on security guarantees. Propose a multi-keyword search scheme supporting result ranking by adopting k-nearest neighbors (kNN) technique. Propose a dynamic searchable encryption scheme through blind storage to conceal access pattern of the search user.

In order to meet the practical search requirements, search over encrypted data should support the following three functions.

First, the searchable encryption schemes should support multi-keyword search, and provide the same user experience as searching in Google search with different keywords; single-keyword search is far from satisfactory by only returning very limited and inaccurate search results. Second, to quickly identify most relevant results, the search user would typically prefer cloud servers to sort the returned search results in a relevance-based order ranked by the relevance of the search request to the documents. In addition, showing the ranked search to users can also eliminate the unnecessary network traffic by only sending back the most relevant results from cloud to search users.

Third, as for the search efficiency, since the number of the documents contained in a database could be extraordinarily large, searchable encryption schemes should be efficient to quickly respond to the search requests with minimum delays.

In contrast to the theoretical benefits, most of the existing proposals, however, fail to offer sufficient insights towards the construction of full functioned searchable encryption as described above. As an effort towards the issue, in this paper, we propose an efficient multi-keyword ranked search (EMRS) scheme over encrypted mobile cloud data through blind storage.

Our main contributions can be summarized as follows:

  • We introduce a relevance score in searchable encryption to achieve multi-keyword ranked search over the encrypted mobile cloud data. In addition to that, we construct an efficient index to improve the search efficiency.
  • By modifying the blind storage system in the EMRS, we solve the trapdoor unlinkability problem and conceal access pattern of the search user from the cloud server.
  • We give thorough security analysis to demonstrate that the EMRS can reach a high security level including confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Moreover, we implement extensive experiments, which show that the EMRS can achieve enhanced efficiency in the terms of functionality and search efficiency compared with existing proposals.

LITRATURE SURVEY

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing works built various types of secure index and corresponding index-based keyword matching algorithms to improve search efficiency. All these works only support the search of single keyword. Subsequent works extended the search capability to multiple, conjunctive or disjunctive, keywords search. However, they support only exact keyword matching. Misspelled keywords in the query will result in wrong or no matching. Very recently, a few works extended the search capability to approximate keyword matching (also known as fuzzy search). These are all for single keyword search, with a common approach involving expanding the index file by covering possible combinations of keyword misspelling so that a certain degree of spelling error, measured by edit distance, can be tolerated. Although a wild-card approach is adopted to minimize the expansion of the resulting index file, for a l-letter long keyword to tolerate an error up to an edit distance of d, the index has to be expanded times.

Thus, it is not scalable as the storage complexity increases exponentially with the increase of the error tolerance. To support multi-keyword search, the search algorithm will have to run multiple rounds To date, efficient multi-keyword fuzzy search over encrypted data remains a challenging problem. We want to point out that the efforts on search over encrypted data involve not only information retrieval techniques such as advanced data structures used to represent the searchable index, and efficient search algorithms that run over the corresponding data structure, but also the proper design of cryptographic protocols to ensure the security and privacy of the overall system. Although single keyword search and fuzzy search have been implemented separately, a combination of the two does not lead to a secure and efficient single keyword fuzzy search scheme.

DISADVANTAGES:

The large number of data users and documents in cloud, it is crucial for the search service to allow multi-keyword query and provide result similarity ranking to meet the effective data retrieval need. The searchable encryption focuses on single keyword search or Boolean keyword search, and rarely differentiates the search results.

  • Single-keyword search without ranking
  • Boolean- keyword search without ranking
  • Single-keyword similarity search with ranking

PROPOSED SYSTEM:

Propose a symmetric searchable encryption scheme which achieves high efficiency for large databases with modest scarification on security guarantees. Propose a multi-keyword search scheme supporting result ranking by adopting k-nearest neighbors (kNN) technique. Propose a dynamic searchable encryption scheme through blind storage to conceal access pattern of the search user.

We propose the detailed EMRS. Since the encrypted documents and index z are both stored in the blind storage system, we would provide the general construction of the blind storage system. Moreover, since the EMRS aims to eliminate the risk of sharing the key that is used to encrypt the documents with all search users and solve the trapdoor unlinkability problem in Naveed’s scheme.

We modify the construction of blind storage and leverage ciphertext policy attribute-based encryption (CP-ABE) technique in the EMRS. However, specific construction of CP-ABE is out of scope of this paper and we only give a simple indication here. The notations of this paper are shown in Table 1. The EMRS consists of the following phases: System Setup, Construction of Blind Storage, Encrypted Database Setup, Trapdoor Generation, Efficient and Secure Search, and Retrieve Documents from Blind Storage.

ADVANTAGES:

In this paper, we propose an efficient multi-keyword ranked search (EMRS) scheme over encrypted mobile cloud data through blind storage.

Our main contributions can be summarized as follows:

  • We introduce a relevance score in searchable encryption to achieve multi-keyword ranked search over the encrypted mobile cloud data. In addition to that, we construct an efficient index to improve the search efficiency.
  • By modifying the blind storage system in the EMRS, we solve the trapdoor unlinkability problem and conceal access pattern of the search user from the cloud server.
  • We give thorough security analysis to demonstrate that the EMRS can reach a high security level including confidentiality of documents and index, trapdoor privacy, trapdoor unlinkability, and concealing access pattern of the search user. Moreover, we implement extensive experiments, which show that the EMRS can achieve enhanced efficiency in the terms of functionality and search efficiency compared with existing proposals

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Back End :           MYSQL Server
  • Server :           Apache Tomact Server
  • Script :           JSP Script
  • Document :           MS-Office 2007

EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval

Abstract—Graph-based ranking models have been widely applied in information retrieval area. In this paper, we focus on a well known graph-based model – the Ranking on Data Manifold model, or Manifold Ranking (MR). Particularly, it has been successfully applied to content-based image retrieval, because of its outstanding ability to discover underlying geometrical structure of the given image database. However, manifold ranking is computationally very expensive, which significantly limits its applicability to large databases especially for the cases that the queries are out of the database (new samples). We propose a novel scalable graph-based ranking model called Efficient Manifold Ranking (EMR), trying to address the shortcomings of MR from two main perspectives: scalable graph construction and efficient ranking computation. Specifically, we build an anchor graph on the database instead of a traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking. An approximate method is adopted for efficient out-of-sample retrieval. Experimental results on some large scale image databases demonstrate that EMR is a promising method for real world retrieval  applications.

INTRODUCTION
GRAPH-BASED ranking models have been deeply studied and widely applied in information retrieval area. In this paper, we focus on the problem of applying a novel and efficient graph-based model for contentbased image retrieval (CBIR), especially for out-of-sample retrieval on large scale databases.
Traditional image retrieval systems are based on keyword search, such as Google and Yahoo image search. In these systems, a user keyword (query) is matched with the context around an image including the title, manual annotation, web document, etc. These systems don’t utilize information from images. However these systems suffer many problems, such as shortage of the text information and inconsistency of the meaning of the text and image. Content-based image retrieval is a considerable choice to overcome these difficulties. CBIR has drawn a great attention in the past two decades [1]–[3]. Different from traditional keyword search systems, CBIR systems utilize the low-level features, including global features (e.g., color moment, edge histogram, LBP [4]) and local features (e.g., SIFT [5]), automatically extracted from images. A great amount of researches have been performed for designing more informative low-level features to represent images, or better metrics (e.g., DPF [6]) to measure the perceptual similarity, but their performance is restricted by many conditions and is sensitive to the data. Relevance feedback [7] is a useful tool for interactive CBIR. User’s high level perception is captured by dynamically updated weights based on the user’s feedback.
Most traditional methods focus on the data features too much but they ignore the underlying structure information, which is of great importance for semantic discovery, especially when the label information is unknown. Many databases have underlying cluster or manifold structure.
Under such circumstances, the assumption of label consistency is reasonable [8], [9]. It means that those nearby data points, or points belong to the same cluster or manifold, are very likely to share the same semantic label. This phenomenon is extremely important to explore the semantic relevance when the label information is unknown. In our opinion, a good CBIR system should consider images’ lowlevel features as well as the intrinsic structure of the image database.
Manifold Ranking (MR) [9], [10], a famous graph-based ranking model, ranks data samples with respect to the intrinsic geometrical structure collectively revealed by a large number of data. It is exactly in line with our consideration. MR has been widely applied in many applications, and shown to have excellent performance and feasibility on a variety of data types, such as the text [11], image [12], [13], and video[14]. By taking the underlying structure into account, manifold ranking assigns each data sample a relative ranking score, instead of an absolute pairwise similarity as traditional ways. The score is treated as a similarity metric defined on the manifold, which is more meaningful to capturing the semantic relevance degree. He et al. [12] firstly applied MR to CBIR, and significantly improved image retrieval performance compared with state-of-the-art algorithms.
However, manifold ranking has its own drawbacks to handle large scale databases – it has expensive computational cost, both in graph construction and ranking computation stages. Particularly, it is unknown how to handle an out-of-sample query (a new sample) efficiently under the existing framework. It is unacceptable to recompute the model for a new query. That means, original manifold ranking is inadequate for a real world CBIR system, in which the user provided query is always an out-of-sample.
In this paper, we extend the original manifold ranking and propose a novel framework named Efficient Manifold Ranking (EMR). We try to address the shortcomings of manifold ranking from two perspectives: the first is scalable graph construction; and the second is efficient computation, especially for out-of-sample retrieval. Specifically, we build an anchor graph on the database instead of the traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking computation. The model has two separate stages: an offline stage for building (or learning) the ranking model and an online stage for handling a new query. With EMR, we can handle a database with 1 million images and do the online retrieval in a short time. To the best of our knowledge, no previous manifold ranking based algorithm has run out-of-sample retrieval on a database in this scale.

Effective Key Management in Dynamic Wireless Sensor Networks

Recently, wireless sensor networks (WSNs) have been deployed for a wide variety of applications, including military sensing and tracking, patient status monitoring, traffic flow monitoring, where sensory devices often move between different locations. Securing data and communications requires suitable encryption key protocols. In this paper, we propose a certificateless-effective key management (CL-EKM) protocol for secure communication in dynamic WSNs characterized by node mobility. The CL-EKM supports efficient key updates when a node leaves or joins a cluster and ensures forward and backward key secrecy. The protocol also supports efficient key revocation for compromised nodes and minimizes the impact of a node compromise on the security of other communication links.

A security analysis of our scheme shows that our protocol is effective in defending against various  attacks. We implement CL-EKM in Contiki OS and simulate it using Cooja simulator to assess its time, energy, communication, and memory performance

DISTORTION-AWARE CONCURRENT MULTIPATH TRANSFER FOR MOBILE VIDEO STREAMING IN HETEROGENEOUS WIRELESS NETWORKS

ABSTRACT:

The massive proliferation of wireless infrastructures with complementary characteristics prompts the bandwidth aggregation for Concurrent Multipath Transfer (CMT) over heterogeneous access networks. Stream Control Transmission Protocol (SCTP) is the standard transport-layer solution to enable CMT in multihomed communication environments. However, delivering high-quality streaming video with the existing CMT solutions still remains problematic due to the stringent quality of service (QoS) requirements and path asymmetry in heterogeneous wireless networks.

In this paper, we advance the state of the art by introducing video distortion into the decision process of multipath data transfer. The proposed distortion-aware concurrent multipath transfer (CMT-DA) solution includes three phases: 1) per-path status estimation and congestion control; 2) quality-optimal video flow rate allocation; 3) delay and loss controlled data retransmission. The term ‘flow rate allocation’ indicates dynamically picking appropriate access networks and assigning the transmission rates.

We analytically formulate the data distribution over multiple communication paths to minimize the end-to-end video distortion and derive the solution based on the utility maximization theory. The performance of the proposed CMT-DA is evaluated through extensive semi-physical emulations in Exata involving H.264 video streaming. Experimental results show that CMT-DA outperforms the reference schemes in terms of video peak signal-to-noise ratio (PSNR), good put, and inter-packet delay.

INTRODUCTION:

During the past few years, mobile video streaming service online gaming, etc. has become one of the “killer applications” and the video traffic headed for hand-held devices has experienced explosive growth. The latest market research conducted by Cisco Company indicates that video streaming accounts for 53 percent of the mobile Internet traffic in parallel, global mobile data is expected to increase 11-fold in the next five years. Another ongoing trend feeding this tremendous growth is the popularity of powerful mobile terminals (e.g., smart phones and iPad), which facilitates individual users to access the Internet and watch videos from everywhere [4].

Despite the rapid advancements in network infrastructures, it is still challenging to deliver high-quality streaming video over wireless platforms. On one hand, the Wi-Fi networks are limited in radio coverage and mobility support for individual users; On the other hand, the cellular networks can well sustain the user mobility but their bandwidth is often inadequate to support the throughput-demanding video applications. Although the 4 G LTE and WiMAX can provide higher peak data rate and extended coverage, the available capacity will still be insufficient compared to the ever-growing video data traffic.

The complementary characteristics of heterogeneous access networks prompt the bandwidth aggregation for concurrent multipath transfer (CMT) to enhance transmission throughput and reliability (see Fig. 1). With the emergency of multihomed/multinetwork terminals CMT is considered to be a promising solution for supporting video streaming in future wireless networking. The key research issue in multihomed video delivery over heterogeneous wireless networks must be effective integration of the limited channel resources available for providing adequate quality of service (QoS). Stream control transmission protocol (SCTP) is the standard transport-layer solution that exploits the multihoming feature to concurrently distribute data across multiple independent end-to-end paths.

Therefore, many CMT solutions have been proposed to optimize the delay, throughput, or reliability performance for efficient data delivery. However, due to the special characteristics of streaming video, these network-level criteria cannot always improve the perceived media quality. For instance, a real-time video application encoded in constant bit rate (CBR) may not effectively leverage the throughput gains since its streaming rate is typically fixed or bounded by the encoding schemes. In addition, involving a communication path with available bandwidth but long delay in the multipath video delivery may degrade the streaming video quality as the end-to-end distortion increases. Consequently, leveraging the CMT for high-quality streaming video over heterogeneous wireless networks is largely unexplored.

In this paper, we investigate the problem by introducing video distortion into the decision process of multipath data transfer over heterogeneous wireless networks. The proposed Distortion-Aware Concurrent Multipath Transfer (CMT-DA) solution is a transport-layer protocol and includes three phases: 1) per-path status estimation and congestion control to exploit the available channel resources; 2) data flow rate allocation to minimize the end-to-end video distortion; 3) delay and loss constrained data retransmission for bandwidth conservation. The detailed descriptions of the proposed solution will be presented in Section 4. Specifically, the contributions of this paper can be summarized in the following.

_ An effective CMT solution that uses path status estimation, flow rate allocation, and retransmission control to optimize the real-time video quality in integrated heterogeneous wireless networks.

_ A mathematical formulation of video data distribution over parallel communication paths to minimize the end-to-end distortion. The utility maximization theory is employed to derive the solution for optimal transmission rate assignment extensive semi-physical emulations in Exata involving real-time H.264 video streaming.

LITRATURE SURVEY:

CMT-QA: QUALITY-AWARE ADAPTIVE CONCURRENT MULTIPATH DATA TRANSFER IN HETEROGENEOUS WIRELESS NETWORKS

AUTHOR: C. Xu, T. Liu, J. Guan, H. Zhang, and G. M. Muntean,

PUBLICATION: IEEE Trans. Mobile Comput., vol. 12, no. 11, pp. 2193–2205, Nov. 2013.

EXPLANATION:

Mobile devices equipped with multiple network interfaces can increase their throughput by making use of parallel transmissions over multiple paths and bandwidth aggregation, enabled by the stream control transport protocol (SCTP). However, the different bandwidth and delay of the multiple paths will determine data to be received out of order and in the absence of related mechanisms to correct this, serious application-level performance degradations will occur. This paper proposes a novel quality-aware adaptive concurrent multipath transfer solution (CMT-QA) that utilizes SCTP for FTP-like data transmission and real-time video delivery in wireless heterogeneous networks. CMT-QA monitors and analyses regularly each path’s data handling capability and makes data delivery adaptation decisions to select the qualified paths for concurrent data transfer. CMT-QA includes a series of mechanisms to distribute data chunks over multiple paths intelligently and control the data traffic rate of each path independently. CMT-QA’s goal is to mitigate the out-of-order data reception by reducing the reordering delay and unnecessary fast retransmissions. CMT-QA can effectively differentiate between different types of packet loss to avoid unreasonable congestion window adjustments for retransmissions. Simulations show how CMT-QA outperforms existing solutions in terms of performance and quality of service.

PERFORMANCE ANALYSIS OF PROBABILISTIC MULTIPATH TRANSMISSION OF VIDEO STREAMING TRAFFIC OVER MULTI-RADIO WIRELESS DEVICES

AUTHOR: W. Song and W. Zhuang

PUBLICATION: IEEE Trans. Wireless Commun., vol. 11, no. 4, pp. 1554–1564, 2012.

EXPLANATION:

Popular smart wireless devices become equipped with multiple radio interfaces. Multihoming support can be enabled to allow for multiple simultaneous associations with heterogeneous networks. In this study, we focus on video streaming traffic and propose analytical approaches to evaluate the packet-level and call-level performance of a multipath transmission scheme, which sends video traffic bursts over multiple available channels in a probabilistic manner. A probability generation function (PGF) and z-transform method is applied to derive the PGF of packet delay and any arbitrary moment in general. Particularly, we can obtain the average delay, delay jitter, and delay outage probability. The essential characteristics of video traffic are taken into account, such as deterministic burst intervals, highly dynamic burst length, and batch arrivals of transmission packets. The video substream traffic resulting from the probabilistic flow splitting is characterized by means of zero-inflated models. Further, the call-level performance, in terms of flow blocking probability and system throughput, is evaluated with a three-dimensional Markov process and compared with that of an always-best access selection. The numerical and simulations results demonstrate the effectiveness of our analysis framework and the performance gain of multipath transmission.

AN END-TO-END VIRTUAL PATH CONSTRUCTION SYSTEM FOR STABLE LIVE VIDEO STREAMING OVER HETEROGENEOUS WIRELESS NETWORKS

AUTHOR: S. Han, H. Joo, D. Lee, and H. Song

PUBLICATION: IEEE J. Sel. Areas Commun., vol. 29, no. 5, pp. 1032–1041, May 2011.

EXPLANATION:

In this paper, we propose an effective end-to-end virtual path construction system, which exploits path diversity over heterogeneous wireless networks. The goal of the proposed system is to provide a high quality live video streaming service over heterogeneous wireless networks. First, we propose a packetization-aware fountain code to integrate multiple physical paths efficiently and increase the fountain decoding probability over wireless packet switching networks. Second, we present a simple but effective physical path selection algorithm to maximize the effective video encoding rate while satisfying delay and fountain decoding failure rate constraints. The proposed system is fully implemented in software and examined over real WLAN and HSDPA networks.

SYSTEM ANALYSIS

EXISTING SYSTEM:

Existing method an effective approach in designing error-resilient wireless video broadcasting systems in recent years, Joint source-channel coding (JSCC) attracts increasing interests in both research community and industry because it shows better results in robust layered video transmission over error-prone channels of various techniques available during these years may be found. However, there are still many open problems in terms of how to serve heterogeneous users with diverse screen features and variable reception performances in wireless video broadcast system. One particular challenging problem of this heterogeneous quality-of-service (QoS) video provision is: the users would prefer flexible video with low quality to match their screens, at the same time; the video stream could be reliable received.

The main technical difficulties are as follows:

  • A distinctive characteristic in current wireless broadcast system is that the receivers are highly heterogeneous in terms of their terminal processing capabilities and available bandwidths. In source side, scalable video coding (SVC) has been proposed to provide an attractive solution to this problem.
  • However, in order to support flexible video broadcasting, the scalable video sources need to provide adaptation ability through a variety of schemes, such as scalable video stream extraction layer generation with different priority and summarization before they can be transmitted over the error-prone networks.

DISADVANTAGES:

  • Existing layered video data is very sensitive to transmission failures, the transmission must be more reliable, have low overhead and support large numbers of devices with heterogeneous characteristics. In broadcast and multicast networks, conventional schemes such as adaptive retransmission have their limitations, for example, retransmission may lead to implosion problem.
  • Forward error correction (FEC) and unequal error protection (UEP) are employed to provide the QoS support for video transmission. However, in order to obtain as minimum investment as possible in broadcasting system deployment, server-side must be designed more scalable, reliable, independent, and support vast number of autonomous receivers. Suitable FEC approaches are expected such that can eliminate the retransmission and lower the unnecessary receptions overhead at each receiver-side.
  • Conventionally, the joint source and channel coding are designed with seldom consideration in heterogeneous characteristics, and most of the above challenges are ignored in practical video broadcasting system. This leads to the need for heterogeneous QoS video provision in broadcasting network. This paper presents the point of view to study the hybrid-scalable video from new quality metric so as to support users’ heterogeneous requirements.

PROPOSED SYSTEM:

We proposed Distortion-Aware Concurrent Multipath Transfer (CMT-DA) solution is a transport-layer protocol and includes three phases: 1) per-path status estimation and congestion control to exploit the available channel resources; 2) data flow rate allocation to minimize the end-to-end video distortion; 3) delay and loss constrained data retransmission for bandwidth conservation an effective CMT solution that uses path status estimation, flow rate allocation, and retransmission control to optimize the real-time video quality in integrated heterogeneous wireless networks.

We propose a quality-aware adaptive concurrent multipath transfer (CMT-QA) scheme that distributes the data based on estimated path quality. Although the path status is an important factor that affects the scheduling policy, the application requirements should also be considered to guarantee the QoS. Basically, the proposed CMT-DA is different from the CMT-QA as we take the video distortion as the benchmark. Still, the proposed solutions (path status estimation, flow rate allocation, and retransmission control) in CMT-DA are significantly different from those in CMTQA. In another research conducted by a realistic evaluation tool-set is proposed to analyze and optimize the performance of multimedia distribution when taking advantage of CMT-based multihoming SCTP solutions.

ADVANTAGES:

  • We propose a novel out-of-order scheduling approach for in-order arriving of the data chunks in CMT-DA based on the progressive water-filling algorithm. Heterogeneous wireless networks based on fountain code. The encoded multipath streaming model proposed by Chow et al. is a joint multipath and FEC approach for real time live streaming applications.
  • We propose an end-to-end virtual path construction system that exploits the path diversity in heterogeneous wireless networks based on fountain code. The encoded multipath streaming model proposed by Chow et al. is a joint multipath and FEC approach for real time live streaming applications. The authors provide asymptotic analysis and derive closed-form solution for the FEC packets allocation.
  • The major components at the sender side are the parameter control unit, flow rate allocator, and retransmission controller. The parameter control unit is responsible for processing the acknowledgements (ACKs) feedback from the receiver, estimating the path status and adapting the congestion window size. The delay and loss requirements are imposed by the video applications to achieve the target video quality.

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT:

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor              –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Tools                                     :           Netbeans or Eclipse
  • Script :           Java Script
  • Document :           MS-Office 2007

Defeating Jamming With the Power of Silence A Game-Theoretic Analysis

Abstract:

The timing channel is a logical communication channel in which information is encoded in the timing between events. Recently, the use of the timing channel has been proposed as a countermeasure to reactive jamming attacks performed by an energy- constrained malicious node. In fact, while a jammer is able to disrupt the information contained in the attacked packets, timing information cannot be jammed, and therefore, timing channels can be exploited to deliver information to the receiver even on a jammed channel. Since the nodes under attack and the jammer have conflicting interests, their interactions can be modeled by means of game theory. Accordingly, in this paper, a game-theoretic model of the interactions between nodes exploiting the timing channel to achieve resilience to jamming attacks and a jammer is derived and analyzed. More specifically, the Nash equilibrium is studied in terms of existence, uniqueness, and convergence under best response dynamics. Furthermore, the case in which the communication nodes set their strategy and the jammer reacts accordingly is modeled and analyzed as a Stackelberg game, by considering both perfect and imperfect knowledge of the jammer’s utility function. Extensive numerical results are presented, showing the impact of network parameters on the system performance.

Introduction:

A timing channel is a communication channel which exploits silence intervals between consecutive transmissions to encode information. Recently, use of timing channels has been proposed in the wireless domain to support low rate, energy efficient communications  as well as covert and resilient communications Timing channels are more although not totally  immune from reactive jamming attacks. In fact, the interfering signal begins its disturbing action against the communication only after identifying an ongoing transmission, and thus after the timing information has been decoded by the receiver.

Timing channel-based communication scheme has been proposed to counteract jamming by establishing a low rate physical layer on top of the traditional physical/link layers using detection and timing of failed packet receptions at the receiver.

The energy cost of jamming the timing channel and the resulting trade-offs have been analyzed. The interactions between the jammer and the node whose transmissions are under attack, which we call target node.

Specifically, assume that the target node wants to maximize the amount of information that can be transmitted per unit of time by means of the timing channel, whereas, the jammer wants to minimize such amount of information while reducing the energy expenditure.

The target node and the jammer have conflicting interests; we develop a game theoretical framework that models their interactions. We investigate both the case in which these two adversaries play their strategies.

 The situation when the target node (the leader) anticipates the actions of the jammer (the follower). To this purpose, we study both the Nash Equilibria (NEs) and Stackelberg Equilibria (SEs) of our proposed games.

Existing system:

Recently, use of timing channels has been proposed in the wireless domain to support low rate, energy efficient communications as well as covert and resilient communications. In existing system methodologies to detect jamming attacks are illustrated; it is also shown that it is possible to identify which kind of jamming attack is ongoing by looking at the signal strength and other relevant network parameters, such as bit and packet errors. Several solutions against reactive jamming have been proposed that exploit different techniques, such as frequency hopping, power control and UN jammed bits.

Disadvantages:

  • Continuous jamming is very costly in terms of energy consumption for the jammer
  • Existing solutions usually rely on users’ cooperation and coordination, which might not be guaranteed in a jammed environment. In fact, the reactive jammer can totally disrupt each transmitted packet and, consequently, no information can be decoded and then used to this purpose.

Proposed system:

Our proposed system implementation focus on the resilience of timing channels to jamming attacks. In general, these attacks can completely disrupt communications when the jammer continuously emits a high power disturbing signal, i.e., when continuous jammingis performed.

Analyze the interactions between the jammer and the node whose transmissions are under attack, which we call target node. Specifically, we assume that the target node wants to maximize the amount of information that can be transmitted per unit of time by means of the timing channel, whereas, the jammer wants to minimize such amount of information while reducing the energy expenditure.

As the target node and the jammer have conflicting interests, we develop a game theoretical framework that models their interactions. We investigate both the case in which these two adversaries play their strategies simultaneously and the situation when the target node (the leader) anticipates the actions of the jammer (the follower). To this purpose, we study both the Nash Equilibria (NEs) and Stackelberg Equilibria (SEs) of our proposed games.

Advantages:

  • System model the interactions between a jammer and a target node as a jamming game
  • We prove the existence, uniqueness and convergence to the Nash equilibrium (NE) under best response dynamics
  • We prove the existence and uniqueness of the equilibrium of the Stackelberg game where the target node plays as a leader and the jammer reacts consequently
  • We investigate in this latter Stackelberg scenario the impact on the achievable performance of imperfect knowledge of the jammer’s utility function;
  • We conduct an extensive numerical analysis which shows that our proposed models well capture the main factors behind the utilization of timing channels, thus representing a promising framework for the design and understanding of such systems.

Modules:

NASH Equilibrium Analysis:

The Nash Equilibrium points (NEs), in which both players achieve their highest utility given the strategy profile of the opponent. In the following we also provide proofs of the existence, uniqueness and convergence to the Nash Equilibrium under best response dynamics.

Existence of the Nash Equilibrium:

 It is well known that the intersection points between bT(y) and bJ(x)are the NEs of the game. Therefore, to demonstrate the existence of at least one NE, it suffices to prove that bT(y) and bJ(x) have one or more intersection points. In other words, it is sufficient to find one or more pairs.

Uniqueness of the Nash Equilibrium:

After proving the NE existence in Theorem, let us prove the uniqueness of the NE, that is, there is only one strategy profile such that no player has incentive to deviate unilaterally.

Convergence to the Nash Equilibrium:

Analyze the convergence of the game to the NE when players follow Best Response Dynamics (BRD). In BRD the game starts from any initial point(x(0),y(0))∈Sand, at each successive step, each player plays its strategy by following its best response function.

Performance Analysis

The game allows the leader to achieve a utility which is atleast equal to the utility achieved in the ordinary game at the NE, if we assume perfect knowledge, that is, the target node is completely aware of the utility function of the jammer and its parameters, and thus it is able to evaluate bJ(x). Whereas, if some parameters in the utility function of the jammer are unknown at the target node

Conclusion:

Our system implementation proposed a game-theoretic model of the interactions between a jammer and a communication node that exploits a timing channel to improve resilience to jamming attacks. Structural properties of the utility functions of the two players have been analyzed and exploited to prove the existence and uniqueness of the Nash Equilibrium. The convergence of the game to the Nash Equilibrium has been studied and proved by analyzing the best response dynamics. Furthermore, as the reactive jammer is assumed to start transmitting its interference signal only after detecting activity of the node under attack, a Stackelberg game has been properly investigated, and proofs on the existence and uniqueness of the Stackelberg Equilibrium has been provided.

DATA-STREAM-BASED INTRUSION DETECTION SYSTEM FOR ADVANCED METERING INFRASTRUCTURE IN SMART GRID: A FEASIBILITY STUDY

ABSTRACT:

In this paper, we will focus on the security of advanced metering infrastructure (AMI), which is one of the most crucial components of SG. AMI serves as a bridge for providing bidirectional information flow between user domain and utility domain. AMI’s main functionalities encompass power measurement facilities, assisting adaptive power pricing and demand side management, providing self-healing ability, and interfaces for other systems.

AMI is usually composed of three major types of components, namely, smart meter, data concentrator, and central system (a.k.a. AMI headend) and bidirectional communication networks among those components. AMI is exposed to various security threats such as privacy breach, energy theft, illegal monetary gain, and other malicious activities. As AMI is directly related to revenue earning, customer power consumption, and privacy, of utmost importance is securing its infrastructure. In order to protect AMI from malicious attacks, we look into the intrusion detection system (IDS) aspect of security solution.

We can define IDS as a monitoring system for detecting any unwanted entity into a targeted system (such as AMI in our context). We treat IDS as a second line security measure after the first line of primary AMI security techniques such as encryption, authorization, and authentication, Hence, changing specifications in all key IDS sensors would be expensive and cumbersome. In this paper, we choose to employ anomaly-based IDS using data mining approaches.

INTRODUCTION

Smart grid (SG) is a set of technologies that integrate modern information technologies with present power grid system. Along with many other benefits, two-way communication, updating users about their consuming behavior, controlling home appliances and other smart components remotely, and monitoring power grid’s stability are unique features of SG. To facilitate such kinds of novel features, SG needs to incorporate many new devices and services. For communicating, monitoring, and controlling of these devices/services, there may also be a need for many new protocols and standards. However, the combination of all these new devices, services, protocols, and standards make SG a very complex system that is vulnerable to increased security threats—like any other complex systems are. In particular, because of its bidirectional, interoperable, and software-oriented nature, SG is very prone to cyber attacks. If proper security measures are not taken, a cyber attack on SG can potentially bring about a huge catastrophic impact on the whole grid and, thus, to the society. Thus, cyber security in SG is treated as one of the vital issues by the National Institute of Standards and Technology and the Federal Energy Regulatory Commission.

In this paper, we will focus on the security of advanced metering infrastructure (AMI), which is one of the most crucial components of SG. AMI serves as a bridge for providing bidirectional information flow between user domain and utility domain [2]. AMI’s main functionalities encompass power measurement facilities, assisting adaptive power pricing and demand side management, providing self-healing ability, and interfaces for other systems. AMI is usually composed of three major types of components, namely, smart meter, data concentrator, and central system (a.k.a. AMI headend) and bidirectional communication networks among those components. Being a complex system in itself, AMI is exposed to various security threats such as privacy breach, energy theft, illegal monetary gain, and other malicious activities. As AMI is directly related to revenue earning, customer power consumption, and privacy, of utmost importance is securing its infrastructure.

LITRATURE SURVEY

EFFICIENT AUTHENTICATION SCHEME FOR DATA AGGREGATION IN SMART GRID WITH FAULT TOLERANCE AND FAULT DIAGNOSIS

PUBLISH: IEEE Power Energy Soc. Conf. ISGT, 2012, pp. 1–8.

AUTOHR: D. Li, Z. Aung, J. R. Williams, and A. Sanchez

EXPLANATION:

Authentication schemes relying on per-packet signature and per-signature verification introduce heavy cost for computation and communication. Due to its constraint resources, smart grid’s authentication requirement cannot be satisfied by this scheme. Most importantly, it is a must to underscore smart grid’s demand for high availability. In this paper, we present an efficient and robust approach to authenticate data aggregation in smart grid via deploying signature aggregation, batch verification and signature amortization schemes to less communication overhead, reduce numbers of signing and verification operations, and provide fault tolerance. Corresponding fault diagnosis algorithms are contributed to pinpoint forged or error signatures. Both experimental result and performance evaluation demonstrate our computational and communication gains.

CYBER SECURITY ISSUES FOR ADVANCED METERING INFRASTRUCTURE (AMI)

PUBLISH: IEEE Power Energy Soc. Gen. Meet. – Convers. Del. Electr. Energy 21st Century, 2008, pp. 1–5.

AUTOHR: F. M. Cleveland

EXPLANATION:

Advanced Metering Infrastructure (AMI) is becoming of increasing interest to many stakeholders, including utilities, regulators, energy markets, and a society concerned about conserving energy and responding to global warming. AMI technologies, rapidly overtaking the earlier Automated Meter Reading (AMR) technologies, are being developed by many vendors, with portions being developed by metering manufacturers, communications providers, and back-office Meter Data Management (MDM) IT vendors. In this flurry of excitement, very little effort has yet been focused on the cyber security of AMI systems. The comment usually is “Oh yes, we will encrypt everything – that will make everything secure.” That comment indicates unawareness of possible security threats of AMI – a technology that will reach into a large majority of residences and virtually all commercial and industrial customers. What if, for instance, remote connect/disconnect were included as one AMI capability – a function of great interest to many utilities as it avoids truck rolls. What if a smart kid hacker in his basement cracked the security of his AMI system, and sent out 5 million disconnect commands to all customer meters on the AMI system.

INTRUSION DETECTION FOR ADVANCED METERING INFRASTRUCTURES: REQUIREMENTS AND ARCHITECTURAL DIRECTIONS

PUBLISH: IEEE Int. Conf. SmartGridComm, 2010, pp. 350–355.

AUTOHR: R. Berthier, W. H. Sanders, and H. Khurana,

EXPLANATION:

The security of Advanced Metering Infrastructures (AMIs) is of critical importance. The use of secure protocols and the enforcement of strong security properties have the potential to prevent vulnerabilities from being exploited and from having costly consequences. However, as learned from experiences in IT security, prevention is one aspect of a comprehensive approach that must also include the development of a complete monitoring solution. In this paper, we explore the practical needs for monitoring and intrusion detection through a thorough analysis of the different threats targeting an AMI. In order to protect AMI from malicious attacks, we look into the intrusion detection system (IDS) aspect of security solution. We can define IDS as a monitoring system for detecting any unwanted entity into a targeted system (such as AMI in our context). We treat IDS as a second line security measure after the first line of primary AMI security techniques such as encryption, authorization, and authentication, such as [3]. However, Cleveland [4] stressed that these first line security solutions alone are not sufficient for securing AMI.

MOA: MASSIVE ONLINE ANALYSIS, A FRAMEWORK FOR STREAM CLASSIFICATION AND CLUSTERING

PUBLISH: JMLR Workshop Conf. Proc., Workshop Appl. Pattern Anal., 2010, vol. 11, pp. 44–50.

AUTOHR: A. Bifet, G. Holmes, B. Pfahringer, P. Kranen, H. Kremer, T. Jansen, and T. Seidl

EXPLANATION:

In today’s applications, massive, evolving data streams are ubiquitous. Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problems of scaling up the implementation of state of the art algorithms to real world dataset sizes and of making algorithms comparable in benchmark streaming settings. It contains a collection of offline and online algorithms for both classification and clustering as well as tools for evaluation. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily compare several algorithms and apply them to real world data sets and settings. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license. Besides providing algorithms and measures for evaluation and comparison, MOA is easily extensible with new contributions and allows the creation of benchmark scenarios through storing and sharing setting files.

SECURING ADVANCED METERING INFRASTRUCTURE USING INTRUSION DETECTION SYSTEM WITH DATA STREAM MINING

PUBLISH: Proc. PAISI, 2012, vol. 7299, pp. 96–111

AUTOHR: M. A. Faisal, Z. Aung, J. Williams, and A. Sanchez

EXPLANATION:

Advanced metering infrastructure (AMI) is an imperative component of the smart grid, as it is responsible for collecting, measuring, analyzing energy usage data, and transmitting these data to the data concentrator and then to a central system in the utility side. Therefore, the security of AMI is one of the most demanding issues in the smart grid implementation. In this paper, we propose an intrusion detection system (IDS) architecture for AMI which will act as a complimentary with other security measures. This IDS architecture consists of three local IDSs placed in smart meters, data concentrators, and central system (AMI headend). For detecting anomaly, we use data stream mining approach on the public KDD CUP 1999 data set for analysis the requirement of the three components in AMI. From our result and analysis, it shows stream data mining technique shows promising potential for solving security issues in AMI.

DATA STREAM MINING ARCHITECTURE FOR NETWORK INTRUSION DETECTION

PUBLISH: IEEE Int. Conf. IRI, 2004, pp. 363–368

AUTOHR: N. C. N. Chu, A. Williams, R. Alhajj, and K. Barker

EXPLANATION:

In this paper, we propose a stream mining architecture which is based on a single-pass approach. Our approach can be used to develop efficient, effective, and active intrusion detection mechanisms which satisfy the near real-time requirements of processing data streams on a network with minimal overhead. The key idea is that new patterns can now be detected on-the-fly. They are flagged as network attacks or labeled as normal traffic, based on the current network trend, thus reducing the false alarm rates prevalent in active network intrusion systems and increasing the low detection rate which characterizes passive approaches.

RESEARCH ON DATA MINING TECHNOLOGIES APPLYING INTRUSION DETECTION

PUBLISH: Proc. IEEE ICEMMS, 2010, pp. 230–233

AUTOHR: Z. Qun and H. Wen-Jie

EXPLANATION:

Intrusion detection is one of network security area of technology main research directions. Data mining technology was applied to network intrusion detection system (NIDS), may automatically discover the new pattern from the massive network data, to reduce the workload of the manual compilation intrusion behavior patterns and normal behavior patterns. This article reviewed the current intrusion detection technology and the data mining technology briefly. Focus on data mining algorithm in anomaly detection and misuse detection of specific applications. For misuse detection, the main study the classification algorithm; for anomaly detection, the main study the pattern comparison and the cluster algorithm. In pattern comparison to analysis deeply the association rules and sequence rules . Finally, has analysed the difficulties which the current data mining algorithm in intrusion detection applications faced at present, and has indicated the next research direction.

AN EMBEDDED INTRUSION DETECTION SYSTEM MODEL FOR APPLICATION PROGRAM

PUBLISH: IEEE PACIIA, 2008, vol. 2, pp. 910–912.

AUTOHR: S. Wu and Y. Chen

EXPLANATION:

Intrusion detection is an effective security mechanism developed in the recent decade. Because of its wide applicability, intrusion detection becomes the key part of the security mechanism. The modern technologies and models in intrusion detection field are categorized and studied. The characters of current practical IDS are introduced. The theories and realization of IDS based on applications are presented. The basic ideas concerned with how to design and realize the embedded IDS for application are proposed.

ACCURACY UPDATED ENSEMBLE FOR DATA STREAMS WITH CONCEPT DRIFT

PUBLISH: Proc. 6th Int. Conf. HAIS Part II, 2011, pp. 155–163.

AUTOHR: D. Brzeziñski and J. Stefanowski

EXPLANATION:

In this paper we study the problem of constructing accurate block-based ensemble classifiers from time evolving data streams. AWE is the best-known representative of these ensembles. We propose a new algorithm called Accuracy Updated Ensemble (AUE), which extends AWE by using online component classifiers and updating them according to the current distribution. Additional modifications of weighting functions solve problems with undesired classifier excluding seen in AWE. Experiments with several evolving data sets show that, while still requiring constant processing time and memory, AUE is more accurate than AWE.

ACTIVE LEARNING WITH EVOLVING STREAMING DATA

PUBLISH: Proc. ECML-PKDD Part III, 2011, pp. 597–612.

AUTOHR: I. liobaitë, A. Bifet, B. Pfahringer, and G. Holmes

EXPLANATION:

In learning to classify streaming data, obtaining the true labels may require major effort and may incur excessive cost. Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. In this paper we develop two active learning strategies for streaming data that explicitly handle concept drift. They are based on uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space. We empirically demonstrate that these strategies react well to changes that can occur anywhere in the instance space and unexpectedly.

LEARNING FROM TIME-CHANGING DATA WITH ADAPTIVE WINDOWING

PUBLISH: Proc. SIAM Int. Conf. SDM, 2007, pp. 443–448.

AUTOHR: A. Bifet and R. Gavaldà,

EXPLANATION:

We present a new approach for dealing with distribution change and concept drift when learning from data sequences that may vary with time. We use sliding windows whose size, instead of being fixed a priori, is recomputed online according to the rate of change observed from the data in the window itself. This delivers the user or programmer from having to guess a time-scale for change. Contrary to many related works, we provide rigorous guarantees of performance, as bounds on the rates of false positives and false negatives. Using ideas from data stream algorithmics, we develop a time- and memory-efficient version of this algorithm, called ADWIN2. We show how to combine ADWIN2 with the Naïve Bayes (NB) predictor, in two ways: one, using it to monitor the error rate of the current model and declare when revision is necessary and, two, putting it inside the NB predictor to maintain up-to-date estimations of conditional probabilities in the data. We test our approach using synthetic and real data streams and compare them to both fixed-size and variable-size window strategies with good results.

DATA-DRIVEN COMPOSITION FOR SERVICE-ORIENTED SITUATIONAL WEB APPLICATIONS

ABSTRACT:

This paper presents a systematic data-driven approach to assisting situational application development. We first propose a technique to extract useful information from multiple sources to abstract service capabilities with set tags. This supports intuitive expression of user’s desired composition goals by simple queries, without having to know underlying technical details. A planning technique then exploits composition solutions which can constitute the desired goals, even with some potential new interesting composition opportunities. A browser-based tool facilitates visual and iterative refinement of composition solutions, to finally come up with the satisfying outputs. A series of experiments demonstrate the efficiency and effectiveness of our approach. Data-driven composition technique for situational web applications by using tag-based semantics in to illustrate the overall life-cycle of our “compose as-you-search” composition approach, to propose the clustering technique for deriving tag-based composition semantics, and to evaluate the composition planning effectiveness, respectively.

Compared with previous work, this paper is significantly updated by introducing a semi-supervised technique for clustering hierarchical tag based semantics from service documentations and human-annotated annotations. The derived semantics link service capabilities and developers’ processing goals, so that the composition is processed by planning the “Tag HyperLinks” from initialquery to the goals. The planning algorithm is also further evaluated in terms of recommendation quality, performance, and scalability over data sets from real-world service repositories. Results show that our approach reaches satisfying precision and high-quality composition recommendations. We also demonstrate that our approach can accommodate even larger size of services than real world repositories so as to promise performance. Besides, more details of our interactive development prototyping are presented. We particularly demonstrate how the composition UI can help developers intuitively compose situational applications, and iteratively refine their goals until requirements are finally satisfied.

 INTRODUCTION:

We develop and deliver software systems more quickly, and these systems must provide increasingly ambitious functionality to adapt ever-changing requirements and environments. Particularly, in recent a few years, the emergence and wide adoption of Web 2.0 have enlarged the body of service computing research. Web 2.0 not only focuses on the resource sharing and utilization from user and social perspective, but also exhibits the notion of “Web as a Platform” paradigm. A very important trend is that, more and more service consumers (including programmers, business analysts or even endusers) are capable of participating and collaborating for their own requirements and interests by means of developing situational software applications (also noted as “situated software”).

Software engineering perspective, situational software applications usually follow the opportunistic development fashion, where small subsets of users create applications to fulfill a specific purpose. Currently, composing available web-delivered services (including SOAP based web services, REST (RE presentational State Transfer) web services and RSS/Atom feeds) into a single web applications, or so called “service mashups” (or “mashups” for short) has been popular. They are supposed to be flexible response for new needs or demands and quick roll-out of some potentially unanticipated functionality. To support situational application development, a number of tools from both academia and industry have emerged.

However, we argue that, the large number of available services and the complexity of composition constraints make manual composition difficult. For the situational applications developers, who might be non-professional programmers, the key challenge remained is that they intend to represent their desired goals simply and intuitively, and be quickly navigated to proper service that can response their requests. They usually do not care about (or understand) the underlying technical details (e.g., syntactics, semantics, message mediation, etc). They just want to figure out all intermediate steps needed to generate desired outputs.

Moreover, many end-users may have a general wish to know what they are trying to achieve, but not know the specifics of what they want or what is possible. It means that the process of designing and developing the situational application requires not only the abstraction of individual services, but also much broader perspective on the evolving collections of services that can potentially incorporate with current onesWe first present a data-driven composition technique for situational web applications by using tag-based semantics in ICWS 2011 work.

The main contributions in this paper are to illustrate the overall life-cycle of our “composeas-you-search” composition approach, to propose the clustering technique for deriving tag-based composition semantics, and to evaluate the composition planning effectiveness, respectively. Compared with previous work, this paper is significantly updated by introducing a semi-supervised technique for clustering hierarchical tag-based semantics from service documentations and human-annotated annotations. The derived semantics link service capabilities and developers’ processing goals, so that the composition is processed by planning the “Tag HyperLinks” from initialquery to the goals.

The planning algorithm is also further evaluated in terms of recommendation quality, performance, and scalability over data sets from real-world service repositories. Results show that our approach reaches satisfying precision and high-quality composition recommendations. We also demonstrate that our approach can accommodate even larger size of services than real world repositories so as to promise performance. Besides, more details of our interactive development prototyping are presented. We particularly demonstrate how the composition UI can help developers intuitively compose situational applications, and iteratively refine their goals until requirements are finally satisfied.

SCOPE OF THE PROJECT

User-oriented abstraction: The tourist uses tags to represent their desired goals and find relevant services. Tags provide a uniform abstraction of user requirements and service capabilities, and lower the entry barrier to perform development. 

Data-driven development: In the whole development process, the tourist selects or inputs some tags, while some relevant services are recommended. This reflects a “Compose-as-you-Search” development process. Recommended services either process these tags as inputs, or produce these tags as outputs. As shown in Fig. 1, each service has some inputs and outputs, which are associated with tagged data. In this way, services can be connected to build data flows. Developers can search their goals by means of tags, and compose recommended services in a data driven fashion. 

Potential composition navigation: The developer is always assisted with possible composition suggestions, based on the tags in the current goals. The composition engine interprets the user queries and automatically generates some appropriate compositions alternatives by a planning algorithm (Section 4). The recommendations not only contain the desired outputs from the developers’ goals, but also suggest some interesting or relevant suggestions leading to potential new composition possibilities.

For example, the tag “Italian” introduced the Google Translation service, which tourist was not aware of such composition possibility. In this way, the composition process is not like traditional semantic web services techniques which might need specific goals, but leads to some emergent opportunities according to current application situations.

LITRATURE SURVEY:

COMPOSING DATA-DRIVEN SERVICE MASHUPS WITH TAG-BASED SEMANTIC ANNOTATIONS

AUTHOR: X. Liu, Q. Zhao, G. Huang, H. Mei, and T. Teng

PUBLISH: Proc. IEEE Int’l Conf. Web Services (ICWS ’11), pp. 243-250, 2011.

EXPLANATION:

Spurred by Web 2.0 paradigm, there emerge large numbers of service mashups by composing readily accessible data and services. Mashups usually address solving situational problems and require quick and iterative development lifecyle. In this paper, we propose an approach to composing data driven mashups, based on tag-based semantics. The core principle is deriving semantic annotations from popular tags, and associating them with programmatic inputs and outputs data. Tag-based semantics promise a quick and simple comprehension of data capabilities. Mashup developers including end-users can intuitively search desired services with tags, and combine several services by means of data flows. Our approach takes a planning technique to retrieving the potentially relevant composition opportunities. With our graphical composition user interfaces, developers can iteratively modify, adjust and refine their mashups to be more satisfying.

TOWARDS AUTOMATIC TAGGING FOR WEB SERVICES

AUTHOR: L. Fang, L. Wang, M. Li, J. Zhao, Y. Zou, and L. Shao

PUBLISH: Proc. IEEE 19th Int’l Conf. Web Services, pp. 528-535, 2012.

EXPLANATION:

Tagging technique is widely used to annotate objects in Web 2.0 applications. Tags can support web service understanding, categorizing and discovering, which are important tasks in a service-oriented software system. However, most of existing web services’ tags are annotated manually. Manual tagging is time-consuming. In this paper, we propose a novel approach to tag web services automatically. Our approach consists of two tagging strategies, tag enriching and tag extraction. In the first strategy, we cluster web services using WSDL documents, and then we enrich tags for a service with the tags of other services in the same cluster. Considering our approach may not generate enough tags by tag enriching, we also extract tags from WSDL documents and related descriptions in the second step. To validate the effectiveness of our approach, a series of experiments are carried out based on web-scale web services. The experimental results show that our tagging method is effective, ensuring the number and quality of generated tags. We also show how to use tagging results to improve the performance of a web service search engine, which can prove that our work in this paper is useful and meaningful.

A TAG-BASED APPROACH FOR THE DESIGN AND COMPOSITION OF INFORMATION PROCESSING APPLICATIONS

AUTHOR: E. Bouillet, M. Feblowitz, Z. Liu, A. Ranganathan, and A. Riabov

PUBLISH: ACM SIGPLAN Notices, vol. 43, no. 10, pp. 585-602, Sept. 2008.

EXPLANATION:

In the realm of component-based software systems, pursuers of the holy grail of automated application composition face many significant challenges. In this paper we argue that, while the general problem of automated composition in response to high-level goal statements is indeed very difficult to solve, we can realize composition in a restricted context, supporting varying degrees of manual to automated assembly for specific types of applications. We propose a novel paradigm for composition in flow-based information processing systems, where application design and component development are facilitated by the pervasive use of faceted, tag-based descriptions of processing goals, of component capabilities, and of structural patterns of families of application. The facets and tags represent different dimensions of both data and processing, where each facet is modeled as a finite set of tags that are defined in a controlled folksonomy. All data flowing through the system, as well as the functional capabilities of components are described using tags. A customized AI planner is used to automatically build an application, in the form of a flow of components, given a high-level goal specification in the form of a set of tags. End-users use an automatically populated faceted search and navigation mechanism to construct these high-level goals. We also propose a novel software engineering methodology to design and develop a set of reusable, well-described components that can be assembled into a variety of applications. With examples from a case study in the Financial Services domain, we demonstrate that composition using a faceted, tag-based application design is not only possible, but also extremely useful in helping end-users create situational applications from a wide variety of available components.

Data Collection in Multi-Application Sharing Wireless Sensor Networks

Data sharing for data collection among multiple applications is an efficient way to reduce communication cost for Wireless Sensor Networks (WSNs). This paper is the first work to introduce the interval data sharing problem which is to investigate how to transmit as less data as possible over the network, and meanwhile the transmitted data satisfies the requirements of all the applications. Different from current studies where each application requires a single data sampling during each task, we study the problem where each application requires a continuous interval of data sampling in each task. The proposed problem is a nonlinear nonconvex optimization problem. In order to lower the high complexity for solving a nonlinear nonconvex optimization problem in resource restricted WSNs, a 2-factor approximation algorithm whose time complexity is Oðn2Þ and memory complexity is OðnÞ is provided. A special instance of this problem is also analyzed. This special instance can be solved with a dynamic programming algorithm in polynomial time, which gives an optimal result in Oðn2Þ time complexity and OðnÞ memory complexity.
Three online algorithms are provided to process the continually coming tasks. Both the theoretical analysis and simulation results demonstrate the effectiveness of the proposed algorithms

COST-AWARE SECURE ROUTING (CASER) PROTOCOL DESIGN FOR WIRELESS SENSOR NETWORKS

ABSTRACT:

Lifetime optimization and security are two conflicting design issues for multi-hop wireless sensor networks (WSNs) with non-replenishable energy resources. In this paper, we first propose a novel secure and efficient Cost-Aware SEcure Routing (CASER) protocol to address these two conflicting issues through two adjustable parameters: energy balance control (EBC) and probabilistic based random walking. We then discover that the energy consumption is severely disproportional to the uniform energy deployment for the given network topology, which greatly reduces the lifetime of the sensor networks. We propose an efficient non-uniform energy deployment strategy to optimize the lifetime and message delivery ratio under the same energy resource and security requirement. We also provide a quantitative security analysis on the proposed routing protocol.

Our theoretical analysis and java simulation results demonstrate that the proposed CASER protocol can provide an excellent tradeoff between routing efficiency and energy balance, and can significantly extend the lifetime of the sensor networks in all scenarios. For the non-uniform energy deployment, our analysis shows that we can increase the lifetime and the total number of messages that can be delivered by more than four times under the same assumption. We also demonstrate that the proposed CASER protocol can achieve a high message delivery ratio while preventing routing traceback attacks.

INTRODUCTION:

The recent technological advances make wireless sensor networks (WSNs) technically and economically feasible to be widely used in both military and civilian applications, such as monitoring of ambient conditions related to the environment, precious species and critical infrastructures. A key feature of such networks is that each network consists of a large number of untethered and unattended sensor nodes. These nodes often have very limited and non-replenishable energy resources, which makes energy an important design issue for these networks. Routing is another very challenging design issue for WSNs. A properly designed routing protocol should not only ensure high message delivery ratio and low energy consumption for message delivery, but also balance the entire sensor network energy consumption, and thereby extend the sensor network lifetime.

WSNs rely on wireless communications, which is by nature a broadcast medium. It is more vulnerable to security attacks than its wired counterpart due to lack of a physical boundary. In particular, in the wireless sensor domain, anybody with an appropriate wireless receiver can monitor and intercept the sensor network communications. The adversaries may use expensive radio transceivers, powerful workstations and interact with the network from a distance since they are not restricted to using sensor network hardware. It is possible for the adversaries to perform jamming and routing traceback attacks. Motivated by the fact that WSNs routing is often geography-based, we propose a geography-based secure and effi- cient Cost-Aware SEcure routing (CASER) protocol for WSNs without relying on flooding.

CASER allows messages to be transmitted using two routing strategies, random walking and deterministic routing, in the same framework. The distribution of these two strategies is determined by the specific security requirements. This scenario is analogous to delivering US Mail through USPS: express mails cost more than regular mails; however, mails can be delivered faster. The protocol also provides a secure message delivery option to maximize the message delivery ratio under adversarial attacks. In addition, we also give quantitative secure analysis on the proposed routing protocol based on the criteria proposed in CASER protocol has two major advantages: (i) It ensures balanced energy consumption of the entire sensor network so that the lifetime of the WSNs can be maximized. (ii) CASER protocol supports multiple routing strategies based on the routing requirements, including fast/slow message delivery and secure message delivery to prevent routing traceback attacks and malicious traffic jamming attacks in WSNs.

Our contributions of this paper can be summarized as follows:

1) We propose a secure and efficient Cost-Aware SEcure Routing (CASER) protocol for WSNs. In this protocol, cost-aware based routing strategies can be applied to address the message delivery requirements.

2) We devise a quantitative scheme to balance the energy consumption so that both the sensor network lifetime and the total number of messages that can be delivered are maximized under the same energy deployment (ED).

3) We develop theoretical formulas to estimate the number of routing hops in CASER under varying routing energy balance control (EBC) and security requirements.

4) We quantitatively analyze security of the proposed routing algorithm.

5) We provide an optimal non-uniform energy deployment (noED) strategy for the given sensor networks based on the energy consumption ratio. Our theoretical and simulation results both show that under the same total energy deployment, we can increase the lifetime and the number of messages that can be delivered more than four times in the non-uniform energy deployment scenario.

LITRATURE SURVEY:

QUANTITATIVE MEASUREMENT AND DESIGN OF SOURCE-LOCATION PRIVACY SCHEMES FOR WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li, J. Ren, and J. Wu

PUBLISH: IEEE Trans. Parallel Distrib. Syst., vol. 23, no. 7, pp. 1302–1311, Jul. 2012.

EXPLANATION:

Wireless sensor networks (WSNs) have been widely used in many areas for critical infrastructure monitoring and information collection. While confidentiality of the message can be ensured through content encryption, it is much more difficult to adequately address source-location privacy (SLP). For WSNs, SLP service is further complicated by the nature that the sensor nodes generally consist of low-cost and low-power radio devices. Computationally intensive cryptographic algorithms (such as public-key cryptosystems), and large scale broadcasting-based protocols may not be suitable. In this paper, we first propose criteria to quantitatively measure source-location information leakage in routing-based SLP protection schemes for WSNs. Through this model, we identify vulnerabilities of some well-known SLP protection schemes. We then propose a scheme to provide SLP through routing to a randomly selected intermediate node (RSIN) and a network mixing ring (NMR). Our security analysis, based on the proposed criteria, shows that the proposed scheme can provide excellent SLP. The comprehensive simulation results demonstrate that the proposed scheme is very efficient and can achieve a high message delivery ratio. We believe it can be used in many practical applications.

PROVIDING HOP-BY-HOP AUTHENTICATION AND SOURCE PRIVACY IN WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li, J. Li, J. Ren, and J. Wu

PUBLISH: IEEE Conf. Comput. Commun. Mini-Conf., Orlando, FL, USA, Mar. 2012, pp. 3071–3075.

EXPLANATION:

Message authentication is one of the most effective ways to thwart unauthorized and corrupted traffic from being forwarded in wireless sensor networks (WSNs). To provide this service, a polynomial-based scheme was recently introduced. However, this scheme and its extensions all have the weakness of a built-in threshold determined by the degree of the polynomial: when the number of messages transmitted is larger than this threshold, the adversary can fully recover the polynomial. In this paper, we propose a scalable authentication scheme based on elliptic curve cryptography (ECC). While enabling intermediate node authentication, our proposed scheme allows any node to transmit an unlimited number of messages without suffering the threshold problem. In addition, our scheme can also provide message source privacy. Both theoretical analysis and simulation results demonstrate that our proposed scheme is more efficient than the polynomial-based approach in terms of communication and computational overhead under comparable security levels while providing message source privacy.

SOURCE-LOCATION PRIVACY THROUGH DYNAMIC ROUTING IN WIRELESS SENSOR NETWORKS

AUTHOR: Y. Li and J. Ren

PUBLISH: IEEE INFOCOM 2010, San Diego, CA, USA., Mar. 15–19, 2010. pp. 1–9.

EXPLANATION:

Wireless sensor networks (WSNs) have the potential to be widely used in many areas for unattended event monitoring. Mainly due to lack of a protected physical boundary, wireless communications are vulnerable to unauthorized interception and detection. Privacy is becoming one of the major issues that jeopardize the successful deployment of wireless sensor networks. While confidentiality of the message can be ensured through content encryption, it is much more difficult to adequately address the source-location privacy. For WSNs, source-location privacy service is further complicated by the fact that the sensor nodes consist of low-cost and low-power radio devices, computationally intensive cryptographic algorithms and large scale broadcasting-based protocols are not suitable for WSNs. In this paper, we propose source-location privacy schemes through routing to randomly selected intermediate node(s) before the message is transmitted to the SINK node. We first describe routing through a single a single randomly selected intermediate node away from the source node. Our analysis shows that this scheme can provide great local source-location privacy. We also present routing through multiple randomly selected intermediate nodes based on angle and quadrant to further improve the global source location privacy. While providing source-location privacy for WSNs, our simulation results also demonstrate that the proposed schemes are very efficient in energy consumption, and have very low transmission latency and high message delivery ratio. Our protocols can be used for many practical applications.

SYSTEM ANALYSIS:

EXISTING SYSTEM:

In Geographic and energy aware routing (GEAR), the sink node disseminates requests with geographic attributes to the target region instead of using flooding. Each node forwards messages to its neighboring nodes based on estimated cost and learning cost. Source-location privacy is provided through broadcasting that mixes valid messages with dummy messages. The transmission of dummy messages not only consumes significant amount of sensor energy, but also increases the network collisions and decreases the packet delivery ratio. In phantom routing protocol, each message is routed from the actual source to a phantom source along a designed directed walk through either sector based approach or hop-based approach. The direction/sector information is stored in the header of the message. In this way, the phantom source can be away from the actual source. Unfortunately, once the message is captured on the random walk path, the adversaries are able to get the direction/sector information stored in the header of the message.

DISADVANTAGES:

  • More energy consumption
  • Increase the network collision
  • Reduce the packet delivery ratio
  • Cannot provide the full secure for packets

PROPOSED SYSTEM:

We propose a secure and efficient Cost Aware Secure Routing (CASER) protocol that can address energy balance and routing security concurrently in WSNs. In CASER routing protocol, each sensor node needs to maintain the energy levels of its immediate adjacent neighboring grids in addition to their relative locations. Using this information, each sensor node can create varying filters based on the expected design tradeoff between security and efficiency. The quantitative security analysis demonstrates the proposed algorithm can protect the source location information from the adversaries. In this project, we will focus on two routing strategies for message forwarding: shortest path message forwarding, and secure message forwarding through random walking to create routing path unpredictability for source privacy and jamming prevention.

  • We propose a secure and efficient Cost-Aware SEcure Routing (CASER) protocol for WSNs. In this protocol, cost-aware based routing strategies can be applied to address the message delivery requirements.
  • We devise a quantitative scheme to balance the energy consumption so that both the sensor network lifetime and the total number of messages that can be delivered are maximized under the same energy deployment (ED).
  • We develop theoretical formulas to estimate the number of routing hops in CASER under varying routing energy balance control (EBC) and security requirements.
  • We quantitatively analyze security of the proposed routing algorithm. We provide an optimal non-uniform energy deployment (noED) strategy for the given sensor networks based on the energy consumption ratio.
  • Our theoretical and simulation results both show that under the same total energy deployment, we can increase the lifetime and the number of messages that can be delivered more than four times in the non-uniform energy deployment scenario.

ADVANTAGES:

  • Reduce the energy consumption
  • Provide the more secure for packet and also routing
  • Increase the message delivery ratio
  • Reduce the time delay

HARDWARE & SOFTWARE REQUIREMENTS:

HARDWARE REQUIREMENT

v    Processor                                 –    Pentium –IV

  • Speed       –    1 GHz
  • RAM       –    256 MB (min)
  • Hard Disk      –   20 GB
  • Floppy Drive       –    44 MB
  • Key Board      –    Standard Windows Keyboard
  • Mouse       –    Two or Three Button Mouse
  • Monitor      –    SVGA

SOFTWARE REQUIREMENTS:

  • Operating System        :           Windows XP or Win7
  • Front End       :           JAVA JDK 1.7
  • Tools :           Netbeans 7
  • Document :           MS-Office 2007