PASSIVE IP TRACEBACK: DISCLOSING THE LOCATIONS OF IP SPOOFERS FROM PATH BACKSCATTER

 ABSTRACT:

It is long known attackers may use forged source IP address to conceal their real locations. To capture the spoofers, a number of IP traceback mechanisms have been proposed. However, due to the challenges of deployment, there has been not a widely adopted IP traceback solution, at least at the Internet level. As a result, the mist on the locations of spoofers has never been dissipated till now.

This paper proposes passive IP traceback (PIT) that bypasses the deployment difficulties of IP traceback techniques. PIT investigates Internet Control Message Protocol error messages (named path backscatter) triggered by spoofing traffic, and tracks the spoofers based on public available information (e.g., topology). In this way, PIT can find the spoofers without any deployment requirement.

This paper illustrates the causes, collection, and the statistical results on path backscatter, demonstrates the processes and effectiveness of PIT, and shows the captured locations of spoofers through applying PIT on the path backscatter data set.

These results can help further reveal IP spoofing, which has been studied for long but never well understood. Though PIT cannot work in all the spoofing attacks, it may be the most useful mechanism to trace spoofers before an Internet-level traceback system has been deployed in real.

 INTRODUCTION

IP spoofing, which means attackers launching attacks with forged source IP addresses, has been recognized as a serious security problem on the Internet for long. By using addresses that are assigned to others or not assigned at all, attackers can avoid exposing their real locations, or enhance the effect of attacking, or launch reflection based attacks. A number of notorious attacks rely on IP spoofing, including SYN flooding, SMURF, DNS amplification etc. A DNS amplification attack which severely degraded the service of a Top Level Domain (TLD) name server is reported in though there has been a popular conventional wisdom that DoS attacks are launched from botnets and spoofing is no longer critical, the report of ARBOR on NANOG 50th meeting shows spoofing is still significant in observed DoS attacks. Indeed, based on the captured backscatter messages from UCSD Network Telescopes, spoofing activities are still frequently observed.

To capture the origins of IP spoofing traffic is of great importance. As long as the real locations of spoofers are not disclosed, they cannot be deterred from launching further attacks. Even just approaching the spoofers, for example, determining the ASes or networks they reside in, attackers can be located in a smaller area, and filters can be placed closer to the attacker before attacking traffic get aggregated. The last but not the least, identifying the origins of spoofing traffic can help build a reputation system for ASes, which would be helpful to push the corresponding ISPs to verify IP source address.

Instead of proposing another IP traceback mechanism with improved tracking capability, we propose a novel solution, named Passive IP Traceback (PIT), to bypass the challenges in deployment. Routers may fail to forward an IP spoofing packet due to various reasons, e.g., TTL exceeding. In such cases, the routers may generate an ICMP error message (named path backscatter) and send the message to the spoofed source address. Because the routers can be close to the spoofers, the path backscatter messages may potentially disclose the locations of the spoofers. PIT exploits these path backscatter messages to find the location of the spoofers. With the locations of the spoofers known, the victim can seek help from the corresponding ISP to filter out the attacking packets, or take other counterattacks. PIT is especially useful for the victims in reflection based spoofing attacks, e.g., DNS amplification attacks. The victims can find the locations of the spoofers directly from the attacking traffic.

In this article, at first we illustrate the generation, types, collection, and the security issues of path backscatter messages in section III. Then in section IV, we present PIT, which tracks the location of the spoofers based on path backscatter messages together with the topology and routing information. We discuss how to apply PIT when both topology and routing are known, or only topology is known, or neither are known respectively. We also present two effective algorithms to apply PIT in large scale networks. In the following section, at first we show the statistical results on path backscatter messages. Then we evaluate the two key mechanisms of PIT which work without routing information. At last, we give the tracking result when applying PIT on the path backscatter message dataset: a number of ASes in which spoofers are found.

Our work has the following contributions:

1) This is the first article known which deeply investigates path backscatter messages. These messages are valuable to help understand spoofing activities. Though Moore et al. [8] has exploited backscatter messages, which are generated by the targets of spoofing messages, to study Denial of Services (DoS), path backscatter messages, which are sent by intermediate devices rather than the targets, have not been used in traceback. 2) A practical and effective IP traceback solution based on path backscatter messages, i.e., PIT, is proposed. PIT bypasses the deployment difficulties of existing IP traceback mechanisms and actually is already in force. Though given the limitation that path backscatter messages are not generated with stable possibility, PIT cannot work in all the attacks, but it does work in a number of spoofing activities. At least it may be the most useful traceback mechanism before an AS-level traceback system has been deployed in real. 3) Through applying PIT on the path backscatter dataset, a number of locations of spoofers are captured and presented. Though this is not a complete list, it is the first known list disclosing the locations of spoofers.

PANDA: PUBLIC AUDITING FOR SHARED DATA WITH EFFICIENT USER REVOCATION IN THE CLOUD

ABSTRACT:

With data storage and sharing services in the cloud, users can easily modify and share data as a group. To ensure share data integrity can be verified publicly, users in the group need to compute signatures on all the blocks in shared data. Different blocks in shared data are generally signed by different users due to data modifications performed by different users. For security reasons, once a user is revoked from the group, the blocks which were previously signed by this revoked user must be re-signed by an existing user. The straight forward method, which allows an existing user to download the corresponding part of shared data and re-sign it during user revocation, is inefficient due to the large size of shared data in the cloud. In this paper, we propose a novel public auditing mechanism

For the integrity of shared data with efficient user revocation in mind. By utilizing the idea of proxy re-signatures, we allow the cloud tore-sign blocks on behalf of existing users during user revocation, so that existing users do not need to download and re-sign blocks by themselves. In addition, a public verifier is always able to audit the integrity of shared data without retrieving the entire data from the

Cloud, even if some part of shared data has been re-signed by the cloud. Moreover, our mechanism is able to support batch auditing by verifying multiple auditing tasks simultaneously. Experimental results show that our mechanism can significantly improve the efficiency of user revocation.

INTRODUCTION

With data storage and sharing services (such as Dropbox and Google Drive) provided by the cloud, people can easily work together as a group by sharing data with each other. More specifically, once a user creates shared data in the cloud, every user in the group is able to not only access and modify shared data, but also share the latest version of the shared data with the rest of the group. Although cloud providers promise a more secure and reliable environment to the users, the integrity of data in the cloud may still be compromised, due to the existence of hardware/software failures and human errors.

To protect the integrity of data in the cloud, a number of mechanisms have been proposed. In these mechanisms, a signature is attached to each block in data, and the integrity of data relies on the correctness of all the signatures. One of the most significant and common features of these mechanisms is to allow a public verifier to efficiently check data integrity in the cloud without downloading the entire data, referred to as public auditing (or denoted as Provable Data Possession). This public verifier could be a client who would like to utilize cloud data for particular purposes (e.g., search, computation, data mining, etc.) or a thirdparty auditor (TPA) who is able to provide verification services on data integrity to users. Most of the previous works focus on auditing the integrity of personal data. Different from these works, several recent works focus on how to preserve identity privacy from public verifiers when auditing the integrity of shared data. Unfortunately, none of the above mechanisms, considers the efficiency of user revocation when auditing the correctness of shared data in the cloud.

With shared data, once a user modifies a block, she also needs to compute a new signature for the modified block. Due to the modifications from different users, different blocks are signed by different users. For security reasons, when a user leaves the group or misbehaves, this user must be revoked from the group. As a result, this revoked user should no longer be able to access and modify shared data, and the signatures generated by this revoked user are no longer valid to the group. Therefore, although the content of shared data is not changed during user revocation, the blocks, which were previously signed by the revoked user, still need to be re-signed by an existing user in the group. As a result, the integrity of the entire data can still be verified with the public keys of existing users only.

Since shared data is outsourced to the cloud and users no longer store it on local devices, a straightforward method to re-compute these signatures during user revocation is to ask an existing user to first download the blocks previously signed by the revoked user verify the correctness of these blocks, then re-sign these blocks, and finally upload the new signatures to the cloud. However, this straightforward method may cost the existing user a huge amount of communication and computation resources by downloading and verifying blocks, and by re-computing and uploading signatures, especially when the number of re-signed blocks is quite large or the membership of the group is frequently changing. To make this matter even worse, existing users may access their data sharing services provided by the cloud with resource limited devices, such as mobile phones, which further prevents existing users from maintaining the correctness of shared data efficiently during user revocation.

Clearly, if the cloud could possess each user’s private key, it can easily finish the re-signing task for existing users without asking them to download and re-sign blocks. However, since the cloud is not in the same trusted domain with each user in the group, outsourcing every user’s private key to the cloud would introduce significant security issues. Another important problem we need to consider is that the re-computation of any signature during user revocation should not affect the most attractive property of public auditing — auditing data integrity publicly without retrieving the entire data. Therefore, how to efficiently reduce the significant burden to existing users introduced by user revocation, and still allow a public verifier to check the integrity of shared data without downloading the entire data from the cloud, is a challenging task.

In this paper, we propose Panda, a novel public auditing mechanism for the integrity of shared data with efficient user revocation in the cloud. In our mechanism, by utilizing the idea of proxy re-signatures, once a user in the group is revoked, the cloud is able to resign the blocks, which were signed by the revoked user, with a re-signing key. As a result, the efficiency of user revocation can be significantly improved, and computation and communication resources of existing users can be easily saved. Meanwhile, the cloud, who is not in the same trusted domain with each user, is only able to convert a signature of the revoked user into a signature of an existing user on the same block, but it cannot sign arbitrary blocks on behalf of either the revoked user or an existing user. By designing a new proxy re-signature scheme with nice properties, which traditional proxy resignatures do no have, our mechanism is always able to check the integrity of shared data without retrieving the entire data from the cloud.

NEIGHBOR SIMILARITY TRUST AGAINST SYBIL ATTACK IN P2P E-COMMERCE

In this paper, we present a distributed structured approach to Sybil attack. This is derived from the fact that our approach is based on the neighbor similarity trust relationship among the neighbor peers. Given a P2P e-commerce trust relationship based on interest, the transactions among peers are flexible as each peer can decide to trade with another peer any time. A peer doesn’t have to consult others in a group unless a recommendation is needed. This approach shows the advantage in exploiting the similarity trust relationship among peers in which the peers are able to monitor each other.

Our contribution in this paper is threefold:

1) We propose SybilTrust that can identify and protect honest peers from Sybil attack. The Sybil peers can have their trust canceled and dismissed from a group.

2) Based on the group infrastructure in P2P e-commerce, each neighbor is connected to the peers by the success of the transactions it makes or the trust evaluation level. A peer can only be recognized as a neighbor depending on whether or not trust level is sustained over a threshold value.

3) SybilTrust enables neighbor peers to carry recommendation identifiers among the peers in a group. This ensures that the group detection algorithms to identify Sybil attack peers to be efficient and scalable in large P2P e-commerce networks.

Malware Propagation in Large-Scale Networks

Malware is pervasive in networks, and poses a critical threat to network security. However, we have very limited understanding of malware behavior in networks to date. In this paper, we investigate how malware propagate in networks from a global perspective. We formulate the problem, and establish a rigorous two layer epidemic model for malware propagation from network to network. Based on the proposed model, our analysis indicates that the distribution of a given malware follows exponential distribution, power law distribution with a short exponential

tail, and power law distribution at its early, late and final stages, respectively. Extensive experiments have been performed through two real-world global scale malware data sets, and the results confirm our theoretical findings.

LOSSLESS AND REVERSIBLE DATA HIDING IN ENCRYPTED IMAGES WITH PUBLIC KEY CRYPTOGRAPHY

This paper proposes a lossless, a reversible, and a combined data hiding schemes for ciphertext images encrypted by public key cryptosystems with probabilistic and homomorphic properties. In the lossless scheme, the ciphertext pixels are replaced with new values to embed the additional data into several LSB-planes of ciphertext pixels by multi-layer wet paper coding. Then, the embedded data can be directly extracted from the encrypted domain, and the data embedding operation does not affect the decryption of original plaintext image. In the reversible scheme, a preprocessing is employed to shrink the image histogram before image encryption, so that the modification on encrypted images for data embedding will not cause any pixel oversaturation in plaintext domain. Although a slight distortion is introduced, the embedded data can be extracted and the original image can be recovered from the directly decrypted image. Due to the compatibility between the lossless and reversible schemes, the data embedding operations in the two manners can be simultaneously performed in an encrypted image. With the combined technique, a receiver may extract a part of embedded data before decryption, and extract another part of embedded data and recover the original plaintext image after decryption.

Joint Beamforming, Power and Channel Allocation in Multi-User and Multi-Channel Underlay MISO Cognitive Radio Networks

In this paper, we consider a joint beamforming, power, and channel allocation in a multi-user and multi-channel underlay multiple input single output (MISO) cognitive radio network (CRN). In this system, primary users’ (PUs’) spectrum can be reused by the secondary user transmitters (SUTXs) to maximize the spectrum utilization while the intra-user interference is minimized by implementing beamforming at each SU-TX. After formulating the joint optimization problem as a non-convex, mixed integer nonlinear programming (MINLP) problem, we propose a solution which consists of two stages.

In the first stage, a feasible solution for power allocation and beamforming vectors is derived under a given channel allocation by converting the original problem into a convex form with an introduced optimal auxiliary variable and semidefinite relaxation (SDR) approach. After that, in the second stage, two explicit searching algorithms, i.e., genetic algorithm (GA) and simulated annealing (SA)-based algorithm, are proposed to determine suboptimal channel allocations. Simulation results show that
beamforming, power and channel allocation with SA (BPCA-SA) algorithm can achieve close-to-optimal sum-rate while having a lower computational complexity compared with beamforming, power and channel allocation with GA (BPCA-GA) algorithm.

Furthermore, our proposed allocation scheme has significant improvement in achievable sum-rate compared to the existing zero-forcing beamforming (ZFBF).

GDCLUSTER A GENERAL DECENTRALIZED CLUSTERING ALGORITHM

In many popular applications like peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is challenging due to processing, storage, and transmission costs. In this paper, we propose GDCluster, a general fully decentralized clustering method, which is capable of clustering dynamic and distributed data sets. Nodes continuously cooperate through decentralized gossip-based communication to maintain summarized views of the data set. We customize GDCluster for execution of the partition-based and density-based clustering methods on the summarized views, and also offer enhancements to the basic algorithm. Coping with dynamic data is made possible by gradually adapting the clustering model. Our experimental evaluations show that GDCluster can discover the clusters efficiently with scalable transmission cost, and also expose its supremacy in comparison to the popular method LSP2P.

Efficient Top-k Retrieval on Massive Data

Top-k query is an important operation to return a set of interesting points in a potentially huge data space. It is analyzed in this paper that the existing algorithms cannot process top-k query on massive data efficiently. This paper proposes a novel table-scan-based T2S algorithm to efficiently compute top-k results on massive data. T2S first constructs the presorted table, whose tuples are arranged in the order of the round-robin retrieval on the sorted lists. T2S maintains only fixed number of tuples to compute results. The early termination checking for T2S is presented in this paper, along with the analysis of scan depth. The selective retrieval is devised to skip the tuples in the presorted table which are not top-k results. The theoretical analysis proves that selective retrieval can reduce the number of the retrieved tuples significantly. The construction and incremental-update/batch-processing methods for the used structures are proposed.

Effective Key Management in Dynamic Wireless Sensor Networks

Recently, wireless sensor networks (WSNs) have been deployed for a wide variety of applications, including military sensing and tracking, patient status monitoring, traffic flow monitoring, where sensory devices often move between different locations. Securing data and communications requires suitable encryption key protocols. In this paper, we propose a certificateless-effective key management (CL-EKM) protocol for secure communication in dynamic WSNs characterized by node mobility. The CL-EKM supports efficient key updates when a node leaves or joins a cluster and ensures forward and backward key secrecy. The protocol also supports efficient key revocation for compromised nodes and minimizes the impact of a node compromise on the security of other communication links. A security analysis of our scheme shows that our protocol is effective in defending against various attacks.We implement CL-EKM in Contiki OS and simulate it using Cooja simulator to assess its time, energy, communication, and memory performance.

Discovery of Ranking Fraud for Mobile Apps

Ranking fraud in the mobile App market refers to fraudulent or deceptive activities which have a purpose of bumping up the Apps in the popularity list. Indeed, it becomes more and more frequent for App developers to use shady means, such as inflating their Apps’ sales or posting phony App ratings, to commit ranking fraud. While the importance of preventing ranking fraud has been widely recognized, there is limited understanding and research in this area. To this end, in this paper, we provide a holistic view of ranking fraud and propose a ranking fraud detection system for mobile Apps. Specifically, we first propose to accurately locate the ranking fraud by mining the active periods, namely leading sessions, of mobile Apps. Such leading sessions can be leveraged for detecting the local anomaly instead of global anomaly of App rankings. Furthermore, we investigate three types of evidences, i.e., ranking based evidences, rating based evidences and review based evidences, by modeling Apps’ ranking, rating and review behaviors through statistical hypotheses tests. In addition, we propose an optimization based aggregation method to integrate all the evidences for fraud detection.

DETECTION AND RECTIFICATION OF DISTORTED FINGERPRINTS

Elastic distortion of fingerprints is one of the major causes for false non-match. While this problem affects all fingerprint recognition applications, it is especially dangerous in negative recognition applications, such as watchlist and deduplication applications. In such applications, malicious users may purposely distort their fingerprints to evade identification. In this paper, we proposed novel algorithms to detect and rectify skin distortion based on a single fingerprint image. Distortion detection is viewed as a two-class classification problem, for which the registered ridge orientation map and period map of a fingerprint are used as the feature vector and a SVM classifier is trained to perform the classification task.

Distortion rectification (or equivalently distortion field estimation) is viewed as a regression problem, where the input is a distorted fingerprint and the output is the distortion field. To solve this problem, a database (called reference database) of various distorted reference fingerprints and corresponding distortion fields is built in the offline stage, and then in the online stage, the nearest neighbor of the input fingerprint is found in the reference database and the corresponding distortion field is used to transform the input fingerprint into a normal one. Promising results have been obtained on three databases containing many distorted fingerprints, namely FVC2004 DB1, Tsinghua Distorted Fingerprint database, and the NIST SD27 latent fingerprint database.

DATA-DRIVEN COMPOSITION FOR SERVICE-ORIENTED SITUATIONAL WEB APPLICATIONS

This paper presents a systematic data-driven approach to assisting situational application development. We first propose a technique to extract useful information from multiple sources to abstract service capabilities with set tags. This supports intuitive expression of user’s desired composition goals by simple queries, without having to know underlying technical details. A planning technique then exploits composition solutions which can constitute the desired goals, even with some potential new interesting composition opportunities. A browser-based tool facilitates visual and iterative refinement of composition solutions, to finally come up with the satisfying outputs. A series of experiments demonstrate the efficiency and effectiveness of our approach.

Data-driven composition technique for situational web applications by using tag-based semantics in to illustrate the overall life-cycle of our “compose as-you-search” composition approach, to propose the clustering technique for deriving tag-based composition semantics, and to evaluate the composition planning effectiveness, respectively. Compared with previous work, this paper is significantly updated by introducing a semi-supervised technique for clustering hierarchical tag based semantics from service documentations and human-annotated annotations. The derived semantics link service capabilities and developers’ processing goals, so that the composition is processed by planning the “Tag HyperLinks” from initialquery to the goals.

The planning algorithm is also further evaluated in terms of recommendation quality, performance, and scalability over data sets from real-world service repositories. Results show that our approach reaches satisfying precision and high-quality composition recommendations. We also demonstrate that our approach can accommodate even larger size of services than real world repositories so as to promise performance. Besides, more details of our interactive development prototyping are presented. We particularly demonstrate how the composition UI can help developers intuitively compose situational applications, and iteratively refine their goals until requirements are finally satisfied.

CONTINUOUS AND TRANSPARENT USER IDENTITY VERIFICATION FOR SECURE INTERNET SERVICES

Session management in distributed Internet services is traditionally based on username and password, explicit logouts and mechanisms of user session expiration using classic timeouts. Emerging biometric solutions allow substituting username and password with biometric data during session establishment, but in such an approach still a single verification is deemed sufficient, and the identity of a user is considered immutable during the entire session. Additionally, the length of the session timeout may impact on the usability of the service and consequent client satisfaction.

This paper explores promising alternatives offered by applying biometrics in the management of sessions. A secure protocol is defined for perpetual authentication through continuous user verification. The protocol determines adaptive timeouts based on the quality, frequency and type of biometric data transparently acquired from the user. The functional behavior of the protocol is illustrated through Matlab simulations, while model-based quantitative analysis is carried out to assess the ability of the protocol to contrast security attacks exercised by different kinds of attackers. Finally, the current prototype for PCs and Android smartphones is discussed.

Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoustic Localization

To implement the system to solve the joint problem of packet scheduling and self-localization in an underwater acoustic sensor network with randomly distributed nodes. In terms of packet scheduling, our goal is to minimize the localization time, and to do so we consider two packet transmission schemes, namely a collision-free scheme (CFS), and a collision-tolerant scheme (CTS). The required localization time is formulated for these schemes, and through analytical results and numerical examples their performances are shown to be dependent on the circumstances.  When the packet duration is short (as is the case for a localization packet), the operating area is large (above 3 km in at least one dimension), and the average probability of packet-loss is not close to zero, the collision-tolerant scheme is found to require a shorter localization time.

CLOUD-BASED MULTIMEDIA CONTENT PROTECTION SYSTEM

We propose a new design for large-scale multimedia content protection systems. Our design leverages cloud infrastructures to provide cost efficiency, rapid deployment, scalability, and elasticity to accommodate varying workloads. The proposed system can be used to protect different multimedia content types, including 2-D videos, 3-D videos, images, audio clips, songs, and music clips. The system can be deployed on private and/or public clouds. Our system has two novel components: (i) method to create signatures of 3-D videos, and (ii) distributed matching engine for multimedia objects. The signature method creates robust and representative signatures of 3-D videos that capture the depth signals in these videos and it is computationally efficient to compute and compare as well as it requires small storage. The distributed matching engine achieves high scalability and it is designed to support different multimedia objects.

We implemented the proposed system and deployed it on two clouds: Amazon cloud and our private cloud. Our experiments with more than 11,000 3-D videos and 1 million images show the high accuracy and scalability of the proposed system. In addition, we compared our system to the protection system used by YouTube and our results show that the YouTube protection system fails to detect most copies of 3-D videos, while our system detects more than 98% of them. This comparison shows the need for the proposed 3-D signature method, since the state-of-the-art commercial system was not able to handle 3-D videos.