Secure and Distributed Data Discovery and Dissemination in Wireless Sensor Networks

—A data discovery and dissemination protocol for wireless sensor networks (WSNs) is responsible for updatingconfiguration parameters of, and distributing management commands to, the sensor nodes. All existing data discovery anddissemination protocols suffer from two drawbacks. First, they are based on the centralized approach; only the base station candistribute data items. Such an approach is not suitable for emergent multi-owner-multi-user WSNs. Second, those protocols werenot designed with security in mind and hence adversaries can easily launch attacks to harm the network. This paper proposes thefirst secure and distributed data discovery and dissemination protocol named DiDrip. It allows the network owners to authorizemultiple network users with different privileges to simultaneously and directly disseminate data items to the sensor nodes.Moreover, as demonstrated by our theoretical analysis, it addresses a number of possible security vulnerabilities that we haveidentified. Extensive security analysis show DiDrip is provably secure. We also implement DiDrip in an experimental network ofresource-limited sensor nodes to show its high efficiency in practice.Index Terms—Distributed data discovery and dissemination, security, wireless sensor networks, efficiencyÇ1 INTRODUCTIONAFTER a wireless sensor network (WSN) is deployed,there is usually a need to update buggy/old small programsor parameters stored in the sensor nodes. This can beachieved by the so-called data discovery and dissemination protocol,which facilitates a source to inject small programs,commands, queries, and configuration parameters to sensornodes. Note that it is different from the code disseminationprotocols (also referred to as data dissemination or reprogrammingprotocols) [1], [2], which distribute large binariesto reprogram the whole network of sensors. For example,efficiently disseminating a binary file of tens of kilobytesrequires a code dissemination protocol while disseminatingseveral 2-byte configuration parameters requires data discoveryand dissemination protocol. Considering the sensornodes could be distributed in a harsh environment,remotely disseminating such small data to the sensor nodesthrough the wireless channel is a more preferred and practicalapproach than manual intervention.In the literature, several data discovery and disseminationprotocols [3], [4], [5], [6] have been proposed for WSNs.Among them, DHV [3], DIP [5] and Drip [4] are regarded asthe state-of-the-art protocols and have been included in theTinyOS distributions. All proposed protocols assume thatthe operating environment of the WSN is trustworthy andhas no adversary. However, in reality, adversaries exist andimpose threats to the normal operation of WSNs [7], [8].This issue has only been addressed recently by [7] whichidentifies the security vulnerabilities of Drip and proposesan effective solution.More importantly, all existing data discovery and disseminationprotocols [3], [4], [5], [6], [7] employ the centralizedapproach in which, as shown in the top sub-figurein Fig. 1, data items can only be disseminated by the basestation. Unfortunately, this approach suffers from the singlepoint of failure as dissemination is impossible whenthe base station is not functioning or when the connectionbetween the base station and a node is broken. In addition,the centralized approach is inefficient, non-scalable, andvulnerable to security attacks that can be launched anywherealong the communication path [2]. Even worse,some WSNs do not have any base station at all. For example,for a WSN monitoring human trafficking in acountry’s border or a WSN deployed in a remote area tomonitor illicit crop cultivation, a base station becomes anattractive target to be attacked. For such networks, datadissemination is better to be carried out by authorized networkusers in a distributed manner.Additionally, distributed data discovery and disseminationis an increasingly relevant matter in WSNs, especiallyin the emergent context of shared sensor networks, wheresensing/communication infrastructures from multiple ownerswill be shared by applications from multiple users. Forexample, large scale sensor networks are built in recent_ D. He is with the School of Computer Science and Engineering, South ChinaUniversity of Technology, Guangzhou 510006, P.R. China, and also withthe College of Computer Science and Technology, Zhejiang University,Hangzhou 310027, P.R. China. E-mail: hedaojinghit@gmail.com._ S. Chan is with the Department of Electronic Engineering, City Universityof Hong Kong, Hong Kong SAR, P. R. China.E-mail: eeschan@cityu.edu.hk._ M. Guizani is with Qatar University, Qatar. E-mail: mguizani@ieee.org._ H. Yang is with the School of Computer Science and Engineering, Universityof Electronic Science and Technology of China, P. R. China.E-mail: haomyang@uestc.edu.cn._ B. Zhou is with the College of Computer Science, Zhejiang University,P. R. China. E-mail: zby@zju.edu.cn.Manuscript received 31 Dec. 2013; revised 29 Mar. 2014; accepted 31 Mar.2014. Date of publication 10 Apr. 2014; date of current version 6 Mar. 2015.recommended for acceptance by V. B. Misic.For information on obtaining reprints of this article, please send e-mail to:reprints@ieee.org, and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TPDS.2014.2316830IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015 11291045-9219 _ 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.projects such as Geoss [9], NOPP [10] and ORION [11].These networks are owned by multiple owners and used byvarious authorized third-party users. Moreover, it isexpected that network owners and different users may havedifferent privileges of dissemination. In this context, distributedoperation by networks owners and users with differentprivileges will be a crucial issue, for which efficient solutionsare still missing.Motivated by the above observations, this paper has thefollowing main contributions:1) The need of distributed data discovery and disseminationprotocols is not completely new, but previouswork did not address this need. We study the functionalrequirements of such protocols, and set theirdesign objectives. Also, we identify the security vulnerabilitiesin previously proposed protocols.2) Based on the design objectives, we propose DiDrip.It is the first distributed data discovery and disseminationprotocol, which allows network owners andauthorized users to disseminate data items intoWSNs without relying on the base station. Moreover,our extensive analysis demonstrates thatDiDrip satisfies the security requirements of theprotocols of its kind. In particular, we apply theprovable security technique to formally provethe authenticity and integrity of the disseminateddata items in DiDrip.3) We demonstrate the efficiency of DiDrip in practiceby implementing it in an experimental WSN withresource-limited sensor nodes. This is also the firstimplementation of a secure and distributed data discoveryand dissemination protocol.The rest of this paper is structured as follows. In Section2, we first survey the existing data discovery anddissemination protocols, and then discuss their securityweaknesses. Section 3 describes the requirements for asecure and distributed extension of such protocols. Section4 presents the network, trust and adversary models.Section 5 describes DiDrip in details. Section 6 providestheoretical analysis of the security properties of DiDrip.Section 7 describes the implementation and experimentalresults of DiDrip via real sensor platforms. Finally,Section 8 concludes this paper.2 SECURITY VULNERABILITIES IN DATADISCOVERY AND DISSEMINATION2.1 Review of Existing ProtocolsThe underlying algorithm of both DIP and Drip is Trickle[12]. Initially, Trickle requires each node to periodicallybroadcast a summary of its stored data. When a node hasreceived an older summary, it sends an update to thatsource. Once all nodes have consistent data, the broadcastinterval is increased exponentially to save energy. However,if a node receives a new summary, it will broadcast thismore quickly. In other words, Trickle can disseminatenewly injected data very quickly. Among the existing protocols,Drip is the simplest one and it runs an independentinstance of Trickle for each data item.In practice, each data item is identified by a unique keyand its freshness is indicated by a version number. Forexample, for Drip, DIP and DHV, each data item is representedby a 3-tuple <key; version; data> , where key is usedto uniquely identify a data item, version indicates the freshnessof the data item (the larger the version, the fresher thedata), and data is the actual disseminated data (e.g., command,query or parameter).2.2 Security VulnerabilitiesAn adversary can first place some intruder nodes in the networkand then use them to alter the data being disseminatedor forge a data item. This may result in some importantparameters being erased or the entire network beingrebooted with wrong data. For example, consider a new dataitem (key, version, data) being disseminated. When anintruder node receives this new data item, it can broadcast amalicious data item (key, version_, data_), where version_ >version. If data_ is set to 0, the parameter identified by key willbe erased from all sensor nodes. Alternatively, if data_ is differentfrom data, all sensor nodes will update the parameteraccording to this forged data item. Note that the aboveattacks can also be launched if an adversary compromisessome nodes and has access to their key materials.In addition, since nodes executing Trickle are required toforward all new data items that they receive, an adversarycan launch denial-of-service (DoS) attacks to sensor nodesby injecting a large amount of bogus data items. As a result,the processing and energy resources of nodes are expendedFig. 1. System overview of centralized and distributed data discovery and dissemination approaches.1130 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015to process and forward these bogus data items, rather thanon the intended functions. Any data discovery and disseminationprotocol based on Trickle or its variants is vulnerableto such a DoS attack.3 REQUIREMENTS AND DESIGN CONSIDERATIONA secure and distributed data discovery and disseminationprotocol should satisfy the following requirements:1) Distributed. Multiple authorized users should beallowed to simultaneously disseminate data itemsinto the WSN without relying on the base station.2) Supporting different user privileges. To provide flexibility,each user may be assigned a certain privilegelevel by the network owner. For example, a user canonly disseminate data items to a set of sensor nodeswith specific identities and/or in a specific localizedarea. Another example is that a user just has the privilegeto disseminate data items identified by somespecific keys.3) Authenticity and integrity of data items. A sensor nodeonly accepts data items disseminated by authorizedusers. Also, a sensor should be able to ensure thatreceived data items have not been modified duringthe dissemination process.4) User accountability. User accountability must be providedsince bad user behaviors and insider attacksshould be audited and pinpointed. That is, a sendershould not be able to deny the distribution of a dataitem. At the same time, an adversary cannot impersonateany legitimate user even if it has compromisedthe network owner or the other legitimateusers. In many applications, accountability is desirableas it enables collection of users’ activities. Forexample, from the dissemination record in sensornodes, the network owner can find out who disseminatesmost data. This requires the sensor nodes to beable to associate each disseminated data with thecorresponding user’s identity.5) Node compromise tolerance. The protocol should beresilient to node compromise attack no matter howmany nodes have been compromised, as long as thesubset of non-compromised nodes can still form aconnected graph with the trusted source.6) User collusion tolerance. Even if an adversary has compromisedsome users, a benign node should notgrant the adversary any privilege level beyond thatof the compromised users.7) DoS attacks resistance. The functions of the WSNshould not be disrupted by DoS attacks.8) Freshness. A node should be able to differentiatewhether an incoming data item is the newestversion.9) Low energy overhead. Most sensor nodes have limitedresources. Thus, it is very important that the securityfunctions incur low energy overhead, which can bedecomposed to communication and computationoverhead.10) Scalability. The protocol should be efficient even forlarge-scale WSNs with thousands of sensors andlarge user population.11) Dynamic participation. New sensor nodes and userscan be dynamically added to the network.In order to ensure security, each step of the existing datadiscovery and dissemination protocol runs should be identifiedand then protected. In other words, although code disseminationprotocols may share the same securityrequirements as listed above, their security solutions needto be designed in accordance with their characteristics. Consideringthe well known open-source code disseminationprotocol Deluge [1] as an example. Deluge uses an epidemicprotocol based on a page-by-page dissemination strategyfor efficient advertisement of metadata. A code image isdivided into fixed-size pages, and each page is further splitinto same-size packets. Due to such a way of decomposingcode images into packets, our proposed protocol is notapplicable for securing Deluge.The primary challenge of providing security functions inWSNs is the limited capabilities of sensor nodes in terms ofcomputation, energy and storage. For example, to provideauthentication function to disseminated data, a commonlyused solution is digital signature. That is, users digitallysign each packet individually and nodes need to verify thesignature before processing it. However, such an asymmetricmechanism incurs significant computational and communicationoverhead and is not applicable to sensor nodes.To address this problem, TESLA and its various extensionshave been proposed [13], [14], which are based on thedelayed disclosure of authentication keys, i.e., the key usedto authenticate a message is disclosed in the next message.Unfortunately, due to the authentication delay, these mechanismsare vulnerable to a flooding attack which causeseach sensor node to buffer all forged data items until thedisclosed key is received.Another possible approach to authentication is by symmetrickey cryptography. However, this approach is vulnerableto node compromise attack because once a node iscompromised, the globally shared secret keys are revealed.Here we choose digital signatures over other forms forupdate packet authentication. That is, the network ownerassigns to each network user a public/private key pair thatallows the user to digitally sign data items and thus authenticateshimself/herself to the sensor nodes. We propose twohybrid approaches to reduce the computation and communicationcost. These methods combine digital signature withefficient data Merkle hash tree and data hash chain, respectively.The main idea is that signature generation and verificationare carried out over multiple packets instead ofindividual packet. In this way, the computation cost perpacket is significantly reduced. Since elliptic curve cryptography(ECC) is computational and communication efficientcompared with the traditional public key cryptography,DiDrip is based on ECC.To prevent the network owner from impersonatingusers, user certificates are issued by a certificate authority ofa public key infrastructure (PKI), e.g., local police office.4 NETWORK, TRUST AND THREAT MODELS4.1 Network ModelAs shown in the bottom subfigure in Fig. 1, a generalWSN comprises a large number of sensor nodes. It isHE ET AL.: SECURE AND DISTRIBUTED DATA DISCOVERY AND DISSEMINATION IN WIRELESS SENSOR NETWORKS 1131administrated by the owner and accessible by many users.The sensor nodes are usually resource constrained withrespect to memory space, computation capability, bandwidth,and power supply. Thus, a sensor node can onlyperform a limited number of public key cryptographicoperations during the lifetime of its battery. The networkusers use some mobile devices to disseminate data itemsinto the network. The network owner is responsible for generatingkeying materials. It can be offline and is assumed tobe uncompromisable.4.2 Trust ModelNetworks users are assigned dissemination privileges bythe trusted authority in a PKI on behalf of the networkowner. However, the network owner may, for various reasons,impersonate network users to disseminate data items.4.3 Threat ModelThe adversary considered in this paper is assumed to becomputationally resourceful and can launch a wide rangeof attacks, which can be classified as external or insiderattacks. In external attacks, the adversary has no control ofany sensor node in the network. Instead, it would eavesdropfor sensitive information, inject forged messages,launch replay attack, wormhole attacks, DoS attacks andimpersonate valid sensor nodes. The communication channelmay also be jammed by the adversary, but this can onlylast for a certain period of time after which the adversarywill be detected and removed.By compromising either network users or sensor nodes,the adversary can launch insider attacks to the network. Thecompromised entities are regarded as insiders because theyare members of the network until they are identified. Theadversary controls these entities to attack the network in arbitraryways. For instance, they could be instructed to disseminatefalse or harmful data, launch attacks such as Sybil attacksor DoS attacks, and be non-cooperative with other nodes.5 DIDRIPReferring to the lower sub-figure in Fig. 1, DiDrip consists offour phases, system initialization, user joining, packet preprocessingand packet verification. For our basic protocol,in system initialization phase, the network owner creates itspublic and private keys, and then loads the public parameterson each node before the network deployment. In theuser joining phase, a user gets the dissemination privilegethrough registering to the network owner. In packet preprocessingphase, if a user enters the network and wants todisseminate some data items, he/she will need to constructthe data dissemination packets and then send them to thenodes. In the packet verification phase, a node verifies eachreceived packet. If the result is positive, it updates the dataaccording to the received packet. In the following, eachphase is described in detail. The notations used in thedescription are listed in Table 1. The information processingflow of DiDrip is illustrated in Fig. 2.5.1 System Initialization PhaseIn this phase, an ECC is set up. The network owner carriesout the following steps to derive a private key x and somepublic parameters fy; Q; p; q; hð:Þg. It selects an elliptic curveE over GFðpÞ, where p is a big prime number. Here Qdenotes the base point ofE while q is also a big prime numberand represents the order of Q. It then selects the private keyx 2 GFðqÞ and computes the public key y ¼ xQ. After that,the public parameters are preloaded in each node of the network.We consider 160-bit ECC as an example. In this case, yandQ are both 320 bits long while p and q are 160 bits long.5.2 User Joining PhaseThis phase is invoked when a user with the identity UIDj,say Uj, hopes to obtain privilege level. User Uj chooses theprivate key SKj 2 GFðqÞ and computes the public keyPKj ¼ SKj_Q. Here the length of UIDj is set to 2 bytes, inthis case, it can support 65,536 users. Similarly, assume that160-bit ECC is used, PKj and SKj are 320 bits and 160 bitslong, respectively. Then user Uj sends a 3-tuple <UIDj;Prij; PKj > to the network owner, where Prij denotes thedissemination privilege of user Uj. Upon receiving this message,the network owner generates the certificate Certj. Aform of the certificate consists of the following contents:Certj ¼ fUIDj; PKj; Prij; SIGxfhðUIDjkPKjkPrijÞg, wherethe length of Prij is set to 6 bytes, thus the length of Certj is88 bytes.TABLE 1NotationsFig. 2. Information processing flow in DiDrip.1132 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 20155.3 Packet Pre-Processing PhaseAssume that a user, say Uj, enters the WSN and wants todisseminate n data items: di ¼ fkeyi; versioni; dataig, i ¼ 1;2; . . .; n. For the construction of the packets of the respectivedata, we have two methods, i.e., data hash chain and theMerkle hash tree [15].For data hash chain approach, a packet, say Pi is composedof packet header, di, and the hash value of packet Piþ1(i.e., Hiþ1 ¼ hðPiþ1Þ) which is used to verify the next packet,where i ¼ 1; . . .; n _ 1. Here each cryptographic hash Hi iscalculated over the full packet Pi, not just the data portion di,thereby establishing a chain of hashes. After that, user Ujuses his/her private key SKj to run an ECDSA sign operationto sign the hash value of the first data packet hðP1Þ and thencreates an advertisement packet P0, which consists of packetheader, user certificate Certj, hðP1Þ and the signatureSIGSKjfhðP1Þg. Similarly, the network owner assigns a predefinedkey to identify this advertisement packet.With the method of Merkle hash tree, user Uj builds aMerkle hash tree from the n data items in the following way.All the data items are treated as the leaves of the tree. A newset of internal nodes at the upper level is formed; each internalnode is computed as the hash value of the concatenationof two child nodes. This process is continued until the rootnode Hroot is formed, resulting in a Merkle hash tree withdepth D ¼ log2ðnÞ. Before disseminating the n data items,user Uj signs the root node with his/her private key SKj andthen transmits the advertisement packet P0 comprising usercertificate Certj,Hroot and SIGSKjfHrootg. Subsequently, userUj disseminates each data item along with the appropriateinternal nodes for verification purpose. Note that asdescribed above, user certificate Certj contains user identityinformation UIDj and dissemination privilege Prij. Beforethe network deployment, the network owner assigns a predefinedkey to identify this advertisement packet.5.4 Packet Verification PhaseWhen a sensor node, say Sj, receives a packet either from anauthorized user or from its one-hop neighbours, it firstchecks the packet’s key field1) If this is an advertisement packet (P0 fCertj; hðP1Þ;SIGSKjfhðP1Þgg for the data hash chain method whileP0 ¼ fCertj; root; SIGSKjfrootgg for the Merkle hashtree method), node Sj first pays attention to the legalityof the dissemination privilege Prij. For example,node Sj needs to check whether the identity of itself isincluded in the node identity set of Prij. If the resultis positive, node Sj uses the public key y of the networkowner to run an ECDSA verify operation toauthenticate the certificate. If the certificate Certj isvalid, node Sj authenticates the signature. If yes, forthe data hash chain method (respectively, the Merklehash tree method), node Sj stores <UIDj;H1 >(respectively, <UIDj; root>) included in the advertisementpacket; otherwise, node Sj simply discardsthe packet.2) Otherwise, it is a data packet Pi, where i ¼ 1; 2; . . . ;n. Node Sj executes the following procedure:For the data hash chain method, node Sj checks theauthenticity and integrity of Pi by comparing the hashvalue of Pi with Hi which has been received in thesame round and verified. If the result is positive andthe version number is new, node Sj then updates thedata identified by the key stored in Pi and replaces itsstored <round, Hi > by <round, Hiþ1 > (hereHiþ1 isincluded in packet Pi); otherwise, Pi is discarded.For Merkle hash tree method, node Sj checks theauthenticity and integrity of Pi through the alreadyverified root node received in the same round. If theresult is positive and the version number is new, nodeSj then updates the data identified by the key storedin Pi; otherwise, Pi is discarded.Remark: To prevent the network owner from impersonatingusers, system initialization and issue of user certificatescan be carried out by the certificate authority of aPKI rather than the network owner.Comparing the two methods, the data hash chainmethod incurs less communication overhead than the Merklehash tree method. In the data hash chain method, onlyone hash value of a packet is included in each packet. Onthe contrary, in the Merkle hash tree method, D (the treedepth) hash values are included in each packet. However, alimitation of the data hash tree method is that it just workswell in networks with in-sequence packet delivery. Such alimitation does not exist in the Merkle hash tree methodsince it allows each packet to be immediately authenticatedupon its arrival at a node. Therefore, the choice of eachmethod depends on this characteristic of the WSNs.5.5 EnhancementsWe can enhance the efficiency and security of DiDrip byadding additional mechanisms. Readers are referred to theAppendix of this paper for details.6 SECURITY ANALYSIS OF DIDRIPIn the following, we will analyze the security of DiDrip toverify that the security requirements mentioned in Section 3are satisfied.Distributed. As described in Section 5.2, in order to passthe signature verification of sensor nodes, each user has tosubmit his/her private key and dissemination privilege tothe network owner for registration. In addition, as describedabove, authorized users are able to carry out disseminationin a distributed manner.Supporting different user privileges. Activities of networkusers can be restricted by setting user privilege Prij, whichis contained in the user certificate. Since each user certificateis generated based on Prij, it will not pass the signature verificationat sensor nodes if Prij is modified. Thus, only thenetwork owner can modify Prij and then updates the certificateaccordingly.Authenticity and integrity of data items. With the Merklehash tree method (rsp. the data hash chain method), anauthorized user signs the root of the Merkle hash tree (rsp.the hash value of the first data packet hðP1Þ) with his/her privatekey. Using the network owner’s public key, each sensornode can authenticate the user certificate and obtains theuser’s public key. Then, using the user’s public key, eachnode can authenticate the root of the Merkle hash tree (rsp.HE ET AL.: SECURE AND DISTRIBUTED DATA DISCOVERY AND DISSEMINATION IN WIRELESS SENSOR NETWORKS 1133the hash value of the first data packet hðP1Þ). Subsequently,each node can authenticate other data packets based on theMerkle hash tree (rsp. the data hash chain). With the assumptionthat the network owner cannot be compromised, it isguaranteed that any forged or modified data items can beeasily detected by the authentication process.User accountability. Users’ identities and their disseminationactivities are exposed to sensor nodes. Thus, sensornodes can report such records to the network owner periodically.Since each user certificate is generated according tothe user identity, except the network owner, no one canmodify the user identity contained in the user certificatewhich passes the authentication. Therefore, users cannotrepudiate their activities.Node compromise and user collusion tolerance. As describedabove, for basic protocol, only the public parameters arepreloaded in each node. Even for the improved protocol,the public-key/dissemination-privilege pair of each networkuser is loaded into the nodes. Therefore, no matterhow many sensor nodes are compromised, the adversaryjust obtains the public parameters and the public-key/dissemination-privilege pair of each user. Clearly, the adversarycannot launch any attack by compromising sensornodes. As described in Section 5, even if some users collude,a benign node will not grant any dissemination privilegethat is beyond those of colluding users.Resistance to DoS attacks. There are DoS attacks against basicDiDrip by exploiting: (1) authentication delays, (2) the expensivesignature verifications, and (3) the Trickle algorithm.First, with the use of Merkle hash tree or data hash chain,each node can efficiently authenticate a data packet by afew hash operations. Second, using the message specificpuzzle approach, each node can efficiently verify a puzzlesolution to filter a fake signature message and to forward adata packet using Trickle without waiting for signature verification.Therefore, all the above DoS attacks are defended.DiDrip can successfully defeat all three types of DoSattacks even if there are compromised network users andsensor nodes. Indeed, without the private key and the unreleasedpuzzle keys of the network users, even an insideattacker cannot forge any signature/data packets.Ensurance of freshness. If the privilege of a user allows him/her to disseminate data items to his/her own set of nodes, theversion number in each item can ensure the freshness ofDiDrip. On the other hand, if a node receives data items frommultiple users, the version number can be replaced by a timestampto indicate the freshness of a data item. More specifically,a timestamp is attached into the root of the Merkle hashtree (or the hash value of the first data packet).Scalability. Different from centralized approaches, anauthorized user can enter the network and then disseminatesdata items into the targeted sensor nodes. Moreover,as to be demonstrated by our experiments in a testbed with24 TelosB motes in the next section, the security functions inour protocol have low impact on propagation delay. Notethat the increase in propagation delay is dominated by thesignature verification time incurred at the one-hop neighboringnodes of the authorized user. Thus, the proposedprotocol is efficient even in a large-scale WSN with thousandsof sensor nodes. Also, as shown in Section 5.2, ourprotocol can support a large number of users.Also, as described above, DiDrip can achieve dynamic participation.Moreover, in the next section, our implementationresults will demonstrate DiDrip has lowenergy overhead.In the following, we give the formal proof of the authenticityand integrity of the disseminated data items in DiDripbased on the three assumptions below:Assumption 1: There exist pseudo-random functionswhich are polynomially indistinguishable from truly randomfunctions.Assumption 2: There exist target collision-resistance(TCR) hash functions [16], where if for all probabilistic-polynomial-time (PPT) adversaries, say A, A have negligibleprobability in winning the following game: A first choose amessage m, and then A are given a random function hð:Þ. Towin, A must output m0 6¼m such that hðm0Þ ¼ hðmÞ. Notethat in our scheme, the TCR hash function can be implementedby the common hash functions, such as SHA-1.Assumption 3: ECC signature is existentially unforgeableunder adaptive chosen-message attacks. Note thatin our scheme, ECC signature can use the standardECDSA of 160 bits.Theorem 1. DiDrip achieves the authenticity and integrity of dataitems, assuming the indistinguishability between pseudo-randomnessand true randomness, and assuming that hð:Þ is aTCR hash function and ECC signature is existentially unforgeableunder adaptive chosen-message attacks.Proof. Our theorem follows from Theorem A.1 in [16], andthus here we only give a proof sketch briefly. To beginwith, we assume ECC signature generation and verificationguarantee authenticity and integrity of signed messages,and every receiver has obtained an authentic copyof the legitimate sender’s public key. And we also assumethat hð:Þ is a TCR hash function. Therefore, here the securityof DiDrip is proven based on the indistinguishabilitybetween pseudo-randomness and true randomness.First, there exists a PPT adversary A, which can defeatauthenticity of data items in DiDrip. This means that Acontrols the communication links and manages, withnon-negligible probability, to deliver a message m to areceiver R, such that the sender S has not sent m but Raccepts m as authentic and coming from S. Then there isa PPT adversary B that uses A to break the indistinguishabilitybetween pseudo-randomness and true randomnesswith non-negligible advantage. That is, B getsaccess to h (as an oracle) and can tell with non-negligibleprobability if h is a pseudorandom function (PRFð:Þ) orif h is a totally random function.To this end, B can query on inputs x of its choice and beanswered with hðxÞ. Hence first B simulates for A a networkwith a sender S and a receiver R. Then B works byrunning A in the way similar to that in [17]. Namely, Bchooses a number l 2 f1; . . .; ng at random, where n is thetotal number of packets to be sent in the data dissemination.Note that B hopes thatAwill forge the lth packet Pl.B grants access to the oracle h, which is either PRFð:Þor an ideal random function. B can adaptively query anarbitrarily chosen x to the oracle and get the outputwhich is either PRFðxÞ or a random value uniformlyselected from f0; 1g_. After performing polynomiallymany queries, B finally makes the decision of whether or1134 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015not the oracle is PRFð:Þ or the ideal random function. Asa result, B wins the game if the decision is correct.We argue that B succeeds with non-negligible probabilitysketchily. If h is a truly random function then A hasonly negligible probability to successfully forge thepacket Pl in the data items. Therefore, if h is random thenB makes the wrong decision only with negligible probability.On the other hand, we have assumed that if theauthentication is done using PRFð:Þ then A forges somepacket with non-negligible probability _. It follows that ifh is PRFð:Þ then B makes the right decision with probabilityat least _=l (which is also non-negligible).In addition, if the adversary A is able to cause areceiver R to accept a forged packet Pl, it implies theadversary A is able to find a collision on hðP0lÞ ¼ hðPlÞ inthe packet Pl_1. However, according to Assumption 2,hð:Þ is a TCR hash function. Moreover, due to the unforgeabilityof ECC signature (Assumption 3), it is impossiblethat A hands R a forged initial packet from S (Fordata hash chain approach, A forges the signatureSIGSKj ðP1Þ; For Merkle hash tree method, A forges thesignature SIGSKj ðrootÞ. Therefore, the above contradictionsmeans also that the authenticity and integrity ofdata items in DiDrip. tu7 IMPLEMENTATION AND PERFORMANCEEVALUATIONWe evaluate DiDrip by implementing all components on anexperimental test-bed. Also, we choose Drip for performancecomparison.7.1 Implementation and Experimental SetupWe have written programs that execute the functions of thenetwork owner, user and sensor node. The network ownerand user side programs are C programs using OpenSSL [18]and running on laptop PCs (with 2 GB RAM) under Ubuntu11.04 environment with different computational power.Also, the sensor node side programs are written in nesCand run on resource-limited motes (MicaZ and TelosB). TheMicaZ mote features an 8-bit 8-MHz Atmel microcontrollerwith 4-kB RAM, 128-kB ROM, and 512 kB of flash memory.Also, the TelosB mote has an 8-MHz CPU, 10-kB RAM, 48-kB ROM, 1MB of flash memory, and an 802.15.4/ZigBeeradio. Our motes run TinyOS 2.x. Additionally, SHA-1 isused, and the key sizes of ECC are set to 128 bits, 160 and192 bits, respectively. Throughout this paper, unlessotherwise stated, all experiments on PCs (respectively, sensornodes) were repeated 100,000 times (respectively,1,000 times) for each measurement in order to obtain accurateaverage results.To implement DiDrip with the data hash chain method(rsp. the Merkle hash tree method), the following functionalitiesare added to the user side program of Drip: constructionof data hash chain (rsp. Merkle hash tree) of around of dissemination data, generation of the signaturepacket and all data packets. For obtaining version numberof each data item, the DisseminatorC and DisseminatorPmodules in the Drip nesC library has been modified toprovide an interface called DisseminatorVersion. Moreover,the proposed hash tree method is implemented withoutand with using the message specific puzzle approachpresented in Appendix, resulting in two implementationsof DiDrip; DiDrip1 and DiDrip2. In DiDrip1, when a nodereceives a signature/data packet with a new version number,it authenticates the packet before broadcasting it to itsnext-hop neighbours. On the other hand, in DiDrip2, anode only checks the puzzle solution in the packet beforebroadcasting the packets. We summarize the pros andcons of all related protocols in Table 2.Based on the design of DiDrip, we implement the verificationfunction for signature and data packets based onthe ECDSA verify function and SHA-1 hash function ofTinyECC 2.0 library [19] and add them to the Drip nesClibrary. Also, in our experiment, when a network user(i.e., a laptop computer) disseminates data items, it firstsends them to the serial port of a specific sensor node inthe network which is referred to as repeater. Then, therepeater carries out the dissemination on behalf of theuser using DiDrip.Similar to [7], we use a circuit to accurately measure thepower consumption of various cryptographic operationsexecuted in a mote. The Tektronix TDS 3034C digital oscilloscopeaccurately measures the voltage Vr across the resistor.Denoting the battery voltage as Vb (which is 3 volts in ourexperiments), the voltage across the mote Vm is then Vb _ Vr.Once Vr is measured, the current through the circuit I canbe obtained by using Ohm’s law. The power consumed bythe mote is then VmI. By also measuring the execution timeof the cryptographic operation, we can obtain the energyconsumption of the operation by multiplying the powerand execution time.7.2 Evaluation ResultsThe following metrics are used to evaluate DiDrip; memoryoverhead, execution time of cryptographic operations andpropagation delay, and energy overhead. The memoryoverhead measures the required data space in the implementation.The propagation delay is defined as the timefrom construction of a data hash chain until the parameterson all sensor nodes corresponding to a round of disseminateddata items are updated.TABLE 2The Pros and Cons of All Related ProtocolsHE ET AL.: SECURE AND DISTRIBUTED DATA DISCOVERY AND DISSEMINATION IN WIRELESS SENSOR NETWORKS 1135Table 3 shows the execution times of some importantoperations in DiDrip. For example, the execution times forthe system initialization phase and signing a random20-byte message (i.e., the output of SHA-1 function) are1.608 and 0.6348 ms on a 1.8-GHz Laptop PC, respectively.Thus, if SHA-1 is used, generating a user certificate or signinga message takes 0.6348 ms on a 1.8-GHz Laptop PC.Fig. 3 shows the execution times of SHA-1 hash function(extracted from TinyECC 2.0 [19]) on MicaZ and TelosBmotes. The inputs to the hash function are randomly generatednumbers with length varying from 24 to 156 bytes inincrements of 6 bytes. Note that, in our protocol, the hashfunction is applied to an entire packet. There are several reasonsthat, possibly, a packet contains a few tens of bytes.First, the advertisement packet has additional informationsuch as certificate and signature. Second, with the Merklehash tree method, each packet contains the disseminateddata item along with the related internal nodes of the treefor verification purpose. Third, although several bytes is atypical size of a data item, sometimes a disseminated dataitem may be a bit larger. Moreover, for sensors with IEEE802.15.4 compliant radios, the maximum payload size is 102bytes for each packet. Therefore, we have chosen a widerrange of input size to SHA-1 to provide readers a more completepicture of the performance. We perform the sameexperiment 10,000 times and take an average over them. Forexample, the execution times on a MicaZ mote for inputs of54 bytes, 114 bytes, and 156 bytes are 9.6788, 18.947, and28.0515 ms, respectively. Also, the execution times on aTelosB mote for inputs of 54, 114, and 156 bytes are 5.7263,10.7529, and 15.629 ms, respectively.To measure the execution time of public key cryptography,as shown in Table 41, we have implemented the ECCverification operation (with a random 20-byte number asthe output) of TinyECC 2.0 library [19] on MicaZ and TelosBmotes. For example, it is measured that the signature verificationtimes are 2.436 and 3.955 seconds, which are 252 and691 times longer than SHA-1 hash operation with a 54-byterandom number as input on MicaZ and TelosB motes,respectively. It can be seen packet authentication based onthe Merkle hash tree (or data hash chain) is much more efficient.Therefore, it is confirmed that DiDrip is suitable forsensor nodes with limited resources.Next, we compare the energy consumption of SHA-1 hashfunction and ECC verification under the condition that theradio of the mote is turned off. When a MicaZ mote is used inthe circuit, Vr ¼ 138 mV, I ¼ 6.7779 mA, Vm ¼ 2.8620 V,P ¼ 19.3983 mW. When a TelosB mote is used, Vr ¼ 38 mV,I ¼ 1.8664 mA, Vm ¼ 2.9620 V, P ¼ 5.5283 mW. With theexecution time obtained from Fig. 3, the energy consumptionon the motes due to the SHA-1 operation can be determined.For example, the energy consumption of SHA-1 operationwith a random 54-byte number as input on MicaZ andTelosB motes are 0.18775 and 0.03166 mJ, respectively. Also,the energy consumption of ECC signature verificationoperation on MicaZ and TelosB motes are 2835.2555 and1316.1777 mJ, respectively.Next, the impact of security functions on the propagationdelay is investigated in an experimental network as shownin Fig. 4. The network has 24 TelosB nodes arranged in a4 _ 6 grid. The distance between each node is about 35 cm,TABLE 3Running Time for Each Phase of the Basic Protocol of DiDrip (Except the Sensor Node Verification Phase)Fig. 3. The execution times of SHA-1 hash function on MicaZ and TelosBmotes.TABLE 4Running Time for ECC Signature VerificationFig. 4. The 4 _ 6 grid network of TelosB motes for measuring propagationdelay.1. Note that ECC-160 is faster than ECC-128, because the columnwidth of ECC-160 is set to 5 for hybrid multiplication optimizationwhile that of ECC-128 is set to 4.1136 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015and the transmission power is configured to be the lowestlevel so that only one-hop neighbours are covered in thetransmission range. The repeater is acted by the node locatingat the vertex of the grid.In the experiments, the packet delivery rate from the networkuser is 5 packets/s. The lengths of round and datafields in a data item are set to 4 bits and 2 bytes, respectively.A hash function with 8-byte truncated output is usedto construct data hash chains. An ECC-160 signature is 40bytes long. Each experiment is repeated 20 times to obtainan average measurement. Figs. 5 and 6 plot the averagepropagation delays of Drip, DiDrip1, and DiDrip2 when thedata hash chain and Merkle hash tree methods areemployed, respectively. It can be seen that the propagationdelay almost increases linearly with the number of dataitems per round for all three protocols. Moreover, the securityfunctions in DiDrip2 have low impact on propagationdelay. For these five experiments of the data hash chainmethod, DiDrip2 is just 3.448, 4.158, 3.222, 3.919 and 2.855 smore than that of Drip, respectively. Note that the increasein propagation delay is dominated by the signature verificationtime incurred at the one-hop neighboring nodes of thebase station. This is because each node carries out signatureverification only after forwarding data packets (with validpuzzle solutions).Table 5 shows the memory (ROM and RAM) usage ofDiDrip2 (with the data hash chain method) on MicaZ andTelosB motes for the case of four data items per round. Thecode size of Drip and a set of verification functions fromTinyECC (secp128r1, secp160r1 and secp192r1, which areimplementations based on various elliptic curves accordingto the Standards for Efficient Cryptography Group) areincluded for comparison. For example, the size of DiDripimplementation corresponds to 26.18 and 56.82 percent of theRAMand ROMcapacities of TelosB, respectively. Clearly, theROM and RAM consumption of DiDrip is more than that ofDrip because of the extra security functions. Moreover, it canbe seen thatmajority of the increasedROMis due to TinyECC.8 CONCLUSION AND FUTURE WORKIn this paper, we have identified the security vulnerabilitiesin data discovery and dissemination when used inWSNs, which have not been addressed in previousresearch. Also, none of those approaches support distributedoperation. Therefore, in this paper, a secure and distributeddata discovery and dissemination protocolnamed DiDrip has been proposed. Besides analyzing thesecurity of DiDrip, this paper has also reported the evaluationresults of DiDrip in an experimental network ofresource-limited sensor nodes, which shows that DiDripis feasible in practice. We have also given a formal proofof the authenticity and integrity of the disseminated dataitems in DiDrip. Also, due to the open nature of wirelesschannels, messages can be easily intercepted. Thus, in thefuture work, we will consider how to ensure data confidentialityin the design of secure and distributed datadiscovery and dissemination protocols.APPENDIXFURTHER IMPROVEMENT OF DIDRIP SECURITY ANDEFFICIENCYBy the basic DiDrip protocol, we can achieve secure and distributeddata discovery and dissemination. To furtherenhance the protocol, here we propose two modifications toFig. 5. Propagation delay comparison of three protocols when the datahash chain method is employed.Fig. 6. Propagation delay comparison of three protocols when the Merklehash tree method is employed.TABLE 5Code Sizes (Bytes) on MicaZ and TelosB MotesHE ET AL.: SECURE AND DISTRIBUTED DATA DISCOVERY AND DISSEMINATION IN WIRELESS SENSOR NETWORKS 1137improve the efficiency and security of DiDrip. For brevity,only those parts of the basic protocol that require changeswill be presented.Avoiding the Generation, Transmission andVerification of CertificatesThere are some efficiency problems caused by the generation,transmission, and verification of certificates. First, it isnot efficient in communication, as the certificate has to betransmitted along with the advertisement packet acrossevery hop as the message propagates in the WSN. A largeper-message overhead will result in more energy consumptionon each sensor node. Second, to authenticate each advertisementpacket, it always takes two expensive signatureverification operations because the certificate should alwaysbe authenticated first. To address these challenges, a feasibleapproach is that before the network deployment, the publickey/dissemination-privilege pair of each network user isloaded into the sensor nodes by the network owner. Once anew user joins the network after the network deployment,the network owner can notify the sensor nodes of the user’spublic key/dissemination privilege through using the privatekey of itself. The detailed description is as follows.User Joining PhaseAccording to the basic protocol of DiDrip, user Uj generatesits public and private keys and sends a 3-tuple<UIDj; Prij; PKj > to the network owner. When the networkowner receives the 3-tuple, it no longer generates thecertificate Certj. Instead, it signs the 3-tuple with its privatekey and sends it to the sensor nodes. Finally, each nodestores the 3-tuple.Packet Pre-Processing PhaseThe user certificate Certj stored in packet P0 is replaced byUIDj.Packet Verification PhaseIf this is an advertisement packet, according to the receivedidentity UIDj, node Sj first picks up the dissemination privilegePrij from its storage and then pays attention to thelegality of Prij. If the result is positive, node Sj uses the publickey PKj from its storage to run an ECDSA verify operationto authenticate the signature; otherwise, node Sjsimply discards the packet. Note that node Sj does not needto authenticate the certificate.As described above, the public-key/dissemination-privilegepair <UIDj; Prij; PKj > of each network user is just2 þ 6 þ 40 ¼ 48 bytes. Therefore, assuming the protocolsupports 500 network users, the code size is about 23 KB.We consider the resource-limited sensor nodes such asTelosB motes as examples. The 1-MB Flash memory isenough for storing these public parameters.Message Specific Puzzle Approach for Resistance toDoS attacksDiDrip uses a digital signature to bootstrap the authenticationof a round of data discovery and dissemination. Thisauthentication is vulnerable to DoS attacks. That is, anadversary may flood a lot of illegal signature message (i.e.,advertisement messages in this paper) to the sensor nodes toexhaust their resources and render them less capable of processingthe legitimate signature messages. Such an attack canbe defended by applying the message specific puzzleapproach [2]. This approach requires each signature messageto contain a puzzle solution. When a node receives a signaturemessage, it first checks that the puzzle solution is correctbefore verifying the signature. There are two characteristicsof the puzzles. First, the puzzles are difficult to be solved buttheir solutions are easy to be verified. Second, there is a tighttime limit to solve a puzzle. This discourages adversaries tolaunch the DoS attack even if they are computationally powerful.More details about this approach can be found in [2].Another advantage of applying the message specific puzzleis to reduce the dissemination delay, which is the time fora disseminated packet to reach all nodes in a WSN. Recallthat in step 1.a) of the packet verification phase, when a nodereceives the signature packet, it first carries out the signatureverification before using the Trickle algorithm to broadcastthe signature packet. This means that the disseminationdelay depends on the signature verification time tsv. On theother hand, when the message specific puzzle approach isapplied, a node can just verify the validity of puzzle solutionbefore broadcasting the signature packet. Then, the disseminationdelay only depends on the puzzle solution verificationtime tpv. Since tpv _ tsv, the dissemination delay issignificantly reduced. Moreover, the reduction in disseminationdelay is proportional to the network size. This is demonstratedby the experiments presented in Section 7.2.ACKNOWLEDGMENTSThis research is supported by a strategic research grant fromCity University of Hong Kong [Project No. 7004054], theFundamental Research Funds for the Central Universities,and the Specialized Research Fund for the Doctoral Programof Higher Education. D. He is the correspondingauthor of this article.and MEng degrees from the Harbin Institute ofTechnology, China, and the PhD degree fromZhejiang University, China, all in computerscience in 2007, 2009, and 2012, respectively.He is with the School of Computer Science andEngineering, South China University of Technology,P.R. China, and also with the College ofComputer Science and Technology, ZhejiangUniversity, P.R. China. His research interestsinclude network and systems security. He is anassociate editor or on the editorial board of some international journalssuch as IEEE Communications Magazine, Springer Journal of WirelessNetworks, Wiley’s Wireless Communications and Mobile ComputingJournal, Journal of Communications and Networks, Wiley’s Security andCommunication Networks Journal, and KSII Transactions on Internetand Information Systems. He is a member of the IEEE.Sammy Chan (S’87-M’89) received the BE andMEngSc degrees in electrical engineering fromthe University of Melbourne, Australia, in 1988 and1990, respectively, and the PhD degree in communicationengineering from the Royal MelbourneInstitute of Technology, Australia, in 1995. From1989 to 1994, he was with Telecom AustraliaResearch Laboratories, first as a research engineer,and between 1992 and 1994 as a seniorresearch engineer and project leader. SinceDecember 1994, he has been with the Departmentof Electronic Engineering, City University of Hong Kong, where he is currentlyan associate professor. He is a member of the IEEE.Mohsen Guizani (S’85-M’89-SM’99-F’09)received the BS (with distinction) and MSdegrees in electrical engineering, the MS andPhD degrees in computer engineering in 1984,1986, 1987, and 1990, respectively, from SyracuseUniversity, Syracuse, New York. He is currentlya professor and the associate vicepresident for Graduate Studies at Qatar University,Qatar. His research interests include computernetworks, wireless communications andmobile computing, and optical networking. Hecurrently serves on the editorial boards of six technical journals andthe founder and EIC of “Wireless Communications and MobileComputing” Journal published by John Wiley (http://www.interscience.wiley.com/jpages/1530-8669/). He is a fellow of the IEEE and a seniormember of ACM.Haomiao Yang (M’12) received the MS and PhDdegrees in computer applied technology from theUniversity of Electronic Science and Technologyof China (UESTC) in 2004 and 2008, respectively.From 2012 to 2013, he worked as a postdoctoralfellow at Kyungil University, Republic ofKorea. Currently, he is an associate professor atthe School of Computer Science and Engineering,UESTC, China. His research interestsinclude cryptography, cloud security, and bigdata security. He is a member of the IEEE.Boyang Zhou is currently working toward thePhD degree from the College of Computer Scienceat Zhejiang University. His research areasinclude software-defined networking, futureinternet architecture and flexible reconfigurablenetworks.” For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.HE ET AL.: SECURE AND DISTRIBUTED DATA DISCOVERY AND DISSEMINATION IN WIRELESS SENSOR NETWORKS 1139