Authenticated Key Exchange Protocols for ParallelNetwork File SystemsHoon Wei Lim Guomin YangAbstract—We study the problem of key establishment for securemany-to-many communications. The problem is inspired bythe proliferation of large-scale distributed file systems supportingparallel access to multiple storage devices. Our work focuses onthe current Internet standard for such file systems, i.e., parallelNetwork File System (pNFS), which makes use of Kerberos toestablish parallel session keys between clients and storage devices.Our review of the existing Kerberos-based protocol shows thatit has a number of limitations: (i) a metadata server facilitatingkey exchange between the clients and the storage devices hasheavy workload that restricts the scalability of the protocol; (ii)the protocol does not provide forward secrecy; (iii) the metadataserver generates itself all the session keys that are used betweenthe clients and storage devices, and this inherently leads to keyescrow. In this paper, we propose a variety of authenticatedkey exchange protocols that are designed to address the aboveissues. We show that our protocols are capable of reducing up toapproximately 54% of the workload of the metadata server andconcurrently supporting forward secrecy and escrow-freeness.All this requires only a small fraction of increased computationoverhead at the client.Keywords-Parallel sessions, authenticated key exchange, networkfile systems, forward secrecy, key escrow.I. INTRODUCTIONIn a parallel file system, file data is distributed acrossmultiple storage devices or nodes to allow concurrent accessby multiple tasks of a parallel application. This is typicallyused in large-scale cluster computing that focuses on highperformance and reliable access to large datasets. That is,higher I/O bandwidth is achieved through concurrent accessto multiple storage devices within large compute clusters;while data loss is protected through data mirroring usingfault-tolerant striping algorithms. Some examples of highperformanceparallel file systems that are in production useare the IBM General Parallel File System (GPFS) [48], GoogleFile System (GoogleFS) [21], Lustre [35], Parallel Virtual FileSystem (PVFS) [43], and Panasas File System [53]; whilethere also exist research projects on distributed object storagesystems such as Usra Minor [1], Ceph [52], XtreemFS [25],and Gfarm [50]. These are usually required for advancedscientific or data-intensive applications such as, seismic dataprocessing, digital animation studios, computational fluid dynamics,and semiconductor manufacturing. In these environments,hundreds or thousands of file system clients share dataand generate very high aggregate I/O load on the file systemsupporting petabyte- or terabyte-scale storage capacities.H.W. Lim is with National University of Singapore. Email:hoonwei@nus.edu.sg.G. Yang is with University of Wollongong, Australia. Email:gyang@uow.edu.au.Independent of the development of cluster and highperformancecomputing, the emergence of clouds [5], [37]and the MapReduce programming model [13] has resultedin file systems such as the Hadoop Distributed File System(HDFS) [26], Amazon S3 File System [6], and Cloud-Store [11]. This, in turn, has accelerated the wide-spreaduse of distributed and parallel computation on large datasetsin many organizations. Some notable users of the HDFSinclude AOL, Apple, eBay, Facebook, Hewlett-Packard, IBM,LinkedIn, Twitter, and Yahoo! [23].In this work, we investigate the problem of secure manyto-many communications in large-scale network file systemsthat support parallel access to multiple storage devices. Thatis, we consider a communication model where there are alarge number of clients (potentially hundreds or thousands)accessing multiple remote and distributed storage devices(which also may scale up to hundreds or thousands) in parallel.Particularly, we focus on how to exchange key materialsand establish parallel secure sessions between the clientsand the storage devices in the parallel Network File System(pNFS) [46]—the current Internet standard—in an efficientand scalable manner. The development of pNFS is driven byPanasas, Netapp, Sun, EMC, IBM, and UMich/CITI, and thusit shares many common features and is compatible with manyexisting commercial/proprietary network file systems.Our primary goal in this work is to design efficient andsecure authenticated key exchange protocols that meet specificrequirements of pNFS. Particularly, we attempt to meet thefollowing desirable properties, which either have not beensatisfactorily achieved or are not achievable by the currentKerberos-based solution (as described in Section II):• Scalability – the metadata server facilitating access requestsfrom a client to multiple storage devices shouldbear as little workload as possible such that the serverwill not become a performance bottleneck, but is capableof supporting a very large number of clients;• Forward secrecy – the protocol should guarantee thesecurity of past session keys when the long-term secretkey of a client or a storage device is compromised [39];and• Escrow-free – the metadata server should not learn anyinformation about any session key used by the client andthe storage device, provided there is no collusion amongthem.The main results of this paper are three new provablysecure authenticated key exchange protocols. Our protocols,progressively designed to achieve each of the above properties,1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems2demonstrate the trade-offs between efficiency and security.We show that our protocols can reduce the workload ofthe metadata server by approximately half compared to thecurrent Kerberos-based protocol, while achieving the desiredsecurity properties and keeping the computational overhead atthe clients and the storage devices at a reasonably low level.We define an appropriate security model and prove that ourprotocols are secure in the model.In the next section, we provide some background on pNFSand describe its existing security mechanisms associated withsecure communications between clients and distributed storagedevices. Moreover, we identify the limitations of the currentKerberos-based protocol in pNFS for establishing securechannels in parallel. In Section III, we describe the threatmodel for pNFS and the existing Kerberos-based protocol. InSection IV, we present our protocols that aim to address thecurrent limitations. We then provide formal security analysesof our protocols under an appropriate security model, as wellas performance evaluation in Sections VI and VII, respectively.In Section VIII, we describe related work, and finally inSection IX, we conclude and discuss some future work.II. INTERNET STANDARD — NFSNetwork File System (NFS) [46] is currently the sole filesystem standard supported by the Internet Engineering TaskForce (IETF). The NFS protocol is a distributed file systemprotocol originally developed by Sun Microsystems that allowsa user on a client computer, which may be diskless, to accessfiles over networks in a manner similar to how local storageis accessed [47]. It is designed to be portable across differentmachines, operating systems, network architectures, and transportprotocols. Such portability is achieved through the use ofRemote Procedure Call (RPC) [51] primitives built on top ofan eXternal Data Representation (XDR) [15]; with the formerproviding a procedure-oriented interface to remote services,while the latter providing a common way of representing a setof data types over a network. The NFS protocol has since thenevolved into an open standard defined by the IETF NetworkWorking Group [49], [9], [45]. Among the current key featuresare filesystem migration and replication, file locking, datacaching, delegation (from server to client), and crash recovery.In recent years, NFS is typically used in environments whereperformance is a major factor, for example, high-performanceLinux clusters. The NFS version 4.1 (NFSv4.1) [46] protocol,the most recent version, provides a feature called parallel NFS(pNFS) that allows direct, concurrent client access to multiplestorage devices to improve performance and scalability. Asdescribed in the NFSv4.1 specification:When file data for a single NFS server is storedon multiple and/or higher-throughput storage devices(by comparison to the server’s throughput capability),the result can be significantly better file accessperformance.pNFS separates the file system protocol processing into twoparts: metadata processing and data processing. Metadata is informationabout a file system object, such as its name, locationwithin the namespace, owner, permissions and other attributes.The entity that manages metadata is called a metadata server.On the other hand, regular files’ data is striped and storedacross storage devices or servers. Data striping occurs in atleast two ways: on a file-by-file basis and, within sufficientlylarge files, on a block-by-block basis. Unlike NFS, a read orwrite of data managed with pNFS is a direct operation betweena client node and the storage system itself. Figure 1 illustratesthe conceptual model of pNFS.Storage access protocol(direct, parallel data exchange)pNFS protocol(metadata exchange)Control protocol(state synchronization)Storage devices or servers(file, block, object storage)Metadata serverClients(heterogeneous OSes)Fig. 1. The conceptual model of pNFS.More specifically, pNFS comprises a collection of threeprotocols: (i) the pNFS protocol that transfers file metadata,also known as a layout,1 between the metadata server anda client node; (ii) the storage access protocol that specifieshow a client accesses data from the associated storage devicesaccording to the corresponding metadata; and (iii) the controlprotocol that synchronizes state between the metadata serverand the storage devices.2A. Security ConsiderationEarlier versions of NFS focused on simplicity and efficiency,and were designed to work well on intranets and local networks.Subsequently, the later versions aim to improve accessand performance within the Internet environment. However,security has then become a greater concern. Among manyother security issues, user and server authentication withinan open, distributed, and cross-domain environment are acomplicated matter. Key management can be tedious andexpensive, but an important aspect in ensuring security ofthe system. Moreover, data privacy may be critical in highperformanceand parallel applications, for example, those associatedwith biomedical information sharing [28], [44], financialdata processing & analysis [20], [34], and drug simulation &discovery [42]. Hence, distributed storage devices pose greaterrisks to various security threats, such as illegal modificationor stealing of data residing on the storage devices, as well asinterception of data in transit between different nodes within1A layout can be seen as a map, describing how a file is distributed acrossthe data storage system. When a client holds a layout, it is granted the abilityto directly access the byte-range at the storage location specified in the layout.2Note that the control protocol is not specified in NFSv4.1. It can take manyforms, allowing vendors the flexibility to compete on performance, cost, andfeatures.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems3the system. NFS (since version 4), therefore, has been mandatingthat implementations support end-to-end authentication,where a user (through a client) mutually authenticates to anNFS server. Moreover, consideration should be given to theintegrity and privacy (confidentiality) of NFS requests andresponses [45].The RPCSEC GSS framework [17], [16] is currently thecore security component of NFS that provides basic securityservices. RPCSEC GSS allows RPC protocols to access theGeneric Security Services Application Programming Interface(GSS-API) [33]. The latter is used to facilitate exchange of credentialsbetween a local and a remote communicating parties,for example between a client and a server, in order to establisha security context. The GSS-API achieves these through aninterface and a set of generic functions that are independentof the underlying security mechanisms and communicationprotocols employed by the communicating parties. Hence,with RPCSEC GSS, various security mechanisms or protocolscan be employed to provide services such as, encrypting NFStraffic and performing integrity check on the entire body of anNFSv4 call.Similarly, in pNFS, communication between the client andthe metadata server are authenticated and protected throughRPCSEC GSS. The metadata server grants access permissions(to storage devices) to the client according to pre-definedaccess control lists (ACLs).3 The client’s I/O request to astorage device must include the corresponding valid layout.Otherwise, the I/O request is rejected. In an environment whereeavesdropping on the communication between the client andthe storage device is of sufficient concern, RPCSEC GSS isused to provide privacy protection [46].B. Kerberos & LIPKEYIn NFSv4, the Kerberos version 5 [32], [18] and the LowInfrastructure Public Key (LIPKEY) [14] GSS-API mechanismsare recommended, although other mechanisms may alsobe specified and used. Kerberos is used particularly for userauthentication and single sign-on, while LIPKEY provides anTLS/SSL-like model through the GSS-API, particularly forserver authentication in the Internet environment.User and Server Authentication. Kerberos, a widely deployednetwork authentication protocol supported by all majoroperating systems, allows nodes communicating over a nonsecurenetwork to perform mutual authentication. It works ina client-server model, in which each domain (also known asrealm) is governed by a Key Distribution Center (KDC), actingas a server that authenticates and provides ticket-grantingservices to its users (through their respective clients) withinthe domain. Each user shares a password with its KDC anda user is authenticated through a password-derived symmetrickey known only between the user and the KDC. However,one security weakness of such an authentication method isthat it may be susceptible to an off-line password guessingattack, particularly when a weak password is used to derive3Typically, operating system principles are matched to a set of user andgroup access control lists.a key that encrypts a protocol message transmitted betweenthe client and the KDC. Furthermore, Kerberos has strict timerequirements, implying that the clocks of the involved hostsmust be synchronized with that of the KDC within configuredlimits.Hence, LIPKEY is used instead to authenticate the clientwith a password and the metadata server with a public keycertificate, and to establish a secure channel between the clientand the server. LIPKEY leverages the existing Simple Public-Key Mechanism (SPKM) [2] and is specified as an GSSAPImechanism layered above SPKM, which in turn, allowsboth unilateral and mutual authentication to be accomplishedwithout the use of secure time-stamps. Through LIPKEY,analogous to a typical TLS deployment scenario that consistsof a client with no public key certificate accessing a serverwith a public key certificate, the client in NFS [14]:• obtains the metadata server’s certificate;• verifies that it was signed by a trusted CertificationAuthority (CA);• generates a random session symmetric key;• encrypts the session key with the metadata server’s publickey; and• sends the encrypted session key to the server.At this point, the client and the authenticated metadata serverhave set up a secure channel. The client can then provide auser name and a password to the server for user authentication.Single Sign-on. In NFS/pNFS that employs Kerberos, eachstorage device shares a (long-term) symmetric key with themetadata server (which acts as the KDC). Kerberos then allowsthe client to perform single sign-on, such that the client isauthenticated once to the KDC for a fixed period of time butmay be allowed access to multiple storage devices governed bythe KDC within that period. This can be summarized in threerounds of communication between the client, the metadataserver, and the storage devices as follows:1) the client and the metadata server perform mutual authenticationthrough LIPKEY (as described before), and theserver issues a ticket-granting ticket (TGT) to the clientupon successful authentication;2) the client forwards the TGT to a ticket-granting server(TGS), typically the same entity as the KDC, in orderto obtain one or more service tickets (each containinga session key for access to a storage device), and validlayouts (each presenting valid access permissions to astorage device according to the ACLs);3) the client finally presents the service tickets and layoutsto the corresponding storage devices to get access to thestored data objects or files.We describe the above Kerberos-based key establishmentprotocol in more detail in Section III-C.Secure storage access. The session key generated by theticket-granting server (metadata server) for a client and astorage device during single sign-on can then be used in thestorage access protocol. It protects the integrity and privacyof data transmitted between the client and the storage device.Clearly, the session key and the associated layout are validonly within the granted validity period.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems4C. Current LimitationsThe current design of NFS/pNFS focuses on interoperability,instead of efficiency and scalability, of various mechanismsto provide basic security. Moreover, key establishment betweena client and multiple storage devices in pNFS are basedon those for NFS, that is, they are not designed specificallyfor parallel communications. Hence, the metadata server is notonly responsible for processing access requests to storage devices(by granting valid layouts to authenticated and authorizedclients), but also required to generate all the correspondingsession keys that the client needs to communicate securelywith the storage devices to which it has been granted access.Consequently, the metadata server may become a performancebottleneck for the file system. Moreover, such protocol designleads to key escrow. Hence, in principle, the server can learn allinformation transmitted between a client and a storage device.This, in turn, makes the server an attractive target for attackers.Another drawback of the current approach is that pastsession keys can be exposed if a storage device’s long-term keyshared with the metadata server is compromised. We believethat this is a realistic threat since a large-scale file system mayhave thousands of geographically distributed storage devices.It may not be feasible to provide strong physical security andnetwork protection for all the storage devices.III. PRELIMINARIESA. NotationWe let M denote a metadata server, C denote a client, andS denote a storage device. Let entity X; Y 2 fM;C; Sg, wethen use IDX to denote a unique identity of X, and KXto denote X’s secret (symmetric) key; while KXY denotes asecret key shared between X and Y , and sk denotes a sessionkey.Moreover, we let E(K;m) be a standard (encryption only)symmetric key encryption function and let E(K;m) be anauthenticated symmetric key encryption function, where bothfunctions take as input a key K and a message m. Finally, weuse t to represent a current time and _ to denote a layout. Wemay introduce other notation as required.B. Threat AssumptionsExisting proposals [19], [40], [29], [30], [31] on securelarge-scale distributed file systems typically assume that boththe metadata server and the storage device are trusted entities.On the other hand, no implicit trust is placed on the clients.The metadata server is trusted to act as a reference monitor,issue valid layouts containing access permissions, and sometimeseven generate session keys (for example, in the case ofKerberos-based pNFS) for secure communication between theclient and the storage devices. The storage devices are trustedto store data and only perform I/O operations upon authorizedrequests. However, we assume that the storage devices areat a much higher risk of being compromised compared tothe metadata server, which is typically easier to monitor andprotect in a centralized location. Furthermore, we assume thatthe storage devices may occasionally encounter hardware orsoftware failure, causing the data stored on them no longeraccessible.We note that this work focuses on communication security.Hence, we assume that data transmitted between the clientand the metadata server, or between the client and the storagedevice can be easily eavesdropped, modified or deleted by anadversary. However, we do not address storage related securityissues in this work. Security protection mechanisms for dataat rest are orthogonal to our protocols.C. Kerberos-based pNFS ProtocolFor the sake of completeness, we describe the key establishmentprotocol4 recommended for pNFS in RFC 5661 betweena client C and n storage devices Si, for 1 _ i _ n, through ametadata server M in Figure 2. We will compare the efficiencyof the pNFS protocol against ours in Section VII.During the setup phase, we assume that M establishes ashared secret key KMSi with each Si. Here, KC is a keyderived from C’s password, that is also known by M; whileT plays the role of a ticket-granting server (we simply assumethat it is part of M). Also, prior to executing the protocol inFigure 2, we assume that C and M have already setup a securechannel through LIPKEY (as described in Section II-B).Once C has been authenticated by M and granted accessto S1; : : : ; Sn, it receives a set of service ticketsE(KMSi ; IDC; t; ski), session keys ski, and layouts5 _i (forall i 2 [1; n]) from T, as illustrated in step (4) of the protocol.Clearly, we assume that C is able to uniquely extract eachsession key ski from E(KCT ; sk1; : : : ; skn). Since the sessionkeys are generated by M and transported to Si through C, nointeraction is required between C and Si (in terms of keyexchange) in order to agree on a session key. This keeps thecommunication overhead between the client and each storagedevice to a minimum in comparison with the case where keyexchange is required. Moreover, the computational overheadfor the client and each storage device is very low since theprotocol is mainly based on symmetric key encryption.The message in step (6) serves as key confirmation, that isto convince C that Si is in possession of the same session keythat C uses.IV. OVERVIEW OF OUR PROTOCOLSWe describe our design goals and give some intuition ofa variety of pNFS authenticated key exchange6 (pNFS-AKE)protocols that we consider in this work. In these protocols,we focus on parallel session key establishment between aclient and n different storage devices through a metadataserver. Nevertheless, they can be extended straightforwardlyto the multi-user setting, i.e., many-to-many communicationsbetween clients and storage devices.4For ease of exposition, we do not provide complete details of the protocolparameters.5We assume that a layout (containing the client’s identity, file objectmapping information, and access permissions) is typically integrity protectedand it can be in the form of a signature or MAC.6Without loss of generality, we use the term “key exchange” here, althoughkey establishment between two parties can be based on either key transportor key agreement [39].1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems5(1) C ! M : IDC(2) M ! C : E(KC;KCT ), E(KT ; IDC; t;KCT )(3) C ! T : IDS1 ; : : : ; IDSn, E(KT ; IDC; t;KCT ), E(KCT ; IDC; t)(4) T ! C : _1; : : : ; _n, E(KMS1 ; IDC; t; sk1); : : : ;E(KMSn; IDC; t; skn), E(KCT ; sk1; : : : ; skn)(5) C ! Si : _i;E(KMSi ; IDC; t; ski), E(ski; IDC; t)(6) Si ! C : E(ski; t + 1)Fig. 2. A simplified version of the Kerberos-based pNFS protocol.A. Design GoalsIn our solutions, we focus on efficiency and scalabilitywith respect to the metadata server. That is, our goal is toreduce the workload of the metadata server. On the otherhand, the computational and communication overhead for boththe client and the storage device should remain reasonablylow. More importantly, we would like to meet all these goalswhile ensuring at least roughly similar security as that of theKerberos-based protocol shown in Section III-C. In fact, weconsider a stronger security model with forward secrecy forthree of our protocols such that compromise of a long-termsecret key of a client C or a storage device Si will not exposethe associated past session keys shared between C and Si.Further, we would like an escrow-free solution, that is, themetadata server does not learn the session key shared betweena client and a storage device, unless the server colludes witheither one of them.B. Main IdeasRecall that in Kerberos-based pNFS, the metadata server isrequired to generate all service tickets E(KMSi ; IDC; t; ski)and session keys ski between C and Si for all 1 _ i _ n,and thus placing heavy workload on the server. In our solutions,intuitively, C first pre-computes some key materials andforward them to M, which in return, issues the corresponding“authentication tokens” (or service tickets). C can then, whenaccessing Si (for all i), derive session keys from the precomputedkey materials and present the corresponding authenticationtokens. Note here, C is not required to compute thekey materials before each access request to a storage device,but instead this is done at the beginning of a pre-definedvalidity period v, which may be, for example, a day or week ormonth. For each request to access one or more storage devicesat a specific time t, C then computes a session key from thepre-computed material. This way, the workload of generatingsession keys is amortized over v for all the clients within thefile system. Our three variants of pNFS-AKE protocols can besummarized as follows:• pNFS-AKE-I: Our first protocol can be regarded as amodified version of Kerberos that allows the client togenerate its own session keys. That is, the key materialused to derive a session key is pre-computed by theclient for each v and forwarded to the correspondingstorage device in the form of an authentication tokenat time t (within v). As with Kerberos, symmetric keyencryption is used to protect the confidentiality of secretinformation used in the protocol. However, the protocoldoes not provide any forward secrecy. Further, the keyescrow issue persists here since the authentication tokenscontaining key materials for computing session keys aregenerated by the server.• pNFS-AKE-II: To address key escrow while achievingforward secrecy simultaneously, we incorporate a Diffie-Hellman key agreement technique into Kerberos-likepNFS-AKE-I. Particularly, the client C and the storagedevice Si each now chooses a secret value (that is knownonly to itself) and pre-computes a Diffie-Hellman keycomponent. A session key is then generated from both theDiffie-Hellman components. Upon expiry of a time periodv, the secret values and Diffie-Hellman key componentsare permanently erased, such that in the event when eitherC or Si is compromised, the attacker will no longer haveaccess to the key values required to compute past sessionkeys. However, note that we achieve only partial forwardsecrecy (with respect to v), by trading efficiency oversecurity. This implies that compromise of a long-termkey can expose session keys generated within the currentv. However, past session keys in previous (expired) timeperiods v′ (for v′ < v) will not be affected.• pNFS-AKE-III: Our third protocol aims to achieve fullforward secrecy, that is, exposure of a long-term keyaffects only a current session key (with respect to t), butnot all the other past session keys. We would also liketo prevent key escrow. In a nutshell, we enhance pNFSAKE-II with a key update technique based on any efficientone-way function, such as a keyed hash function. In PhaseI, we require C and each Si to share some initial keymaterial in the form of a Diffie-Hellman key. In Phase II,the initial shared key is then used to derive session keysin the form of a keyed hash chain. Since a hash value inthe chain does not reveal information about its pre-image,the associated session key is forward secure.V. DESCRIPTION OF OUR PROTOCOLSWe first introduce some notation required for our protocols.Let F(k;m) denote a secure key derivation function that takesas input a secret key k and some auxiliary information m,and outputs another key. Let sid denote a session identifierwhich can be used to uniquely name the ensuing session. Letalso N be the total number of storage devices to which aclient is allowed to access. We are now ready to describe theconstruction of our protocols.A. pNFS-AKE-IOur first pNFS-AKE protocol is illustrated in Figure 3. Foreach validity period v, C must first pre-compute a set of key1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems6Phase I – For each validity period v:(1) C ! M : IDC, E(KCM;KCS1 ; : : : ;KCSN )(2) M ! C : E(KMS1 ; IDC; IDS1 ; v;KCS1 ); : : : ; E(KMSN ; IDC; IDSN ; v;KCSN )Phase II – For each access request at time t:(1) C ! M : IDC, IDS1 ; : : : ; IDSn(2) M ! C : _1; : : : ; _n(3) C ! Si : _i; E(KMSi ; IDC; IDSi ; v;KCSi ), E(sk0i ; IDC; t)(4) Si ! C : E(sk0i ; t + 1)Fig. 3. Specification of pNFS-AKE-I.materials KCS1 ; : : : ;KCSN before it can access any of theN storage device Si (for 1 _ i _ N). The key materials aretransmitted to M. We assume that the communication betweenC andM is authenticated and protected through a secure channelassociated with key KCM established using the existingmethods as described in Section II-B. M then issues an authenticationtoken of the form E(KMSi ; IDC; IDSi ; v;KCSi )for each key material if the associated storage device Si hasnot been revoked.7 This completes Phase I of the protocol.From this point onwards, any request from C to access Si isconsidered part of Phase II of the protocol until v expires.When C submits an access request to M, the request containsall the identities of storage devices Si for 1 _ i _ n _ Nthat C wishes to access. For each Si, M issues a layout_i. C then forwards the respective layouts, authenticationtokens (from Phase I), and encrypted messages of the formE(sk0i ; IDC; t) to all n storage devices.Upon receiving an I/O request for a file object from C, eachSi performs the following:1) check if the layout _i is valid;2) decrypt the authentication token and recover key KCSi ;3) compute keys skzi = F(KCSi ; IDC; IDSi ; v; sid; z) forz = 0; 1;4) decrypt the encrypted message, check if IDC matchesthe identity of C and if t is within the current validityperiod v;5) if all previous checks pass, Si replies C with a keyconfirmation message using key sk0i .At the end of the protocol, sk1i is set to be the session keyfor securing communication between C and Si. We note that,as suggested in [7], sid in our protocol is uniquely generatedfor each session at the application layer, for example throughthe GSS-API.B. pNFS-AKE-IIWe now employ a Diffie-Hellman key agreement techniqueto both provide forward secrecy and prevent key escrow. Inthis protocol, each Si is required to pre-distribute some keymaterial to M at Phase I of the protocol.Let gx 2 G denote a Diffie-Hellman component, where Gis an appropriate group generated by g, and x is a numberrandomly chosen by entity X 2 fC; Sg. Let _ (k;m) denote7Here KMSi is regarded as a long-term symmetric secret key shared betweenM and Si. Also, we use authenticated encryption instead of encryptiononly encryption for security reasons. This will be clear in our security analysis.a secure MAC scheme that takes as input a secret key k anda target message m, and output a MAC tag. Our partiallyforward secure protocol is specified in Figure 4.At the beginning of each v, each Si that is governed byM generates a Diffie-Hellman key component gsi . The keycomponent gsi is forwarded to and stored by M. Similarly, Cgenerates its Diffie-Hellman key component gc and sends it toM.8 At the end of Phase I, C receives all the key componentscorresponding to all N storage devices that it may accesswithin time period v, and a set of authentication tokens of theform _ (KMSi ; IDC; IDSi ; v; gc; gsi ). We note that for ease ofexposition, we use the same key KMSi for encryption in step(1) and MAC in step (2). In actual implementation, however,we assume that different keys are derived for encryption andMAC, respectively, with KMSi as the master key. For example,the encryption key can be set to be F(KMSi ; “enc”), whilethe MAC key can be set to be F(KMSi ; “mac”).Steps (1) & (2) of Phase II are identical to those in theprevious variants. In step (3), C submits its Diffie-Hellmancomponent gc in addition to other information required in step(3) of pNFS-AKE-I. Si must verify the authentication tokento ensure the integrity of gc. Here C and Si compute skzi forz = 0; 1 as follow:skzi = F(gcsi ; IDC; IDSi ; gc; gsi ; v; sid; z):At the end of the protocol, C and Si share a session keysk1i .Note that since C distributes its chosen Diffie-Hellmanvalue gc during each protocol run (in Phase II), each Si needsto store only its own secret value si and is not required tomaintain a list of gc values for different clients. Upon expiryof v, they erase their secret values c and si, respectively, fromtheir internal states (or memory).Clearly, M does not learn anything about skzi unless itcolludes with the associated C or Si, and thus achievingescrow-freeness.C. pNFS-AKE-IIIAs explained before, pNFS-AKE-II achieves only partialforward secrecy (with respect to v). In the third variant ofour pNFS-AKE, therefore, we attempt to design a protocol8For consistency with the existing design of the Kerberos protocol, weassume that the Diffie-Hellman components are “conveniently” transmittedthrough the already established secure channel between them, although theDiffie-Hellman components may not necessarily be encrypted from a securityview point.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems7Phase I – For each validity period v:(1) Si ! M : IDSi , E(KMSi ; gsi )(2) C ! M : IDC, E(KCM; gc)(3) M ! C : E(KCM; gs1 ; : : : ; gsN ),_ (KMS1 ; IDC; IDS1 ; v; gc; gs1 ); : : : ; _ (KMSN ; IDC; IDSN ; v; gc; gsN )Phase II – For each access request at time t:(1) C ! M : IDC, IDS1 ; : : : ; IDSn(2) M ! C : _1; : : : ; _n(3) C ! Si : _i; gc; _ (KMSi ; IDC; IDSi ; v; gc; gsi ), E(sk0i ; IDC; t)(4) Si ! C : E(sk0i ; t + 1)Fig. 4. Specification of pNFS-AKE-II (with partial forward secrecy and escrow-free).that achieves full forward secrecy and escrow-freeness. Astraightforward and well-known technique to do this is throughrequiring both C and Si to generate and exchange freshDiffie-Hellman components for each access request at timet. However, this would drastically increase the computationaloverhead at the client and the storage devices. Hence, we adopta different approach here by combining the Diffie-Hellman keyexchange technique used in pNFS-AKE-II with a very efficientkey update mechanism. The latter allows session keys to bederived using only symmetric key operations based on a agreedDiffie-Hellman key. Our protocol is illustrated in Figure 5.Phase I – For each validity period v:(1) Si ! M : IDSi , E(KMSi ; gsi )(2) C ! M : IDC, E(KCM; gc)(3) M ! C : E(KCM; gs1 ; : : : ; gsN )(4) M ! Si : E(KMSi ; IDC; IDSi ; v; gc; gsi )Phase II – For each access request at time t:(1) C ! M : IDC, IDS1 ; : : : ; IDSn(2) M ! C : _1; : : : ; _n(3) C ! Si : _i, E(skj,0i ; IDC; t)(4) Si ! C : E(skj,0i ; t + 1)Fig. 5. Specification of pNFS-AKE-III (with full forward secrecy and escrowfree).Phase I of the protocol is similar to that of pNFS-AKEII.In addition, M also distributes C’s chosen Diffie-Hellmancomponent gc to each Si. Hence, at the end of Phase I, bothC and Si are able to agree on a Diffie-Hellman value gcsi .Moreover, C and Si set F1(gcsi ; IDC; IDSi ; v) to be theirinitial shared secret state K0CSi .9During each access request at time t in Phase II, steps (1)& (2) of the protocol are identical to those in pNFS-AKE-II.In step (3), however, C can directly establish a secure sessionwith Si by computing skj,zi as follows:skj,zi = F2(Kj−1CSi; IDC; IDSi ; j; sid; z)where j _ 1 is an increasing counter denoting the j-th sessionbetween C and Si with session key skj,1i . Both C and Si then9Unlike in pNFS-AKE-II where gc is distributed in Phase II, we need topre-distribute C’s chosen Diffie-Hellman component in Phase I because thesecret state K0C Sithat C and Si store will be updated after each request.This is essential to ensure forward secrecy.setKjCSi= F1(Kj−1CSi; j)and update their internal states. Note that here we use twodifferent key derivation functions F1 and F2 to compute KjCSiand skj,zi , respectively. Our design can enforce independenceamong different session keys. Even if the adversary hasobtained a session key skj,1i , the adversary cannot derive Kj−1CSior KjCSi . Therefore, the adversary cannot obtain skj+1,zi orany of the following session keys. It is worth noting that theshared state KjCSi should never be used as the session key inreal communications, and just like the long-term secret key, itshould be kept at a safe place, since otherwise, the adversarycan use it to derive all the subsequent session keys within thevalidity period (i.e., KjCSi can be regarded as a medium-termsecret key material). This is similar to the situation that oncethe adversary compromises the long-term secret key, it canlearn all the subsequence sessions.However, we stress that knowing the state information KjCSiallows the adversary to compute only the subsequence sessionkeys (i.e., skj+1,zi ; skj+2,zi ; _ _ _ ) within a validity period, butnot the previous session keys (i.e., sk1,zi ; sk2,zi ; _ _ _ ; skj,zi )within the same period. Our construction achieves thisby making use of one-way hash chains constructed usingthe pseudo-random function F1. Since knowing KjCSidoes not help the adversary in obtaining the previous states(Kj−1CSi;Kj−2CSi; :::;K0C Si ), we can prevent the adversary fromobtaining the corresponding session keys. Also, since compromiseof KMSi or KCM does not reveal the initial state K0CSiduring the Diffie-Hellman key exchange, we can achieve fullforward secrecy.VI. SECURITY ANALYSISWe work in a security model that allows us to showthat an adversary attacking our protocols will not able tolearn any information about a session key. Our model alsoimplies implicit authentication, that is, only the right protocolparticipant is able to learn or derive a session key.A. Security ModelWe now define a security model for pNFS-AKE. Let Mdenote the metadata server, SS = fS1; S2; _ _ _ ; SNg the setof storage devices, and CS = fC1;C2; _ _ _ ;Cℓg the set of1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems8clients. A party P 2 fMg[SS[CS may run many instancesconcurrently, and we denote instance i of party P by _iP .Our adversarial model is defined via a game between anadversary A and a game simulator SIM. SIM tosses arandom coin b at the beginning of the game and b willbe used later in the game. SIM then generates for eachSi 2 SS (Cj 2 CS, respectively) a secret key KMSi(KMCj , respectively) shared with M. A is allowed to makethe following queries to the simulator:• SEND(P; i;m): This query allows the adversary to senda message m to an instance _iP . If the message m issent by another instance _jP′ with the intended receiverP, then this query models a passive attack. Otherwise, itmodels an active attack by the adversary. The simulatorthen simulates the reaction of _iP upon receiving themessage m, and returns to A the response (if there isany) that _iP would generate.• CORRUPT(P): This query allows the adversary to corrupta party P 2 SS[CS. By making this query, the adversarylearns all the information held by P at the time of thecorruption, including all the long-term and ephemeralsecret keys. However, the adversary cannot corrupt M(but see Remark 1).• REVEAL(P; i): This query allows the adversary to learnthe session key that has been generated by the instance_iP (P 2 SS [ CS). If the instance _iP does not holdany session key, then a special symbol ? is returned tothe adversary.• TEST(P∗; i∗): This query can only be made to a freshinstance _i∗P∗ (as defined below) where P∗ 2 SS [ CS.If the instance _i∗P∗ holds a session key SKi∗P∗ , then SIMdoes the following– if the coin b = 1, SIM returns SKi∗P∗ to theadversary;– otherwise, a random session key is drawn from thesession key space and returned to the adversary.Otherwise, a special symbol ? is returned to the adversary.We define the partner id pidiP of an instance _iP as theidentity of the peer party recognized by _iP , and sidiP as theunique session id belonging to _iP . We say a client instance_iC and a storage device instance _jS are partners if pidiC = Sand pidjS = C and sidiC = sidjS.We say an instance _iP is fresh if• A has never made a CORRUPT query to P or pidiP ; and• A has never made a REVEAL query to _iP or its partner.At the end of the game, the adversary outputs a bit b′ as herguess for b. The adversary’s advantage in winning the gameis defined asAdvpNFSA (k) = j2Pr[b′= b] 1j:Definition 1: We say a pNFS-AKE protocol is secure if thefollowing conditions hold.1) If an honest client and an honest storage device completematching sessions, they compute the same session key.2) For any PPT adversary A, AdvpNFSA (k) is a negligiblefunction of k.Forward Secrecy. The above security model for pNFS-AKEdoes not consider forward secrecy (i.e., the corruption ofa party will not endanger his/her previous communicationsessions). Below we first define a weak form of forwardsecrecy we call partial forward secrecy (PFS). We follow theapproach of Canetti and Krawczyk [10] by introducing a newtype of query:• EXPIRE(P; v): After receiving this query, no instance ofP for time period v could be activated. In addition, thesimulator erases all the state information and session keysheld by the instances of party P which are activatedduring time period v.Then, we redefine the freshness of an instance _iP asfollows:• A makes a CORRUPT(P) query only after anEXPIRE(P; v) query where the instance _iP is activatedduring time period v;• A has never made a REVEAL(P; i) query; and• If _iP has a partner instance _jQ, then A also obeys theabove two rules with respect to _jQ; otherwise, A hasnever made a CORRUPT(pidiP ) query.The rest of the security game is the same. We define theadvantage of the adversary asAdvpNFS−PFSA (k) = j2Pr[b′= b] 1j:We can easily extend the above definition to define fullforward secrecy (FFS) by modifying the EXPIRE query asfollows:• EXPIRE(P; i): Upon receiving this query, the simulatorerases all the state information and the session key heldby the instance _iP .The rest of the security model is the same as in the PFS game.Remark 1. In our security model, we do not allow theadversary to corrupt the metadata server M which holds allthe long-term secret keys. However, in our Forward Secrecymodel, we actually do not really enforce such a requirement.It is easy to see that if the adversary corrupts all the partiesin SS [ CS, then the adversary has implicitly corrupted M.But we should also notice that there is no way to preventactive attacks once M is corrupted. Therefore, the adversarycan corrupt all the parties (or M) only after the Test sessionhas expired.Remark 2. Our above Forward Secrecy model has alsoescrow-freeness. One way to define escrow-freeness is todefine a new model which allows the adversary to corruptthe metadata server and learn all the long-term secret keys.However, as outlined in Remark 1, our Forward Secrecy modelallows the adversary to obtain all the long-term secret keysunder some necessary conditions. Hence, our Forward Secrecymodel has implicitly captured escrow-freeness.B. Security ProofsTheorem 1: The pNFS-AKE-I protocol is secure withoutPFS if the authenticated encryption scheme E is secure underchosen-ciphertext attacks and F is a family of pesudo-randomfunctions.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems9Proof. We define a sequence of games Gi(i _ 0) where G0 isthe original game defined in our security model without PFS.We also define AdvpNFSi as the advantage of the adversaryin game Gi. Then we have AdvpNFS0 = AdvpNFSA (k).In game G1 we change the original game as follows: thesimulator randomly chooses an instance _i∗P∗ among all theinstances created in the game, if the TEST query is notperformed on _i∗P∗ , the simulator aborts and outputs a randombit. Then we haveAdvpNFS1 =1nIAdvpNFS0where nI denotes the number of instances created in the game.In the following games, we use C∗ and S∗ to denote the clientand the storage device involved in the test session, respectively,and v∗ to denote the time period when the test session isactivated.In game G2, we change game G1 as follows: let FORGEdenote the event that A successfully forges a valid ciphertextE(KMS∗ ; IDC∗ ; IDS∗ ; v∗;KC∗S∗ ). If the event FORGEhappens, then the simulator aborts the game and outputsa random bit. Since E is a secure authenticated encryptionscheme, we havePr[b′ = b in game G1jFORGE]= Pr[b′ = b in game G2jFORGE]andPr[b′= b in game G1] Pr[b′= b in game G2]_ Pr[FORGE] _ AdvUF−CMAE (k):Therefore, we haveAdvpNFS1_ AdvpNFS2 + 2AdvUF−CMAE (k):In game G3 we use a random key K′ instead of the decryptionof E(KMS∗ ; IDC∗ ; IDS∗ ; v∗;KC∗S∗ ) to simulate thegame. In the following, we show that jAdvpNFS2AdvpNFS1jis negligible if the authenticated encryption scheme E is secureunder adaptive chosen-ciphertext attacks (CCA).We construct an adversary B in the CCA game for theauthenticated encryption scheme E. B simulates the game G2for the adversary A as follows. B generates all the longtermkeys in the system except KMS∗ . B then randomlyselects two keys K0 and K1 and obtains a challenge ciphertextCH∗ = E(KMS∗ ; IDC∗ ; IDS∗ ; v∗;K∗) from its challengerwhere K∗ is either K0 or K1. B then uses CH∗ as theauthentication token used between C∗ and S∗ during thetime period v∗, and uses K1 as the decryption of CH∗ toperform any related computation. For other authenticationtokens related to KMS∗ , B generates them by querying itsencryption oracle. Also, for any authentication token intendedfor S∗ but not equal to CH∗, B performs the decryption byquerying its decryption oracle. Finally, if the adversary A winsin the game (denote this event by WIN), B outputs 1 (i.e., Bguesses K∗ = K1), otherwise, B outputs 0 (i.e., B guessesK∗ = K0).We can see that if K∗ = K1, then the game simulated byB is the same as game G2; otherwise, if K∗ = K0, then thegame simulated by B is the same as game G3. So we haveAdvCCAB (k) = j2(Pr[WINjK∗= K1]Pr[K∗= K1] +Pr[WINjK∗= K0]Pr[K∗= K0]) 1j= Pr[WINjK∗= K1] Pr[WINjK∗= K0]=12(AdvpNFS2 AdvpNFS3 )andAdvpNFS2_ AdvpNFS3 + 2AdvCCAE (k):In game G4 we then replace the function F(K′; _) with arandom function RF(_). Since F is a family of pseudo-randomfunctions, if the adversary’s advantage changes significantlyin game G4 , we can construct a distinguisher D against F.D simulates game G3 for A honestly except that wheneverD needs to compute F(K′; x), D queries its own oracle Owhich is either F(K′; _) or RF(_). At the end of the game, ifA wins the game, D outputs 1, otherwise, D outputs 0.We can see that if O = F(K′; _), A is in game G3,otherwise, if O = RF(_), then A is in game G4. Therefore,we haveAdvprfD (k) = Pr[D outputs 1jO = F(K′; _)] Pr[D outputs 1jO = RF(_)]= Pr[WINjO = F(K′; _)] Pr[WINjO = RF(_)]=12(AdvpNFS3 AdvpNFS4 )andAdvpNFS3_ AdvpNFS4 + 2AdvprfF (k):In game G4, we havesk0i = RF(IDC∗ ; IDS∗ ; v∗; sid∗; 0)sk1i = RF(IDC∗ ; IDS∗ ; v∗; sid∗; 1)where sid∗ is the unique session id for the test session. Nowsince RF is a random function, sk1i is just a random keyindependent of the game. Therefore, the adversary has noadvantage in winning the game, i.e.,AdvpNFS4 = 0:Combining all together, we haveAdvpNFSA (k)_2nI (AdvUF−CMAE (k) + AdvCCAE (k) + AdvprfF (k)):□Theorem 2: The pNFS-AKE-II protocol achieves partialforward secrecy if _ is a secure MAC scheme, the DDHassumption holds in the underlying group G, and F is a familyof pesudo-random functions.Proof. The proof is similar to that for Theorem 1. Below weonly elaborate on the differences between the two proofs. Wealso define a sequence of games Gi where G0 is the originalPFS security game.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems10In game G1, we change game G0 in the same way as inthe previous proof, then we also haveAdvpNFS−PFS1 =1nIAdvpNFS−PFS0where nI denotes the number of instances created in thegame. Let C∗ and S∗ denote the client and the storage deviceinvolved in the test session, respectively, and v∗ the timeperiod when the test session is activated.In game G2, we further change game G1 as follows: letFORGE denote the event that A successfully forges a validMAC tag _ (KMSi ; IDC; IDSi ; v; gc; gsi ) before corruptingSi, if the event FORGE happens, then the simulator abortsthe game and outputs a random bit. Then we havePr[b′ = b in game G1jFORGE]= Pr[b′ = b in game G2jFORGE]andPr[b′= b in game G1] Pr[b′= b in game G2]_ Pr[FORGE] _ AdvUF−CMAτ (k):Therefore, we haveAdvpNFS−PFS1_ AdvpNFS−PFS2 + 2AdvUF−CMAτ (k):In game G3, we change game G2 by replacing the Diffie-Hellman key gc∗s∗ in the test session with a random elementK′ 2 G. Below we show that if the adversary’s advantagechanges significantly in game G3 , we can construct a distinguisherB to break the Decisional Diffie-Hellman (DDH)assumption.B is given a challenge (ga; gb;Z), in which with equalprobability, Z is either gab or a random element of G. Bsimulates game G2 honestly by generating all the long-termsecret keys for all the clients and storage devices. Then forthe time period v∗, B sets gc∗= ga and gs∗= gb. When thevalue of gc∗s∗ is needed, B uses the value of Z to performthe corresponding computation. Finally, if A wins the game,B outputs 1, otherwise, B outputs 0.Since the adversary cannot corrupt C∗ or S∗ before the timeperiod v∗ has expired, if a FORGE event did not happen,then the values of the Diffie-Hellman components in the testsession must be ga and gb. If Z = gab, then A is in game G2;otherwise, if Z is a random element of G, then A is in gameG3. Therefore we haveAdvDDHB (k) = Pr[B outputs 1jZ = gab] Pr[B outputs 1jZ = gr]= Pr[WINjZ = gab] Pr[WINjZ = gr]=12(AdvpNFS−PFS2 AdvpNFS−PFS3 )andAdvpNFS−PFS2_ AdvpNFS−PFS3 + 2AdvDDH(k):In game G4, we replace the pseudo-random functionF(K′; _) with a random function RF(_). By following thesame analysis as in the previous proof, we haveAdvpNFS−PFS3_ AdvpNFS−PFS4 + 2AdvprfF (k)andAdvpNFS−PFS4 = 0:Therefore, combining all together, we haveAdvpNFS−PFSA (k)_ 2nI (AdvUF−CMAτ (k) + AdvDDH(k) + AdvprfF (k)):□Theorem 3: The pNFS-AKE-III protocol achieves full forwardsecrecy if E is a secure authenticated encryption scheme,the DDH assumption holds in the underlying group G, and Fis a family of pesudo-random functions.Proof (Sketch). The proof is very similar to that for Theorem 2.Below we provide a sketch of the proof.Let C∗ and S∗ denote the client and the storage deviceinvolved in the test session, respectively, and v∗ the timeperiod when the test session is activated. Without loss ofgenerality, suppose the test session is the j-th session betweenC∗ and S∗ within the period v∗. Since the adversary is notallowed to corrupt C∗ or S∗ before the test session sessionhas expired, due to the unforgeability of E, and the DDHassumption, the simulator can replace gc∗s∗ in the time periodv∗ with a random element K 2 G. Then in the next augmentedgame, the simulator replaces K0C∗S∗ by a random key. SinceF1 is a secure pseudo-random function, such a replacementis indistinguishable from the adversary’s view point. Thesimulator then replaces ski,zi (for z = 0; 1) and KiC∗S∗ withindependent random keys for all 1 _ i _ j. Once again, sinceF1 and F2 are secure pseudo-random functions, the augmentedgames are indistinguishable by the adversary. Finally, in thelast augmented game, we can claim that the adversary has noadvantage in winning the game since a random key is returnedto the adversary no matter b = 0 or b = 1. This completes thesketch of the proof. □VII. PERFORMANCE EVALUATIONA. Computational OverheadWe consider the computational overhead for w accessrequests over time period v for a metadata server M, a clientC, and storage devices Si for i 2 [1;N]. We assume that alayout _ is of the form of a MAC, and the computational costfor authenticated symmetric encryption E is similar to that forthe non-authenticated version E.10 Table I gives a comparisonbetween Kerberos-based pNFS and our protocols in terms ofthe number of cryptographic operations required for executingthe protocols over time period v.To give a more concrete view, Table II provides someestimation of the total computation times in seconds (s) foreach protocol by using the Crypto++ benchmarks obtained onan Intel Core 2 1.83 GHz processor under Windows Vistain 32-bit mode [12]. We choose AES/CBC (128-bit key) forencryption, AES/GCM (128-bit, 64K tables) for authenticatedencryption, HMAC(SHA-1) for MAC, and SHA-1 for keyderivation. Also, Diffie-Hellman exponentiations are based on10For example, according to the Crypto++ 5.6.0 Benchmarks, AES/GCM(128-bit, 64K tables) has similar speed as AES/CBC (128-bit key) [12].1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems11TABLE ICOMPARISON IN TERMS OF CRYPTOGRAPHIC OPERATIONS FOR w ACCESS REQUESTS FROM C TO Si VIA M OVER TIME PERIOD v, FOR ALL 1 _ i _ nAND WHERE n _ N.Protocol M C all Si TotalKerberos-pNFS– Symmetric key encryption / decryption w(n + 5) w(2n + 3) 3wn w(6n + 8)– MAC generation / verification wn 0 wn 2wnpNFS-AKE-I– Symmetric key encryption / decryption N + 1 2wn + 1 3wn 5wn + N + 2– MAC generation / verification wn 0 wn 2wn– Key derivation 0 2wn 2wn 4wnpNFS-AKE-II– Symmetric key encryption / decryption N + 2 2wn + 2 2wn + 1 4wn + N + 5– MAC generation / verification wn + N 0 2wn 3wn + N– Key derivation 0 2wn 2wn 4wn– Diffie-Hellman exponentiation 0 N + 1 N + wn 2N + wn + 1pNFS-AKE-III– Symmetric key encryption / decryption 2N + 2 2wn + 2 2wn + 1 4wn + 2N + 5– MAC generation / verification wn 0 wn 2wn– Key derivation 0 3wn + N 3wn + N 6wn + 2N– Diffie-Hellman exponentiation 0 N + 1 2N 3N + 1DH 1024-bit key pair generation. Our estimation is basedon a fixed message size of 1024 bytes for all cryptographicoperations, and we consider the following case:• N = 2n and w = 50 (total access requests by C withinv);• C interacts with 103 storage devices concurrently for eachaccess request, i.e. n = 103;• M has interacted with 105 clients over time period v; and• each Si has interacted with 104 clients over time periodv.Table II shows that our protocols reduce the workload of Min the existing Kerberos-based protocol by up to approximately54%. This improves the scalability of the metadata serverconsiderably. The total estimated computational cost for Mfor serving 105 clients is 8:02 _ 104 s (_ 22.3 hours) inKerberos-based pNFS, compared with 3:68 _ 104 s (_ 10.2hours) in pNFS-AKE-I and 3:86 _ 104 s (_ 10.6 hours) inpNFS-AKE-III. In general, one can see from Table I that theworkload of M is always reduced by roughly half for anyvalues of (w; n;N). The scalability of our protocols from theserver’s perspective in terms of supporting a large number ofclients is further illustrated in the left graph of Figure 6 whenwe consider each client requesting access to an average ofn = 103 storage devices.Moreover, the additional overhead for C (and all Si) forachieving full forward secrecy and escrow-freeness using ourtechniques are minimal. The right graph of Figure 6 shows thatour pNFS-AKE-III protocol has roughly similar computationaloverhead in comparison with Kerberos-pNFS when the numberof accessed storage devices is small; and the increasedcomputational overhead for accessing 103 storage devicesin parallel is only roughly 1/500 of a second compared tothat of Kerberos-pNFS—a very reasonable trade-off betweenefficiency and security. The small increase in overhead is partlydue to the fact that some of our cryptographic cost is amortizedover a time period v (recall that and for each access requestat time t, the client runs only Phase II of the protocol).On the other hand, we note that the significantly highercomputational overhead incurred by Si in pNFS-AKE-II islargely due to the cost of Diffie-Hellman exponentiations. Thisis a space-computation trade-off as explained in Section V-B(see Section VII-C for further discussion on key storage).Nevertheless, 256 s is an average computation time for 103storage devices over time period v, and thus the averagecomputation time for a storage device is still reasonably small,i.e. less than 1/3 of a second over time period v. Moreover, wecan reduce the computational cost for Si to roughly similarto that of pNFS-AKE-III if C pre-distributes its gc value toall relevant Si so that they can pre-compute the gcsi value foreach time period v.TABLE IICOMPARISON IN TERMS OF COMPUTATION TIMES IN SECONDS (S) OVERTIME PERIOD v BETWEEN KERBEROS-PNFS AND OUR PROTOCOLS. HEREFFS DENOTES FULL FORWARD SECRECY, WHILE EF DENOTESESCROW-FREENESS.Protocol FFS EF M C SiKerberos-pNFS 8:02 _ 104 0.90 17.00pNFS-AKE-I 3:68 _ 104 1.50 23.00pNFS-AKE-II ✓ 3:82 _ 104 2.40 256.00pNFS-AKE-III ✓ ✓ 3:86 _ 104 2.71 39.60B. Communication OverheadAssuming fresh session keys are used to secure communicationsbetween the client and multiple storage devices, clearlyall our protocols have reduced bandwidth requirements. Thisis because during each access request, the client does not needto fetch the required authentication token set from M. Hence,the reduction in bandwidth consumption is approximately thesize of n authentication tokens.1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems12secondsnumber of clientsmillisecondsnumber of storage devicesFig. 6. Comparison in terms of computation times for M (on the left) and for C (on the right) at a specific time t.C. Key StorageWe note that the key storage requirements for KerberospNFSand all our described protocols are roughly similar fromthe client’s perspective. For each access request, the clientneeds to store N or N + 1 key materials (either in the formof symmetric keys or Diffie-Hellman components) in theirinternal states.However, the key storage requirements for each storagedevice is higher in pNFS-AKE-III since the storage devicehas to store some key material for each client in their internalstate. This is in contrast to Kerberos-pNFS, pNFS-AKE-I andpNFS-AKE-II that are not required to maintain any client keyinformation.VIII. OTHER RELATED WORKSome of the earliest work in securing large-scale distributedfile systems, for example [24], [22], have already employedKerberos for performing authentication and enforcing accesscontrol. Kerberos, being based on mostly symmetric keytechniques in its early deployment, was generally believed tobe more suitable for rather closed, well-connected distributedenvironments.On the other hand, data grids and file systems such as,OceanStore [27], LegionFS [54] and FARSITE [3], make useof public key cryptographic techniques and public key infrastructure(PKI) to perform cross-domain user authentication.Independently, SFS [36], also based on public key cryptographictechniques, was designed to enable inter-operabilityof different key management schemes. Each user of thesesystems is assumed to possess a certified public/private keypair. However, these systems were not designed specificallywith scalability and parallel access in mind.With the increasing deployment of highly distributed andnetwork-attached storage systems, subsequent work, suchas [4], [55], [19], focussed on scalable security. Nevertheless,these proposals assumed that a metadata server shares agroup secret key with each distributed storage device. Thegroup key is used to produce capabilities in the form ofmessage authentication codes. However, compromise of themetadata server or any storage device allows the adversaryto impersonate the server to any other entities in the filesystem. This issue can be alleviated by requiring that eachstorage device shares a different secret key with the metadataserver. Nevertheless, such an approach restricts a capabilityto authorising I/O on only a single device, rather than largergroups of blocks or objects which may reside on multiplestorage devices.More recent proposals, which adopted a hybrid symmetrickey and asymmetric key method, allow a capability to spanany number of storage devices, while maintaining a reasonableefficiency-security ratio [40], [29], [30], [31]. For example,Maat [30] encompasses a set of protocols that facilitate (i)authenticated key establishment between clients and storagedevices, (ii) capability issuance and renewal, and (iii) delegationbetween two clients. The authenticated key establishmentprotocol allows a client to establish and re-use a shared(session) key with a storage device. However, Maat and otherrecent proposals do not come with rigorous security analysis.As with NFS, authentication in Hadoop Distributed FileSystem (HDFS) is also based on Kerberos via GSS-API.Each HDFS client obtains a TGT that lasts for 10 hoursand renewable for 7 days by default; and access control isbased on the Unix-style ACLs. However, HDFS makes use ofthe Simple Authentication and Security Layer (SASL) [38],a framework for providing a structured interface betweenconnection-oriented protocols and replaceable mechanisms.11In order to improve the performance of the KDC, the developersof HDFS chose to use a number of tokens forcommunication secured with an RPC digest scheme. TheHadoop security design makes use of Delegation Tokens, JobTokens, and Block Access Tokens. Each of these tokens issimilar in structure and based on HMAC-SHA1. DelegationTokens are used for clients to communicate with the Name11SASL’s design is intended to allow new protocols to reuse existingmechanisms without requiring redesign of the mechanisms and allows existingprotocols to make use of new mechanisms without redesign of protocols [38].1045-9219 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TPDS.2015.2388447, IEEE Transactions on Parallel and Distributed Systems13Node in order to gain access to HDFS data; while BlockAccess Tokens are used to secure communication between theName Node and Data Nodes and to enforce HDFS filesystempermissions. On the other hand, the Job Token is used to securecommunication between the MapReduce engine Task Trackerand individual tasks. Note that the RPC digest scheme usessymmetric encryption and depending upon the token type, theshared key may be distributed to hundreds or even thousandsof hosts [41].IX. CONCLUSIONSWe proposed three authenticated key exchange protocols forparallel network file system (pNFS). Our protocols offer threeappealing advantages over the existing Kerberos-based pNFSprotocol. First, the metadata server executing our protocolshas much lower workload than that of the Kerberos-basedapproach. Second, two our protocols provide forward secrecy:one is partially forward secure (with respect to multiplesessions within a time period), while the other is fully forwardsecure (with respect to a session). Third, we have designed aprotocol which not only provides forward secrecy, but is alsoescrow-free.