research-article

Open access

A Trust Establishment and Key Management Architecture for Hospital-at-Home

Authors:

Paul Stankovski WagnerAuthors Info & Claims

ACM Transactions on Computing for Healthcare, Volume 6, Issue 1

Article No.: 3, Pages 1 - 28

https://doi.org/10.1145/3700144

Published: 08 January 2025 Publication History

PDF eReader

Abstract

The landscape of healthcare is experiencing a digitalization shift, transferring many medical activities to the patients’ homes, a phenomenon commonly referred to as Hospital-at-Home. While Internet of Things (IoT) devices facilitate the building of such systems, there is a need for powerful middleware that encapsulates device-to-device communication and enables the construction of user-friendly, secure, and robust Hospital-at-Home systems. A key challenge for such middleware is to build a trustworthy and lightweight key management system allowing different devices in the system to exchange messages securely. In this article, we present a simple, easily manageable and scalable such architecture which, in addition, supports long-term data protection using post-quantum cryptographic primitives. Our proposed solution utilizes a Merkle tree to enable the IoT devices to establish trust between each other automatically, even in the absence of an Internet connection. We have implemented the architecture and present performance figures as well as a security analysis of our approach.

1 Introduction

Hospital healthcare is currently being transformed by moving more and more activities to patients’ homes, a situation referred to as Hospital-at-Home [28]. Examples include remote patient monitoring for patients with chronic diseases like kidney failure and heart diseases, palliative home care, where mobile nurse teams regularly visit patients, and early discharge, where patients are sent home early after treatment, and mobile care teams continue care in the patients’ homes, freeing up beds at the ward. Expected advantages of this transformation include decreased hospital costs, improved care quality, higher comfort and quality-of-life for patients, and reduced risks with respect to hospital-spread infections.

Hospital-at-Home systems differ from ordinary hospital IT systems in being highly distributed, including mobile devices operating over external networks, potentially with weak or irregular connectivity. Similarly to ordinary hospital IT systems, any patient monitoring, treatment, and communication performed within these systems falls under a regulatory body with high requirements on reliability, safety, security, and privacy. This is in contrast to self-monitoring systems, where patients use e-health devices on their own, without any connection to a hospital. In self-monitoring systems, the patient is responsible for installation and updates. Furthermore, the patient may have to use commercial authentication methods, and the patient data may be under vendor control.

In this work, we consider how to build Hospital-at-Home systems that are highly secure, yet simple to administrate and easy to use for both patients and medical staff. The kind of systems we consider are illustrated in Figure 1. There are four kinds of users in the system; patients, mobile medical staff that visit patients in their homes, medical staff at the hospital, and system administrators. At the hospital, a healthcare server stores information about patients and staff. Patients have a tablet at home with which they can communicate with staff using chat, photos, and video. The patients can also submit medical information via the tablet, e.g., home dialysis reports. Medical devices like scales and blood pressure equipment can be connected to the tablet, and new measurements and reports are automatically relayed to the hospital healthcare server. Staff can communicate with each other and with patients via a web client or a tablet. They can also use the system for scheduling purposes to coordinate their work, e.g., to keep track of which mobile staff should visit which patients. Also, patients might move about, e.g., taking a holiday trip, or visiting relatives. Through the tablet they can stay connected with medical staff, and continue to take measurements. Tablets store information locally in order to be able to operate also when network connectivity is poor or lost. When connectivity is restored, any pending new information on a tablet is automatically relayed to the healthcare server.

Fig. 1.

Illustration of four kinds of users and a database server, communicating via network. — Fig. 1. The Hospital-at-Home system considered in this work.

A device needs to be able to authenticate other devices in the system, and set up secure communication channels with them, without human intervention. For example, a patient tablet should be able to automatically upload a new measurement without requiring the patient to interact with the tablet. Other examples are tablets and servers synchronizing information after network failures, a server sending a software update to a tablet, and staff and patient tablets exchanging information when connected locally. In all these situations, devices need to be able to set up connections to authorized devices and communicate securely with them, without human intervention.

For users interacting with a tablet, login procedures are needed, but they need to be very easy to use, both for medical staff and all kinds of patients. For this reason, commercial and national electronic identification systems cannot be relied on exclusively, even if they fulfill all the requirements mandated by the regulatory bodies, since this would rule out some patient groups like elderly patients and foreign visitors. Furthermore, such systems, typically requiring two-factor authentication, may be too difficult to use for a sick patient, and too cumbersome for a mobile staff person who has to login numerous times during a work day.

For system administration tasks such as entering new equipment into the system or updating their software, it is important that it can be done completely by administrators at the hospital, without relying on steps performed by medical staff or patients. Once equipment has been deployed at a patient’s home, it is important that all software updates can be performed remotely, so that they can be applied quickly in order to keep the entire system up to date, for example when software vulnerabilities have been discovered and patched.

Hospital-at-Home systems are examples of Internet of Things (IoT) systems where smart devices communicate with each other over Internet or local networks. This is a growing field with many technological advancements and applications [30], but also with challenges such as how to handle large amounts of data in numerous formats [32], how to scale the systems and limit their energy consumption as some IoT devices have limited computational power and memory space [39], and not the least, how to handle security and privacy challenges [45]. Such challenges can be addressed by building solutions into middleware systems for IoT. The middleware systems serve as an interface between applications and IoT devices, supporting communication between the devices [30].

In this article, we focus on how to support secure authentication and communication between IoT devices in the given Hospital-at-Home setting. We address this problem by formulating a solution for trust establishment and key management between IoT devices that can be implemented as part of a middleware. For our implementation, we use a particular middleware, PalCom [36], that supports peer-to-peer communication between devices.

Main Contributions.

The main contributions of this work is as follows:

–

Based on a motivating scenario, identification of challenges and technical requirements for secure communication in Hospital-at-Home systems (Section 3).

–

A threat model for Hospital-at-Home systems (Section 4).

–

Our proposed security architecture for trust establishment and key management (Section 5). This architecture includes a scheme for device-to-device authentication and a scheme for efficient trust establishment by using Merkle trees. We tailor the work of [44] such that it fits into our Hospital-at-Home use-case scenario. Our solution supports long-term security for the keys and data by utilizing post-quantum primitives.

–

A security evaluation of our proposed solution (Section 6).

–

A concrete example implementation of this architecture and a performance evaluation of it (Section 7). For reproducibility, we provide an artifact for this part.

In the following, we start by presenting related work on risks introduced when IoT devices are used in healthcare systems as well as IoT trust management techniques (Section 2). In Section 3, we introduce the middleware system and the Hospital-at-Home scenario that we use in this work, identifying the requirements on this system from a secure trust and key establishment perspective. We then present our threat model for such Hospital-at-Home systems in Section 4, and our proposed security architecture for trust establishment and key management in Section 5. In Sections 6 and 7, we present a security evaluation of our proposed solution, our implementation of the architecture, and a performance evaluation of it. Finally, we conclude the article in Section 8.

2 Related Work

In this section, we first give a more in-depth explanation of the role that IoT devices play in the healthcare ecosystem. Then we survey the potential risks and challenges that the e-health services might face as a result of implementing IoT devices. We then dive deeper into the problem of trust management between IoT devices, and we perform a literature review on the methods that are used to handle this issue.

2.1 IoT in the Healthcare Ecosystem

The IoT refers to smart devices that are equipped with sensors, software programs, and other technologies that enable them to generate and process data, and communicate with each other over a network (e.g., the Internet and the personal area network). IoT devices play a significant role in the healthcare industry, especially in the context of e-health and Hospital-at-Home. Several healthcare services, e.g., the monitoring of patient’s health conditions and ambient assisted living, can be done remotely by utilizing IoT devices. Several IoT devices such as depression/mood monitoring devices, Parkinson’s disease monitoring devices, connected inhalers, and ingestible sensors are designed to improve patient care. The employment of these devices in the field marks a shift toward more distributed systems, in which more medical devices are used to assist in gathering, preprocessing and possibly also in the analysis of medical data (locally and distributed). The Hospital-at-Home scenario also stretches the system perimeter beyond the comforts of a localized system, which also implies increased security demands on handling remote data transfers securely.

E-health services leverage the utilization of IoT devices to generate a data-driven and interconnected robust healthcare ecosystem to enhance patient care. However, deploying the IoT devices introduces several security and privacy risks and challenges for handling the patients’ data that is generated, transmitted, and stored by these devices. Some of these issues that are especially relevant to this work are listed below.

–

On-demand computing, such as the cloud, fog, and edge computing, has been used extensively to process, store, and share the data that is generated by IoT devices. However, on-demand computing has several security shortcomings that make the patients’ medical records vulnerable to exploitation by cyber attackers [41]. Medical data that is stored by third-party companies will most likely reveal sensitive information about the patients, and any usage of this data beyond the intended treatment may severely infringe on patients’ privacy [34]. In this context, it is pressing to recall that corporate interests incentivize (medical) data hoarding and mining for revenue and market positioning.

–

Access control and credential management are crucial aspects of a robust and secure healthcare system. An e-health system should be designed in such a way that it ensures only authorized personnel can access the relevant and necessary data. It is of utmost importance to regularly update the credentials (e.g., passwords and access tokens) to prevent insider attackers from accessing information that they are not authorized to view or handle. However, the complex nature of a healthcare ecosystem makes it challenging to design and perform secure access control and credential management [4].

–

All around the globe, the healthcare service providers are required to follow a set of strict regulatory laws, standards, and requirements. The main goal of these mandatory rules is to provide security for patients and preserve their privacy [6]. Therefore, any technology that is used within the healthcare ecosystem must comply with regulations, e.g., the US healthcare systems must follow Health Insurance Portability and Accountability Act and all the healthcare systems within the European Union must follow General Data Protection Regulation (GDPR). The utilization of IoT devices and the technologies that are used in connection with these devices should help the healthcare service providers meet regulatory compliance with their national/international mandates. Therefore, a secure and lightweight IoT-related protocol that otherwise functions well might not be suitable for e-health systems. Thus, regulatory compliance is one of the challenges that needs to be faced when designing e-health systems.

–

Cryptographic primitives are used to build secure and private systems. However, implementing and deploying cryptographic protocols in IoT healthcare systems can be challenging due to a combination of various technical and practical issues [4, 5, 40], namely resource constraints, legacy systems, and real-time requirements. As we mentioned in Section 1, most of the IoT devices suffer from severely limited computational power, memory, and energy resources. Moreover, most of these devices quickly become outdated and cause interoperability problems with newer devices. Therefore, executing most of the secure but relatively heavy cryptosystems such as public-key schemes might not be possible on all of these devices. Moreover, the nature of healthcare services often requires real-time data processing. However, deployment of most of the cryptographic primitives introduces some additional latency due to the resource constraints of the IoT devices. This imposed latency prevents quick response times that some healthcare applications must have, for instance, real-time remote diagnosis of abnormalities in cardiovascular. Therefore, it is not acceptable to use certain cryptographic primitives for critical healthcare scenarios.

–

With the emergence of quantum computers, many of the nowadays common and classical cryptographic primitives and protocols will not remain secure for very long. Therefore, it is crucial to utilize post-quantum cryptography schemes in IoT devices [14] to enable long-term security guarantees. However, as mentioned before, the costs of using these new primitives need to be low enough for the peripheral IoT devices and for the healthcare system itself to operate well.

–

In many IoT use cases in the healthcare ecosystem, an IoT device will be given to the patient after an initial setup by the technical and medical experts, after which physical access to the device might be severely limited, or none. Therefore, updating its software has to be done remotely, without any user intervention. Moreover, in some use cases, the devices should be able to communicate with other authorized IoT devices in their close proximity, even if these devices do not have access to the Internet [20].

–

In order for the IoT devices to communicate with each other, first the communication should be “allowed.” In other words, an IoT device should only communicate with devices that it trusts. Therefore, it is crucial to implement a robust trust management system for medical IoT devices [29]. Trust management refers to implementing protocols and measures which ensure that only authorized devices and/or individuals can access devices and their information inside the ecosystem, and other accesses are not possible.

As e-health systems’ vulnerabilities can directly impact the patients’ lives, it is even more crucial to find a comprehensive and holistic approach to the above challenges, that addresses the security and privacy risks with respect to regional regulatory compliance.

In our proposed system in Section 5, we do not rely on a cloud, fog, or any third-party computing or storage services for trust management and device discovery. We propose a trust management scheme for IoT devices in the healthcare ecosystem that is lightweight, post-quantum secure, can function securely with military-grade longevity (50 years) without key updates, and which complies with GDPR. Below, we present a literature review of IoT device trust management techniques.

2.2 Trust Management Schemes for IoT Devices

IoT trust management enables the ecosystem to prevent malicious devices from joining the system, and therefore, guarantees secure access control [18]. The trust management can be automated, i.e., the entities in the ecosystem are enabled to exchange information that contain verified data to establish trust [25]. In this work, we focus mainly on the automated trust establishments methods, as automation is a requirement in the Hospital-at-Home use-case scenario. For the purpose of this exposition, we assign the same level of device trust to all the devices used by any one specific patient. Although discussions about different levels of access rights and trust scores (i.e., see [3, 27]) are left out of this section for readability, our proposed system is indeed designed to take multi-level device trust into account.

2.2.1 Trust Management via a Middleware System.

As we explained before, a middleware software can be used to facilitate communication between IoT devices, to improve the functionality of these devices, and to strengthen the security and privacy of the whole IoT ecosystem. Hereafter, we refer to middleware systems that are used in IoT ecosystems as IoT-middleware. Several IoT-middleware systems have been proposed, supporting various functionalities [7]. Among other tasks, the process of key management and trust establishment can be done via an IoT-middleware software. In this part we first present several IoT-middleware systems and the trust management methods they utilize and provide. Then we discuss some of the shortcomings and open problems of these IoT-middleware systems’ trust management schemes.

2.2.2 IoT Trust Management with Blockchains.

Amatista is an IoT-middleware that can be used to facilitate trust management in an environment with zero-trust [38]. Amatista utilizes a blockchain-based distributed system that with the help of the edge devices manages trust establishment between the IoT devices. In [1], Abbasi et al. proposed another IoT-middleware framework that is trust-based and supports interoperability across heterogeneous devices. Their proposed middleware also utilizes a blockchain-based trusted third party. The utilization of a blockchain for providing trust management in IoT-middleware is further discussed in several other works, see for example [13, 17, 43].

2.2.3 IoT Trust Management with Merkle Tree Techniques.

Trust management for IoT devices can also be done by utilizing Merkle tree techniques [44]. Merkle trees are cryptographic building blocks which are the core enablers of transparency logs. As an example usage, the reader may consider Certificate Transparency logs [24]. This is the approach that we adopt in this work for trust management.

Informally, using Merkle tree techniques, an unauthenticated device or entity needs to prove that it is in possession of a private key whose public key counterpart is linked to the root of a Merkle tree. The hierarchical structure of a Merkle tree enables fast and efficient verification of the access rights of different devices in a large IoT ecosystem. As a Merkle tree is a lightweight and cost efficient way of handling trust establishment between devices, there is a growing interest in utilizing them in the context of IoT ecosystems, see for example [33, 44, 46].

2.2.4 IoT Trust Management with Biometric Methods.

In a healthcare ecosystem, biometric data that is collected by IoT devices can, in turn, be used directly or indirectly in the trust management schemes [9, 29]. For instance, heart beat profiles (e.g., utilizing time gaps between heart beats) may be seen as a pseudo-random numbers that can be used as seeds for key material, or even used directly as private/secret keys for device authentication and trust management purposes [37]. However, the accuracy levels in biometric systems can be problematic when it comes to system design. To be more precise, any system that uses biometric data inherits (at least) two major fault sources; False Acceptance Rate (FAR), and False Rejection Rate (FRR). As an example or FRR, when using biometric data directly as key material, the system may read the biometric data differently at each session, failing to construct the desired and needed private key. This blocks a legitimate user from accessing the system, which is most often a usability concern. As an example of FAR, due to (the same) inaccuracy in reading the biometric data, it is also possible that a wrongfully authorized entity (e.g., relatives may have similar heart beat profiles) gains access to medical records they were not intended to have access to. This is a type of classification error that mostly affects the security guarantees of the system. Moreover, the associated error probabilities in the above mentioned biometric systems are typically much higher—several order of magnitude—than the error probabilities that can be guaranteed using cryptographic constructs (cryptographic grade probabilities) in trust establishment systems.

2.2.5 Discussion.

The above-mentioned methods that are based on blockchain and/or utilization of on-demand computing cannot be applied in most of the healthcare ecosystems due to the fact that the patients do not always have access to the Internet and therefore, we require schemes that can also function off-line. More precisely, in most of the healthcare use-cases, data cannot leave the ecosystem, and there should be no third-party entity that can access the patients’ private information. Moreover, as health-related IoT devices need to function properly and accurately even in the absence of the Internet, the above-mentioned schemes are not always reliable or efficient.

On the other hand, utilization of Merkle trees techniques for trust management in IoT devices provides a scalable and cost-efficient way to ensure the trustworthiness of entities within a medical IoT ecosystem. It enhances the security of IoT systems by enabling real-time and reliable verification of the integrity of the devices, and enables the ecosystem to only provide access rights to the IoT devices that can prove their identities with cryptographic levels of assurance.

This provides us with sufficient motivation to utilize a Merkle tree-based trust management scheme in our system design, which is presented in Section 5.

We do disqualify biometric techniques as main trust management enablers due to their inherently low¹ assurance levels, which are well below our aim of cryptographic grade standards. However, we do recognize that biometric techniques can be successfully employed in subsystems for usability reasons, but then only in situations in which lower levels of security guarantees are needed.

3 Motivating Scenario

In this article, we use an existing Hospital-at-Home system, itACiH [21, 22], as an initial motivating scenario. The system is currently used for thousands of patients in Sweden, in a variety of medical care situations including palliative care, home treatment of dialysis patients, and support for early discharge [21]. It is also used in a current research project on e-health for children, addressing clinical areas like neonatal care, pediatric surgery of heart defects in newborns, and child oncology [23].

itACiH is implemented on top of an open source IoT middleware, PalCom² [2, 15, 26], that supports devices communicating over heterogeneous networks with weak connectivity. We will describe the PalCom device model (Section 3.1), the current distributed architecture of the itACiH Hospital-at-Home system (Section 3.2), and the method for enrolling and activating new mobile devices in the system (Section 3.3). We then use this scenario to identify challenges in the existing solution (Section 3.4) and end with formulating system requirements for a security architecture for middleware supporting Hospital-at-Home systems (Section 3.5).

3.1 The PalCom Device Model

The PalCom middleware provides a communication platform and software architecture for IoT systems. A PalCom device is a running instance of the middleware, typically representing the physical device it is hosted on, for instance a sensor in an IoT scenario, a mobile device like a tablet, or a server like a database. It is also possible to run several PalCom devices on the same physical device, which is useful for instance when testing and for separating functionality on a server.

PalCom is transparent to communication technologies and currently supports a number of protocols, including TCP/IP and UDP/IP. Different local networks can be connected with each other by communication over TCP/IP, using an encrypted PalCom tunnel, which multiplexes messages between the networks. If a device is connected to more than one physical network (like local networks or tunnels), the messages on one network can be automatically routed to the other networks by turning on routing for the device. This means that the resulting PalCom network can span many physical networks. A PalCom device is identified by a device ID that must be unique within this resulting network.

The functionality of a device is organized as PalCom services that communicate through messages with other services, on the same or other devices, similar to microservices [42]. A device discovery protocol enables devices to discover each other and exchange service interface descriptions. A device can set up a connection from one of its services to a service on another device, locating it with the device ID only, regardless of which physical networks the devices are located on. Messages can then flow in both directions between the two services.³

A limitation of the current implementation of PalCom is that it does not prevent spoofing of device IDs: any device can claim to have a particular ID. Current applications avoid this problem by authenticating the user rather than the device. This approach has limitations for the devices located in patients’ homes in Hospital-at-Home scenarios, where devices may need to operate for a long time, perhaps years, without human intervention, e.g., automatically relaying new measurements to the hospital server. As we will discuss in Section 3.4, a preferable solution would be to provide a device authentication mechanism at the middleware level.

Figure 2 illustrates a PalCom device with services, running on a physical device. Note that the device ID is independent from any IDs on the hosting hardware, such as serial numbers, and so forth.

Fig. 2.

Illustration of a PalCom device hosting three services, and running on a hardware device. — Fig. 2. A PalCom device runs as an application on a hardware device. Supported device ID schemes include assigned identities (A:...) and auto-generated UUIDs (C:...) and are independent of any hardware identities. Services expose the functionality of a device.

3.2 Distributed Architecture of the itACiH Hospital-at-Home System

The itACiH system follows the overall structure of Hospital-at-Home systems from Figure 1. Figure 3 shows its distributed architecture in terms of physical locations, PalCom devices, and PalCom tunnels.

Fig. 3.

Illustration of how users with tablets communicate with PalCom devices on the healthcare server frontend. — Fig. 3. Distributed architecture of the itACiH Hospital-at-Home system.

The Healthcare server is divided into a frontend that handles connections from tablets and other clients, and a backend that contains an encrypted database with patient and system data. For security reasons, the backend is isolated from clients, so any connections to it have to go via the frontend.

Administrators and medical staff at the hospital can access the system via web clients, using a two-factor authentication procedure with a personal smart card, issued by the hospital. This is handled in the frontend by two PalCom devices: Identity provider for the authentication and Web server for authenticated access.

What is interesting for this article is how the tablets communicate with the server. Tablets are represented by PalCom devices and access the server via two PalCom devices in the frontend: Tablet staging and Tablet frontend. Tablet staging does not have routing turned on. Therefore, a tablet that sets up a tunnel to Tablet staging cannot discover any of the other devices in the frontend or backend. Tablet frontend, on the other hand, has routing turned on, so a tablet that opens a tunnel to Tablet frontend can discover the Core System in the backend, and communicate with it. In the next subsection, we describe in more detail how tablets get access to the Core System.

3.3 Mobile Device Enrollment and Activation in itACiH

A tablet device gets access to the core system via an enrollment and activation procedure, where cryptographic keys are installed on the devices. This procedure is performed in three steps. First, a system administrator enrolls the device, registering it as trusted by the system. Then a medical staff user can activate the device to be used for a session, whose duration can vary. For medical staff tablets, such an activation session is normally just a work day, whereas for patient tablets, the session can go on for years, until the session is explicitly revoked. As the third step, the tablet needs to be logged in by an end user (medical staff or patient) with a PIN code. Normally, a tablet is logged in only once for each activation session, but in case the device is powered down, it needs to be logged in again before communication can resume. For user interaction, there is also a screen lock on the tablet and which, for simplicity, uses the same PIN code as the login. We will now describe these steps in more detail, as illustrated in Figure 4.

Fig. 4.

Illustration of the states a new device goes through in order to log in. — Fig. 4. Device states during enrollment (admin user), activation (medical user), and logging in (end user).

Enrolling a new tablet is done by a system administrator that has physical access to the tablet. The administrator first starts the tablet, which automatically gets a generated unique PalCom device ID (if this is the first time it is started). At this point, the tablet cannot get into contact with the itACiH system, because it lacks the appropriate cryptographic keys (Figure 4(a)).

Next, the administrator accesses the itACiH system via a web client to generate a new more suitable device ID for the tablet, and to record this device ID as trusted in the system. The itACiH tablet software is then installed on the tablet, along with the new device ID and a cryptographic “staging” key pair. The tablet can use this staging key pair (which is the same for all tablets) to set up an encrypted tunnel to the Tablet staging device in the server frontend (Figure 4(b)). It can, however, still not discover other itACiH devices, since Tablet staging has not turned on routing.

Activating an enrolled device is done by a medical staff user that has physical access to the tablet, and uses a web client to access the healthcare server. The user starts by turning on the tablet which will then connect to Tablet staging. On the Web page, the user selects the desired tablet from a list of trusted devices, and provides a new password (PIN code) that will be used later for logging in on the tablet. Tablet frontend generates a new key pair⁴ with a matching self-signed certificate,⁵ and packages the key pair in a keystore that is locked with the provided PIN code. The matching certificate is stored in the truststore of Tablet frontend and is associated with the tablet’s device ID. This information represents the activation session for the tablet and can be removed in order to revoke the session. The PIN code itself is not stored anywhere. The locked keystore is then sent to the tablet via the Tablet staging device by addressing it to its registered device ID. The tablet is now in the activated state (Figure 4(c)).

Finally, the end user (staff or patient) enters the PIN code on the tablet, after which the tablet is logged in (Figure 4(d)). The tablet can now access the Core system system via Tablet frontend (Figure 4(d)).

3.4 Challenges

The itACiH system fulfills basic requirements for a security architecture: Only trusted tablets can communicate with Core system, and all communication is secured by encryption over tunnels. However, the solution is unnecessarily complex with cryptographic keys being installed manually for each communication path and user session. This also makes it difficult to build more complex scenarios, e.g., where two tablets need to communicate directly with each other. For example, if a patient lives in a remote location with weak connectivity, it could be useful to relay data from a patient tablet directly over a local network to the tablet of a visiting nurse. This is currently not supported in the itACiH system.

Furthermore, the current solution does not support rotation of cryptographic keys without physical access to the device, and without network connection to Core system. For the medical staff tablets, this is no problem since they are activated and given new keys daily at the hospital. For patient tablets, however, such a manual rotation procedure is not scalable, even if key rotation only needs to be done monthly.

An additional challenge with the current solution is that data can be “stranded” on a medical staff tablet that runs out of battery when it is offline. This could happen, for example, if a nurse visits a patient in a remote location with weak connectivity, enters some notes on the tablet before driving back to the hospital, and the tablet then runs out of battery before connectivity is restored. When the tablet is being powered up again, it needs to be logged in by the same nurse in order for the session to be resumed, allowing the stranded data to be automatically relayed to the server. However, if the nurse’s work pass is already over at this point, this may not be practical.

To overcome these challenges, we propose that the middleware should include support for trust and key management, as will be detailed in Section 5. In particular, this solution supports that device IDs are not possible to spoof: each device can prove ownership of its device ID. With such support, encryption keys can be generated at the middleware level, a device can automatically rotate its keys without network access or human intervention, and stranded sessions can be allowed to complete without a specific user logging in.

With support for trust and key management at the middleware level, the enrollment and activation procedure of itACiH can be revised as shown in Figure 5. In the revised solution, all the phases are similar to the previous solution in Figure 4, but slightly simplified. In particular, there is no longer any need for the Tablet staging service, or of encrypted tunnels.

Fig. 5.

Illustration of the states a new device goes through in order to log in, using the revised procedure. — Fig. 5. Revised procedure for enrollment and activation, based on a middleware with trusted IDs.

When a device is started (Figure 5(a)), the device is automatically given a device ID that can be trusted (cannot be spoofed) and that it will keep over its lifetime. In the enrollment phase (Figure 5(b)), the itACiH software is installed, and the device connects to Tablet frontend over a non-encrypted tunnel. Because device IDs can no longer be spoofed, both Tablet frontend and Core system can now keep a whitelist of devices that they will accept communication from. If a tablet is lost or stolen, a medical user can remove it from the whitelist. For a tablet to actually be able to connect to Core system to access data, it still needs to be activated by a medical user (Figure 5(c)), and logged in using a PIN code (Figure 5(d)). Because a tablet can be securely identified using its trusted ID, any stranded sessions can be allowed to complete without login.

3.5 Technical Requirements

From our discussion above on the challenges of the presented Hospital-at-Home system, we extract a number of technical requirements for our proposed solution: \(R1\)–\(R6\). Furthermore, we consider a broader IoT perspective where thin, i.e., technically less capable devices, are prominent, and add two additional requirements \(R7\) and \(R8\).

\(R1\).

Devices must be uniquely identifiable.

\(R2\).

Identities must be permanent over the lifetime of devices.

\(R3\).

Devices must be able to prove ownership of their identity.

\(R4\).

Devices must support secure peer-to-peer communication.

\(R5\).

Devices must operate on isolated networks, i.e., without Internet access.

\(R6\).

Device must be able to rotate cryptographic material autonomously.

\(R7\).

Devices must operate on hardware with low computational power.

\(R8\).

Devices must operate on low-bandwidth networks.

Devices need to identify themselves with an ID that is unique, so that no two devices are using the same device ID (\(R1\)). This device ID needs to be (practically) permanent, lasting for the entire lifetime of the device, thus requiring a long-lived ID (\(R2\)). In addition, a device needs to be able to show cryptographic ownership of its ID, such that it can be uniquely verifiable and not replicated or claimed by another device, in order to prevent spoofing (\(R3\)).

Devices also need support for secure communication between itself and other devices (\(R4\)). A device may need to produce signatures (signing statements), so that the device can, e.g., demonstrate its ownership of its ID. A device must further support offline operation (\(R5\)). That is, two devices must be able to establish a connection between themselves without connecting to any central authority. This requirement is also relevant with regards to proof of ID ownership. Furthermore, a device must also be able to rotate (renew) its cryptographic material when needed, without involving any user and without connecting to any central authority (\(R6\)). This requirement can be relevant for both secure communication and for proving ownership of IDs.

We further require a solution that works for thin devices that are low on resources (\(R7\)), as it is typical in the IoT domain. While many system devices used in the field will not be thin in this regard, such as tablets used by nurses, other system devices, such as sensors, will be suffering from the typical IoT deficits of small memory and low computational power. Such devices may use low power networks such as Zigbee or Bluetooth Low Energy, with low bandwidth capabilities (\(R8\)). The system design must be able to support these devices as well, allowing them to establish a connection in a reasonably short time, say around 1 second.

At protocol level, the security features must be cheap in practice. That is, the security extension of the system must not substantially increase the communication overhead, or lead to a significant increase in on-device processing requirements, or any other such undesired side-effects that render the system less useful or responsive in practical terms.

4 Threat Model

The scenario, we work with assumes that Commercial off-the-Shelf (COTS) devices are used in the system. This obviously constitutes a high-security threat per se in the system. However, threats do not only come from the use of COTS devices but also from local networks, the Internet, and other sources. We consider the following explicit threats in the system:

–

Any device in the system can potentially run hostile software that tries to make illegal connections toward healthcare systems, other devices in the system, or end-user devices.

–

An attacker can try to introduce any new device in the system pretending to be part of the local network.

–

A legitimate user of an authorized device might try to gain privileges in the medical system that are not granted to that user or device, i.e., try a time privilege escalation attack.

–

We adopt the Dolev–Yao model [12] and assume that the attacker can influence the system in all other aspects, including the following capabilities of the adversary:

–

The attacker is able to intercept, modify, and replay all communication from and to any device.

–

The attacker is able to launch input attacks by sending arbitrary messages to a device.

We explicitly exclude some specific threats to the system assuming the following:

–

The healthcare backend system is assumed to be fully trusted.

–

The healthcare personnel, i.e., the administrative users, in the system are assumed to be trusted (verified through explicit authentication functions, which we do not consider in this article).

–

The desktop computer and similar units used by trusted administrative users are assumed to be fully trusted.

–

The identities, keys, and so forth installed by a trusted administrative user on devices are assumed to be stored with integrity and confidentiality protection on the devices, such that it is not possible for an attacker to read, modify, or by other means get access to. That means that we also assume that a secure execution environment is available on the devices realizing this protection of the identities and credentials when used in the system.

–

The software on the devices is assumed to be integrity-protected. Furthermore, it is assumed that the software is executed in a protected execution environment such that it cannot be tampered with or eavesdropped by any attacker in the system.

5 An IoT-Middleware Trust and Key Management Security Architecture

In this section, we outline our IoT-middleware trust and key management security architecture. We begin by summarizing the relevant design requirements that affect our design. Then we detail our security architecture in terms of device security, describing what we require in our system. As our design is also configurable in many aspects, we additionally explain what these optional alternatives allow a system implementer to accomplish, and at what cost. After this we further detail the two central functionalities supported by our IoT-middleware trust and key management security architecture. The first such functionality is device initialization, for entering new devices into the system, while the second is device discovery, for enabling trusted communication between devices.

5.1 Device Security

Devices may be in operation for a very long time. This longevity implicitly dictates a necessity to provide support for cryptographic primitives that are quantum-safe—so called, post-quantum primitives. Such primitives are under development and are currently being standardized, and the available algorithms include both asymmetric encryption for establishing secure communication channels, and digital signatures for applications requiring non-repudiation. A particular application may have need for only asymmetric encryption, or only digital signatures, but our design must (and indeed does) support both types of primitives.

Identification with a unique device ID that does not change over time, coupled with the property that the device needs to be able to prove ownership of the ID, implies that the ID must be cryptographically linked to the device. An adversary should not be able to falsely prove ownership of an ID. While a simple solution would be to equip each device with an asymmetric key pair, and use the public key as the device ID, this solution falls short when it comes to the requirement of support for rotating cryptographic material. With that requirement, we need to make every device able to replace a current set of cryptographic keys for a new set, without replacing the device ID. Such key rotation may, e.g., be enforced by administrators or be initiated by devices themselves, either periodically, after a key has been compromised or at other times. As an alternative, devices could be instantiated with a large set of keys material, from which the device ID is derived. A device could then rotate to a new key pair, from its set of keys, without replacing its device ID. This approach requires that the device can provide proof that the new key pair is indeed linked to the device ID, which therefore also touches on the protocol requirements. As the proof of ID ownership must be embedded into the communication protocol, such proofs need to be small in terms of communication overhead.

A technical tool that balances all of the design tradeoffs in a favorable way for all of the above requirements is a Merkle tree. Consider a device with a large set of cryptographic keys, say \(n\) of them. We can use a Merkle tree to tie all of the corresponding key pairs together and into a unique ID formed from the root of the Merkle tree. Referring to Figure 6 for a toy example of a Merkle tree for \(n=8\), the tree is constructed by employing a cryptographically secure hash function \(H\) in a very systematic way. The public keys of every key pair are hashed into digests which form the leaves of the Merkle tree.

Fig. 6.

The nodes in the Merkle tree are labeled \(h^{\ell}_{i}\), where \(\ell\) denotes the level of the node from 0 (zero) at leaf level counting up to the Merkle root which is at level \(\lceil\log(n)\rceil\). The index \(i\) denotes the position of the node, ordered sequentially from left to right, within that level. A parent node is calculated as the hash of its concatenated children, such that \(h^{\ell+1}_{i}=H(h^{\ell}_{2i}\|h^{\ell}_{2i+1})\). The Merkle root is the top node, which can also be denoted \(h^{\lceil\log(n)\rceil}_{0}\) in the general case, corresponding to \(h^{3}_{0}\) in Figure 6.

5.1.1 Intuition and Design Motivation.

Before we dive into the more detailed options of the solution space, let us first take a step back and clarify what the solution using the Merkle tree structure actually provides. The Merkle root can be viewed as a commitment to a (large) set of cryptographic keys. That is, a device commits to using a (any) specific set of cryptographic keys during its lifetime. Note that this set of keys can either be generated by the device itself (locally on the device), or an administrator can generate a set of keys and load these onto a device during device initialization.

Although we employ a finite set of cryptographic keys, the set can be very large in practice since the device does not need to store the entire set of keys in memory. This “storage magic” can be accomplished by generating the key set from a device-unique seed (which needs to be rigorously protected from leakage or misuse). And furthermore, our solution can also be adapted to allow dynamic expansion of the cryptographic key set when necessary.

The Merkle root—the commitment to the set of cryptographic keys—is used as the device ID that does not change over the lifetime of the device. How large the set of cryptographic keys needs to be is application dependent, but for intuition, a reasonable size for the set can be calculated by considering the maximum expected lifetime of a device together with the expected frequency of key rotation in the system. As a concrete example, consider a system instantiation with a life expectancy of 50 years, and where devices rotate keys once per month. This system instantiation would need to support devices that use in total \(n=12\times 50=600\) keys. Key sets of this size can be handled with ease using a Merkle tree of height \(\lceil\log(n)\rceil=\lceil\log(600)\rceil=10\).

To motivate the added communication complexity introduced by our design choice, consider the added amount of overhead in the resulting communication protocols. If we compare unauthenticated connections with authenticated ones, the devices need to show that they “own” the device ID that they present. A device can do this by showing that

(1)

the public key that they are currently using is indeed in the committed set of keys, and

(2)

that they also have the corresponding private key (without revealing it).

Part (1) above amounts to showing that the public key is indeed in the Merkle tree. Showing this is a matter of providing evidence that the Merkle tree contains a path from the given public key to the Merkle root. Note that this proof contains only information that is publicly verifiable. In particular, this means that we must only include public key information in the Merkle tree. To realize that part (2) is also necessary, the reader may consider the procedure when setting up a Transport Layer Security (TLS) connection [35] between two parties. In a typical such connection, the parties first present certificates containing a proof identity coupled with their public key. But the TLS protocol also includes a challenge-response part in which the device must indeed prove that it has possession of the corresponding private key (without revealing it).

The Merkle tree structure allows for part (1) above to be added to a communication protocol by adding \(\mathcal{O}(\log(n))\) bytes of data, where \(n\) is the size of the set of cryptographic keys. Part (2) comes virtually for free, adding only \(\mathcal{O}(1)\) bytes of data to the communication protocol.

While we primarily consider the Hospital-at-Home scenario in this work, our design is also suited for some IoT applications and devices which need to balance usage of their scarce resources in terms of memory, processing, and communication capabilities. For example, if a device is low on storage capabilities, it does not actually need to store the entire key set. It is sufficient to store only a (pseudo-random) seed that can be used to generate all the keys in the key set when they are actually needed. How this can be done is detailed in the subsequent section.

The reader may further note that it is not necessary to encrypt the Merkle tree for protection, since it contains only public information, but it does need integrity protection to prevent adversarial mutability and Denial of Service attacks. The seed and all private keys must be stored securely on the device, so that no external party can utilize or alter them.

5.1.2 Key Pair Chaining.

A single (pseudo-)random seed stored on a device can be used to generate a sequence of \(n\) asymmetric key pairs \((pub_{i},priv_{i}),i=0,\ldots,n-1\). While this can be done in several different ways, for ease of reading, we describe one simple solution here, referring the committed reader to [44] for further options.

A KGA typically takes (pseudo-)random data as input and outputs one key pair \((pub,priv)\). To produce an entire set of \(n\) key pair from the same seed, we can utilize a counter value as additional seed material. One way of doing this is to utilize a counter to produce each key pair \((pub_{i},priv_{i})\) directly from the seed using the KGA according to

\begin{align}(pub_{i},priv_{i})=KGA(seed\|i).\end{align}

(1)

Note that it is important to use secret data as input here, as it must be impossible for an adversary to compute the next key pair from public information only (the public key). This is one way of generating a set of keys from a single random seed. With the key generation method above, devices can use the keys in the key set out-of-order if needed. An advantage of this particular way of key chaining is that it can provide forward secrecy as a system property [44].

Note that the seed used to generate the key set needs to contain enough entropy to ensure that it is not easy for an adversary to guess the seed. This is important, because in a technical sense, the seed is the key set, and needs to be protected as such. So storage of the seed on the device needs to be as secure as storing a private key on the device.

5.1.3 Complexities Summary.

Computing a device ID (Merkle root) involves generating all the \(n\) key pairs in the key set from the seed. This takes \(\mathcal{O}(n)\) time. Counting the number of times we need to apply a hash function operation, we see that we first need \(n\) hash operations to produce the leaves \(h^{0}_{i}\) in the Merkle tree, and then an additional \(n-1\) hash operations to compute the Merkle root itself. So computing the device ID takes \(\mathcal{O}(n)\) time.

The amount of storage required for the device ID computation is merely \(\mathcal{O}(\log(n))\). To see this, note that it is not necessary to store the entire Merkle tree in memory at any given time. All that is needed is to store \(\lceil\log(n)\rceil\) hashes, corresponding to the currently processed hashes in the Merkle tree as key pairs are processed sequentially from \((pub_{i},priv_{i}),i=0,\ldots,n-1\), deleting the key pairs once they have been processed.

Now consider the size of the proof of inclusion. Let \(b\) denote the size of the hash digest in bits. The proof of inclusion consists of the adjacent hashes in the Merkle tree path going from the current key pair leaf node to the Merkle root, but excluding the Merkle root itself, since it is passed in the protocol separately as the device ID. The size of the proof of inclusion is therefore exactly \(b\times\lceil\log(n)\rceil\) bits. Here, \(b\) depends on the cryptographic hash function that is used, but typical sizes in actual applications have \(b\in\{128,256,512\}\).

5.2 Initial Device Registration

Consider a new and unregistered device \(D_{1}\). Before deploying the device, we execute an initial registration procedure to create a unique device ID for \(D_{1}\) as follows. We shall assume that the system designer has decided for each device to have a simultaneous set of \(k\) quantum-safe key pairs. That is, each Merkle tree leaf node has \(k\) separate key pairs, which can be used concurrently for different purposes. The reader may, for example, consider the concrete case \(k=2\), where the first key pair is used for signing messages or statements, and the second is used for session establishment.

We also require the system administrator to specify the parameter \(n\), indicating how many such sets of key pairs that are to be generated. As we remarked previously, a suitable number \(n\) can be determined by considering the life expectancy of the key pairs and of the system itself. And, last but not least, the system administrator also needs to decide on a cryptographically secure hash function \(H\) to be used for the hash computations. The device ID generation procedure is as follows.

(1)

Generate \(n\) sets of key pairs \(s_{i}=\left\{(pub_{i,0},priv_{i,0}),\ldots,(pub_{i,k-1},priv_{i,k-1})\right\}, i=0,\ldots,n-1\), where each key pair \((pub_{i,j},priv_{i,j})\) is a tuple of a public key \(pub_{i,j}\) and its corresponding private key \(priv_{i,j}\).

(2)

Following the notation in Figure 6, first compute the leaf node hash digests \(h_{0}^{0},\ldots,h_{n-1}^{0}\) from the public keys in each set of key pairs according to

\begin{align*}h_{i}^{0}=H(s_{i}):=H(pub_{i,0}\|\ldots\|pub_{i,k-1}),i=0,\ldots,n-1.\end{align*}

(3)

Compute the entire Merkle tree up to and including the Merkle root by successively applying the equation \(h^{\ell+1}_{i}=H(h^{\ell}_{2i}\|h^{\ell}_{2i+1})\). For the full Merkle tree computation, \(n-1\) such hashing computations need to be performed. The Merkle root is the final hash value \(h_{0}^{\lceil\log(n)\rceil}\).

Denoting the ID of device \(D_{1}\) as \(ID_{D_{1}}\), we have

\begin{align*}ID_{D_{1}}:=h_{0}^{\lceil\log(n)\rceil}.\end{align*}

A path from a Merkle tree leaf \(h_{i}^{0}\) up to the root \(ID_{D_{1}}\) can be used as a proof of inclusion, in the sense that it shows that the public keys of key set \(s_{i}\) have indeed been included in the computation of the device ID. Another way of phrasing this is that the Merkle tree computation serves as a commitment to a specific set of key pairs.

Initial device registration can be managed and performed centrally, of course, with device generation being performed on separate or dedicated equipment, and then transferring the device ID and possibly the entire Merkle tree to the device. However, it is equally possible to distribute the device ID computation to the devices themselves.

If authenticated registration is a system requirement, then device IDs are additionally signed by the system administrator to complete the initial registration. The system administrator’s signature of the device ID then serves as a proof of registration, which sets it apart from unregistered devices.

In many use-cases it is required to have a centralized system to manage trust and key rotations. From a technical perspective, this amounts to having devices that are able to do two specific things; verify signatures and supporting a notion of time. To support this, the following two steps can be performed in the registration phase. First, storing the public key of the backend system administrator in every device. Secondly, the administrator may impose a key rotation policy for the device to adhere to. For example, a key rotation policy can be a set of instructions for when and how to rotate keys on a device, and the backend administrator signs this policy to avoid having devices follow malicious instructions. The instructions themselves can typically include time stamps for validity or key rotation instructions. These steps enable the devices to check whether other devices that want to connect with them are legitimately registered into the system, and to handle connections during the entire system life cycle.

5.3 New Device Discovery

Device discovery in PalCom is bi-directional and used by all devices to automatically discover devices on available networks. This is accomplished through periodic heartbeat messages, broadcast by devices to announce their presence. When two devices discover each other, they exchange IDs and public keys, in order to set up a secure connection between themselves, and for both devices to prove ownership of their respective device ID. While this discovery procedure is PalCom intrinsic, the remaining part of setting up a secure connection is principally the same as setting up a TLS connection. However, our architecture supports any suitable handshake protocol to be used in practice.

In Section 7.3, we provide one example of a concrete handshake protocol that we use in our implementation, mimicking a full TLS implementation in sufficient detail.

5.4 Device Re-Discovery with Key Rotation

We now explicitly consider the case when two devices \(D_{1}\) and \(D_{2}\) want to re-establish a connection after a key rotation. That is, we consider the case where \(D_{1}\) and \(D_{2}\) have met before (so they may know each other’s device ID), but device \(D_{1}\) has been off-grid or sleeping for a long time, and is unaware that device \(D_{2}\) has rotated its keys in the meantime (for whatever reason).

Device re-discovery in itself is equivalent to resuming a previously active communication session. If such a communication session has been previously established, and the device still recognizes the authentication of the communicating device as valid, then we can save some data transmission load in the handshake protocol to make it a little bit more efficient, as the authentication part with the proof of ID ownership is not necessary to transmit.

However, a previous session cannot be resumed if any one of the communicating entities have rotated their keys. This situation needs to be detectable, and a full handshake with a new session (key) establishment must be completed. In particular, to avoid spoofing (i.e., to prevent falsifying identification), this necessarily includes sending a new proof of inclusion for the new key pair that the key-rotated device is now using.

The utilization of a Merkle tree to store the keys makes the device key rotation relatively easy because there is no need for any complex key-revocation procedure. While the root of the Merkle tree, and consequently the device ID, remains valid, keys in the tree can be rotated by the device themselves at any time in a very simple fashion. If a device with an ID \(D_{1}\) that is using key \(k_{i}\) wants to rotate its key, it is enough for this device to start utilizing key \(k_{i+1}\), while the ID \(D_{1}\) remains unchanged. In Section 6, we explain in more detail why the devices’ IDs are permanent and are not altered during the lifetime of the device.

6 Security Evaluation

In this part, we present a security analysis of our proposed security architecture for trust establishment and key management in Section 5. In particular, we evaluate the security under the threat model in Section 4, and we justify how we satisfy the technical requirements defined in Section 3. These requirements are assured as follows:

\(R1\)

and \(R3\). A device must have a unique identification and must be able to prove ownership of its ID. In Section 5, we presented a mechanism to generate device IDs by utilizing an asymmetric cryptosystem, a secure hash function and a Merkle tree. All key material and the device ID used by a single device is generated from a secret seed that is known only to the device. So fulfillment of \(R1\) and \(R3\) depends on two components. The first is the secret seed, which must be generated with sufficient entropy, to avoid an adversary being able to correctly guess the seed. The second component regards the cryptographic primitives. The hash function used to compute the Merkle tree must sport a sufficiently large output size (in bits), so that it is difficult in practice for an adversary to cheat by creating collisions. These two components together guarantee the uniqueness and ownership provability of the device ID.

\(R2\).

Permanent device identity. The construction with the Merkle tree—incorporating the \(n\) key pairs into the device ID—provides flexibility to tailor the longevity of the device ID depending on the expected lifetime of the device, making it permanent in a very practical sense.

There is also another time aspect regarding the cryptographic primitives that are used in the design. In order to protect our architecture from quantum computers, we use quantum-safe cryptography, so that we can guarantee that the device IDs that have been generated by our system are functional even in the quantum era.

\(R4\).

Devices must support secure peer-to-peer communication. This security requirement is satisfied by the fact that a device can establish an authenticated connection by proving that the public key it holds indeed is incorporated into the Merkle tree and, therefore, cryptographically tied to the device ID.

\(R5\).

Devices must operate on isolated networks, i.e., without Internet access. This requirement is satisfied if every device stores the public key of the backend administrator. The reason for this is that devices must be able to check that the device it is connecting to is properly enrolled into the system, which amounts to verifying that the corresponding device ID has been signed by the system administrator. Even so, all operation in isolated networks is possible, since any two devices can set up a secure connection between them using only local information. That is, signed device IDs, key material and proofs of inclusion are all locally available and verifiable without Internet access.

\(R6\).

Device must be able to rotate cryptographic material autonomously. As described in Section 5, in the initial device registration phase, the device is primed with the backend system administrator’s public key, and a key rotation policy defined by the administrator. This means, when it is time for key rotation (according to the policy), the device can automatically discard old keys and start using the next set of keys in the Merkle tree.

\(R7\)

and \(R8\). Devices must operate on hardware with low computational power, and on low-bandwidth networks. In our security architecture in Section 5, we utilize a lightweight scheme to achieve trust establishment and key management. The computational complexity of our scheme primarily relies on the calculations of a moderate amount of hash values, which makes the architecture suitable for devices with low computational power. Moreover, the proof of ID ownership is a set of hash values, which are comparatively small in size and do not significantly impact the bandwidth usage.

From the above requirements fulfillment, we can summarize a core system property into the following theorem.

Theorem 6.1.

Any Dolev–Yao adversary is not able to falsely prove ownership of a valid ID that they in fact do not own.

One direct conclusion from the above theorem is that our proposed architecture provides privacy for devices and users by protecting all communication. An attacker cannot connect their device to any of the registered devices in the Hospital-at-Home ecosystem. Moreover, an attacker cannot obtain any private data by eavesdropping the communication between legitimate devices, as the sensitive data is sent through the connection channels that are all secured. Therefore, the attacker cannot invade the privacy of the patients by accessing their medical data. However, in our architecture, we require a permanent ID that can be used to uniquely identify devices and authenticate them. This means that this ID might be tracked, which might, under some circumstances, constitute a privacy threat. In other words, although the attackers cannot get access to the patients’ private medical data, they might be able to link a device to a certain user. One simple way to tackle this privacy threat is to encrypt the device IDs before sending them to other system’s entities. Our threat model does not cover this privacy threat, and we leave it for future work. However, as the device ID’s are signed by the backend administrator’s private key, it is impossible for an attacker to successfully spoof a registered device ID.

7 System Implementation and Experimental Evaluation

In this part, we detail our system implementation choices. Then, we present the performance evaluation of our security architecture. For reproducibility, we provide an artifact at https://zenodo.org/records/10999243.

7.1 Method

To evaluate the effectiveness and practical applicability of our security architecture and design, we developed a prototype implementation in Java using the Bouncy Castle cryptographic library. This choice was made with compatibility with PalCom in mind, which is also Java-based. Our choice of cryptographic functions and algorithms is detailed in Table 1.

Table 1.

Function	Algorithm	Configuration
Hashing	SHA3	SHA3-512
Key encapsulation	Kyber	kyber1024
Signatures	Dilithium	dilithium3
Pseudorandom number generator	SHA1	SHA1PRNG

Table 1. Algorithms Used in the Implementation

Our evaluation involved a series of three experiments performed on our implementation, targeting the three scenarios from Section 5, important to real-world deployment.

–

Device Initialization—Generating keys and Merkle trees.

–

New Device Discovery—Communicating public keys and proof of ownership of ID.

–

Device Re-Discovery with Key Rotation—Renewing keys and re-establishing a new session.

All experiments were conducted on a Raspberry Pi 3, utilizing Raspbian 10 as the operating system and OpenJDK 11.0.18 for Java execution. Time measurements were taken directly on the Raspberry Pi using Java’s built-in System.nanoTime() function. For the infrequent actions of device initialization and key rotation, we measured Java startup performance. For the more frequent activity of new device discovery, we instead measured Java steady-state performance [16]. To ensure a comprehensive evaluation, all measurements were repeated across Merkle trees of varying heights, ranging from 7 to 13.

7.2 Implementation of Initial Device Registration

On its initial startup, a device creates a 512-bit seed using Java’s function SecureRandom.getInstanceStrong(). Through Equation (1) from Section 5.1.2, i.e., \(H(seed\|i)\) where \(H\) is the hashing function (SHA3) and \(i\) is the key index, this initial seed is used to spawn new seeds to be used when creating the asymmetric keys. These new seeds are used as input to the pseudo-random number generator that generate the asymmetric keys.

In our implementation, each seed is used to creates two key pairs—one for key encapsulation (Kyber) and the other for signing (Dilithium). The public keys are used as leaves to construct a Merkle tree, as described in Section 5.2. The root hash of this tree, once base-64 encoded for brevity and readability, is used to represent the device ID. Lastly, the device securely stores the initial seed and the current key index. For more efficient key rotations, our implementation also caches all the leaf hashes of the Merkle tree.

Performance evaluations for the device initialization was performed on a cold Java Virtual Machine (JVM) to reflect the one-time nature of this process at startup. To mitigate variability, the evaluations were repeated 50 times for each tree height, with a new JVM instance for each iteration.

The results are summarized in Table 2. They include the number of keys generated (Keys column) and an estimated lifespan for the device ID (Lifespan column), assuming monthly key rotations. The average time for the initialization process is broken down into the three columns Create Keys (time to generate keys), Build Tree (time to construct Merkle tree), and Other (all remaining tasks). Lastly, the column Total shows the total time for the process, along with a 95% confidence interval.

Table 2.

Height	Keys	Lifespan (years)	Create Keys (s)	Build Tree (s)	Other (s)	Total (s)
7	128	10.67	0.96	0.32	0.46	2.54 \(\pm\) 0.19
8	256	21.33	1.45	0.44	0.58	3.63 \(\pm\) 0.16
9	512	42.67	2.41	0.63	0.80	5.79 \(\pm\) 0.24
10	1,024	85.33	4.15	1.48	1.74	10.30 \(\pm\) 0.44
11	2,048	170.67	7.36	2.36	2.79	17.94 \(\pm\) 0.87
12	4,096	341.33	13.45	4.55	5.03	32.43 \(\pm\) 1.20
13	8,192	682.67	25.69	9.19	9.73	61.66 \(\pm\) 2.52

Table 2. Measured Time for Generating Keys and a Merkle Tree to Create a Device ID

The Lifespan column is the estimated number of years an ID will last, assuming monthly key rotations.

These results demonstrate practical feasibility; For example, generating 1,024 keys for a tree of height 10 takes roughly 10 s on a Raspberry Pi 3, and ensures an expected lifespan of 80 years with monthly key rotations.

7.3 Implementation of Secure Connections between Devices

Establishing a secure connection between two entities with mutual authentication is a two-part process. The first part is about device discovery, which is handled by PalCom’s heartbeat functionality. The second part is that of establishing the secure communication itself. Conceptually, this is precisely what a TLS connection provides.

While using a (any) standard implementation of TLS would suffice for a proof of concept, for the purpose of performance evaluation, we provide a simple and lightweight and yet concrete and explicit example of a handshake protocol tailored to integrate with PalCom. An advantage of this approach is that we can make the usage of the techniques we have described explicit for the reader.

Our protocol enables two devices to discover each other and establish a secure connection. Specifically, it serves three primary functions as follows.

(1)

Devices exchange IDs, public keys, and cryptographic proofs linking the keys to their ID.

(2)

Each device demonstrates possession of its corresponding private key.

(3)

Both devices agree on a symmetric session key for efficient bulk encryption.

Our example protocol is detailed below and depicted in the message sequence diagram shown in Figure 7.

Fig. 7.

The handshake begins when device \(D_{1}\) receives a heartbeat message from device \(D_{2}\). In response, \(D_{1}\) sends its ID (\(ID_{D_{1}}\)), public keys, proof of inclusion (\(PK_{D_{1}}^{+}\)), and a nonce challenge (\(NONCE_{D_{1}}\))—a unique number used once to prevent replay attacks—to \(D_{2}\). \(D_{1}\) additionally signs the message (\(SIG_{D_{1}}\)) to allow \(D_{2}\) to verify its integrity. Upon receiving this message, \(D_{2}\) verifies the signature and that the public keys are cryptographically linked to \(D_{1}\)’s ID using the provided proof.

Device \(D_{2}\) responds with its ID (\(ID_{D_{2}}\)), public keys, and proof (\(PK_{D_{2}}^{+}\)). It also generate a (symmetric) session key \(k\) and encapsulates this key in a Key Encapsulation Message (KEM) \(KEM_{PK_{D_{2}}}\) using \(D_{1}\)’s public encryption key. This ensures that only \(D_{1}\) can decrypt it to retrieve \(k\). \(D_{1}\)’s challenge (\(NONCE_{D_{1}}\)) is encrypted with \(D_{2}\)’s private key and sent back, allowing \(D_{1}\) to confirm that \(D_{2}\) is in possession of its corresponding private key. Likewise, \(D_{2}\) sends its own nonce challenge (\(NONCE_{D_{2}}\)) to \(D_{1}\), and signs the message (\(SIG_{D_{2}}\)) before sending it.

Next, \(D_{1}\) similarly verifies \(D_{2}\)’s signature and ID, and decrypts the nonce (\(NONCE_{D_{1}}\)) to confirm that \(D_{2}\) has the private key to match its public one. With its own private key, \(D_{1}\) decrypts the KEM to retrieve the session key \(k\). Lastly, \(D_{1}\) encrypts \(D_{2}\)’s challenge (\(NONCE_{D_{2}}\)) with the session key, to demonstrate its possession of it, and thus also its possession of its private key needed to retrieve it.

At the end of the protocol execution, both devices have identified themselves and demonstrated ownership of their ID, and now have a shared session key for further encrypted communication.

Now, let us consider a scenario where an attacker \(D_{t}\) tries to connect to a Device \(D_{1}\), pretending to be a legitimate device. To do so, \(D_{t}\) follows the above handshake protocol and gives its ID to device \(D_{1}\). As we explained in Section 5.2, the device IDs are signed by the system administrator in the initial registration phase. Therefore, only authenticated devices have their IDs signed and stored. Moreover, and still in the initial device registration phase, the system administrator stores its public key in each registered device. Thus, in the above attack scenario, device \(D_{1}\) can utilize the administrator’s public key and verify that the device \(D_{t}\) is not a registered device, and safely stop the communication with this malicious device. Moreover, the utilization of the nonce challenges prevent \(D_{t}\) from performing a replay attack, i.e., to send the same messages it received from \(D_{1}\) to another benign device in the system and pretend to be \(D_{1}\). Therefore, the proposed handshake protocol is secure and prevents establishing a connection between an attacker’s device and a registered one.

Considering the handshake procedure is integral to establishing secure connections and may be initiated multiple times throughout a device’s operational period, we evaluate its performance under steady state conditions. We measured four key durations \(T_{1},T_{2},T_{3}\), and \(T_{4}\), as illustrated in Figure 7. Each such duration comprises the time each device takes to process a received a message and send a response within the handshake sequence.

To ensure the accuracy of our results, we repeated the experiment on all 50 devices created during the initialization experiment, organized into 25 pairs. For each device within a pair, we regenerated the signing and encryption keys for a predefined key index and executed the handshake protocol 100 times. To mitigate the JVM warm-up effect, we excluded the first 10 iterations from our analysis. This procedure was repeated with a new JVM instance for each of the 25 device pairs, yielding a total of 2,250 measurements for each tree height.

The results are detailed in Table 3, with columns 2–5 presenting the average times for the four individual deltas, and the final column showing the total handshake duration alongside a 95% confidence interval.

Table 3.

Height	\(T_{1}\) (ms)	\(T_{2}\) (ms)	\(T_{3}\) (ms)	\(T_{4}\) (ms)	Total (ms)
7	19.36	35.29	15.99	1.06	71.70 \(\pm\) 29.36
8	19.45	35.06	16.06	1.06	71.62 \(\pm\) 30.37
9	19.66	35.82	16.27	1.05	72.81 \(\pm\) 30.15
10	19.84	36.51	16.46	1.05	73.86 \(\pm\) 31.21
11	19.79	36.23	16.50	1.04	73.57 \(\pm\) 29.25
12	19.92	36.42	16.31	1.03	73.68 \(\pm\) 30.13
13	20.02	35.95	15.94	0.98	72.90 \(\pm\) 30.56

Table 3. Measured Durations of the Four Deltas of the Handshake Protocol as Indicated by the Message Sequence Diagram in Figure 7

The analysis reveals that the tree height has negligible impact on the handshake execution time. Despite the confidence interval’s fairly wide range, the overall handshake duration remains reasonably small, with the worst case scenario being approximately 100 ms.

In addition to the time deltas, we evaluated the size of the messages exchanged during the handshake, the result of which is presented in Table 4. These messages, sent between each time delta, correspond to the transitions indicated by the column headers. Specifically, the column labeled \((T_{1})\rightarrow(T_{2})\) shows the size (in bytes) of messages sent following \(T_{1}\), which triggers \(T_{2}\).

Table 4.

Height	\((T_{1})\rightarrow(T_{2})\)	\((T_{2})\rightarrow(T_{3})\)	\((T_{3})\rightarrow(T_{4})\)	Total (bytes)
7	7,696	9,345	61	17,102
8	7,778	9,427	61	17,266
9	7,859	9,508	61	17,428
10	7,940	9,589	61	17,590
11	8,022	9,671	61	17,754
12	8,103	9,752	61	17,916
13	8,185	9,834	61	18,080

Table 4. Sizes (In Bytes) of Messages Sent during Handshake Protocol, by Tree Height

This analysis shows that even though the message sizes for transitions \((T_{1})\rightarrow(T_{2})\) and \((T_{2})\rightarrow(T_{3})\) tend to grow with tree size—explainable by the increased proof sizes—the increment is minor relative to the more considerable constant components of the message, comprised by the post-quantum asymmetric keys and the message signature. For detailed sizes of these message components, see Table 5.

Table 5.

Protocol parameter	Description	Size in bytes
\(ID\)	The device ID	64
\(PK^{+}\)	The public keys plus the proof	4,208
\(NONCE\)	A nonce	4
\(KEM\)	Key encapsulation mechanism	1,568
\(k\)	Symmetric key (including IV)	44
\(SIG\)	Digital signature of the message	3,293

Table 5. Description and Size of the Message Fields Used in the Handshake Protocol Depicted in Figure 7

Given the healthcare application scenario, it would be reasonable to require that the total data transfer time for the complete handshake is no more than one second, even if a low power network is used. From Table 4, we see that the total amount of data for all three messages is less than 20 Kbyte, i.e., 200 Kbit, regardless of tree height. It would then be required that the network has a bandwidth of at least 200 Kbit/s, which is compatible with what low power networks typically give. For example, for Zigbee, the bandwidth is 250 Kbit/s, and for Bluetooth Low Energy, it varies, but is typically 1 Gbit/s.

7.4 Implementation of Device Re-Discovery with Key Rotation

In our implementation, we optimize key rotation performance by storing the hashes of the leaves on disk. To perform a key rotation, we retrieve all of these hashes, and construct a Merkle tree. Then, given the index of the next key to rotate to, we can generate a proof of inclusion by traversing the tree. With the initial seed and the key index we also recreate the set of asymmetric keys.

Our evaluation of key rotation efficiency is split into two parts. First, the focus is on constructing the tree and generating the proof of inclusion. We timed both of these steps, and included the process of reading the leaves from disk. Including this additional load time in the measurements is justified by the infrequent need of having the tree in memory—typically this is only required during key rotations. Second, we timed the recreation of the keys using the seed and the new key index.

As before, we reused the devices from the device initialization evaluation in Section 7.2, i.e., 50 devices per tree height. For every device, 10 key indices was randomly selected and recreated, resulting in 500 measurements per tree height. Each measurement was performed in a fresh JVM instance, so as to measure startup performance. This is again motivated by the extended time between key rotations, which implies that the JVM may not have fully optimized the key rotation code.

The measurement results are detailed in Table 6, showing the average time required to read leaves from disk and create a proof (Tree column), and the average time required to generate the signature and key encapsulation keys from the seed and index (Key column). The Total column is the sum of these averages, presenting the mean total alongside a 95% confidence interval.

Table 6.

Height	Tree (ms)	Key (ms)	Total (ms)
7	174.72	326.56	501.28 \(\pm\) 19.58
8	263.99	342.33	606.32 \(\pm\) 20.43
9	374.52	328.94	703.46 \(\pm\) 69.73
10	487.16	331.27	818.43 \(\pm\) 57.86
11	620.44	327.47	947.91 \(\pm\) 72.44
12	948.29	338.60	1,286.89 \(\pm\) 72.47
13	1,537.39	339.69	1,877.08 \(\pm\) 115.31

Table 6. Times for Performing Key Rotation, by Tree Height

This analysis indicates that the duration for creating a new proof depend on the size of the tree, while key recreation remains largely constant. Importantly, therefore, proof creation time becomes significantly impactful for large trees. Nonetheless, given the infrequency of such operations, we consider the associated times to be manageable.

7.5 Real-World Deployment Challenges

Our implementation demonstrates the viability of the proposed approach, solving the identified technical challenges. To deploy the solution in a real-world healthcare application, such as the itACiH Hospital-at-Home system, there are several additional technical and non-technical challenges to address. One challenge is to get the system approved by the IT support organizations at hospitals. This is a process that can take quite some time and effort due to the complexity of the area. To use post-quantum cryptographic algorithms is not yet mainstream, yet of importance not only for long-term usage of devices, but also because malicious actors are reported to already now collect traditionally encrypted data in order to be able to break it in the future with quantum computers.

Another challenge is to ensure that patients are able to trust the system with their data. Some data can be very sensitive, such as photographs of a child’s body. Research shows that to accept using an e-health system, it is very important for patients to know what happens to the information entered into it [19]. One suggestion to support this has been to specifically inform patients about security when introducing them to a new e-health system. For example, they can be informed and reassured that all communication is end-to-end encrypted between the device and the database [11].

Another security-related challenge is to ensure that the system application code can be updated in an agile way, and that updates can be done securely. This is important both to achieve patient satisfaction by improving the function of the application, but also for security reasons, to be able to patch newly found vulnerabilities in the code. Such update is challenging for IoT systems as it requires automatic remote management of the distributed IoT devices [10]. For PalCom, we have recently developed such an automatic update method, supporting minimal downtime for end users [31].

Apart from challenges directly related to security, there are general challenges in developing and introducing new IT systems for healthcare, in particular ensuring that the system does not only help patients but also the medical personnel, saving them time and effort. This can be accomplished by using participatory design [8, 11] and implies highly iterative development with users in the loop, accentuating the need for simple and secure mechanisms for system update.

8 Conclusions

To support secure Hospital-at-Home systems, we have proposed a security architecture based on trust and key management for devices, implemented at the middleware level. We took our starting point in a Hospital-at-Home scenario of a system currently in use at Swedish hospitals, identifying challenges in this scenario and formulating technical requirements for future similar systems. A particularly important requirement that calls for new solutions at the middleware level is the need for devices being able to operate and communicate securely for years, without physical access from administrator users.

Our proposed solution is based on using Merkle trees for trust and key management, and post-quantum encryption primitives. We have evaluated the security properties of this solution under an explicit threat model, showing that the technical requirements are fulfilled. We have also constructed an example implementation in Java for Raspberry Pi embedded computers and evaluated its performance for different heights of the Merkle tree. In this evaluation we consider the situations of practical importance for real-world deployment: device initialization, discovery of a new device, and device re-discovery with key rotation.

Our evaluation demonstrates the practical feasibility of the approach. For example, with a tree height of 10, device initialization takes around 10 s (start-up Java performance), ensuring a lifespan of 80 years with monthly key rotations. The discovery computation then takes at most 0.1 s (steady-state performance, and regardless of tree height) and key rotation takes less than 1 s (startup performance).

Footnotes

Cryptographic probabilities of failure are configurable, but they can typically be, say, \(2^{-128}\), \(2^{-256}\), or \(2^{-512}\).

https://bitbucket.org/palcom/palcom-middleware

Most PalCom applications additionally make use of scripted assemblies [15] or compositions [2], i.e., special kinds of services that mediate messages between ordinary services, and that handle the setup of connections. The distinction between these and ordinary services is, however, not of any importance to this article.

⁴

Key Generation Algorithm (KGA): RSA; key size: 2,048 bits.

⁵

Certificate signature algorithm: SHA256withRSA.

References

[1]

Mohammad Asad Abbasi, Zulfiqar A. Memon, Nouman M. Durrani, Waleej Haider, Kashif Laeeq, and Ghulam Ali Mallah. 2021. A multi-layer trust-based middleware framework for handling interoperability issues in heterogeneous IoTs. Cluster Computing 24 (2021), 2133–2160.

Abstract

1 Introduction

Main Contributions.

2 Related Work

2.1 IoT in the Healthcare Ecosystem

2.2 Trust Management Schemes for IoT Devices

2.2.1 Trust Management via a Middleware System.

2.2.2 IoT Trust Management with Blockchains.

2.2.3 IoT Trust Management with Merkle Tree Techniques.

2.2.4 IoT Trust Management with Biometric Methods.

2.2.5 Discussion.

3 Motivating Scenario

3.1 The PalCom Device Model

3.2 Distributed Architecture of the itACiH Hospital-at-Home System

3.3 Mobile Device Enrollment and Activation in itACiH

3.4 Challenges

3.5 Technical Requirements

4 Threat Model

5 An IoT-Middleware Trust and Key Management Security Architecture

5.1 Device Security

5.1.1 Intuition and Design Motivation.

5.1.2 Key Pair Chaining.

5.1.3 Complexities Summary.

5.2 Initial Device Registration

5.3 New Device Discovery

5.4 Device Re-Discovery with Key Rotation

6 Security Evaluation

7 System Implementation and Experimental Evaluation

7.1 Method

7.2 Implementation of Initial Device Registration

7.3 Implementation of Secure Connections between Devices

7.4 Implementation of Device Re-Discovery with Key Rotation

7.5 Real-World Deployment Challenges

8 Conclusions

Footnotes

References

Index Terms

Recommendations

Invited - Things, trouble, trust: on building trust in IoT systems

Secure ad hoc trust initialization and key management in wireless body area networks

New key management protocol for SSL/TLS

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations