2308.09883v3
2308.09883v3
Yiping Ma⋆ Jess Woods⋆ Sebastian Angel⋆† Antigoni Polychroniadou‡ Tal Rabin⋆
⋆ † ‡
University of Pennsylvania Microsoft Research J.P. Morgan AI Research & AlgoCRYPT CoE
arXiv:2308.09883v3 [cs.CR] 18 May 2024
Abstract—This paper introduces Flamingo, a system for secure Many protocols and systems for secure aggregation have
aggregation of data across a large set of clients. In secure been proposed, e.g., in the scenarios of private error report-
aggregation, a server sums up the private inputs of clients ing and statistics collection [5,12,20,24,63,75]. However,
and obtains the result without learning anything about the secure aggregation in federated learning, due to its spe-
individual inputs beyond what is implied by the final sum. cific model, faces unprecedented challenges: a large num-
Flamingo focuses on the multi-round setting found in federated ber of clients, high-dimensional input vectors (e.g., model
learning in which many consecutive summations (averages) of weights), multiple rounds of aggregation prior to model
model weights are performed to derive a good model. Previous convergence, and unstable devices (i.e., some devices might
protocols, such as Bell et al. (CCS ’20), have been designed for go offline prior to completing the protocol). It is therefore
a single round and are adapted to the federated learning setting difficult to directly apply these protocols in a black-box way
by repeating the protocol multiple times. Flamingo eliminates and still get good guarantees and performance.
the need for the per-round setup of previous protocols, and Recent works [9,11,66] propose secure aggregation tai-
has a new lightweight dropout resilience protocol to ensure lored to federated learning scenarios. In particular, a state-of-
that if clients leave in the middle of a sum the server can still the-art protocol [9] (which we call BBGLR) can handle one
obtain a meaningful result. Furthermore, Flamingo introduces aggregation with thousands of clients and high-dimensional
a new way to locally choose the so-called client neighborhood input vectors, while tolerating devices dropping out at any
introduced by Bell et al. These techniques help Flamingo
point during their execution. The drawback of these proto-
cols is that they only examine one round of aggregation in
reduce the number of interactions between clients and the
the full training process, i.e., a selection of a subset of the
server, resulting in a significant reduction in the end-to-end
clients and a sum over their inputs.
runtime for a full training session over prior work.
Utilizing the BBGLR protocol (or its variants) multiple
We implement and evaluate Flamingo and show that it can
times to do summations in the full training of a model
securely train a neural network on the (Extended) MNIST and
incurs high costs. Specifically, these protocols follow the
CIFAR-100 datasets, and the model converges without a loss in pattern of having each client establish input-independent
accuracy, compared to a non-private federated learning system. secrets with several other clients in a setup phase, and then
computing a single sum in a collection phase using the
1. Introduction secrets. These secrets cannot be reused for privacy reasons.
Consequently, for each round of aggregation in the training
In federated learning, a server wants to train a model process, one needs to perform an expensive fresh setup.
using data owned by many clients (e.g., millions of mo- Furthermore, in each step (client-server round trip) of the
bile devices). In each round of the training, the server setup and the collection phases, the server has to interact
randomly selects a subset of clients, and sends them the with all of the clients. In the setting of federated learning,
current model’s weights. Each selected client updates the such interactions are especially costly as clients may be
model’s weights by running a prescribed training algorithm geographically distributed and may have limited compute
on its data locally, and then sends the updated weights to and battery power or varying network conditions.
the server. The server updates the model by averaging the In this paper we propose Flamingo, the first single-server
collected weights. The training takes multiple such rounds secure aggregation system that works well for multiple
until the model converges. rounds of aggregation and that can support full sessions of
This distributed training pattern is introduced with the training in the stringent setting of federated learning. At
goal of providing a critical privacy guarantee in training— a high level, Flamingo introduces a one time setup and a
the raw data never leaves the clients’ devices. However, prior collection procedure for computing multiple sums, such that
works [55,77] show that the individual weights still leak the secrets established in the setup can be reused throughout
information about the raw data, which highlights the need the collection procedure. For each summation in the collec-
for a mechanism that can securely aggregate the weights tion procedure, clients mask their input values (e.g., updated
computed by client devices [52,74]. This is precisely an weight vectors in federated learning) with a random mask
instance of secure aggregation. derived from those secrets, and then send the masked inputs
1
to the server. The server sums up all the masked inputs and to share their data (or other intermediate information such as
obtains the final aggregate value (what the server wants) weights that might leak their data [77]). To accomplish this,
masked with the sum of all masks. Flamingo then uses a each client receives the original model from the server and
lightweight protocol whereby a small number of randomly computes new private weights based on their own data. The
chosen clients (which we call decryptors) interact with the server and the clients then engage in a secure aggregation
server to remove the masks. protocol that helps the server obtain the sum of the clients’
The design of Flamingo significantly reduces the overall private weights, without learning anything about individual
training time of a federated learning model compared to run- weights beyond what is implied by their sum. The server
ning BBGLR multiple times. First, Flamingo eliminates the then normalizes the sum to obtain the average weights which
need for a setup phase for each summation, which reduces represent the new state of the model. The server repeats this
the number of steps in the full training session. Second, for process until the training converges.
each summation, Flamingo only has one step that requires This process is formalized as follows. Let [z] denote the
the server to contact all of the clients (asking for their set of integers {1, 2, . . . , z}, and let ⃗x denote a vector; all
inputs); the rest of the interactions are performed between operations on vectors are component-wise. A total number
the server and a few clients who serve as decryptors. of N clients are fixed before the training starts. Each client is
Besides training efficiency, the fact that a client needs indexed by a number in [N ]. The training process consists of
to only speak once in a round reduces the model’s bias T rounds. In each round t ∈ [T ], a set of clients is randomly
towards results that contain only data from powerful, stably sampled from the N clients, denoted as St . Each client i ∈
connected devices. In Flamingo, the server contacts the St has an input vector, ⃗xi,t , for round t. (A client may be
clients once to collect inputs; in contrast, prior works have selected in multiple rounds and may have different inputs
multiple client-server interaction steps in the setup phase in each round.) In each round, the server wants P to securely
(before input collection) and thus filter out weak devices compute the sum of the |St | input vectors, i∈St ⃗xi,t .
for summation, as staying available longer is challenging. In practical deployments of federated learning, a com-
Seen from a different angle, if we fix the number of clients, plete sum is hard to guarantee, as some clients may drop out
the failure probability of a client in a given step, and the in the middle of the aggregation process and the server must
network conditions, Flamingo’s results are of higher quality, continue the protocol without waiting for them to come back
as they are computed over more inputs than prior works. (otherwise the server might be blocked for an unacceptable
In summary, Flamingo’s technical innovations are: amount of time). So the real goal is to compute the sum of
• Lightweight dropout resilience. A new mechanism to the input vectors from the largest possible subset of St ; we
achieve dropout resilience in which the server only con- elaborate on this in the next few sections.
tacts a small number of clients to remove the masks. All
other clients are free to leave after the one step in which 2.1. Target deployment scenario
they submit their inputs without harming the results.
• Reusable secrets. A new way to generate masks that
Based on a recent survey of federated learning deploy-
allows the reuse of secrets across rounds of aggregation. ments [43], common parameters are as follows. N is in the
• Per-round graphs. A new graph generation procedure
range of 100K–10M clients, where |St | = 50–5,000 clients
that allows clients in Flamingo to unilaterally determine are chosen to participate in a given round t. The total number
(without the help of the server as in prior work) which of rounds T for a full training session is 500–10,000. Input
pairwise masks they should use in any given round. weights (⃗xi,t ) have typically on the order of 1K–500K entries
for the datasets we surveyed [16,17,46,47].
These advancements translate into significant perfor- Clients in these systems are heterogeneous devices with
mance improvements (§8.3). For a 10-round pure summation varying degrees of reliability (e.g., cellphones, servers) and
task, Flamingo is 3× faster than BBGLR (this includes can stop responding due to device or network failure.
Flamingo’s one-time setup cost), and includes the contri-
bution of more clients in its result. When training a neural
2.2. Communication model
network on the Extended MNIST dataset, Flamingo takes
about 40 minutes to converge while BBGLR needs roughly Each client communicates with the server through a pri-
3.5 hours to reach the same training accuracy. vate and authenticated channel. Messages sent from clients
to other clients are forwarded via the server, and are end-
2. Problem Statement to-end encrypted and authenticated.
Secure aggregation is useful in a variety of domains: 2.3. Failure and threat model
collecting, in a privacy-preserving way, error reports [12,24],
usage statistics [20,63], and ad conversions [5,75]; it has We model failures of two kinds: (1) honest clients that
even been shown to be a key building block for computing disconnect or are too slow to respond as a result of unstable
private auctions [76]. But one key application is secure fed- network conditions, power loss, etc; and (2) arbitrary actions
erated learning, whereby a server wishes to train a model on by an adversary that controls the server and a bounded
data that belongs to many clients, but the clients do not wish fraction of the clients. We describe each of these below.
2
Dropouts. In each round of summation, the server interacts • Security: for each round t summing over the inputs of
with the clients (or a subset of them) several times (several clients in St , a malicious adversary learns the sum of
steps) to compute a sum. If a server contacts a set of clients inputs from at least (1 − δ − η)|St | clients.
during a step and some of the clients are unable to respond
in a timely manner, the server has no choice but to keep Besides the above, we introduce a new notion that
going; the stragglers are dropped and do not participate in quantifies the quality of a single sum.
the rest of the steps for this summation round. In practice, • Sum accuracy: A round of summation has sum accuracy
the fraction of dropouts in the set depends on the client τ if the final sum result contains the contribution of a τ
response time distribution and a server timeout (e.g., one fraction of the clients who are selected to contribute to
second); longer timeouts mean lower fraction of dropouts. that round (i.e., τ |St |).
There are two types of clients in the system: regular
Input correctness. In the context of federated learning, if
clients that provide their input, and decryptors, who are
malicious clients input bogus weights, then the server could
special clients whose job is to help the server recover the
derive a bad model (it may even contain “backdoors” that
final result. We upper bound the fraction of regular clients
cause the model to misclassify certain inputs [7]). Ensuring
that drop out in any given aggregation round by δ , and upper
correctness against this type of attack is out of the scope
bound the fraction of decryptors that drop out in any given
of this work; to our knowledge, providing strong guarantees
aggregation round by δD .
against malicious inputs remains an open problem. Some
Adversary. We assume a static, malicious adversary that works [6,10,23,60–62] use zero-knowledge proofs to bound
corrupts the server and up to an η fraction of the total how much a client can bias the final result, but they are
N clients. That is, the adversary compromises N η clients unable to formally prove the absence of all possible attacks.
independent of the protocol execution and the corrupted
Comparison to prior work. Flamingo provides a stronger
set stays the same throughout the entire execution (i.e., all
security guarantee than BBGLR. In Flamingo, an adversary
rounds). Note that malicious clients can obviously choose to
who controls the server and some clients learns a sum that
drop out during protocol execution, but to make our security
contains inputs from at least a 1 − δ − η fraction of clients.
definition and analysis clear, we consider the dropout of
In contrast, the malicious protocol in BBGLR leaks several
malicious clients separate from, and in addition to, the
sums: consider a partition of the |St | clients, where each
dropout of honest clients.
partition set has size at least α · |St |; a malicious adversary
Similarly to our dropout model, we distinguish between
in BBGLR learns the sum of each of the partition sets.
the fraction of corrupted regular clients (ηSt ) and corrupted
Concretely, for 5K clients, when both δ and η are 0.2,
decryptors (ηD ). Both depend on η but also on how Flamingo
α < 0.5. This means that the adversary learns the sum of
samples regular clients and decryptors from the set of all
two subsets. This follows from Definition 4.1 and Theorem
N clients. We defer the details to Appendix A, but briefly,
4.9 in BBGLR [9].
ηSt ≈ η ; and given a (statistical) security parameter κ, ηD is
upper bounded by κη with probability 2−Θ(κ) .
3. Background
Threshold requirement. The minimum requirement for
Flamingo to work is δD + ηD < 1/3. For a target security In this section, we discuss BBGLR [9], which is the state
parameter κ, we show in Appendix C how to select other of the art protocol for a single round secure aggregation in
parameters for Flamingo to satisfy the above requirement the federated learning setting. We borrow some ideas from
and result in minimal asymptotic costs. this protocol, but design Flamingo quite differently in order
Comparison to prior works. BBGLR and other works [9, to support a full training session.
11] also have a static, malicious adversary but only for
a single round of aggregation. In fact, in Section 3.3 we 3.1. Cryptographic building blocks
show that their protocol cannot be naturally extended to
multiple aggregations that withstands a malicious adversary We start by reviewing some standard cryptographic
throughout. primitives used by BBGLR and Flamingo.
Pseudorandom generators. A PRG is a deterministic func-
2.4. Properties
tion that takes a random seed in {0, 1}λ and outputs a
Flamingo is a secure aggregation system that achieves longer string that is computationally indistinguishable from
the following properties under the above threat and failure a random string (λ is the computational security parameter).
model. We give informal definitions here, and defer the For simplicity, whenever we use PRG with a seed that is not
formal definitions to Section 5.3. in {0, 1}λ , we assume that there is a deterministic function
• Dropout resilience: when all parties follow the protocol,
that maps the seed to {0, 1}λ and preserves security. Such
the server, in each round t, will get a sum of inputs mapping is discussed in detail in Section 6.
from all the online clients in St . Note that this implicitly Pseudorandom functions. A PRF : K × X → Y is a family
assumes that the protocol both completes all the rounds of deterministic functions indexed by a key in K that map
and outputs meaningful results. an input in X to an output in Y in such a way that the
3
indexed function is computationally indistinguishable from Finally, each client i uses Shamir’s secret sharing to
a truly random function from X to Y . We assume a similar share ai and an additional random value mi to its neighbors
deterministic map for inputs as described in the PRG above. A(i) (let the threshold be ℓ < |A(i)|), where the shares are
Shamir’s secret sharing. An ℓ-out-of-L secret sharing end-to-end encrypted with a secure authenticated encryption
scheme consists of the following two algorithms, Share and scheme and sent via the server (§2.2).
Recon. Share(s, ℓ, L) → (s1 , . . . , sL ) takes in a secret s, a Collection phase. Client i sends the following masked
threshold ℓ, and the number of desired shares L, and outputs vector to the server:
L shares s1 , . . . , sL . Recon takes in at least ℓ+ 1 of the shares, Veci = ⃗xi +
X
PRG(ri,j ) −
X
PRG(ri,j ) + PRG(mi ) ,
and output the secret s; i.e., for a set U ⊆ [L] and |U | ≥ ℓ+1, j∈A(i),i<j j∈A(i),i>j
| {z }
Recon({su }u∈U ) → s. Security requires that fewer than ℓ+ 1 | {z } individual mask
shares reveal no information about s. pairwise mask
Diffie-Hellman key exchange. Let G be a group of order q where ri,j = gai aj , which can be computed by client i
in which the Decisional Diffie-Hellman (DDH) problem is since it has the secret ai and j’s public key, gaj . (These are
hard, and g be a generator of G. Alice and Bob can safely essentially Diffie-Hellman key exchanges between a client
establish a shared secret (assuming a passive adversary) as and its neighbors.) Here we view the output of the PRG as
follows. Alice samples a secret a ←
$
− Zq , and sets her public a vector of integers instead of a binary
P string. Also, we will
$ write the pairwise mask term as j∈A(i) ±PRG(ri,j ) for ease
value to ga ∈ G. Bob samples his secret b ←− Zq , and sets his of notation.
public value to gb ∈ G. Alice and Bob exchange the public As we mentioned earlier (§2), clients may drop out
values and raise the other party’s value to their secret, i.e., due to unstable network conditions, power loss, etc. This
b a
gab = (ga ) = (gb ) . If DDH is hard, the shared secret gab means that the server may not receive some of the masked
is only known to Alice and Bob but no one else. vectors within an acceptable time period. Once the server
times out, the server labels the clients whose vectors have
3.2. The BBGLR protocol been received as “online”; the rest are labeled “offline”.
The server shares this information with all the n clients.
BBGLR is designed for computing a single sum on the The server then sums up all of the received vectors, which
inputs of a set of clients. To apply it to the federated learning yields a vector that contains the masked sum. To recover the
setting, we can simply assume that in a given round of the correct sum, the server needs a way to remove the masks. It
training process, there are n clients selected from a large does so by requesting for each offline client i, the shares of
population of size N. We can then run BBGLR on these n ai from i’s neighbors; and for each online client j, the shares
clients to compute a sum of their inputs. of mj from j’s neighbors. These shares allow the server to
The high level idea of BBGLR is for clients to derive reconstruct either the pairwise mask or the individual mask
pairwise random masks and to add those masks to their input for each client. As long as there are more than ℓ neighbors
vectors in such a way that when all the masked input vectors that send the requested shares, the server can successfully
across all clients are added, the masks cancel out. It consists remove the masks and obtain the sum. This gives the dropout
of a setup phase and a collection phase. We first describe a resilience property of BBGLR.
semi-honest version below. One might wonder the reason for having the individual
Setup phase. The setup phase consists of three steps: (1) mask mi , since the pairwise mask already masks the input.
create a database containing public keys of all of the n To see the necessity
P of having mi , assume that it is not added,
clients; (2) create an undirected graph where vertices are i.e., Veci = ⃗xi + ±PRG(rij ). Suppose client i’s message is
clients, and each vertex has enough edges to satisfy certain sent but not received
P on time. Thus, the server reconstructs
properties; (3) have each client send shares of two secrets i’s pairwise mask ±PRG(rij ). Then, i’s vector Veci arrives
to its neighbors in the graph. We discuss these below. at the server. The server can then subtract the pairwise mask
In the first step, each client i ∈ [n] generates a secret from Veci to learn ⃗xi . The individual mask mi prevents this.
ai and sends gai to the server, where gai represents client Preventing attacks in fault recovery. The above protocol
i’s public key. The server then stores these public keys in a only works in the semi-honest setting. There are two major
database. Note that the malicious-secure version of BBGLR attacks that a malicious adversary can perform. First, a
requires the server to be semi-honest for this particular step, malicious server can give inconsistent dropout information
or the existence of a trusted public key infrastructure (PKI). to honest clients and recover both the pairwise and indi-
In the second step, the graph is established as follows. vidual masks. For example, suppose client i has neighbors
Each client i ∈ [n] randomly chooses γ other clients in [n] j1 , . . . , jγ , and a malicious server lies to the neighbors of
as its neighbors, and tells the server about their choices. j1 , . . . , jγ that j1 , . . . , jγ have dropped out (when they actu-
After the server collects all the clients’ choices, it notifies ally have not). In response, their neighbors, including i, will
each client of their neighbors indexes in [n] and public provide the server with the information it needs to recon-
keys. The neighbors of client i, denoted as A(i), are those struct aj1 , . . . , ajγ , thereby deriving all the pairwise secrets
corresponding to vertices that have an edge with i (i.e., i ri,j1 , . . . , ri,jγ . At the same time, the server can tell j1 , . . . , jγ
chose them or they chose i). that i was online and request the shares of mi . This gives
4
the server both the pairwise mask and the individual mask 4.1. High-level ideas
of client i, violating i’s privacy. To prevent this, BBGLR has
a consistency check step performed among all neighbors of Flamingo has three key ideas:
each client to reach an agreement on which nodes actually (1) Lightweight dropout-resilience. Instead of asking
dropped out. In this case, i would have learned that none of clients to secret share ai and mi for their masks with all
its neighbors dropped out and would have refused to give of their neighbors, we let each client encrypt—in a spe-
the shares of their pairwise mask. cial way—the PRG seeds of their pairwise and individual
Second, malicious clients can submit a share that is dif- masks, append the resulting ciphertexts to their masked input
ferent than the share that they received from their neighbors. vectors, and submit them to the server in a single step.
This could lead to reconstruction failure at the server, or to Then, with the help of a special set of L clients that we
the server deriving a completely different secret. BBGLR call decryptors, the server can decrypt one of the two seeds
fixes the latter issue by having the clients hash their secrets associated with each masked input vector, but not both. In
and send these hashes to the server when they send their in- effect, this achieves a similar fault-tolerance property as
put vectors; however, reconstruction could still fail because BBGLR (§3.2), but with a different mechanism.
of an insufficient threshold in error correction1 . The crucial building block enabling this new protocol is
In sum, the full protocol of BBGLR that withstands a threshold decryption [28,32,64], in which clients can encrypt
malicious adversary (assuming a PKI or a trusted server data with a system-wide known public key PK, in such a
during setup) has six steps in total: three steps for the setup way that the resulting ciphertexts can only be decrypted
and three steps for computing the sum. with a secret key SK that is secret shared among decryptors
in Flamingo. Not only does this mechanism hide the full
3.3. Using BBGLR for federated learning secret key from every party in the system, but the decryptors
can decrypt a ciphertext without ever having to interact
BBGLR works well for one round of training, but when with each other. Specifically, the server in Flamingo sends
many rounds are required, several issues arise. First, in the ciphertext (pealed from the submitted vector) to each
federated learning the set of clients chosen to participate of the decryptors, obtains back some threshold ℓ out of
in a round changes, so a new graph needs to be derived L responses, and locally combines the ℓ responses which
and new secrets must be shared. Even if the graph stays produce the corresponding plaintext. Our instantiation of
the same, the server cannot reuse the secrets from the setup threshold decryption is based on the ElGamal cryptosystem
in previous rounds as the masks are in fact one-time pads and Shamir’s secret sharing; we describe it in Section 6.
that cannot be applied again. This means that we must run Technically one can also instantiate the threshold decryption
the setup phase for each round, which incurs a high latency with other protocols, but we choose ElGamal cryptosystem
since the setup contains three steps involving all the clients. because it enables efficient proof of decryption (for mali-
Moreover, BBGLR’s threat model does not naturally cious security, §4.3) and simple distributed key generation.
extend to multi-round aggregation. It either needs a semi- One key technical challenge that we had to overcome
honest server or a PKI during the first step of the protocol. when designing this protocol is figuring out how to secret
If we assume the former, then this means the adversary has share the key SK among the decryptors. To our knowledge,
to be semi-honest during the exact time of setup in each existing efficient distributed key generation (DKG) proto-
round, which is practically impossible to guarantee. If we cols [18,33,57] assume a broadcast channel or reliable point-
use a PKI, none of the keys can be reused (for the above to-point channels, whereas our communication model is that
reasons); as a result, all of the keys in the PKI need to be of a star topology where all messages are proxied by a po-
updated for each round, which is costly. tential adversary (controlling the server) that can drop them.
There are also asynchronous DKG protocols [3,26,44], but
standard asynchronous communication model assumes even-
4. Efficient Multi-Round Secure Aggregation tual delivery of messages which is not the case in our
setting. In fact, we can relax the guarantees of DKG and
Flamingo supports multi-round aggregation without re- Section 4.3 gives an extension of a discrete-log-based DKG
doing the setup for each round and withstands a malicious protocol [33] (since we target ElGamal threshold decryp-
adversary throughout. The assumptions required are: (1) in tion) that works in the star-topology communication model.
the setup, all parties are provided with the same random In sum, the above approach gives a dropout-resilient
seed from a trusted source (e.g., a distributed randomness protocol for a single summation with two steps: first, each
beacon [25]); and (2) a PKI (e.g., a key transparency client sends their masked vector and the ciphertexts of the
log [21,41,49,53,68,70,71]). Below we describe the high- PRG seeds; second, the server uses distributed decryption
level ideas underpinning Flamingo (§4.1) and then we give to recover the seeds (and the masks) for dropout clients (we
the full protocol (§4.3 and §4.4). discuss how to ensure that decryptors agree on which of the
two seeds to decrypt in §4.4). This design improves the run
1. To apply a standard error correction algorithm such as Berlekamp-
Welch in this setting, the polynomial degree should be at most γ/3.
time over BBGLR by eliminating the need to involve all the
Definition 4.2 in BBGLR implies that the polynomial degree may be larger clients to remove the masks—the server only needs to wait
than required for error correction. until it has collected enough shares from the decryptors,
5
instead of waiting for almost all the shares to arrive. Fur- 1: Parameters: ϵ. // the probability that an edge is added
2: function C HOOSE S ET(v, t, nt , N)
thermore, the communication overhead of appending several 3: St ← ∅.
small ciphertexts (64 bytes each) to a large input vector 4: v∗t := PRF(v, t).
(hundreds of KBs) is minimal. 5: while |St | < nt do
6: Parse log N bits from PRG(v∗t ) as i, add i to St .
(2) Reusable secrets. Flamingo’s objective is to get rid of 7: Output St .
the setup phase for each round of aggregation. Before we 8: function G EN G RAPH(v, t, St )
discuss Flamingo’s approach, consider what would happen 9: Gt ← nt × nt empty matrix; ρ ← log(1/ϵ).
if we were to naively run the setup phase in BBGLR once, 10: for i ∈ St , j ∈ St do
followed by running the collection procedure multiple times. 11: Let v′ be the first ρ bits of PRF(v, (i, j)).
12: if v′ = 0ρ then set Gt (i, j) := 1
First, we are immediately limited to performing all of the
13: Output Gt .
aggregation tasks on the same set of clients, since BBGLR
14: function F IND N EIGHBORS(v, St , i)
establishes the graph of neighbors during the setup phase. 15: At (i) ← ∅; ρ ← log(1/ϵ).
This is problematic since federated learning often chooses 16: for j ∈ St do
different sets of clients for each round of aggregation (§2). 17: Let v′ be the first ρ bits of PRF(v, (i, j)).
Second, clients’ inputs are exposed. To see why, suppose 18: if v′ = 0ρ then add j to At (i).
that client i drops out in round 1 but not in 2. In round 1, 19: for j ∈ St do
20: Let v′ be the first ρ bits of PRF(v, (j, i)).
the server reconstructs ri,j for P j ∈ A(i) to unmask the sum. 21: if v′ = 0ρ then add j to At (i).
In round 2, client i sends ⃗xi + j∈A(i) ±PRG(ri,j ) + PRG(mi ) 22: Output At (i).
and the server reconstructs mi by asking i’s neighbors for
their shares. Since all the ri,j are reconstructed in round 1 Figure 1: Pseudocode for generating graph Gt in round t.
and are reused in round 2, the server can derive both masks.
The above example shows that the seeds should be new
and independent in each round. We accomplish this with a other client). In federated learning, however, establishing
simple solution that adds a level of indirection. Flamingo sparse graphs requires per-round generation since the set
treats ri,j as a long-term secret and lets the clients apply St changes in each round (some clients are chosen again,
a PRF to generate a new seed for each pairwise mask. some are new [48]). A naive way to address this is to let
Specifically, clients i computes the PRG seed for pairwise all clients in [N ] establish a big graph G with N nodes in
mask in round t as hi,j,t := PRF(ri,j , t) for all j ∈ A(i). Note the setup phase: each client in [N ] sends its choice of γ
that client j will compute the same hi,j,t as it agrees with i on neighbors to the server, and the server sends to each client
ri,j . In addition, each client also generates a fresh seed mi,t the corresponding neighbors. Then, in each round t, the
for the individual mask in round t. Consequently, combined corresponding subgraph Gt consists of clients in St and the
with idea (1), each client uses PK to encrypt the per-round edges among clients in St .
seeds, {hi,j,t }j∈A(i) and mi,t . Then, the server recovers one of However, this solution is unsatisfactory. If one uses a
the two for each client. We later describe an optimization small γ (e.g., log N), Gt might not be connected and might
where clients do not encrypt mi,t with PK (§4.4). even have isolated nodes (leaking a client’s input vector
A nice property of ri,j being a long-term secret is that since it has no pairwise masks); if one uses a large γ (e.g.,
Flamingo can avoid performing all the Diffie-Hellman key the extreme case being N), Gt will not be sparse and the
exchanges between graph neighbors (proxied through the communication cost for the server will be high (e.g., O(N 2 )).
server). Flamingo relies instead on an external PKI or a Flamingo introduces a new approach for establishing the
verifiable public key directory such as CONIKS [53] and graph with a communication cost independent of γ . The
its successors (which are a common building block for graph for each round t is generated by a random string
bootstrapping end-to-end encrypted systems). v ∈ {0, 1}λ known to all participants (obtained from a
We note that this simple technique cannot be applied randomness beacon or a trusted setup). Figure 1 lists the
to BBGLR to obtain a secure multi-round protocol. It is procedure. C HOOSE S ET(v, t, nt , N ) determines the set of
possible in Flamingo precisely because clients encrypt their clients involved in round t, namely St ⊆ [N ] with size nt .
per-round seeds for pairwise masks directly so the server The server computes Gt ← G EN G RAPH(v, t, St ) as the graph
never needs to compute these seeds from the long-term pair- in round t among clients in St . A client i ∈ St can find its
wise secrets. In contrast, in BBGLR, clients derive pairwise neighbors in Gt without materializing the whole graph using
secrets (gai ,aj ) during the setup phase. When client i drops F IND N EIGHBORS(v, St , i). In this way, the neighbors of i
out, the server collects enough shares to reconstruct ai and can be locally generated. We choose a proper ϵ such that in
compute the pairwise secrets, gai ,aj , for all online neighbors each round, the graph is connected with high probability
j of client i. Even if we use a PRF here, the server already (details in §5). We note that this technique might be of
has the pairwise secret; so it can run the PRF for any round independent interest.
and conduct the attacks described earlier. The above ideas taken together eliminate the need for
(3) Per-round graphs. BBGLR uses a sparse graph instead per-round setup, which improves the overall run time of
of a fully-connected graph for efficiency reasons (otherwise multi-round aggregation over BBGLR. Figure 2 depicts the
each client would need to secret share its seeds with every overall protocol, and the next sections describe each part.
6
Setup Report Cross-check & Reconstruction … Report Cross-check & Reconstruction Server
Decryptors
… …
… … … … … … Clients reporting vector
Figure 2: Workflow of Flamingo. The server first does a setup for all clients in the system. In each round t of training, the server securely
aggregates the masked input vectors in the report step; in the cross-check and reconstruction steps, the server communicates with a small
set of randomly chosen clients who serve as decryptors. The decryptors are chosen independently from the set St that provides inputs in
a given round. Every R rounds, the decryptors switch and the old decryptors transfers shares of SK to new decryptors.
4.2. Types of keys at PKI half of the participants can be dishonest; the remaining
must perform the following steps correctly: (1) Each party i
Before giving our protocol, we need to specify what generates a random value si and acts as a dealer to distribute
types of keys the PKI needs to store. The keys depend on the shares of si (party j gets si,j ). (2) Each party j verifies
the cryptographic primitives that we use (signature schemes, the received shares (we defer how the verification is done
symmetric encryption and ElGamal encryption); for ease of to Appendix B.2). If the share from the party i fails the
reading, we formally give these primitives in Appendix B.1. verification, j broadcasts a complaint against party i. (3)
The PKI stores three types of keys for all clients in [N ]: Party i broadcasts, for each complaint from party j, the si,j
a
• g i of client i for its secret ai ; this is for client j to derive for verification. (4) Each party disqualifies those parties that
the pairwise secret ri,j with client i by computing (gaj )ai . fail the verification; the rest of the parties form a set QUAL.
b
• g i of client i for deriving a symmetric encryption key
Then each party sums up the shares from QUAL to derive
ki,j for an authenticated encryption scheme SymAuthEnc a share of the secret key.
(Definition 3); this scheme is used when a client sends Given our communication model, it appears hard to
messages to another client via the server. Later when we guarantee the standard DKG correctness property, which
say client i sends a message to client j via the server states that if there are enough honest parties, at the end of
in the protocol, we implicitly assume the messages are the protocol the honest parties hold valid shares of a unique
encrypted using ki,j . secret key. Instead, we relax this correctness property by
• pki of client i for verifying i’s signature on messages allowing honest parties to instead abort if the server who is
signed by ski . proxying the messages acts maliciously.
We modify GJKR-DKG in the following ways. First,
4.3. Setup phase we assume the threshold of dishonest participants is 1/3.
Second, all of the messages are signed; honest parties abort
The setup phase consists of two parts: (1) distributing a if they do not receive the prescribed messages. Third, we
random seed v ∈ {0, 1}λ to all participants, and (2) selecting add another step before each client decides on the eventual
a random subset of clients as decryptors and distribute set QUAL: all parties sign their QUAL set and send it to
the shares of the secret key of an asymmetric encryption the server; the server sends the signed QUALs to all the
scheme AsymEnc. In our context, AsymEnc is the ElGamal parties. Each party then checks whether it receives 2ℓ + 1
cryptosystem’s encryption function (Definition 2). or more valid signed QUAL sets that are the same. If so,
As we mentioned earlier, the first part can be done then the QUAL set defines a secret key; otherwise the party
through a trusted source of randomness, or by leveraging aborts. We give the detailed algorithms and the correspond-
a randomness beacon that is already deployed, such as ing proofs in Appendix B.2. Note that the relaxation from
Cloudflare’s [1]. The second part can be done by selecting GJKR-DKG is that we allow parties to abort (so no secret
a set of L clients as decryptors, D, using the random key is shared at the end), and this is reasonable particularly
seed v (C HOOSE S ET), and then running a DKG protocol in the federated learning setting because the server will not
among them. We use a discrete-log based DKG protocol [33] get the result if it misbehaves.
(which we call GJKR-DKG) since it is compatible with the
ElGamal cryptosystem. However, this DKG does not work
under our communication model and requires some changes Decryptors run DKG. At the end of our DKG, a subset of
and relaxations, as we discuss next. the selected decryptors will hold the shares of the secret key
DKG with an untrusted proxy. The correctness and secu- SK. The generated PK is signed by the decryptors and sent
rity of the GJKR-DKG protocol relies on a secure broadcast to all of the N clients by the server; the signing prevents the
channel. Our communication model does not have such a server from distributing different public keys or distributing
channel, since the server can tamper, replay or drop mes- a public key generated from a set of malicious clients. Each
sages. Below we give the high-level ideas of how we modify client checks if it received 2ℓ + 1 valid signed PKs from
GJKR-DKG and Appendix B.2 gives the full protocol. the set of decryptors determined by the random seed (from
We begin with briefly describing the GJKR-DKG proto- beacon); if not, the client aborts. In Appendix B, we provide
col. It has a threshold of 1/2, which means that at most the pseudocode for the entire setup protocol Πsetup (Fig. 11).
7
4.4. Collection phase checks it received 2L/3 or more valid signed labels (recall
from §2.3 that δD + ηD < 1/3). If so, each decryptor further
The collection phase consists of T rounds; each round checks that:
t has three steps: report, cross-check, and reconstruction. 1) The number of online clients is at least (1 − δ)nt ;
Below we describe each step and we defer the full protocol 2) All the online clients in the graph are connected;
Πsum to Figure 13 in Appendix B. The cryptographic primi- 3) Each online client i has at least k online neighbors2 ,
tives we use here (SymAuthEnc and AsymEnc) are formally such that η k < 2−κ (η and κ are defined as in §2.3).
given in Appendix B.1.
If any of the above checks fail, the decryptor aborts. This
Report. In round t, the server uses the random value v step ensures either all the honest decryptors agree on a valid
(obtained from the setup) to select the set of clients St ⊆ [N ] offline/online label assignment and consequently the server
of size nt by running C HOOSE S ET(v, t, nt , N ). It then estab- gets the result, or the honest decryptors abort and the server
lishes the graph Gt ← G EN G RAPH(v, t, St ) as specified in gets nothing.
Figure 1. We denote the neighbors of i as At (i) ⊆ St . The
server asks each client i ∈ St to send a message consisting Reconstruction. The server collects all the ciphertexts to be
of the following three things: decrypted: the ciphertexts of mi,u,t (symmetric encryption)
P for the online clients, and the ciphertexts of hi,j,t (public-
1) Veci = ⃗xi,t + j∈At (i) ±PRG(hi,j,t ) + PRG(mi,t ), where key encryption) for the offline clients. Then the server sends
hi,j,t is computed as PRF(ri,j , t) and ri,j is derived from the ciphertexts to all the decryptors who perform either a
the key directory by computing (gaj )ai ; mi,t is freshly symmetric decryption or the threshold ElGamal decryption
generated in round t by client i. according to their agreed-upon labels.
2) L symmetric ciphertexts: SymAuthEnc(ki,u , mi,u,t ) for all The choice of using decryptors to check the graph and
u ∈ D, where mi,u,t is the share of mi,t meant for u reconstruct all the secrets is based on an important observa-
(i.e., Share(mi,t , ℓ, L) → {mi,u,t }u∈D ), and ki,u is the tion in federated learning: the number of clients involved
symmetric encryption key shared between client i and in one round, nt , is much smaller than the input vector
decryptor u (they can derive ki,u from the PKI); length [43]. Therefore, the asymptotic costs at a decryptor
3) |At (i)| ElGamal ciphertexts: AsymEnc(PK, hi,j,t ) for all (which are proportional to nt ) are actually smaller than the
j ∈ At (i). size of an input weight vector.
The above way of using symmetric encryption for
individual masks and public-key encryption for pairwise 4.5. Malicious labeling across rounds
masks is for balancing computation and communication
in practice. Technically, one can also encrypt the shares
The server, controlled by a malicious adversary, can ask
of hi,j,t with symmetric authenticated encryption as well
for the decryption of hi,j,t in round t, and then in some other
(eliminating public-key operations), but it increases client
round t′ , the server can ask for the decryption of mi,t (but
communication—for each client, the number of ciphertexts
not mi,t′ , if the server does not care about obtaining a result
appended to the vector is |A(i)| · L. This is, for example,
in round t′ ). This allows the server to recover ⃗xi,t in the clear.
1,600 when L and |A(i)| are both 40. On the other hand, if
To prevent this attack, honest decryptors need to know the
one encrypts both the pairwise and individual masks using
round for which a ciphertext is sent. For symmetric cipher-
only public-key encryption, then the number of expensive
text, the client appends the round number t to the plaintext
public key operations for reconstructing secrets is propor-
(e.g., mi,u,t ||t) and uses authenticated encryption; for public-
tional to nt ; whereas it is only proportional to the number of
key ciphertexts, the client appends t to the ciphertext c and
dropouts in our proposed approach. In practice, the number
signs the tuple (c, t) (the verification key is in the PKI).
of dropouts is much smaller than nt , hence the savings.
Note that a malicious adversary can still fool enough honest
Cross-check. The server needs to recover mi,t for online decryptors into thinking it is round t while it is in fact t′ . To
clients, and hi,j,t for clients that drop out. To do so, the prevent this, decryptors also include the round number in the
server labels the clients as “offline” or “online” and asks the online/offline labels and sign them. The cross-check (§4.4)
decryptors to recover the corresponding masks. For BBGLR, guarantees that the decryptors agree on the round number.
we described how this step involves most clients during
the fault recovery process and highlighted an issue where 4.6. Load balancing across decryptors
a malicious server can send inconsistent labels to clients
and recover both the pairwise mask and individual mask for In each summation, a client who is not a decryptor only
some target client (§3.2). Flamingo also needs to handle this sends a single vector. This is nearly optimal since even if
type of attack (the server tells some of honest decryptors to the aggregation is non-private the client has to send the
decrypt mi,t and other honest decryptors to decrypt hi,j,t , and vector (but without the additional small ciphertexts). The
utilizes the malicious decryptors to reconstruct both), but it decryptors, however, have additional computation and com-
only needs to involve decryptors. In detail, each decryptor munication in order to help with the result reconstruction.
signs the online/offline labels of the nt clients (each client
can only be labeled either offline or online), and sends 2. This can be efficiently done in an inverse way of checking how many
them to the other decryptors (via the server). Each decryptor offline neighbors that each online client has, assuming dropout rate is small.
8
This causes a load imbalance in the system and could be Functionality Fsetup
unfair since a client selected to be a decryptor has to do
more work than regular clients. Parties: clients 1, . . . , N and a server.
In Flamingo, the decryptor responsibility shifts across •
$
− {0, 1}λ .
Fsetup samples v ←
time. Every R rounds, the current decryptors transfer their
• Fsetup samples a secret key and public key pair (SK, PK ).
shares of SK to a new set of randomly selected clients who
serve as the new decryptors. To ensure security, the shares // When the public-key cryptosystem is instantiated by ElGamal,
$
of SK have to be modified in a particular way during the − Zq and PK = gs .
then SK is s ←
transition, as otherwise the adversary may control some ma- • Fsetup asks the adversary A whether it should continue or not. If
licious decryptors before the transition and some malicious A replies with abort, Fsetup sends abort to all honest parties; if A
decryptors after the transition, and thus may obtain enough replies with continue, Fsetup sends v and PK to all the parties.
shares to reconstruct SK. We address this by relying on
prior proactive secret sharing techniques [32,40,45]; they
additionally enable Flamingo to change the number of de- Figure 3: Ideal functionality for the setup phase.
cryptors and the threshold as needed. In Appendix B.4, we
provide details of the transition protocol used in Flamingo.
A final clarification is that decryptors who dropped out 5.1. Security of setup phase
(e.g., due to power loss) at one round can come back
later and participate in another round (e.g., when power is Let δD upper bound the fraction of decryptors that drop
resumed). The decryption always succeeds since we require out during the setup phase; note that in Section 2.3 we let
that less than 1/3 deryptors are dropped out or malicious δD upper bound the dropouts in one aggregation round and
at any step (§5). The secret key transition is purely for sys- for simplicity here we use the same notation. Flamingo’s
tem load balancing—neither dropout resilience nor security DKG requires that δD + ηD < 1/3. Note that ηD in fact
relies on the parameter R. depends on η , L and N, but we will give the theorems using
ηD and discuss how to choose L to guarantee a desired ηD
4.7. Considerations in federated learning in Appendix C.
A recent work [56] describes an attack in the composi- Theorem 1 (Security of setup phase). Assume that a PKI
tion of federated learning and secure aggregation. The idea and a trusted source of randomness exist, and that the DDH
is that the server can elude secure aggregation by sending assumption holds. Let the dropout rate of decryptors in the
clients inconsistent models. For example, the server sends setup phase be bounded by δD . If δD +ηD < 1/3, then under
to client 1 model M1 , to client 2 model M2 , and to client 3 the communication model defined in Section 2.2, protocol
model M3 . Each of the clients then runs the local training on Πsetup (Fig. 11) securely realizes functionality Fsetup (Fig. 3)
the provided model. The server chooses the models it sends in the presence of a malicious adversary controlling the
to clients 1 and 2 in a special way such that after clients 1 server and η fraction of the N clients.
and 2 train their local models, their local weights will cancel
out when added. Consequently, the server will get the model 5.2. Security of collection phase
weights of client 3. The proposed defense, which works in
our context without incurring any overhead, is for clients to First, we need to guarantee that each graph Gt , even after
append the hash of the model they receive in a given round removing the vertices corresponding to the δ + η fraction
to their PRG seed for that round: PRG(hi,j,t ||Hash(M )), of dropout and malicious clients, is still connected. This
where M is the model received from the server. If all the imposes a requirement on ϵ, which we state in Lemma 1. For
clients receive the same model, the pairwise masks cancel simplicity, we omit the exact formula for the lower bound
out; otherwise, they do not. of ϵ and defer the details to Appendix C.
5. Parameter Selection and Security Analysis Lemma 1 (Graph connectivity). Given a security parameter
κ, and threat model parameters δ , η (§2.3). Let G be a
The parameters of Flamingo include: random graph G(n, ϵ). Let C , O be two random subsets
• System parameters N, T and the number of clients nt of nodes in G where |O| ≤ δ n and |C| ≤ η n (O stands
chosen in round t ∈ [T ]; for dropout set and C stands for malicious set). Let G̃ be
• Threat model parameters δD , δ , η which are given, and the graph with nodes in C and O and the associated edges
ηSt , ηD which depend on η (their relation is summarized removed from G. There exists ϵ∗ such that for all ϵ ≥ ϵ∗ ,
in Section 2.3 and fully derived in Appendix A). G̃ is connected except with probability 2−κ .
• Security parameter κ, and the parameters that relates to Secondly, we require 2δD + ηD < 1/3 to ensure that all
security: graph generation parameter ϵ, and the number online honest decryptors reach an agreement in the cross-
of selected decryptors L. check step and the reconstruction is successful. Note that the
We discuss these parameters in detail below and state decryptors in the setup phase who dropped out (δD fraction)
our formal lemmas with respect to them. will not have the share of SK; while the clients who drop
9
Functionality Fmal scheme, if 2δD + ηD < 1/3 and ϵ ≥ ϵ∗ (κ) (Lemma 1), then
under the communication model defined in Section 2.2, pro-
Parties: clients 1, . . . , N and a server. tocol ΦT securely realizes the ideal functionality Fmal given
Parameters: corrupted rate η , dropout rate δ , number of per-round in Figure 4 in the presence of a static malicious adversary
participating clients n. controlling η fraction of N clients (and the server),3 except
with probability at most Tn · 2−κ+1 .
• Fmal receives from a malicious adversary A a set of corrupted
parties, denoted as C ⊂ [N ], where |C|/N ≤ η . Additionally, if the asymmetric encryption is instantiated
• For each round t ∈ [T ]:
by ElGamal cryptosystem, the security can be based on the
DDH assumption and the other primitives stated above.
1) Fmal receives a sized-n random subset St ⊂ [N ] and a dropout set The final complication is how to choose L to ensure
Ot ⊂ St , where |Ot |/|St | ≤ δ , and |C|/|St | ≤ η , and inputs ⃗xi,t 2δD + ηD < 1/3 holds; note that ηD depends on η and L.
from client i ∈ St \(Ot ∪ C). One can choose L to be N but it does not give an efficient
2) Fmal sends St and Ot to A, and asks A for a set Mt : if A replies protocol; on the other hand, choosing a small L may result
with Mt ⊆ St \Ot such that |Mt |/|St | ≥ 1 −δ , then Fmal computes in all the decryptors being malicious. In Appendix C, we
⃗yt =
P
xi,t and continues; otherwise Fmal sends abort to
give a lower bound of L to ensure a desired ηD (w.h.p.),
i∈Mt \C ⃗
given N, η , and δD .
all the honest parties.
3) Depending on whether the server is corrupted by A:
6. Implementation
• If the server is corrupted by A, then Fmal outputs ⃗yt to all the
parties corrupted by A. We implement Flamingo in 1.7K lines and BBGLR in
• If the server is not corrupted by A, then Fmal asks A for a 1.1K lines of Python. For PRG, we use AES in counter
shift ⃗at and outputs ⃗yt + ⃗at to the server. mode, for authenticated encryption we use AES-GCM, and
signatures use ECDSA over curve P-256. Our code is avail-
able at https://github.com/eniac/flamingo.
Figure 4: Ideal functionality for Flamingo. Distributed decryption. We build the distributed decryption
scheme discussed in Section 4.1 as follows. We use ElGamal
encryption to instantiate the asymmetric encryption. It con-
out during a round (another δD fraction) in the collection
sists of three algorithms (AsymGen, AsymEnc, AsymDec).
phase can come back at another round, hence we have the
AsymGen outputs a secret and public key pair SK ∈R Zq
above inequality.
and PK := gSK ∈ G. AsymEnc takes in PK and plaintext
h ∈ G, and outputs ciphertext (c0 , c1 ) := (gw , h·PK w ), where
5.3. Main theorems w ∈R Zq is the encryption randomness. AsymDec takes in
−1
SK and ciphertext (c0 , c1 ) and outputs h = (cSK 0 ) · c1 .
The full protocol, denoted as ΦT , is the sequential In threshold decryption [28,32,64], the secret key SK
execution of Πsetup (Fig. 11) followed by a T-round Πsum is shared among L parties such that each party u ∈ [L]
(Fig. 13). We now give formal statements for the properties holds a share su , but no single entity knows SK, i.e.,
of Flamingo, and defer the proof to Appendix D. Note that (s1 , . . . , sL ) ← Share(SK, ℓ, L). Suppose Alice wants to
as we see from the ideal functionality Fmal (Fig. 4), when decrypt the ciphertext (c0 , c1 ) using the secret-shared SK.
the server is corrupted, the sum result in round t is not To do so, Alice sends c0 to each party in [L], and gets back
determined by the actual dropout set Ot , but instead a set cs0u for u ∈ U ⊆ [L]. If |U | > ℓ, Alice can compute from U
Mt chosen by the adversary (see details in Appendix D.5). a set of combination coefficients {βu }u∈U , and
Theorem 2 (Dropout resilience of ΦT ). Let δ , δD , η , ηD be Y
cSK
0 = (cs0u )βu .
threat model parameters as defined (§5.1,§5.2). If 2δD +
u∈ U
ηD < 1/3, then protocol ΦT satisfies dropout resilience:
SK −1
when all parties follow the protocol ΦT , for every round Given cSK0 , Alice can get the plaintext h = (c0 ) · c1 .
t ∈ [T ], and given a set of dropout clients Ot in the report Three crucial aspects of this protocol are that: (1) SK is
P where |Ot |/|St | ≤ δ , protocol−κ
step ΦT terminates and outputs never reconstructed; (2) the decryption is dropout resilient
i∈St \O ⃗
x i,t , except probability 2 . (Alice can obtain h as long as more than ℓ parties respond);
(3) it is non-interactive: Alice communicates with each party
Theorem 3 (Security of ΦT ). Let the security parameter
exactly once.
be κ. Let δ , δD , η , ηD be threat model parameters as defined
We implement ElGamal over elliptic curve group G and
(§5.1,§5.2). Let ϵ be the graph generation parameter (Fig. 1).
we use curve P-256. To map the output of PRF(ri,j , t), which
Let N be the total number of clients and n be the number of
is a binary string, to G, we first hash it to an element in the
clients for summation in each round. Assuming the existence
field of the curve using the hash-to-field algorithm from the
of a PKI, a trusted source of initial randomness, a PRG,
a PRF, an asymmetric encryption AsymEnc, a symmet- 3. We assume that in each round, the corrupted fraction of n clients is
ric authenticated encryption SymAuthEnc, and a signature also η ; see Section 2.
10
BBGLR Flamingo
Phase Steps Server Client Steps Server Client
Setup — — — 4 O(L3 ) O(L2 )
Round setup 3T O(TAnt ) O(TA) — — —
T sums Regular client: O(T (d + A))
Collection 3T O(Tnt (d + A)) O(T (d + A)) 3T O(Tnt (d + L + A))
Decryptors: O(T (L + δ Ant + (1 − δ)nt ))
Figure 5: Communication complexity and number of steps (client-server round-trips) of Flamingo and BBGLR for T rounds
of aggregation. N is the total number of clients and nt is the number of clients chosen to participate in round t. The number
of decryptors is L, and the dropout rate of clients in St is δ . Let A be the upper bound on the number of neighbors of a
client, and let d be the dimension of client’s input vector.
IETF draft [39, §5]. We then use the SSWU algorithm [13, In addition to fewer round trips, Flamingo is expected
72] to map the field element to a curve point P ∈ G. A client to wait for less time during the reconstruction step. This
will encrypt P with ElGamal, and then hash P with SHA256 is because the server in BBGLR has to wait for the vast
to obtain hi,j,t —the input to the pairwise mask’s PRG. When majority of clients to respond in order to reconstruct secrets
the server decrypts the ElGamal ciphertexts and obtains P, for dropout clients, while the server in Flamingo only needs
it uses SHA256 on P to obtain the same value of hi,j,t . responses from 1/3 of the decryptors.
Optimizations. In Flamingo’s reconstruction step, we let Client and server costs. Figure 5 compares Flamingo’s
the server do reconstruction using partial shares. That is, if communication costs with those of BBGLR. In short, the
the interpolation threshold is 15, and the server collected total asymptotic cost for T aggregation rounds between
shares from 20 decryptors, it will only use 15 of them and BBGLR and Flamingo does not vary much; but the number
ignore the remaining 5 shares. Furthermore, as we only have of round trips differ much. The computation cost is analyzed
a single set of decryptors, when the server collects shares as follows. For the setup phase, if a client is a decryptor,
from U ⊆ D, it computes a single set of interpolation it has O(L2 ) computation for both DKG, and secret key
coefficients from U and uses it to do linear combinations transfer. In the collection phase at round t, each client
on all the shares. This linear combination of all the shares computes O(A + L) encryptions (assuming that for all clients
can be done in parallel. In contrast, BBGLR requires the i, A(i) ≤ A). If a client is a decryptor, it additionally has a
server to compute different sets of interpolation coefficients O(δ Ant + (1 − δ)nt + ϵn2t ) computation cost.
to reconstruct the pairwise masks (one set for each client).
Simulation framework. We integrade all of Flamingo’s 8. Experimental Evaluation
code into ABIDES [14,15], which is an open-source high-
fidelity simulator designed for AI research in financial mar- In this section we answer the following questions:
kets (e.g., stock exchanges). ABIDES is a great fit as it What are Flamingo’s concrete server and client costs,
•
supports tens of thousands of clients interacting with a server and how long does it take Flamingo to complete one and
to facilitate transactions (and in our case to compute sums). multiple rounds of aggregation?
It also supports configurable pairwise network latencies. • Can Flamingo train a realistic neural network?
• How does Flamingo compare to the state of the art in
7. Asymptotic Costs terms of the quality of the results and running time?
We implement the following baselines:
An advantage of Flamingo over BBGLR is that
Flamingo requires fewer round trips for one summation. Non-private baseline. We implement a server that simply
BBGLR requires six round trips for one summation; in sums up the inputs it receives from clients. During a partic-
contrast, Flamingo requires three: the report step involves ular round, each of the clients sends a vector to the server.
all clients selected in a round, and for the remaining two These vectors are in the clear, and may be any sort of value
steps the server contacts the decryptors. Meanwhile, the (e.g. floating points), unlike Flamingo, which requires data
number of round trips has a significant impact on the overall to be masked positive integers. The server tolerates dropouts,
runtime of the aggregation, as we show experimentally in as Flamingo does, and aggregates only the vectors from
Section 8.2. The reasons for this are two-fold: (1) the latency clients who respond before the timeout.
of an RTT over the wide area network is on the order of tens BBGLR. For BGGLR, we implement Algorithm 3 in their
of milliseconds; and (2) the server has to wait for enough paper [9] with a slight variation that significantly improves
clients in each step to send their messages, so tail latency BBGLR’s running time, although that might introduce secu-
plays a big role. Depending on the setting, client message rity issues (i.e., we are making this baseline’s performance
arrival can vary widely, with some clients potentially being better than it is in reality, even if it means it is no longer
mobile devices on slow networks. secure). Our specific change is that we allow clients to drop
11
out during the graph establishment procedure and simply 106
BBGLR
106 BBGLR (all clients)
communication (bytes)
Flamingo (regular clients)
construct a graph with the clients that respond in time.
communication (KB)
Flamingo Flamingo (decryptors)
BBGLR (originally) requires that no client drops out during 104 104
graph establishment to ensure that the graph is connected.
Without this modification, BBGLR’s server has to wait for 102 102
all the clients to respond and is severely impacted by the
long tail of the client response distribution—which makes
100 keyad graph shares report check recon 100 keyad graph shares report check recon
our experiments take an unreasonable amount of time.
(a) Server communication (b) Client communication
8.1. Experimental environment
Figure 6: Communication costs for different steps in a single
Prior works focus their evaluation on the server’s costs. summation over 1K clients for Flamingo and BBGLR.
While this is an important aspect (and we also evaluate it),
a key contributor to the end-to-end completion time of the CPU costs keyad graph share report check recon
aggregation (and of the federated learning training) is the Server (sec)
number of round-trips between clients and the server. This BBGLR 0.11 0.27 0.09 0.09 0.08 0.76
is especially true for geodistributed clients. Flamingo — — — 0.24 — 2.30
To faithfully evaluate real network conditions, we run Client (sec)
the ABIDES simulator [14] on a server with 40 Intel Xeon BBGLR <0.01 <0.01 <0.01 0.02 0.01 <0.01
Flamingo
E5-2660 v3 (2.60GHz) CPUs and 200 GB DDR4 memory. Regular clients — — — 0.22 — —
Note that in practice, client devices are usually less powerful Decryptors — — — — 0.10 0.56
than the experiment machine. ABIDES supports the cubic
network delay model [37]: the latency consists of a base Figure 7: Single-threaded microbenchmarks averaged over
delay (a range), plus a jitter that controls the percentage 10 runs for server and client computation for a single
of messages that arrive within a given time (i.e., the shape summation over 1K clients. “<” means less than.
of the delay distribution tail). We set the base delay to the
“global” setting in ABIDES’s default parameters (the range
is 21 microseconds to 53 milliseconds), and use the default steps correspond to the last three round trips in a summation
parameters for the jitter. marked as 8–14 in their Algorithm 3.
Both Flamingo and BBGLR work in steps (as defined
Communication costs. Figure 6 gives the communication
in §2.3). We define a waiting period for each step of the
cost for a single summation. The total cost per aggregation
protocol. During the waiting period, the server receives
round for BBGLR and Flamingo are similar. This is because
messages from clients and puts the received messages in
Flamingo’s extra cost over BBGLR at the report step is
a message pool. When the waiting period is over, a timeout
roughly the message size that BBGLR has in their three-
is triggered and the server processes the messages in the
step setup; this is also reflected in the asymptotic cost anal-
pool, and proceeds to the next step. The reason that we do
ysis of Figure 5. In short, compared to BBGLR, Flamingo
not let the server send and receive messages at the same
has fewer round trips with roughly the same total server
time is that, in some steps (in both BBGLR and Flamingo),
communication. For clients, the story is more nuanced: each
the results sent to the clients depend on all the received
client has a slightly higher cost in Flamingo than in BBGLR
messages and cannot be processed in a streaming fashion.
during the report step, as clients in Flamingo need to append
For example, the server must decide on the set of online
ciphertexts to the vector. However, in the reconstruction step,
clients before sending the request to reconstruct the shares.
clients who are not decryptors will not need to send or
receive any messages. Each decryptor incurs communication
8.2. Secure aggregation costs and completion time that is slightly larger than sending one input vector. Note that
the vector size only affects the report step.
This section provides microbenchmarks for summation
tasks performed by BBGLR and Flamingo. Client inputs are Computation costs. We first microbenchmark a single sum-
16K-dimensional vectors with 32-bit entries. For parameters, mation with 1K clients, and set δ to 1% (i.e., up to 1%
unless specified differently later, we set N to 10K and of clients can drop out in any given step). This value of
the number of decryptors to 60 in Flamingo; and set the δ results in a server waiting time of 10 seconds. Figure 7
number of neighbors to 4 log nt for both Flamingo and gives the results. The report step in Flamingo has slightly
BBGLR (for BBGLR, this choice satisfies the constraints larger server and client costs than BBGLR because clients
in Lemma 4.7 [9]). In Figures 6 and 7, “keyad” is the step need to generate the graph “on-the-fly”. In BBGLR, the
for exchanging keys, “graph” is the step for clients to send graph is already established in the first three steps and
their choices of neighbors, “share” is the step for clients stored for the report and reconstruction step. For server
to shares their secrets to their neighbors (marked as 1–8 in reconstruction time, Flamingo is slightly more costly than
their Algorithm 3). The steps “report”, “check” and “recon” BBGLR because of the additional elliptic curve operations.
in Flamingo are described in Section 4.4; in BBGLR, these The main takeaway is that while Flamingo’s computational
12
BBGLR 1.0
600 Flamingo step 1: share and commit step 2: accept or complain step 3: bcast share step 4: bcast qual
elapsed time (sec)
0.9
4
DKG step
3
accuracy
400 0.8 2
0.7 BBGLR
1
200 Flamingo 0 1 2 3 4 5 6 7 8 9 10
0.6 message arrival time at server (sec)
0 0.5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Figure 9: Generating shares of the secret key among 60 decryp-
#summations #summations tors. The four steps are described in Section 4 and given as part (1)
(a) Runtime with w = 10. (b) Sum accuracy τ ; w = 10. in ΠDKG in Appendix B.2.
BBGLR 1.0
600 Flamingo
elapsed time (sec)
0.9
8.3. Feasibility of a full private training session
accuracy
400 0.8
0.7 BBGLR
200 Flamingo We implement the federated learning algorithm
0.6
FedAvg [52] on the non-private baseline. We also use
0 0.5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 this algorithm for Flamingo and BBGLR but replace
#summations #summations
its aggregation step with either Flamingo or BBGLR to
(c) Runtime with w = 5. (d) Sum accuracy τ ; w = 5. produce a secure version. Inside of FedAvg, we use a
multilayer perceptron for image classification. Computation
Figure 8: End-to-end completion time and accuracy of 10
of the weights is done separately by each client on local
secure aggregation rounds with 1K clients. The elapsed time
data, and then aggregated by the server to update a global
is the finishing time of round t. For Flamingo, round 1
model. The server then sends the global model back to
includes all of the costs of its one-time setup, and between
the clients. The number of training iterations that clients
round 5 and 6 Flamingo performs a secret key transfer.
perform on their local data is referred to as an epoch. We
evaluated Flamingo on epochs of size 5, 10, and 20. We
found that often, a larger epoch was correlated with faster
costs are slightly higher than BBGLR, these additional costs convergence of FedAvg to some “maximum” accuracy
have negligible impact on completion time owing to the score. Additionally, because our neural network model
much larger effect of network delay, as we show next. calculations run very fast—there was, on average, less than
a second difference between clients’ model fitting times for
Aggregation completion time. To showcase how waiting different epochs—and because Flamingo and the baselines
time w affects dropouts (which consequently affects the were willing to wait for clients’ inputs for up to 10 seconds,
sum accuracy), we use two waiting times, 5 seconds and the epoch size did not affect their overall runtime.
10 seconds. The runtime for an aggregation round depends We use two of TensorFlow’s federated datasets [2]: (1)
on the timeout for each step, the simulated network delay, EMNIST, the Extended MNIST letter dataset from the Leaf
and the server and client computation time. Figure 8a and 8c repository [16,17]; and (2) CIFAR100, from the CIFAR-
show the overall completion time across 10 rounds of ag- 100 tiny images dataset [46,47]. The EMNIST dataset has
gregations. A shorter waiting time makes the training faster, ∼340K training/∼40K test samples, each with a square of
but it also means that there are more dropouts, which in 28 × 28 pixels and 10 classes (digits). Each client has ∼226
turn leads to more computation during reconstruction. As samples. During training, we use weight vectors with 8K
a result, the overall runtime for the two cases are similar. 32-bit entries. The CIFAR100 dataset has 50K training/10K
On 1K clients, Flamingo achieves a 3× improvement over test samples, each with a square of 32 × 32 pixels and 100
BBGLR; for Flamingo’s cost we included its one-time setup classes. Each pixel additionally has red/blue/green values.
and one secret key transfer. If the key transfer is performed Each client has 100 samples. To achieve good accuracy for
less frequently, the improvement will be more significant. the CIFAR100 dataset, we use a more complex convolutional
The cost of the DKG procedure (part of the setup and neural network than we do for the EMNIST dataset, with extra
which we also added to the first round in Figure 8) is shown layers to build the model, normalize inputs between layers,
in Figure 9. A complete DKG takes less than 10 seconds and handle activation functions and overfitting. This results
as the number of decryptors is not large and we allow in longer weight vectors, with 500K 32-bit entries.
decryptor dropouts. For 60 decryptors, the local computation We randomly divide the datasets equally among 128
performed by each decryptor during the DKG is 2 seconds. clients to create local data. Local models are trained with
Summation accuracy. Figure 8b and 8d show that Flamingo small batch sizes. In Flamingo and BBGLR, all weights (of-
achieves better sum accuracy τ (defined in §2.4) than ten floating point numbers) are encoded as positive integers.
BBGLR when they start a summation with the same number We do this by adding a large positive constant, multiplying
of clients. When the waiting time is shorter, as in Figure 8d, by 212 , and truncating the weight to an unsigned 32-bit
in each step there are more clients excluded from the sum- integer. Figure 10 shows the result with δ = 1%.
mation and therefore the discrepancy between Flamingo and Running time. From Figures 10b and 10d, we see that
BBGLR grows larger. the EMNIST and CIFAR100 datasets do not converge until
13
1.0 always outputs the correct result even if some clients are
7 BBGLR
elapsed time (hours)
Accuracy score
5 Baseline already computed) can be detected but the protocol will not
0.6 BBGLR
4 Baseline output the correct result. Clearly the first notion is stronger.
3 0.4 Flamingo
2 In Section 3.2, we discussed that in BBGLR, when the
0.2
1 server is honest, a malicious client may cause the output to
0 0.0 be wrong or the protocol to abort: a malicious client changes
0 50 100 150 200 250 300 0 50 100 150 200 250 300
#training rounds #training rounds the received share of a secret, and the server may not
(a) Run time. EMNIST (b) Accuracy. EMNIST reconstruct the secret (in which case the protocol aborts) or
1.0 the server may reconstruct to different secret. Or a malicious
7 BBGLR
elapsed time (hours)
Flamingo 0.8 client does not use the established pairwise secret in the
6
Accuracy score
5 Baseline
BBGLR setup phase to mask the input vector. The protocol we
0.6
4 Baseline give in Section 4 has neither type of robustness for the
3 0.4 Flamingo same reason as in BBGLR, but we provide an extension
2
0.2 such that malicious behaviors of the clients can be detected,
1
0
0 50 100 150 200 250 300
0.0
0 50 100 150 200 250 300
though it does not have guaranteed output delivery. The full
#training rounds #training rounds description of the extension is given in Appendix E.1.
As a result, the extended protocol, if terminates, always
(c) Run time. CIFAR100 (d) Accuracy. CIFAR100
output the correct sum, i.e., the sum of the inputs from
Figure 10: Evaluation for full training sessions on EMNIST clients who participated; or it aborts, in which case the
and CIFAR100 datasets. The number of clients per round is server detects that the output is incorrectly computed (but it
128, the batch and epoch sizes for FedAvg are 10 and 20, cannot detect which client is malicious). That said, we want
respectively. Flamingo’s setup cost is included during the to emphasize that this notion of security is not practically
first round, and it performs a secret key transfer every 20 meaningful at the moment since malicious clients are free
rounds, which adds to the total run time. The accuracy score to provide any input they want (see the paragraph on input
is TensorFlow’s sparse categorical accuracy score [2]. correctness in §2.4). However, it could become important
in the future: if there is ever some mechanism that could
confirm the validity of clients’ inputs (e.g., [10,50]), then
about round 150 and 200, respectively, though their accuracy this additional guarantee would ensure that an honest server
continues to improve slightly after that. Figures 10a and 10c always gets valid outputs even if malicious clients exist in
show Flamingo’s running time is about 5.5× lower (i.e., the system.
better) than BBGLR for EMNIST and 4.8× for CIFAR100
and about 1.4× higher (i.e., worse) than the non-private 10. Related Work
baseline for EMNIST and 1.7× for CIFAR100. We believe
these results provide evidence that Flamingo is an effective In this section we discuss alternative approaches to
secure aggregation protocol for multi-round settings such as compute private sums and the reasons why they do not fit
those required in federated learning. well in the setting of federated learning. Readers may also
Training accuracy. We measure training accuracy with be interested in a recent survey of this area [51].
TensorFlow’s sparse categorical accuracy score, which is Pairwise masking. Bonawitz et al. [11] and follow up
derived based on the resulting model’s performance on test works [9,65], of which BBGLR [9] is the state-of-the-art,
data. Due to the way in which we encode floating points as adopt the idea of DC networks [22] in which pairwise masks
integers, a small amount of precision from the weights is are used to hide individuals’ inputs. Such a construction
lost each round. We compare the accuracy of Flamingo and is critical for both client-side and server-side efficiency:
BBGLR’s final global model to a model trained on the same first, since the vectors are long, one-time pad is the most
datasets with the baseline version of FedAvg (which works efficient way to encrypt a vector; second, the server just
over floating points) in Figures 10b and 10d. We find that needs to add up the vectors, which achieves optimal server
the encoding itself does not measurably affect accuracy. computation (even without privacy, the server at least has to
do a similar sum). Furthermore, pairwise masking protocols
9. Extension with robustness support flexible input vectors, i.e., one can choose any b
(the number of bits for each component in the vector) as
Recall that our threat model (§2.3) assumes that the desired. Flamingo improves on this line of work by reducing
server is controlled by a malicious adversary. However, in the overall round trip complexity for multiple sums.
some cases, the server is actually honest and wants to obtain MPC. Works like FastSecAgg [42] use a secret-sharing
a meaningful result—this motivates the need of having the based MPC to compute sums, which tolerates dropouts, but
protocol be robust against malicious clients in addition to it has high communication as the inputs in federated learning
the other security properties. Here we discuss two types of are large. Other results use non-interactive MPC protocols
robustness: 1) guaranteed output delivery, where the protocol for addition [65,66] where all the clients establish shares of
14
zero during the setup. And then when the vectors of clients Finally, secure aggregation reduces the leakage of in-
are requested, each client uses the share to mask the vector dividuals’ inputs in federated learning but does not fully
and sends it to the server. However, to mask long vectors, eliminate it. It is important to understand what information
the clients need to establish many shares of zeros, which continues to leak. Two recent works in this direction are as
is communication-expensive. Such shares cannot be reused follows. Elkordy et al. [30] utilize tools from information
over multiple summation rounds (which is precisely what theory to bound the leakage with secure aggregation: they
we address with Flamingo). Furthermore, the non-interactive found that the amount of leakage reduces linearly with
protocols are not resilient against even one dropout client. the number of clients. Wang et al. [73] then show a new
Additively homomorphic encryption. One can construct a inference attack against federated learning systems that use
single-server aggregation protocol using threshold additive secure aggregation in which they are able to obtain the
homomorphic encryption [27,29,54,58,59,69]. Each client proportion of different labels in the overall training data.
encrypts its vector as a ciphertext under the public key While the scope of this attack is very limited, it may inspire
of the threshold scheme, and sends it to the committee. more advanced attacks.
The committee adds the ciphertexts from all the clients and
gives the result to the server. However, this does not work Acknowledgments
well for large inputs (like the large weight vectors found
in federated learning) because encrypting the vector (say, We thank the S&P reviewers for their comments which
using Paillier or a lattice-based scheme) and performing the improved the content of this paper. We also thank Fabrice
threshold decryption will be very expensive. Benhamouda for discussion on elliptic curve scalar mul-
A recent work [67] uses LWE-based homomorphic tiplication efficiency, Varun Chandrasekaran for advice on
PRGs. This is an elegant approach but it has higher computa- machine learning datasets, Yue Guo for answering ques-
tion and communication costs than works based on pairwise tions about the ABIDES simulator, and Riad Wahby for
masking, including Flamingo. The higher cost stems from pointers on hashing to elliptic curves. We thank Mariana
one having to choose parameters (e.g., vector length and the Raykova and Adrià Gascón for helpful discussions about
size of each component) that satisfy the LWE assumption, the BBGLR paper. Finally, we thank Dario Pasquini for
and particularly the large LWE modulus that is required. making us aware of model inconsistency attacks in federated
learning. This work was partially funded by NSF grant
Multi-round setting. Recent work [36] designs a new multi- CNS-2045861, DARPA contract HR0011-17-C0047, and a
round secure aggregation protocol with reusable secrets JP Morgan Chase & Co Faculty Award.
that is very different from Flamingo’s design. The protocol This paper was prepared in part for information purposes
works well for small input domains (e.g., vectors with small by Artificial Intelligence Research Group and the Algo-
values) but cannot efficiently handle large domains as it CRYPT CoE of JPMorgan Chase & Co and its affiliates
requires brute forcing a discrete log during decryption. In (“JP Morgan”) and is not a product of the Research Depart-
contrast, Flamingo does not have any restriction on the input ment of JP Morgan. JP Morgan makes no representation
domain. A variant of the above work can also accommodate and warranty whatsoever and disclaims all liability, for the
arbitrary input domains by relying on DDH-based class completeness, accuracy, or reliability of the information con-
groups [19] (a more involved assumption than DDH). tained herein. This document is not intended as investment
research or investment advice, or a recommendation, offer,
11. Discussion or solicitation for the purchase or sale of any security,
financial instrument, financial product, or service, or to be
We have focused our discussion on computing sums, but used in any way for evaluating the merits of participating in
Flamingo can also compute other functions such as max/min any transaction, and shall not constitute a solicitation under
using affine aggregatable encodings [4,8,24,38]. any jurisdiction or to any person, if such solicitation under
Limitations. Flamingo assumes that the set of all clients (N ) such jurisdiction or to such person would be unlawful.
involved in a training session is fixed before the training
starts and that in each round t some subset St from N is References
chosen. We have not yet explored the case of handling
[1] Cloudflare randomness beacon.
clients that dynamically join the training session. https://developers.cloudflare.com/randomness-beacon/.
Another aspect that we have not investigated in this work [2] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
is that of handling an adaptive adversary that can dynam- G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat,
ically change the set of parties that it compromises as the I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz,
L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore,
protocol executes. In BBGLR, the adversary can be adaptive D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever,
across rounds but not within a round; in Flamingo the K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas,
adversary is static across all the rounds. To our knowledge, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and
an adversary that can be adaptive within a single round has X. Zheng. TensorFlow: Large-scale machine learning on
heterogeneous systems. Technical report, 2015.
not been considered before in the federating learning setting. https://www.tensorflow.org/.
It is not clear that existing approaches from other fields [34] [3] I. Abraham, P. Jovanovic, M. Maller, S. Meiklejohn, and G. Stern.
can be used due to different communication models. Bingo: Adaptivity and asynchrony in verifiable secret sharing and
15
distributed key generation. In Proceedings of the International [23] A. R. Chowdhury, C. Guo, S. Jha, and L. van der Maaten. Eiffel:
Cryptology Conference (CRYPTO), 2023. Ensuring integrity for federated learning. In Proceedings of the
[4] S. Addanki, K. Garbe, E. Jaffe, R. Ostrovsky, and ACM Conference on Computer and Communications Security
A. Polychroniadou. Prio+: Privacy preserving aggregate statistics (CCS), 2022.
via boolean shares. In C. Galdi and S. Jarecki, editors, Proceedings [24] H. Corrigan-Gibbs and D. Boneh. Prio: Private, robust, and scalable
of the International Conference on Security and Cryptography for computation of aggregate statistics. In Proceedings of the USENIX
Networks(SCN), 2022. Symposium on Networked Systems Design and Implementation
[5] E. Anderson, M. Chase, W. Dai, F. B. Durak, K. Laine, S. Sharma, (NSDI), 2017.
and C. Weng. Aggregate measurement via oblivious shuffling. [25] S. Das, V. Krishnan, I. M. Isaac, and L. Ren. Spurt: Scalable
Cryptology ePrint Archive, Paper 2021/1490, 2021. distributed randomness beacon with transparent setup. In
https://eprint.iacr.org/2021/1490. Proceedings of the IEEE Symposium on Security and Privacy
[6] S. Angel, A. J. Blumberg, E. Ioannidis, and J. Woods. Efficient (S&P), 2021.
representation of numerical optimization problems for SNARKs. In [26] S. Das, T. Yurek, Z. Xiang, A. Miller, L. Kokoris-Kogias, and
Proceedings of the USENIX Security Symposium, 2022. L. Ren. Practical asynchronous distributed key generation. In
[7] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov. How Proceedings of the IEEE Symposium on Security and Privacy
to backdoor federated learning. In Proceedings of the Artificial (S&P), 2022.
Intelligence and Statistics Conference (AISTATS), 2020. [27] V. A. Dasu, S. Sarkar, and K. Mandal. PROV-FL:
[8] A. Beimel, A. Gabizon, Y. Ishai, E. Kushilevitz, S. Meldgaard, and Privacy-preserving round optimal verifiable federated learning. In
A. Paskin-Cherniavsky. Non-interactive secure multiparty Proceedings of the ACM Workshop on Artificial Intelligence and
computation. In Proceedings of the International Cryptology Security, 2022.
Conference (CRYPTO), 2014. [28] Y. Desmedt and Y. Frankel. Threshold cryptosystems. In
[9] J. Bell, K. A. Bonawitz, A. Gascón, T. Lepoint, and M. Raykova. Proceedings of the International Cryptology Conference (CRYPTO),
Secure single-server aggregation with (poly) logarithmic overhead. 1989.
In Proceedings of the ACM Conference on Computer and [29] T. Elahi, G. Danezis, and I. Goldberg. PrivEx: Private collection of
Communications Security (CCS), 2020. traffic statistics for anonymous communication networks. In
[10] J. Bell, A. Gascón, T. Lepoint, B. Li, S. Meiklejohn, M. Raykova, Proceedings of the ACM Conference on Computer and
and C. Yun. Acorn: Input validation for secure aggregation. Communications Security (CCS), 2014.
Cryptology ePrint Archive, Paper 2022/1461, 2022. [30] A. R. Elkordy, J. Zhang, Y. H. Ezzeldin, K. Psounis, and
https://eprint.iacr.org/2022/1461. S. Avestimehr. How Much Privacy Does Federated Learning with
[11] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. McMahan, Secure Aggregation Guarantee? In Proceedings of the Privacy
S. Patel, D. Ramage, A. Segal, and K. Seth. Practical secure Enhancing Technologies Symposium (PETS), 2023.
aggregation for privacy-preserving machine learning. In [31] P. Feldman. A practical scheme for non-interactive verifiable secret
Proceedings of the ACM Conference on Computer and sharing. In Proceedings of the IEEE Symposium on Foundations of
Communications Security (CCS), 2017. Computer Science (FOCS), 1987.
[12] D. Boneh, E. Boyle, H. Corrigan-Gibbs, N. Gilboa, and Y. Ishai. [32] R. Gennaro, S. Halevi, H. Krawczyk, and T. Rabin. Threshold rsa
Lightweight techniques for private heavy hitters. In Proceedings of for dynamic and adhoc groups. In Proceedings of the International
the IEEE Symposium on Security and Privacy (S&P), 2021. Conference on the Theory and Applications of Cryptographic
[13] E. Brier, J.-S. Coron, T. Icart, D. Madore, H. Randriam, and Techniques (EUROCRYPT), 2008.
M. Tibouchi. Efficient indifferentiable hashing into ordinary elliptic [33] R. Gennaro, S. Jarecki, H. Krawczyk, and T. Rabin. Secure
curves. In Proceedings of the International Cryptology Conference distributed key generation for discrete-log based cryptosystems. In
(CRYPTO), 2010. Journal of Cryptology, 2006.
[14] D. Byrd, M. Hybinette, and T. H. Balch. ABIDES: Agent-based [34] C. Gentry, S. Halevi, H. Krawczyk, B. Magri, J. B. Nielsen,
interactive discrete event simulation environment. T. Rabin, and S. Yakoubov. YOSO: You only speak once / secure
https://github.com/abides-sim/abides, 2020. MPC with stateless ephemeral roles. In Proceedings of the
[15] D. Byrd, M. Hybinette, and T. H. Balch. ABIDES: Towards International Cryptology Conference (CRYPTO), 2021.
high-fidelity multi-agent market simulation. In Proceedings of the [35] E. N. Gilbert. Random graphs. In The Annals of Mathematical
2020 ACM SIGSIM Conference on Principles of Advanced Discrete Statistics, 1959.
Simulation, 2020. [36] Y. Guo, A. Polychroniadou, E. Shi, D. Byrd, and T. Balch.
[16] S. Caldas, S. M. K. Duddu, P. Wu, T. Li, J. Konečnỳ, H. B. MicroFedML: Privacy preserving federated learning for small
McMahan, V. Smith, and A. Talwalkar. Leaf: A benchmark for weights. Cryptology ePrint Archive, Paper 2022/714, 2022.
federated settings. https://github.com/TalwalkarLab/leaf. https://eprint.iacr.org/2022/714.
[17] S. Caldas, S. M. K. Duddu, P. Wu, T. Li, J. Konečnỳ, H. B. [37] S. Ha, I. Rhee, and L. Xu. Cubic: A new tcp-friendly high-speed
McMahan, V. Smith, and A. Talwalkar. Leaf: A benchmark for tcp variant. ACM SIGOPS operating systems review, 2008.
federated settings. arXiv preprint arXiv:1812.01097, 2018. [38] S. Halevi, Y. Ishai, E. Kushilevitz, and T. Rabin. Best possible
https://arxiv.org/abs/1812.01097. information-theoretic MPC. In Proceedings of the Theory of
[18] J. Canny and S. Sorkin. Practical large-scale distributed key Cryptography Conference (TCC), 2018.
generation. In Proceedings of the International Conference on the [39] A. F. Hernández, S. Scott, N. Sullivan, R. S. Wahby, and C. Wood.
Theory and Applications of Cryptographic Techniques Hashing to elliptic curves.
(EUROCRYPT), 2004. https://www.ietf.org/archive/id/draft-irtf-cfrg-hash-to-curve-10.html,
[19] G. Castagnos and F. Laguillaumie. Linearly homomorphic 2021.
encryption from ddh. Cryptology ePrint Archive, Paper 2015/047, [40] A. Herzberg, S. Jarecki, H. Krawczyk, and M. Yung. Proactive
2015. https://eprint.iacr.org/2015/047. secret sharing or how to cope with perpetual leakage. In
[20] T.-H. H. Chan, E. Shi, and D. Song. Privacy-preserving stream Proceedings of the International Cryptology Conference (CRYPTO),
aggregation with fault tolerance. In Proceedings of the International 1995.
Financial Cryptography Conference, 2011. [41] Y. Hu, K. Hooshmand, H. Kalidhindi, S. J. Yang, and R. A. Popa.
[21] M. Chase, A. Deshpande, E. Ghosh, and H. Malvai. Seemless: Merkle2 : A low-latency transparency log system. In Proceedings of
Secure end-to-end encrypted messaging with less trust. In the IEEE Symposium on Security and Privacy (S&P), 2021.
Proceedings of the ACM Conference on Computer and [42] S. Kadhe, N. Rajaraman, O. O. Koyluoglu, and K. Ramchandran.
Communications Security (CCS), 2019. FastSecAgg: Scalable secure aggregation for privacy-preserving
[22] D. L. Chaum. The dining cryptographers problem: Unconditional federated learning. In ICML Workshop on Federated Learning for
sender and recipient untraceability. Journal of Cryptology, 1(1), User Privacy and Data Confidentiality, 2020.
1988. [43] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N.
16
Bhagoji, K. Bonawit, Z. Charles, G. Cormode, R. Cummings, In Proceedings of the ACM Symposium on Operating Systems
R. G. L. D’Oliveira, H. Eichner, S. El Rouayheb, D. Evans, Principles (SOSP), 2019.
J. Gardner, Z. Garrett, A. Gascón, B. Ghazi, P. B. Gibbons, [62] E. Roth, H. Zhang, A. Haeberlen, and B. C. Pierce. Orchard:
M. Gruteser, Z. Harchaoui, C. He, L. He, Z. Huo, B. Hutchinson, Differentially private analytics at scale. In Proceedings of the
J. Hsu, M. Jaggi, T. Javidi, G. Joshi, M. Khodak, J. Konecný, USENIX Symposium on Operating Systems Design and
A. Korolova, F. Koushanfar, S. Koyejo, T. Lepoint, Y. Liu, P. Mittal, Implementation (OSDI), 2020.
M. Mohri, R. Nock, A. Özgür, R. Pagh, H. Qi, D. Ramage, [63] E. Shi, T.-H. H. Chan, E. Rieffel, R. Chow, and D. Song.
R. Raskar, M. Raykova, D. Song, W. Song, S. U. Stich, Z. Sun, Privacy-preserving aggregation of time-series data. In Proceedings
A. Theertha Suresh, F. Tramèr, P. Vepakomma, J. Wang, L. Xiong, of the Network and Distributed System Security Symposium (NDSS),
Z. Xu, Q. Yang, F. X. Yu, H. Yu, and S. Zhao. Advances and open 2011.
problems in federated learning. In Foundations and Trends in [64] V. Shoup and R. Gennaro. Securing threshold cryptosystems against
Machine Learning, 2021. chosen ciphertext attack. In Journal of Cryptology, 2002.
[44] E. Kokoris-Kogias, D. Malkhi, and A. Spiegelman. Asynchronous [65] J. So, B. Güler, and A. S. Avestimehr. Turbo-aggregate: Breaking
distributed key generation for computationally-secure randomness, the quadratic aggregation barrier in secure federated learning. In
consensus, and threshold signatures. In Proceedings of the ACM Journal on Selected Areas in Information Theory, 2021.
Conference on Computer and Communications Security (CCS), [66] J. So, C. He, C.-S. Yang, S. Li, Q. Yu, R. E. Ali, B. Guler, and
2020. S. Avestimehr. LightSecAgg: a lightweight and versatile design for
[45] S. Krishna, D. Maram, F. Zhang, L. Wang, A. Low, Y. Zhang, secure aggregation in federated learning. In Proceedings of Machine
A. Juels, and D. Song. Churp: Dynamic-committee proactive secret Learning and Systems, 2022.
sharing. In Proceedings of the ACM Conference on Computer and [67] T. Stevens, C. Skalka, C. Vincent, J. Ring, S. Clark, and J. Near.
Communications Security (CCS), 2019. Efficient differentially private secure aggregation for federated
[46] A. Krizhevsky. Learning multiple layers of features from tiny learning via hardness of learning with errors. In Proceedings of the
images. Technical report, 2009. USENIX Security Symposium, 2022.
https://www.cs.toronto.edu/∼kriz/learning-features-2009-TR.pdf. [68] A. Tomescu, V. Bhupatiraju, D. Papadopoulos, C. Papamanthou,
[47] A. Krizhevsky, V. Nair, and G. Hinton. The CIFAR-100 dataset. N. Triandopoulos, and S. Devadas. Transparency logs via
https://www.cs.toronto.edu/∼kriz/cifar.html. append-only authenticated dictionaries. In Proceedings of the ACM
[48] F. Lai, X. Zhu, H. V. Madhyastha, and M. Chowdhury. Oort: Conference on Computer and Communications Security (CCS),
Efficient federated learning via guided participant selection. In 2019.
Proceedings of the USENIX Symposium on Operating Systems [69] S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig,
Design and Implementation (OSDI), 2021. R. Zhang, and Y. Zhou. A hybrid approach to privacy-preserving
[49] D. Leung, Y. Gilad, S. Gorbunov, L. Reyzin, and N. Zeldovich. federated learning. In Proceedings of the ACM workshop on
Aardvark: An asynchronous authenticated dictionary with short artificial intelligence and security, 2019.
proofs. In Proceedings of the USENIX Security Symposium, 2022. [70] N. Tyagi, B. Fisch, A. Zitek, J. Bonneau, and S. Tessaro. VeRSA:
[50] H. Lycklama, L. Burkhalter, A. Viand, N. Küchler, and A. Hithnawi. Verifiable registries with efficient client audits from RSA
RoFL: Robustness of secure federated learning. In Proceedings of authenticated dictionaries. In Proceedings of the ACM Conference
the IEEE Symposium on Security and Privacy (S&P), 2023. on Computer and Communications Security (CCS), 2022.
[51] M. Mansouri, M. Önen, W. B. Jaballah, and M. Conti. SoK: Secure [71] I. Tzialla, A. Kothapalli, B. Parno, and S. Setty. Transparency
aggregation based on cryptographic schemes for federated learning. dictionaries with succinct proofs of correct operation. In
In Proceedings of the Privacy Enhancing Technologies Symposium Proceedings of the Network and Distributed System Security
(PETS), 2023. Symposium (NDSS), 2022.
[52] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. [72] R. S. Wahby and D. Boneh. Fast and simple constant-time hashing
y Arcas. Communication-efficient learning of deep networks from to the BLS12-381 elliptic curve. In Proceedings of the Conference
decentralized data. In Proceedings of the Artificial Intelligence and on Cryptographic Hardware and Embedded Systems (CHES), 2019.
Statistics Conference (AISTATS), 2017. [73] L. Wang, S. Xu, X. Wang, and Q. Zhu. Eavesdrop the composition
[53] S. Melara, A. Blankstein, J. Bonneau, E. W. Felten, and M. J. proportion of training labels in federated learning.
Freedman. CONIKS: bringing key transparency to end users. In arXiv:1910/06044, 2023. https://arxiv.org/abs/1910.06044.
Proceedings of the USENIX Security Symposium, 2015. [74] H. Yuan and T. Ma. Federated accelerated stochastic gradient
[54] L. Melis, G. Danezis, and E. D. Cristofaro. Efficient private descent. In Neural Information Processing Systems (NeurIPS), 2020.
statistics with succinct sketches. In Proceedings of the Network and [75] K. Zhong, Y. Ma, and S. Angel. Ibex: Privacy-preserving ad
Distributed System Security Symposium (NDSS), 2016. conversion tracking and bidding. In Proceedings of the ACM
[55] L. Melis, C. Song, E. D. Cristofaro, and V. Shmatikov. Exploiting Conference on Computer and Communications Security (CCS),
unintended feature leakage in collaborative learning. In Proceedings 2022.
of the IEEE Symposium on Security and Privacy (S&P), 2019. [76] K. Zhong, Y. Ma, Y. Mao, and S. Angel. Addax: A fast, private,
[56] D. Pasquini, D. Francati, and G. Ateniese. Eluding secure and accountable ad exchange infrastructure. In Proceedings of the
aggregation in federated learning via model inconsistency. In USENIX Symposium on Networked Systems Design and
Proceedings of the ACM Conference on Computer and Implementation (NSDI), 2023.
Communications Security (CCS), 2022. [77] L. Zhu, Z. Liu, and S. Han. Deep leakage from gradients. In
[57] T. Pedersen. A threshold cryptosystem without a trusted party. In Neural Information Processing Systems (NeurIPS), 2019.
Proceedings of the International Conference on the Theory and
Applications of Cryptographic Techniques (EUROCRYPT), 1991.
[58] R. A. Popa, H. Balakrishnan, and A. J. Blumberg. VPriv: Protecting Appendix A.
privacy in location-based vehicular services. In Proceedings of the
USENIX Security Symposium, 2009. Failure and threat model details
[59] R. A. Popa, A. J. Blumberg, H. Balakrishnan, and F. H. Li. Privacy
and accountability for location-based aggregate statistics. In In this section, we give the full details of the dropout
Proceedings of the ACM Conference on Computer and rate and the corruption rate (§2.3).
Communications Security (CCS), 2011.
[60] E. Roth, K. Newatia, Y. Ma, K. Zhong, S. Angel, and A. Haeberlen. Dropout rate. Recall that we upper bound the dropout
Mycelium: Large-scale distributed graph queries with differential rate of the sum contributors (St ) in one round as δ . For
privacy. In Proceedings of the ACM Symposium on Operating
Systems Principles (SOSP), 2021.
decryptors, we consider the dropout rate in one summation
[61] E. Roth, D. Noble, B. H. Falk, and A. Haeberlen. Honeycrisp: round and assume it is at most δD . Note that δ and δD are
Large-scale differentially private aggregation without a trusted core.
17
Protocol Πsetup . hard if the two distributions (ga , gb , gab ) and (ga , gb , gc ) are
computationally indistinguishable.
Parties. Clients 1, . . . , N and a server.
Definition 2 (ElGamal encryption). Let G be a group of
Parameters. Number of pre-selected decryptors L. Let L = 3ℓ + 1.
order q in which DDH is hard. ElGamal encryption scheme
Protocol outputs. A set of t clients (2ℓ + 1 ≤ t ≤ 3ℓ + 1) hold secret consists of the following three algorithms.
sharing of a secret key SK. All the clients in [N ] and the server hold λ
• AsymGen(1 ) → (SK, PK ): sample a random element
the associated public key PK. s from Zq , and output SK = s and PK = gs .
• The server and all the clients in [N ] invoke Frand and receive a • AsymEnc(PK, h) → (c0 , c1 ): sample a random element
$
− {0, 1}λ .
binary string v ← y from Zq and compute c0 = gy and c1 = h · PK y .
SK −1
• AsymDec(SK, (c0 , c1 )) → h: compute h = (c0 ) · c1 .
• The server and all the clients in [N ] computes
D0 ← C HOOSE S ET(v, 0, L, N ). We say that ElGamal encryption is secure if it has CPA
security. Note that if DDH assumption (Def.1) holds, then
• All the clients u ∈ D0 and the server run ΠDKG (Fig. 12).
ElGamal encryption is secure.
• The server broadcasts the signed PKs received from the clients in
D0 to all the clients in [N ]. Definition 3 (Authenticated encryption). An authenticated
encryption scheme consists of the following algorithms:
• A client in [N ] aborts if it received less than 2ℓ + 1 valid signatures λ
• SymAuthGen(1 ) → k: sample a key k uniformly ran-
on PKs signed by the parties defined by C HOOSE S ET(v, 0, L, N ).
dom from {0, 1}λ .
• SymAuthEnc(k, m) → c: take in a key k and a message
Figure 11: Setup phase with total number of clients N. Frand m, output a ciphertext c.
is modeled as a random beacon service. • SymAuthDec(k, c): take in a key k and a ciphertext c,
output a plaintext m or ⊥ (decryption fails).
We say that the scheme is secure if it has CPA security and
individually determined by the server timeout at those steps ciphertext integrity.
(recall that in each round, clients in St only participate in the
first step; the following two steps only involve decryptors). For simplicity, we use AsymEnc and SymAuthEnc to
refer to the encryption schemes.
Corruption rate. For corruption, we denote the corrupted
rate in St as ηSt and the corrupted rate in decryptors as ηD . Definition 4 (Signature scheme). A signature scheme con-
In the Flamingo system, η is given; ηSt and ηD depends on sists of the following algorithms:
λ
η . Note that the fraction of malicious clients in a chosen • SGen(1 ) → (sk, pk): generate a pair of siging key sk
subset of [N ] (e.g., St , D) may not be exactly η , but rather a and verfication key pk.
random variable η ∗ from a distribution that is parameterized • Sign(sk, m) → σ : take in a signing key sk and message
by η , N and the size of the chosen set. Since the expectation m, outputs a signature σ .
of η ∗ is equal to η , and when the size of the chosen set is • VerSig(pk, m, σ) → b: take in a verification key pk, a
large (e.g., St ), the deviation of η ∗ from η is negligible (i.e., messagee m and a signature σ , output valid or not as
η ∗ is almost equal to η ). Therefore, ηSt can be considered b = 1, 0.
as equal to η . On the other hand, since D is a small set, we We say that the signature scheme is secure if the prob-
cannot assume ηD is equal to η . Later in Appendix C we ability that, given m1 , . . . , mz , an attacker who can query
show how to choose L to ensure ηD satisfies the inequality the signing challenger and finds a valid (m′ , σ ′ ) where
required in Theorem 3 with overwhelming probability. m′ ̸∈ {m1 , . . . , mz } is negligible.
Security parameters. In the following definitions and
proofs, we use κ for the information-theoretic security pa- B.2. Setup phase and distributed key generation
rameter and λ for the computational security parameter.
The setup protocol is conceptually simple, as shown in
Appendix B. Figure 11. A crucial part of the setup phase is the distributed
Full Protocol Description key generation (DKG). We first describe the algorithms used
in DKG.
B.1. Definition of cryptographic primitives Algorithms. Let G be a group with order q in which
discrete log is hard. The discrete-log based DKG protocol
In this section, we formally define the cryptographic builds on Feldman verifiable secret sharing [31], which we
primitives used in Flamingo protocol that are not given in provide below. The sharing algorithm takes in the threshold
Section 3.1. parameters L, ℓ, and a secret s ∈ Zq , chooses a polynomial
with random coefficients except the constant term, i.e.,
Definition 1 (DDH assumption). Given a cyclic group G p(X ) = a0 + a1 X + . . . + aℓ X ℓ (a0 = s), and outputs the
with order q, and let the generator of G be g. Let a, b, c be commitments Ak = gak ∈ G for k = 0, 1, . . . , ℓ. The j-th
uniformly sampled elements from Zq . We say that DDH is share sj is p(j) for j = 1, . . . , L.
18
To verify the j-th share against the commitments, the C1. Each honest party either has no secret at the end or
verification algorithm takes in sj and a set of commitments agrees on the same QUAL with other honest parties.
{Ak }ℓk=0 , and checks if C2. The agreed QUAL sets defines a unique secret key.
ℓ
C3. The secret key defined by QUAL is uniformly ran-
Y k dom.
gsj = (Ak )j .
C4. Each honest party, either has no public key, or out-
k=0
puts the same public key with other honest parties.
We define the above algorithms as S. Malicious parties learns no information about the
• FShare(s, ℓ, L) → {sj }Lj=1 , {Ak }ℓk=0 , secret key except for what is implied by the public
key.
• FVerify(sj , {Ak }ℓk=0 ) → b where b ∈ {0, 1}.
Lemma 2. Let the participants in DKG be L parties and
The GJKR-DKG uses a variant of the above algorithm, a server. If δD + ηD < 1/3, then under the communication
PShare and PVerify based on Pedersen commitment, for model defined in Section 2.2, protocol ΠDKG (Fig. 12) has
security reason [33]. The PShare algorithm chooses two properties C1, C2, C3, C4 and S in the presence of a
random polynomials malicious adversary controlling the server and up to ηD
p(X ) = a0 + a1 X + . . . + aℓ X ℓ , a0 = s fraction of the parties.
p′ (X ) = b0 + b1 X + . . . + bℓ X ℓ Proof. Since L = 3ℓ+ 1, and by δD +ηD < 1/3, at most ℓ are
malicious (the dropouts can be treated as malicious parties
and outputs
who do not send the prescribed messages). We first show
{p(j)}Lj=1 , {p′ (j)}Lj=1 , Ck := gak hbk for k = 0, . . . , ℓ, under which cases the honest parties will have no share.
Note that the parties that are excluded from D2 are those
where g, h ∈ G. who are honest but did not receive a complaint against a
To verify against the share sj = p(j), PVerify takes in malicious party who performed wrong sharing; and parties
s′j = p′ (j) and {Ck }ℓk=0 , and checks if that are excluded from D3 are those who have complained
ℓ at i in step (b) but did not receive shares from i at this step.
′ k
If the server drops messages (sent from one online honest
Y
gsj hsj = (Ck )j .
k=0
party to another) in the above two cases, then in the cross-
check step, the honest parties will not receive more than
The algorithms PShare and PVerify can be defined anal- 2ℓ + 1 QUALs, and hence will abort. In this case, honest
ogously to Fshare and FVerify: parties in the end has no share.
• PShare(s, ℓ, L) → {sj }Lj=1 , {s′j }Lj=1 , {Ck }ℓk=0 , Now we prove C1 by contradiction. Suppose there are
• PVerify(sj , s′j , {Ck }ℓk=0 ) → b where b ∈ {0, 1}. two honest parties P1 and P2 at the end of the protocol
who holds secrets (not abort) and they have different QUAL.
Protocol. We give the modified DKG protocol ΠDKG from Then this means P1 received at least 2ℓ + 1 same QUAL
GJKR-DKG in Figure 12. The participating parties can drop sets S1 and P2 received at least same 2ℓ + 1 QUAL sets
out, as long as ηD + δD < 1/3. S2 . W.l.o.g., assume that there are ℓ − v malicious parties
(v ≥ 0). In the 2ℓ + 1 sets S1 , at least ℓ + 1 + v of them
Correctness and security. We analyze the properties of are from honest parties. Similarly, for the 2ℓ + 1 sets S2 , at
ΠDKG in this section. We start by revisiting the correctness least ℓ + 1 + v are from other honest parties different than
and security definitions of GJKR-DKG, and then discuss above (since an honest party cannot send both S1 and S2 ).
how our definition differs from theirs because of a weaken- However, note that we have in total 2ℓ+ 1 + v honest parties,
ing of the communication model. In GJKR-DKG, correct- which derives a contradiction.
ness has three folds: Recall that at most ℓ parties are malicious, so the QUAL
1) All subsets of honest parties define the same unique set has at least ℓ + 1 parties, and since we have C1, now
secret key. C2 is guaranteed. Moreover, since QUAL contains at least
2) All honest parties have the same public key. one honest party, the secret key is uniform (C3). C4 follows
3) The secret key is uniformly random. exactly from the work GJKR. The proof for S is the same
Security means that no information about the secret key can as GKJR, except that the simulator additionally simulates
be learned by the adversary except for what is implied by the agreement protocol in Lemma 3.
the public key. For ΠDKG , if the server is honest, then our Remark. An important difference between ΠDKG and stan-
communication model (§2.3) is equivalent to having a fully dard DKG protocols is that we allow aborts and allow honest
synchronous channel, hence in this case the correctness and parties to not have shares at the end of the protocol. When
security properties in the prior work hold. When the server some honest parties do not have shares of the secret key,
is malicious, we show that ΠDKG satisfies the following the server is still able to get sum (decryption still works)
correctness (C1, C2, C3, C4) and security (S). because malicious parties hold the shares.
19
Protocol ΠDKG based on discrete log
Parties in QUAL sign PK using their own signing keys and send the signed PK to the server.
20
Collection phase: Πsum for round t out of T total rounds
Initial state from setup phase: each client i ∈ [N ] holds a value v and public key PK = gs where SK = s; each decryptor u ∈ D additionally holds a
Shamir share of SK (threshold ℓ with 3ℓ + 1 = L). We require δD + ηD < 1/3.
Steps for round t:
1. Report step.
Server performs the following:
Compute a set Qgraph ← C HOOSE S ET(v, t, nt , N ) and a graph Gt ← G EN G RAPH(v, t, Qgraph ); store {At (i)}i∈Qgraph computed from Gt .
Notify each client i ∈ Qgraph that collection round t begins.
Each client i ∈ Qgraph performs the following:
Compute Qlocal local
graph ← C HOOSE S ET (v, t, nt , N ), and if i ̸∈ Qgraph , ignore this round.
Read from PKI gaj for j ∈ At (i), and compute ri,j by computing (gaj )ai and mapping it to {0, 1}λ .
$
− {0, 1}λ and compute {hi,j,t }j∈A(i) ← PRF(rij , t) for j ∈ At (i).
Sample mi,t ←
Send to server a message msgi,t consisting of
P
Veci,t = ⃗xi,t + PRG(mi,t ) + j∈At (i) ±PRG(hi,j,t ), SymAuthEnc(ki,u , mi,u,t ∥t), for u ∈ D, AsymEnc(PK, hi,j,t ) for j ∈ At (i)
where mi,u,t ← Share(mi,t , ℓ, L), At (i) ← F IND N EIGHBORS(v, St , i),
and AsymEnc (ElGamal) and SymAuthEnc (authenticated encryption) are defined in Appendix B.1.
along with the signatures σi,j,t ← Sign(ski , ci,j,t ∥t) for all ciphertext ci,j,t = AsymEnc(PK, hi,j,t ) ∀j ∈ At (i).
2. Cross check step.
Server performs the following:
Denote the set of clients that respond within timeout as Qvec .
P
Compute partial sum z˜t = i∈Qvec Veci,t .
Build decryption request req (req consists of clients in St to be labeled):
Initialize an empty set Ei for each i ∈ Qgraph , and
if i ∈ Qvec , label i with “online”,
and add SymAuthEnc(ki,u , mi,u,t ∥t) to Ei , where ki,j is derived from PKI (Appendix B.1);
else label i with “offline”,
and add {(AsymEnc(PK, hi,j,t ), σi,j,t }j∈At (i)∩Qvec ) to Ei .
Send to each u ∈ D the request req and Ei of all clients i ∈ Qgraph .
Each decryptor u ∈ D performs the following:
Upon receiving a request req, compute σu∗ ← Sign(sku , req∥t), and send (req, σu∗ ) to all other decryptors via the server.
3. Reconstruction step.
Each decryptor u ∈ D performs the following:
Ignore messages with signatures (σi,j,t or σu∗ ) with round number other than t.
Upon receiving a message (req, σu∗′ ), run b ← VerSig(pki , req, σu∗′ ). Ignore the message if b = 0.
Continue only if u received 2ℓ + 1 or more same req messages that were not ignored. Denote such message as req∗ .
For req∗ , continue only if
each client i ∈ St is either labeled as “online” or “offline”;
the number of “online” clients is at least (1 − δ)nt ;
all the “online” clients are connected in the graph;
each online client i has at least k online neighbors such that η k < 2−κ .
For each i ∈ Qgraph ,
For each SymAuthEnc(ki,u , mi,u,t ∥t) in Ei , use ki,u (derived from PKI) to decrypt; send mi,u,t to the server if the decryption succeeds;
For each (AsymEnc(PK, hi,j,t ), σi,j,t ) ∈ Ei , parse as ((c0 , c1 ), σ) and send cs0u to the server if VerSig((c0 , c1 ), σ) outputs 1;
Server completes the sum:
Denote the set of decryptors whose messages have been received as U. Compute a set of interpolation coefficients {βu }u∈U from U.
For each i ∈ Qgraph , reconstruct the mask mi,t or {hi,j,t }j∈At (i)∩Qvec :
Q su βu −1
For each parsed (c0 , c1 ) meant for hi,j,t in Ei , compute hi,j,t as c1 · ( u∈U (c0 ) ) ;
For each set of received shares {mi,u,t }u∈U , compute mi,t as Recon({mi,u,t }u∈U ).
P
Output zt = z˜t − PRG(mi,t ) + j∈At (i)∩Qvec ±PRG(hi,j,t ).
21
Number of nodes n 128 512 1024 Concretely, from Gilbert [35], let g(n, ϵ) be the proba-
bility that graph G(n, ϵ) has disconnected components, and
Parameter ϵ (failure probability 10−6 ) 0.11 0.03 0.02
it can be recursively computed as
Parameter ϵ (failure probability 10−12 ) 0.25 0.06 0.03
n− 1
X n−1
Figure 14: Parameters ϵ to ensure random graph connectivity. g(n, ϵ) = 1 − g(i, ϵ)(1 − ϵ)i(n−i) .
i−1
i=1
(given the set D, we can compute {βu }u∈D ). Note that the We numerically depict the above upper bound of the
same issue about communication model for DKG also exists probability g(n, ϵ) for different ϵ in Figure 14. For example,
here, but the same relaxation applies. when n = 1024, to ensure less than 10−6 failure probability,
Here we require that ηD + δD < 1/3 for both D and we need ϵ ≥ 0.02, hence the number of neighbors a client
Dnew . As a result, each client j in a subset D ⊆ Dnew holds needs when δ = η = 0.01 is at least ⌈(ϵ + δ + η)n⌉ = 41.
Recall that when η and δD are both 1%, the lower bound
P su,j . For each receiving decryptor j ∈ D, it computes
a share
for number of decryptors L is roughly 60 (Appendix C.1).
s′j = u∈Dold βu · su,j , where each βu is some fixed constant.
Since the sharing part is exactly the same as ΠDKG and
the share combination happens locally, the same correctness Appendix D.
and security argument of DKG applies. Specifically, for Security Proofs
correctness, each honest party either has no secret at the
end or agrees on the same QUAL with other honest parties; D.1. Security definition
and the QUAL defines the unique same secret key before
resharing. For security, a malicious adversary will not learn We say a protocol Π securely realizes ideal functionality
any information, but can cause the protocol aborts. F in the presence of a malicious adversary A if there exists
a probabilistic polynomial time algorithm, or simulator,
Appendix C. S , such that for any probabilistic polynomial time A, the
Requirements on Parameters distribution of the output of the real-world execution of Π
is (statistically or computationally) indistinguishable from
C.1. The number of decryptors the distribution of the output of the ideal-world execution
invoking F : the output of both worlds’ execution includes
In this section, we show how to choose L such that, given the inputs and outputs of honest parties the view of the
N, η , δD , we can guarantee less than 1/3 of the L chosen adversary A. In our proof, we focus on the computational
decryptors are malicious. Note that δD is given because this notion of security. Note that S in the ideal world has one-
can be controlled by the server, i.e., the server can decide time access to F , and has the inputs of the corrupted parties
how long to wait for the decryptors to respond. controlled by A.
To guarantee 2δD + ηD < 1/3 (Theorem 3), a necessary
condition is that η < 1/3 − 2δD . This can be formalized as D.2. Ideal functionalities
a probability question: Given N clients where η fraction of
them are malicious, and randomly choose L clients (decryp- We provide ideal functionality for Flamingo in Figure 4.
tors) from them; the malicious clients X in decryptors can Looking ahead in the proof, we also need to define an ideal
be upper bounded using Chernoff bound as functionality for the setup phase and an ideal functionality
2 for a single round in the collection phase, which we give as
Pr[X ≥ (η + (1/3 − 2δD − η))L] ≤ e−2L(1/3−η−2δD ) , Figure 3 and Figure 15.
For η and δD both being 1%, the choice of L = 60 We model the trusted source of initial randomness as
(Section 6) gives 10−6 probability. If we double L to be a functionality Frand ; that is, a party or ideal functionality
120, then this guarantees 10−12 probability. on calling Frand will receive a uniform random value v in
{0, 1}λ , where λ is the length of the PRG seed.
C.2. Proof of Lemma 1
D.3. Proof of Theorem 1
The algorithm in Figure 1 gives a random graph G(n, ϵ).
A known result in random graphs is, when the edge proba- The ideal functionality Fsetup for the setup phase is
bility ϵ > (1+ω)n
ln n
, where ω is an arbitrary positive value, defined in Figure 3. Depending on whether the server is
the graph is almost surely connected when n is large. Note corrupted or not, we have the following two cases.
that in BBGLR, they also use this observation to build the 1) When the server is not corrupted, then the communica-
graph that has significant less number of neighbors than a tion model is equivalent to a secure broadcast channel.
prior work by Bonawitz et al. [11], but in their work since By security of GJKR, Πsetup securely realizes Fsetup .
the clients chooses their neighbors, the resulting graph is a 2) When the server is corrupted, we build a simulator for
biased one; and they guarantee that there is no small isolated the adversary A. We start by listing the messages that
components in the graph. A sees throughout the setup phase:
22
(t) the output sum is not determined by the actual dropout set
Functionality Fsum
Ot , but instead Mt sent by the adversary.
Parties: clients in St and a server. In the proof below for a single round, for simplicity, we
Parameters: dropout rate δ and malicious rate η over nt clients. omit the round number t when there is no ambiguity. We
(t)
assume the adversary A controls a set of clients in [N ], with
• Fsum receives a set Ot such that |Ot |/|St | ≤ δ , and from the the constraint 2δD + ηD < 1/3. Denote the set of corrupted
adversary A a set of corrupted parties, C ; and ⃗xi,t from client clients in [N ] as C and as before the set of the decryptors is
i ∈ St \(Ot ∪ C). D; the malicious decryptors form a set C ∩ D and |C ∩ D| <
• Fmal sends St and Ot to A, and asks A for a set Mt : if A replies L/3. From the analysis in Appendix A, we have |C|/|St | ≤ η .
with Mt ⊆ St \Ot such that |Mt |/|St | ≥ 1 − δ , then Fmal computes
P Case 1. We start with the case where the server is corrupted
⃗zt = i∈Mt \C ⃗xi,t and continues; otherwise Fmal sends abort to all
by A. Now we construct a simulator S in the ideal world that
the honest parties. runs A as a subroutine. We assume both the simulator S and
(t )
• Depending on whether the server is corrupted by A: ideal functionality Fsum (Fig. 15) have access to an oracle
• If the server is corrupted by A, then Fmal outputs ⃗zt to all the Rdrop that provides the dropout sets Ot . In other words, the
parties corrupted by A. dropout set is not provided ahead of the protocol but instead
provided during the execution (similar notion appeared in
• If the server is not corrupted by A, then Fmal asks A for a shift
prior work [11]). Assume that in the ideal world, initially a
⃗at and outputs ⃗zt + ⃗at to the server.
secret key SK is shared among at least 2L/3 clients in D.
The simulation for round t is as follows.
Figure 15: Ideal functionality for round t in collection phase. 1) S received a set Ot from the oracle Rdrop .
2) S receives a set Mt from the adversary A.
(t )
3) S obtains ⃗zt from Fsum .
A random value v from Frand ;
• 4) (Report step) S interacts with A as in the report step
All the messages in ΠDKG that are sent via the server;
• acting as the honestPclients i ∈ Mt \C with masked
• All the messages in ΠDKG that are seen by the inputs ⃗xi′ , such that i∈Mt \C ⃗xi′ = ⃗zt .
corrupted decryptors. Here the input vector ⃗xi′ and the mask mi are generated
by S itself, and the pairwise secrets are obtained by
The simulator first calls Frand and receives a value v.
querying the PKI.
Then the simulator interacts with A acting as the honest
5) (Cross-check step) S interacts with A acting as honest
decryptors. The simulator aborts if any honest decryp-
decryptors as in the cross-check step.
tors would abort in ΠDKG . There are two ways that A
6) (Reconstruction step) S interacts with A acting as hon-
can cheat: 1) A cheats in ΠDKG , and in Appendix B.2,
est decryptors in the reconstruction step, where S uses
we show the simulator can simulate the view of A; 2)
the shares of the secret key SK to perform decryption
A cheats outside ΠDKG , this means A chooses a wrong
of honest parties.
set of decryptors, or it broadcasts wrong signatures. So
7) In the above steps, if all the honest decryptors would
the simulator aborts as long as it does not receive 2ℓ+ 1
abort in the protocol prescription then S sends abort
valid signatures on PKs signed by the set defined by v. (t)
to Fsum , outputs whatever A outputs and halts.
Finally, note that our threat model (§2.3) assumes that the
server is also controlled by the adversary, i.e., the first case We construct a series of hybrid execution, starting from
is not needed here; but it will be useful when we analyze the real world to the ideal world execution.
the robust version in Appendix E.1. Hybrid 1. The view of A in the real world execution is the
same as the view of A in the ideal world when S would
D.4. Proof of Theorem 2 have the actual inputs of honest parties, {⃗xi }i∈St \(C∪Ot ) , the
pairwise and individual masks, and the shares of the secret
The proof for dropout resilience is rather simple: in the key SK. (S in fact would know SK in full because 3ℓ+ 1 = L
setup phase, at most δD fraction of L selected decryptors and that the number of honest parties is 2ℓ + 1 or more.)
drop out; then in one round of the collection phase, another
Hybrid 2. S now instead of using the actual secret key s, it
δD fraction of decryptors can drop out. Since 2δD + ηD <
replaces s with 0 and sends the corresponding |C ∩D| < L/3
1/3, and 3ℓ + 1 = L (Fig. 12), the online decryptors can
shares of 0 in Zq to A. The joint distribution of less than
always help the server to reconstruct the secrets.
L/3 shares (recall that the threshold is ℓ where 3ℓ + 1 = L),
from the property of Shamir secret sharing, for s and 0 are
D.5. Proof of Theorem 3 the same. Hence this hybrid has identical distribution to the
previous hybrid.
We first present the proof for a single round: the col-
lection protocol Πsum (Fig. 13) for round t securely realizes Hybrid 3. S now instead of using the actual pairwise masks
(t ) between honest parties, it samples a random pairwise mask
the ideal functionality Fsum (Fig. 15) in the random oracle ′
(t )
model. From the ideal functionality Fsum we can see that ri,j from {0, 1}λ and computes the corresponding ElGamal
23
ciphertext as (c′0 , c′1 ). S does not change the pairwise mask Hybrid 7. Same as the previous hybrid, except that the
between a client controlled by A and an honest client (such label messages req from honest decryptors in the last step
pairwise mask can be obtained by querying PKI to get gaj for are replaced with the offline/online labels obtained from
an honest client j, and compute (gaj )ai for malicious client the oracle. In all steps, A would cheat by sending invalid
i). We argue that A’s view in this hybrid is computationally signatures to S ; in this case S will abort. In the cross-check
indistinguishable from the previous one as below. and reconstruction steps, there are following ways that A
First, we need to assume the mapping from G to {0, 1}λ would cheat here:
is a random oracle. To specify, in the real world, the mask 1) A sends multiple different Mt ’s to Fsum . S in the ideal
ri,j is computed from the mapping on gai aj ; and in the ideal world will simulate the protocol in Lemma 3 below,
′
world the mask ri,j is randomly sampled. Let Mt be the and outputs whatever the protocol outputs.
set of online clients that A labels in the real world (recall 2) A sends to Fsum a set Mt with less than (1 −δ)nt clients,
the server is controlled by A). A in both worlds observes or the clients in Mt are disconnected, or there is a client
PRF(ri,j , t) between a client i out of Mt and a client j in Mt , in Mt with less than η k online neighbors. In this case,
hence we require ri,j to be random as a PRF key. S will abort, which is the same as in the real-world
Second, A in the ideal world does not observe the pair- execution.
wise masks between clients in Mt , but only the ciphertexts The last hybrid is exactly the ideal world execution.
′
generated from ri,j for those clients; and the distribution To better analyze the simulation succeeding probability,
of the ciphertexts is computationally indistinguishable from we use κ1 to denote the security parameter for the graph
what A observed from the real world by the security of connectivity (Lemma 1) and use κ2 to denote the security
ElGamal encryption (Definition 2). parameter for the third checking in the cross-check round
Hybrid 4. S now instead of using symmetric encryption (§4.4). The simulation can fail in two ways: 1) The graph
(SymAuthEnc) of the shares of the actual individual mask gets disconnected (even when the server is honest); 2) There
mi , it uses the symmetric encryption of a randomly sampled exists a client in St such that all of its online neighbors are
m′i from {0, 1}λ as the individual mask. Looking ahead in malicious. The former happens with probability 2−κ1 . The
the proof, we also need to model the PRG as a random oracle latter is bounded by n · 2−κ2 : the probability that the opposite
RPRG that can be thought of as a “perfect PRG” (see more event of 2) happens is (1 − η k )n ≈ 1 − nη k (assuming η k is
details in a prior work [11]). For all i ∈ Mt /C , S samples very small). Thus the failure probability nη k ≤ n · 2−κ2 .
Veci at random and programs RPRG to set PRG(m′i ) as Case 2. For the case where the server is not corrupted by
PRG(m′i ) = Veci − ⃗xi −
X
±PRG(ri,j′
), A, the simulation is the same as Case 1, except that the
simulator needs to compose the “shifts” added by A in each
where the vectors Veci ’s are vectors observed in the real step to hit the value ⃗at .
world. The view of A regarding Veci ’s in this hybrid is This completes the proof that for any single round t ∈
statistically indistinguishable to that in the previous hybrid. (t )
[T ], the protocol Πsum for round t securely realizes Fsum
Moreover, A learns the mi in the clear for those i ∈ Mt in −κ1 −κ2
when δD + ηD < 1/3, except probability 2 +n·2 ≤
both worlds, and the distributions of those mi ’s in the ideal
n · 2−κ+1 , where κ = min{κ1 , κ2 }.
and real world are identical; for those mi ’s where i ̸∈ Mt
that A should not learn, from the semantic security of the Lemma 3. Assume there exists a PKI and a secure signature
symmetric encryption scheme (Definition 3), and the thresh- scheme; there are 3ℓ + 1 parties with at most ℓ colluding
old ℓ < L/3, A’s view in this hybrid is computationally malicious parties. Each party has an input bit of 0 or 1 from
indistinguishable from the previous one. a server. Then there exists a one-round protocol for honest
Hybrid 5. S now instead of programming the oracle RPRG parties to decide if the server sent the same value to all the
as in the previous hybrid, it programs the oracle as honest parties.
X Proof. If an honest party receives 2ℓ + 1 or more messages
PRG(m′i ) = Veci − ⃗xi′ − ±PRG(ri,j′
),
with the same value, then it means the server sends to all
where ⃗xi′ ’s are chosen such that i∈Mt \C ⃗xi = i∈Mt \C ⃗xi′ .
P P honest parties the same value. If an honest party receives
From Lemma 6.1 in a prior work [11] under the same less than 2ℓ + 1 messages with the same value, it will abort;
setting, the distribution of A’s view in this hybrid is sta- in this case the server must have sent different messages to
tistically indistinguishable to that in the previous hybrid, different honest parties.
except probability 2−κ : if the graph becomes disconnected
or there is an isolated node after removing Ot and C from St , Remark. The above analysis of the agreement protocol
then the server learns xi in the clear and thus can distinguish shows where the threshold 1/3 comes from. Consider the
between the two worlds. When A cheats by submitting Mt to case where the threshold is 1/2 and 2ℓ + 1 = L. For a target
S where the graph formed by nodes in Mt is not connected, client, the (malicious) server can tell ℓ/2 decryptors that the
S simulates honest decryptors and output abort. In this case, client is online and tell another ℓ/2 + 1 decryptors that the
the distribution of A′ s view in the ideal world is the same client is offline. Then combined with the ℓ malicious de-
as that in the real world. cryptors’ shares, the server has ℓ/2 + ℓ shares to reconstruct
24
individual mask, and ℓ/2 + 1 + ℓ shares to reeconstruct the the same length as its input vector. The client also generates
pairwise mask. a proof of masking that it correctly computed ⃗xi +⃗si . Each
Multi-round security. Our threat model assumes that A client i sends to the server the commitment to ⃗si and the
controls η N clients throughout T rounds (§2.3). There are proof of masking, besides what is supposed to send in
two things we need to additionally consider on top of the protocol Πsum .
single-round proof: 1) the set St is generated from PRG, and The server first verifies the received proofs of masking,
2) the pairwise mask hi,j,t computed from PRF(ri,j , t). For the and then reconstruct the masks exactly as in Πsum . Denote
former, we program RPRG (like the single-round proof) such the sum of the recovered mask as ⃗s. For those clients whose
that the C HOOSE S ET outputs St . proofs are valid, the server and the clients engage in a
Now we analyze the per-round pairwise masks. Let the distributed key correctness (DKC) protocol [10, Section 5],
distribution of the view of A in round t be ∆t . We next show where the server checks if recovered mask ⃗s is the sum of the
that if there exists an adversary B , and two round number ⃗si ’s underlying the commitments. If the server in the DKC
t1 , t2 ∈ [T ] such that B can distinguish between ∆t1 and protocol aborts, then it also aborts in Πsum-robust ; otherwise
∆t2 , then we can construct an adversary B ′ who can break the server continues executing the rest of steps as in Πsum .
PRF security. We call the challenger in PRF security game Now that the server can detect incorrect output; but
simply as challenger. There are two worlds (specified by it does not identify the malicious clients or decryptors. If
b = 0 or 1) for the PRF game. When b = 0, the challenger desired, we can further detect malicious decryptors using
uses a random function; when b = 1, the challenger uses the following two techniques.
PRF and a random key for the PRF. We construct B ′ as ElGamal encryption for both types of seeds. Instead
follows. On input t1 , t2 from B , B ′ asks challenger for hi,j,t1 of using symmetric encryption for the shares of mi,t , each
for all clients i and j, and round t1 , t2 . Then B ′ creates the client encrypts mi,t directly (but not the shares) using El-
messages computed from hi,j,t ’s as protocol Πsum prescribed; Gamal encryption. This reduces the number of ciphertexts
it generates two views ∆t1 , ∆t2 and sends to B . B ′ outputs appended to the masked vector compared to Πsum ; but more
whatever B outputs. importantly, later we will see how it combines with another
Failure probability for T rounds. For a single round, we technique (zero-knowledge proof for discrete log) to bring
(t) (t ) robustness. See the details in the report step of Figure 17.
already showed that protocol Πsum securely realizes Fsum
−κ+1 This change in the report step will bring a change in the
except probability p = n · 2 . The probability that for all
reconstruction step that the decryptors do threshold decryp-
the T rounds the protocol is secure is therefore 1 − (1 − p)T ,
tion on ciphertext mi,t , instead of decrypting the shares of
which is approximately 1 − T · p when T · p ≪ 1. Therefore,
mi,t using symmetric keys.
the probability of failure (there exists a round that fails the
simulation) is Tn2−κ+1 . Proof of decryption. In the reconstruction step, in addition
to decrypting, each decryptor u, for each ciphertext (c0 , c1 ),
Appendix E. proves to the server that
Extension for Robustness logc0 (c0 )su = logg gsu ,
where gsu is known to the server. Note that gsu can be
E.1. Robust protocol generated in the DKG along the way in the setup phase
and stored by the server.
As discussed in Section 9, we want to additionally detect To prove that logc0 (c0 )su = logg gsu , each decryptor
incorrect output when the server is honest. Figure 16 gives
the ideal functionality of our robust version (detect-and- chooses a random β ∈ Zq and sends to the server cβ0 , gβ .
abort). Specifically, the protocol Πsum-robust has two key The server then sends to the decryptor a challenge e ∈ Zq .
properties that Πsum does not have: The decryptor computes z = su · e + β and sends z to
the server. The sever checks that (cs0u )e · (c0 )β = cz0 and
1) the inputs of clients are fixed after being submitted in
(gsu )e · gβ = gz . We use Fiat-Shamir transform to make the
the report step (i.e., malicious decryptors have no way
proof non-interactive.
to change the sum of those inputs);
Due to this proof of decryption, the server can exclude
2) when the server is honest, the incorrect output caused by
bogus partial decryptions from malicious decryptors; since
malicious clients can always be detected by the server.
the sharing polynomial for secret key s is of degree ℓ and
Moreover, we present a proof-of-decryption technique which there are at least 2ℓ + 1 honest online decryptors, the server
allows the server to identify malicious decryptors. is always able to reconstruct the result.
We next describe key techniques we use in Πsum-robust ;
the full description is given as Figure 17 where we highlight
the changes from Πsum .
E.2. Security proof
Checking aggregated masks. One can make a blackbox Theorem 4 (Security of Flamingo extension). Let ΨT
use of a technique in a very recent work by Bell et al. [10] be the sequential execution of Πsetup (Fig.11) and the T
published after our work. To specify, each client i computes rounds of Πsum-robust (Fig. 17). Let κ be a security param-
the sum of its (expanded) masks, denoted as ⃗si which has eter. Let δ , δD , η , ηD be threat model parameters as defined
25
Functionality Fmal-robust 2) S received a set Ot from Rdrop (see Appendix D.5).
3) (Report step) S computes offline/online labels from the
Parties: clients 1, . . . , N and a server. set Ot , and interact with A acting as the server. Here
Parameters: corrupted rate η , dropout rate δ . the collected masked inputs are all zeros and the seeds
for the masks are generated by S itself.
• Fmal-robust receives from the adversary A a set of corrupted parties,
4) (Cross-check step) S acts as honest decryptors in the
denoted as C ⊂ [N ], where |C|/N ≤ η .
cross-check step, and if all the honest decryptors would
• For each round t ∈ [T ]: abort in the protocol prescription then S outputs what-
(t)
1) Fmal-robust receives a random subset St ⊂ [N ], a set of dropout ever A outputs, and sends abort to Fsum and halts.
clients Ot ⊂ St , where |Ot |/|St | ≤ δ and |C|/|St | ≤ η , and 5) (Reconstruction step) S acts as the server in the recon-
inputs ⃗xi,t for client i ∈ St \(Ot ∪ C).
struction step, if the server would abort in the protocol
prescription, then S outputs whatever A outputs, and
2) Fmal sends St and Ot to A, and asks A for a set Mt : if A replies (t )
sends abort to Fsum-robust and halts.
with Mt ⊆ St \Ot such that |Mt |/|St | ≥ 1 −δ , then Fmal computes
P 6) (DKC) S acts as the server in the DKC protocol [10,
⃗zt = xi,t and continues; otherwise Fmal sends abort to
i∈Mt \C ⃗ Figure 2], if the server would abort in DKC, then
all the honest parties. S outputs whatever A outputs, and sends abort to
(t)
3) Depending on whether the server is corrupted by A or not: Fsum-robust and halts.
a) If the server is corrupted by A, then Fmal-robust outputs ⃗zt to We construct a series of hybrid execution, starting from
all the parties corrupted by A; the real world to the ideal world execution.
b) If the server is not corrupted by A, then Fmal-robust asks Hybrid 1. The view of A in the real world execution is the
A whether it should continue or not: if A replies with same as the view of A in the ideal world when S would have
continue, then Fmal-robust outputs ⃗zt to the server; otherwise the actual inputs of honest parties, {⃗xi }i∈St \C , the masks, and
it outputs abort to all the honest parties.
the shares of the secret key SK = s (S in fact would know
the s in full because of the threshold requirement 3ℓ+ 1 = L
and 2δD + ηD < 1/3).
Figure 16: Ideal functionality for Flamingo with robustness Hybrid 2. S now instead of using the actual secret key s, it
(detect-and-abort). replaces s with 0 and sends the corresponding |C ∩D| < L/3
shares of 0 in Zq to A. This hybrid has identical distribution
(§5.1,§5.2). Let ϵ be the graph generation parameter (Fig.1). to the previous hybrid from the property of Shamir secret
Let N be the total number of clients and n be the number of sharing.
clients in summation in each round. Assuming the existence Hybrid 3. S now instead of using the actual masks of honest
of a PKI, a trusted source of initial randomness, a PRG, parties, it replaces each mask with a uniformly random mask
a PRF, an asymmetric encryption AsymEnc, a symmetric generated by S itself. This hybrid has identical distribution
SymAuthEnc, and a signature scheme, if 2δD +ηD < 1/3 and to the previous hybrid.
ϵ ≥ ϵ∗ (κ) (Lemma 1), then under the communication model
Hybrid 4. S now instead of using the actual inputs of honest
defined in §2.2, protocol ΨT securely computes functionality
Fmal-robust (Fig.16) in the presence of a static malicious parties {⃗xi }i∈St \C , it replaces each input vector with all zeros.
adversary controlling η fraction of the total N clients (and Note that neither the masked vectors nor the vectors in the
the server) and per-round η fraction of n clients, except clear are seen by A. This hybrid has identical distribution
probability Tn · 2−κ+1 . to the previous hybrid.
Hybrid 5. Same as the previous hybrid, except that the
Proof. When the server is corrupted by A, the proof is label messages from honest decryptors in the last step are
essentially the same as the proof of Theorem 3 (Ap- replaced with the offline/online labels obtained from Rdrop .
pendix D.5). Below we prove for the case where the server This hybrid is the ideal world execution, and the view of A
is not corrupted. We construct a simulator S in the ideal in this hybrid is identical to the previous hybrid.
world that runs A as a subroutine. We denote the ideal
(t ) Combined with our analysis for the setup phase when
functionality for each round t as Fsum-robust , which executes
the server is honest (Appendix B.2), we conclude that ΨT
steps 1), 2), and 3) in functionality Fsum-robust .
(t ) securely realizes Fmal-robust in the random oracle model.
1) S obtains zt from Fsum-robust .
26
Collection phase with robustness: Πsum-robust for round t
Initial state from setup phase: each client i ∈ [N ] holds a value v and public key PK = gs ; each decryptor u ∈ D additionally holds a Shamir share of
SK = s (threshold ℓ with 3ℓ + 1 = L). The server has gsu for each decryptor u ∈ D.
Parameters: δD + ηD < 1/3.
1. Report step.
Server performs the following:
Compute a set Qgraph ← C HOOSE S ET(v, t, nt , N ) and a graph Gt ← G EN G RAPH(v, t, Qgraph ); store {Ai (t)}i∈Qgraph computed from Gt .
Notify each client i ∈ Qgraph that collection round t begins.
Each client i ∈ Qgraph performs the following:
Compute Qlocal local
graph ← C HOOSE S ET (v, t, nt , N ), and if i ̸∈ Qgraph , ignore this round.
$
− {0, 1}κ and compute {hi,j,t }j∈A(i) ← PRF(rij , t) for j ∈ A(i), where rij is derived from PKI.
Sample mi,t ←
Send to server a message msgi,t consisting of
P
Veci,t = ⃗xi,t + PRG(mi,t ) + j∈At (i) ±PRG(hi,j,t ), AsymEnc(PK, mi,t ), AsymEnc(PK, hi,j,t ) for j ∈ At (i)
where At (i) ← F IND N EIGHBORS(v, St , i), and AsymEnc is ElGamal encryption (Def.2), along with the signatures for each ciphertext.
2. Cross check step.
Server performs the following:
Denote the set of clients that respond within timeout as Qvec .
P
Compute partial sum z˜t = i∈Qvec Veci,t .
Build decryption request req (req consists of clients in St to be labeled):
for each i ∈ Qgraph ,
if i ∈ Qvec , label i with “online”,
and attach AsymEnc(PK, mi,t );
else label i with “offline”,
and attach {AsymEnc(PK, hi,j,t )}j∈At (i)∩Qvec .
Send to each u ∈ D the request req and a set of the attached ciphertexts Ei (in order to recover the mask(s) of client i).
Each decryptor u ∈ D performs the following:
Upon receiving a request req, compute σu∗ ← Sign(sku , req∥t), and send (req, σu∗ ) to all other decryptors via the server.
3. Reconstruction step.
Each decryptor u ∈ D performs the following:
Ignore messages with signatures (σi or σu∗ ) with round number other than t.
Upon receiving a message (req, σu∗ ), run b ← Vrf (pki , req, σu∗ ). Ignore the message if b = 0.
Abort if u received less than 2ℓ + 1 same messages that were not ignored. Denote such message as req∗ .
For req∗ , continue only if
each client i ∈ St is either labeled as “online” or “offline”;
the number of “online” clients is at least (1 − δ)nt ;
all the “online” clients are connected in the graph;
each online client i has at least k online neighbors such that η k < 2−κ .
For each i ∈ Qgraph ,
For each (c0 , c1 ) ∈ Ei , send cs0u , along with the zero-knowledge proof for logg (gsu ) = logc0 (c0 )su .
Server completes the sum:
Denote the set of decryptors whose messages have been received as U. Compute a set of interpolation coefficients {βu }u∈U from U.
For each i ∈ Qgraph , verify the zero-knowledge proof and reconstruct the mask mi,t or {hi,j,t }j∈At (i)∩Qvec :
for each ElGamal ciphertext (c0 , c1 ) of the mask, compute c1 · ( u∈U (cs0u )βu )−1 ;
Q
P
Output zt = z˜t − PRG(mi,t ) + j∈At (i)∩Qvec ±PRG(hi,j,t ).
Figure 17: Detect-and-abort version of collection protocol without DKC (Appendix E.1). One can run DKC in parallel to
the collection phase.
27