Cryptography
Zero-Knowledge Range Proofs: Proving Where Your Secret Lies
Suppose that Alice is a blockchain user and she wants to participate in a sealed bid auction, where the amount she bids is kept private. In order to make sure that Alice has enough money, she’s required to prove that her bid does not exceed her account balance. How can she do this while keeping the amount that she bids private?
A zero-knowledge proof allows Alice to prove a statement about a secret value \( x \) without revealing anything else about \( x \). Zero-knowledge proofs are widely used in the blockchain world, and allow us to prove very complicated statements. For our particular sealed bid auction scenario, all that Alice needs in order to enter the auction is a very specific kind of zero-knowledge proof: one that lets her show that her secret bid value \( x \) is in a given interval.
A zero-knowledge range proof (ZKRP) is a more specific type of zero-knowledge proof that allows a prover to prove that a secret value \( x \) lies in some range. Typically, we assume that the prover has published a commitment \({com}(x,r)\) to \( x \) with randomness r, where this commitment is also known to the verifier. The prover and verifier then engage in a protocol that convinces the verifier that the prover knows \( x \) such that \({com}(x,r)\) is indeed a commitment to \( x \), and \(a \leq x \leq b\). Since the statement being proven in a ZKRP is much more specific than in an arbitrary zero-knowledge proof, we can build more efficient protocols by using tailored techniques.
This blog post gives an overview of applications of ZKRPs and various techniques used to construct them.
(For those unfamiliar, a commitment scheme lets Alice publish a value \( c \)=\({com}(x,r)\) about a secret \( x \), where at a later time she can open \( c \) to reveal the secret \( x \). She should not be able to open \( c \) to any other value \( x' \), and \( c \) should reveal nothing about \( x \) until Alice chooses to open it.)
Confidential transactions
Confidential transactions are an application of range proofs that are especially relevant to cryptocurrencies and have spurred research yielding significant efficiency improvements for ZKRPs in recent years. In most cryptocurrencies, such as Bitcoin and Ethereum, transaction details are visible to everyone; in particular, anyone with access to the underlying blockchain can see the amounts of currency being transferred.
Confidential transactions, initially proposed by Maxwell [Max15], aim to hide the amounts involved in each transaction. In Maxwell’s approach, these amounts are stored using Pedersen commitments [Ped91], and the sender must prove to the miners that the sum of the output amounts does not exceed the input amount. In other words, if \({amt}_{in}\) is the input amount and each \({amt}_{out}\) is an output amount, it should hold that:
\({amt}_{in} - \sum_i \left(\mathsf{amt}_{out}\right)_i \geq 0 \)
However, this check on its own is not enough to ensure security. A malicious sender, Eve, could create coins by creating a transaction with herself as the recipient where, for example,
\(({amt}_{in}) = 0, (\mathsf{amt}_{out})_1 = -1, (\mathsf{amt}_{out})_2 = 1 \)
Since \( {amt}_{in} = 0 \), this transaction does not require Eva to spend any coins. However, since Eve receives the output \(({amt}_{out})_2 = 1\), she gains a coin. Yet this transaction satisfies the check above.
Thus, confidential transactions also require the sender to prove that each output amount is positive: \(({amt}_{out})_i \geq 0\).
Recall that zero-knowledge range enables us to prove that a committed (or encrypted) value \( x \) lies in a range \([a,b]\) without revealing any other information about \( x \). Thus, when applied to confidential transactions, they allow the sender to show exactly these checks without compromising confidentiality. Often it helps efficiency to consider a range \([0, 2^k - 1]\) for some integer \( k \), and ranges of this form are sufficient for many applications, like for showing that balances are positive in confidential transactions. For the rest of this post we focus on ranges of this form.
Other applications
Proofs of liabilities/reserves/solvency: Proofs of solvency allow exchanges or banks to show that they hold enough currency in reserves to pay out all customers. There are two aspects to this: proving that the bank holds at least a certain amount in reserves, and proving that the bank holds at most that amount in liabilities. These proofs should be privacy-preserving in that they don’t reveal users’ balances. Zero-knowledge range proofs have been used in many schemes such as Provisions [DBB+15, Cam14, and JC21].
Receiver hiding: In a public blockchain, everyone can see when a user \(A\) sends currency amount \(x\) to a user \(B\). By using confidential transactions that hide amounts, which we can construct using range proofs, \(A\) can increase the receiver anonymity set by making it appear that it could have sent \(x\) to any of \(n\) users. \(A\) can do so by creating a transaction that sends a confidential amount to \(n\) different recipients \(B_1, \ldots, B_n\), one of which is actually \(B\). \(A\) sends \(x\) to \(B\) and \(0\) to all other recipients, but because the amounts are hidden to the public, an observer cannot tell which recipient received nonzero currency.
A subtle issue is that the recipients \(B_i\) that receive zero currency in the transaction will likely never user this (zero) amount in a future transaction, which the public can see. One solution is to send these fake recipients \(B_i\) very small amounts as an incentive to use them in future transactions; however, this costs more for the sender.
Private voting: [Gro05] constructs several protocols for private voting, where users submit encrypted votes to a group of authorities that tally the votes and jointly decrypt the final outcome. The protocols are private in that a snooping authority should not be able to decrypt individuals’ votes. Since the votes are encrypted, a malicious voter might encrypt a negative value for a candidate they don’t like. [Gro05] prevents this attack by requiring users to prove that their encrypted votes contain positive values.
Federated learning: In federated learning, many distributed clients train machine learning models on locally held data. These models are then aggregated into a global model, often by averaging their weights. To protect the privacy of clients’ data, these weights are often submitted under encryption. One concern in this setting is that a malicious client may submit an encrypted model with very large weights, which could hurt the accuracy of the global model by skewing this average. One way to mitigate this problem is to use input validation to ensure that each client’s model has bounded weights. Range proofs, used for federated learning by Acorn [BGL+22], enable input validation even over encrypted weights, where clients can show that their inputs satisfy these bounds without revealing their models in the clear.
Auctions: [AW13] uses range proofs for verifiable auctions. Range proofs help an auctioneer to prove to an auditor that the sale price was set correctly without revealing the values of the bids. For example, in a second-price auction with \(n\) bids, (where the sale price is equal to the second-highest bid), the auctioneer can provide proofs that \(n-1\) bids were at most the sale price and one bid was greater than the sale price.
Anonymous credentials: In an anonymous credential system, users are issued credentials that they can later use to prove facts about their attributes, such as their age. Ideally, these proofs should reveal no more than is necessary. For example, a user should be able to prove that their age is at least 18 without revealing their exact age. Here, range proofs can be used; this was a motivation for [CCs08].
Group signatures: Group signatures allow a group of signers to create signatures on behalf of the group, without revealing which of them created the signature. [CM99] uses range proofs as a building block in their construction of a group signature scheme.
Verified location: Range proofs can be used to show that a location is in a permitted region, by proving that the latitude and longitude lie in the proper intervals. This and the following application, timestamping, are suggested applications of HashWires [CCLMR21].
Timestamping: Suppose one is issued a certificate with a secret expiration date. Range proofs can be used to privately show that the certificate is still valid; i.e., the current date is less than the expiration date.
Ciphertext validity: Range proofs can be used to show that ciphertexts for various lattice-based encryption schemes are valid; see [Lib23].
Differential Privacy: In some applications, encrypted consumer data is aggregated to support statistical studies. For example, encrypted electricity consumption data [MR14]. To enable privacy of individual users, [MR14] added noise for differential privacy. A ZKRP is used to prove that the added noise is within an accepted range.
Technical Details
Let’s now get into the details, although still high-level, of the technical approaches, and some additional desirable properties of ZKRPs.
Overview of approaches
Over the last couple of decades several approaches have been developed to construct ZKRPs. Many of these can be roughly categorized into approaches based on set memberships, binary decomposition, four squares decomposition, and iterated hash functions. Of course, we can instantiate ZKRPs from generic ZKPs as well.
Zero-knowledge set membership
[CCs08] describes a protocol for showing given a commitment \(C_x\) that the committed value \(x\) lies in an arbitrary set \(\Phi\) (\(Phi\) does not have to be an interval). The high level idea is that the verifier publishes a signature \(\sigma_i\) for every element \(i \in \Phi\). The prover then proves in zero-knowledge that it knows \(x\) and \(\sigma_x\) such that:
\(C_x\) is a commitment to \(x\)
\(\sigma_x\) is a valid signature for \(x\) under the verifier’s public key
Since the verifier publishes signatures only for elements of \(\Phi\), the prover cannot obtain a valid signature for an element outside of \(\Phi\).
Note that the verifier publishes \(|\Phi|\) signatures, which becomes burdensome if \(\Phi\) is large. In particular, if \(\Phi\) is a range \([0, 2^k - 1]\), the verifier must publish \(2^k\) signatures. It turns out that there are much more efficient protocols tailored to the case when \(\Phi\) is such an interval; [CCs08] gives a construction of such a protocol using the following idea.
Bi/n-ary decomposition
A folklore approach to proving that a committed value \(x\) is in a range \([0, 2^k - 1]\) is to express \(x\) in binary, as \(x = x_0 \cdot 2^0 + x_1 \cdot 2^1 + \ldots + x_{k-1} \cdot 2^{k-1}\). As long as this relation between \(x\) and the \(x_i\)’s holds and each \(x_i\) is indeed a bit in \(\{0,1\}\), we must have that \(x \in [0, 2^k - 1]\).
CCs08
Notice that we already know how to prove that each \(x_i \in \{0,1\}\) using the zero-knowledge set membership protocol of [CCs08], and for the set \(\{0,1\}\) in question the verifier only needs to publish two signatures. These same two signatures can be used to prove that \(x_i \in \{0,1\}\) for every \(i\). In total, the prover sends \(O(k)\) group elements.
Because the signatures published by the verifier can be used, [CCs08] observe that we can save even more communication-wise by expressing \(x\) in some other base \(u\) instead (assume for ease of explanation that \(2^k\) is exactly equal to \(u^\ell\)):
\(x = a_0 \cdot u^0 + a_1 \cdot u^1 + \ldots + a_{k-1}\cdot u^{\ell-1}\)
where the verifier publishes \(u\) signatures allowing the prover to show that each \(a_i \in \{0, \ldots, u-1\}\). Now the prover only needs to send \(O(\ell)\) group elements, which may be much smaller than \(O(k)\) if \(u\) is much larger than \(2\). Although the verifier must publish more signatures (\(u\) versus \(2\)), this is a one-time cost and the signatures may be reused for every digit of this proof and any subsequent proof.
By choosing the base \(u\) optimally, [CCs08] is able to prove that \(x\) is in the range \([0, 2^k - 1]\) using only \(O\left( \frac{k}{\log k - \log \log k}\right)\) group elements.
[PBF+] similarly uses a combination of binary decomposition and signatures to construct a range proof over Pedersen commitments. Their construction uses Borromean ring signatures [MP15] and does not require a trusted setup (for unfamiliar readers, we explain this near the end of this post).
Bulletproofs
Bulletproofs [BBB+18], considered the state-of-the-art range proof scheme, combines the binary decomposition technique with an inner product argument to enable the prover to send only \(O(\log_2 k)\) elements. An inner product argument (IPA) is a way to prove the value of the inner product of some committed vector and some other vector. Bulletproofs improves and uses an inner product argument of [BCC+16] where the prover sends only \(O(\log_2 n)\) group elements for in IPA over length-\(n\) vectors. The key idea in Bulletproofs is that the prover can use this IPA to execute the binary decomposition approach more efficiently.
As before, we write \(x = a_0 \cdot 2^0 + a_1 \cdot 2^1 + \ldots + a_{k-1} \cdot 2^{k-1}\) and let \({\bf a_L} = [a_0, a_1, \ldots, a_{k-1}]\). We let \({\bf 2^k} := [2^0, 2^1, \ldots, 2^{k-1}]\). The prover shows that it knows a vector \(\bf a_R\) such that the following hold:
\({\bf a_L}\circ{\bf a_R} = {\bf 0^k}\)
\(\bf{a_L - a_R = 1^k}\)
\({\bf a_L}\cdot{\bf 2^k} = x\)
where \({\bf a} \circ {\bf b}\) denotes the vector \([a_0 b_0, a_1 b_1, \ldots, a_{k-1} b_{k-1}]\).
Conditions (1) and (2) show that each component of \(\bf a_L\) is in \(\{0,1\}\):
Suppose that some component \((a_L)_i = c \neq 0\). Then \((a_R)_i = 0\) by (1) and \(c - 0 = 1\) by (2). Therefore, \(c = 1\).
Condition (3) shows that indeed \(\bf a_L\) contains the binary decomposition of \(x\).
Notice that condition (3) is already written as an inner product equation, so we can easily use the \(O(\log_2 n)\)-length IPA mentioned above. It turns out that we can also write (1) and (2) as inner product equations by using a random challenge. That is, the verifier chooses a random \(y\) and lets \({\bf y^k} = [1, y, y^2, \ldots, y^{k-1}]\) and the equation verifying (1) is \(({\bf a_L \circ a_R}) \cdot {\bf y^k} = 0\). The equation for (2) is \(({\bf a_L - 1^k - a_R}) \cdot {\bf y^k} = 0\). Observe that if the left side is not equal to \({\bf 0^k}\) in either of these inner products, it is unlikely that its inner product with \(\bf y^k\) will be zero.
Bulletproofs+ [CHJ+22] slightly optimizes the Bulletproofs argument to reduce the number of group elements sent by the prover. Bulletproofs++ [EKR+23] further improves efficiency by reducing both prover and verifier time.
BFGW20
[BFGW20] instantiates the binary composition approach even more efficiently, using polynomial commitment schemes (PCS). Their approach is general and allows you to plug in any hiding PCS. Notably, when KZG commitments are used, the proof length and verifier time are only \(O_\lambda(1)\).
They assume that the prover starts with a commitment \( \mathsf{com}_f \) to a polynomial \( f\) such that \(f(1) =x\). The prover then commits to a specially structured polynomial \(g\) that encodes \(x\) in binary. More specifically, the pover constructs \(g\) such that on a special set of points \( \{alpha_o, \ldots, \alpha_{k_1}\}\),
\(g(\alpha_{k-1}) = a_{k-1}\) and for all \(0 \leq i \leq k-2\), \(g(\alpha_i) = 2g(\alpha_{i+1}) + a_{i}\).
Observe that \(g(\alpha_i) = a_i + \sum_{i' > i} 2^{i'- i} a_{i'}\), and thus \(g(\alpha_0) = a_0 + \sum_{i=1}^{k-1} 2^i \cdot a_i = x\). The prover now shows the following:
\(g(1) = f(1)\)
\(g(\alpha_{k-1}) \in \{0,1\}\)
\(g(\alpha_i) - 2g(\alpha_{i+1}) \in \{0,1\}\) for all \(0 \leq i \leq k-2\)
(1) argues that \(g\) indeed encodes \(x\). (2) argues that \(g\) encodes the last digit of \(x\) in binary. (3) argues that \(g\) encodes the remaining digits of \(x\) in binary (recall that the prover constructed \(g\) such that \(a_i = g(\alpha_i) - 2g(\alpha_{i+1})\)).
It turns out that by choosing the \(\alpha_i\)’s to be \(k^{\text{th}}\) roots of unity \(\{1 = \omega^k, \omega, \omega^2, \ldots, \omega^{k-1}\}\), we can leverage properties of polynomial commitment schemes to get a very efficient proof.
Lib23
[Lib23] presents a proof system to show that a committed vector is binary (has all components in \(\{0,1\}\)) uses it to construct a constant-sized range proof that is even shorter than that of [BFGW20]. For this proof of binarity, [Lib23] uses similar techniques to previous work, This proof system can also be used to efficiently prove that lattice-based ciphertexts are valid.
Four square decomposition
[Bou00] introduced an approach to range proofs that uses the fact that every positive integer can be decomposed into the sum of squares. More specifically, by Lagrange’s four square theorem, for all \(x \in \mathbb{Z}_{\geq0}\), there exist \(x_1, x_2, x_3, x_4 \in \mathbb{Z}\) such that
\(x = x_1^2 + x_2^2 + x_3^2 + x_4^2\)
On the other hand, if \(x\) is negative, it can never be expressed as the sum of squares in \(\mathbb{Z}\). Thus we can prove that \(x\) is non-negative by proving that we know \(x_1, x_2, x_3, x_4\) such that their squares sum to \(x\).
Importantly, this fact must hold over \(\mathbb{Z}\) if we are working in some group, for example the integers modulo 5, it’s possible that \(x = -1\) and \(0^2 + 1^2 + 2^2 + 2^2 = 9 = -1 \mod{5}\). Therefore, for the four-square-decomposition approach we need a special kind of commitment called an integer commitment that allows us to prove that such equations hold over \(\mathbb{Z}\). [Bou00] uses Fujisaki-Okamoto commitments [FO97], which require a group of unknown order such as an RSA group. Damgård-Fujisaki commitments [DF02] are slightly more efficient integer commitments used in subsequent work [Lip03] which refined Boudot’s idea. [Gro05] similarly followed the same approach and improved its efficiency by observing that \(x\)’s of a certain form can be written as the sum of only three squares. [CPP17] further improved the efficiency and showed that the RSA assumption (rather than the strong RSA assumption) is sufficient to show the security of Damgård-Fujisaki commitments.
The integer commitments used by all of [Bou00, Lip03, Gro05] require groups of unknown order such as RSA groups, which require the modulus \(N = pq\) to be generated without any party knowing \(p\) and \(q\). This trusted setup requirement is undesirable. A newer line of work [CKLR21, CGKR22] develops new integer commitment schemes, some of which do not require a trusted setup. These schemes also yield better efficiency than earlier four-square-based schemes, though Bulletproofs and subsequent binary-decomposition-based proofs are still more concretely efficient.
Hash chains
An approach to range proofs that’s more concretely efficient in certain settings uses hash chains. An advantage of hash chains over the previous protocols is that the only cryptographic machinery required is a hash function. Thus, they require no trusted setup and offer concrete efficiency.
To see the simplest way to use a hash chain for a range proof, let \(H\) denote a hash function and let \(H^i(x) = H(H( \ldots H(x)))\) denote evaluating \(H\) on \(x\) \(i\) times.
Assume that a prover holds a commitment \(c = H^x(r)\) to a value \(x\) where \(r\) is a random value. To show that \(x \geq t\), where \(t\) is some publicly known threshold, the prover can generate a proof \(\pi := H^{x-t}(r)\). The verifier can check that \(H^t(\pi) = c\).
The idea is that if \(x < t\), \(x-t\) will be negative and generating a proof that makes the verifier accept involves computing a preimage of \(r\) under \(H\). This is impossible by the security of \(H\).
PayWord [RS96] first applied the idea of hash chains to electronic payments, and they have since seen other applications such as verifiable auctions [AW13].
In the above range proof construction, the prover must do \(x - t\) work to compute the proof \(H^{x-t}(r)\), and the verifier must do \(t\) work to check that \(H^t(\pi) = c\). If we use this construction to prove that \(x\) is in some interval \([0, 2^k - 1]\), the work for the prover scales linearly in \(2^k\), which quickly becomes costly if \(2^k\) is large. HashWires [CCLMR21] improves the efficiency of both the prover and verifier to scale linearly with \(k\), by observing that integers can be decomposed in a way that allows computing a much shorter hash chain for each digit of \(x\).
Generic zero-knowledge proofs
Another option is to use a generic zero-knowledge proof. However, such proofs are often less efficient for the prover than the best schemes above which are tailored to range proofs. Some of these SNARKs also require a trusted setup, which is undesirable.
Desirable Properties
Apart from the core properties of ZKRPs, many applications can benefit from some additional properties provided by specific ZKRPs. These properties, for example, enable amortizing size when several such proofs are required, remove the need for trusted setup, as well as enable homomorphic combination of proofs.
Aggregation
A convenient property that some range proofs have is that they can be aggregated:
Given \(m\) proofs \((\pi_1, \ldots, \pi_m)\) that \(x_1 \in [0, 2^k - 1]\), \(x_2 \in [0, 2^k - 1]\), \(\ldots, x_m \in [0,2^k - 1]\) we can efficiently generate a short aggregate proof \(\pi\) that shows all of these statements at once. For this aggregation property to be nontrivial, \(\pi\) should be shorter than the concatenation of \(\pi_1, \ldots, \pi_m\).
For Bulletproofs, Bulletproofs+, and Bulletproofs++, a single prover can aggregate range proofs for \(m\) different values in the same range \([0, 2^k - 1]\), where the aggregate proof takes space only \(O(\log_2(m \cdot k))\). For range proofs generated by many different provers that may not want to reveal their values to each other, Bulletproofs provides an MPC protocol that enables this aggregation.
Aggregation is especially useful for confidential transactions, where minimizing the amount of space used on chain decreases gas costs. Since range proofs are used to show non-negativity, all range proofs typically prove membership in the same interval \([0, 2^k - 1]\), and thus Bulletproofs aggregation is exactly what is needed.
In addition to proof aggregation, some ZKRPs including Bulletproofs, allow for batch verification. This means that several distinct proofs can be verified together more efficiently than verifying individually.
Transparent setup
Some range proofs require parties to agree on a common random string used in the protocol. For example, the four square decomposition range proofs based on RSA-based integer commitments require the parties to agree on an RSA modulus. Importantly, this modulus \(N\) must be generated in such a way that no parties know the factorization of \(N = pq\). Similarly, [BFGW20] instantiated with KZG commitments requires a powers-of-tau common reference string, a series of values \(g^{\tau^i}\) where no party knows \(\tau\). If \(p\) and \(q\) or \(\tau\) are known, these protocols lose security. Thus we say these protocols require trusted setup.
Ideally, protocols should have a transparent setup procedure that does not produce this toxic waste, and that each party can run itself.
Bulletproofs/Bulletproofs+/Bulletproofs++, [CKLR], [BFGW20] instantiated with DARKs, HashWires, and [PBF+] all have transparent setup.
Note that trusted setup is different from having a trusted issuer responsible for distributing the proper commitments to users, e.g., a Pedersen commitment corresponding to that user’s account balance or a HashWires hash commitment corresponding to the user’s age. Any protocol needs to assume that the prover and verifier agree on the commitment at hand.
Homomorphism
A commitment scheme is homomorphic if there is an easy way to generate a commitment to \(x + y\) from a commitment to \(x\) and a commitment to \(y\). For confidential transactions, homomorphism is highly desirable since it gives an easy way to add two amounts. As a result, many of the range proofs motivated by confidential transactions use Pedersen commitments, which are homomorphic.
An additional benefit of homomorphic commitments is to enable range proofs over arbitrary ranges \([a, b]\), even if we have a ZKRP for a custom range, such as \([0, 2^k-1]\), for a large enough \(k\). To show that \(com(x)\) is such that \(x≥a\), we can instead show that \(com(y:=x-a) = com(x) - com(a)\) is such that \(y≥0\). Similarly, to show that \(com(x)\) is such that \(x≤b\), we can instead show that \(com(y:=b-x) = com(b) - com(x)\) is such that \(y≥0\). Of course, \(k\) needs to be large enough to account for undesirable rounding.
However, the hash-based commitments used in hash chains are not homomorphic; there is no easy way to generate a commitment of \(x+y\) given a commitment \(H^x(r_1)\) to \(x\) and a commitment \(H^y(r_2)\) to \(y\). As such, hash chains are better suited for other applications such as verified location, KYC, and verifiable auctions; more of these applications are discussed in HashWires [CCLMR21].
Conclusion
We’ve given an overview of range proofs, their applications, and several approaches to constructing them. The Bulletproofs family seems to be the most popular in practice, due to their concrete efficiency, transparent setup, aggregatability, and compatibility with Pedersen Commitments. However, which scheme is most appropriate for a specific application can depend on the size of the range being proven and the size of the secret value \(x\). A comprehensive comparison of these schemes would be very useful.
Blog