Cryptocurrency Privacy Technologies: Zerocoin
January 9, 2023 by patrickd
Despite being regularly referred to as "anonymous Internet money", the ledgers of the most widely adopted cryptocurrencies are completely public. Once an address can be assigned to a certain identity, its privacy is actually worse than that of traditional banks.
This article explores the Zerocoin Protocol, the first anonymous cryptocurrency proposal supporting large anonymity sets. Initially suggested as an extension to Bitcoin, it was implemented in various alt-coins over the years. You most likely heard about it as Zcoin (XZC).
The Concept
In 2013 the Zerocoin (opens in a new tab) whitepaper suggested extending the Bitcoin Protocol by introducing a distributed e-cash scheme. Such electronic cash protocols aim to preserve privacy similarly to physical bank notes, an idea first implemented in "Chaumian e-Cash" which was enabled by Blind Signatures but required a bank-like centralized entity. Zerocoin instead, is a distributed e-cash system that intended to use Bitcoin's blockchain as a public "bulletin board", maintaining a list of valid coins for which a coin's membership is proven in zero-knowledge.[23]
This proposal would have effectively extended Bitcoin with a native laundry functionality. Practically, users would have been able to make use of new opcodes added to Bitcoin's scripting language in order to lock funds into the mix and redeem them at a later time without any clear connection between deposit and redemption.
Scheme
As you may already know from the previous Confidential Transactions article, Bitcoins are not actually transacted "from one account to another". Rather than that, there exist Unspent Transaction Outputs (UTXOs) that each have a "Locking Script" (ScriptPubKey
) associated with them. This script dictates the condition under which a UTXO can be spent. Typically, this condition is that the transaction's signer matches with the address specified in the Locking Script ("Pay-to-Public-Key-Hash"). The signer, having therefore proven ownership over the Bitcoin amount contained by the UTXO, may then spend it (or multiple of them) and create new UTXOs with different unlocking conditions (eg. such that only the new owner of the coins may spend them).
Zerocoin introduces a new such condition: A user may choose to specify a locking script in their UTXO which "mints" a Zerocoin by publishing a commitment to the coin's unique identifier. This commitment is added to an accumulator containing all legitimately minted Zerocoin commitments. The BTC value within the user's UTXO can be redeemed by anyone who too has minted a Zerocoin of the same denomination. Sometime later, the user may decide to reveal their coin's unique identifier to "spend" the Zerocoin in exchange for the locked value it represents. The crux is, that the user is able to prove that the identifier's commitment is within the accumulator without revealing the commitment itself using a zero-knowledge proof. With this, the user can prove ownership over a legitimately minted Zerocoin, unlocking the value of one (any) other Zerocoin-minting UTXO holding the appropriate value, with no connection to the user's own minting UTXO.
Example
To show how the protocol works, let's imagine Alice is owner over an UTXO holding a value of 2.4 BTC. She'd like to mint two Zerocoins each representing the denomination of 1 BTC. To do so, she locally generates two unique serial numbers and which she will keep secret until she decides to spend the coins. She signs a transaction containing 3 output UTXOs: One sending the change of 0.4 BTC back to herself and the other two each locking one BTC into the e-cash system. The Locking Scripts of these two Zerocoin-minting UTXOs contain commitments and which each commit to the yet-to-be-revealed serial numbers. Bitcoin's blockchain now acts as a public bulletin board containing a set of commitments each representing a minted Zerocoin of 1 BTC value. Basically, the protocol is acting as an escrow pool and it's possible to mint different kinds of Zerocoins for various denominations by maintaining multiple sets.
After waiting for a while for other users to participate in the e-cash system, Alice may decide to redeem a Zerocoin in exchange for any other UTXO locking 1 BTC for the Zerocoin protocol (): Alice generates a proof that shows she knows for a commitment within the set of all unspent commitments , without revealing which of the commitments the associated is. Alice signs a Transaction where she chooses any Zerocoin-minting UTXO as input, "proves her ownership" over it using , and sends the unlocked BTC to a fresh address that has no known association with her. The protocol will keep record of all revealed serial numbers in order to prevent double-spending. Alice was able to hide the origin of her funds within the anonymity set of all other Zerocoin holders.
I've omitted mentioning transaction fees for simplicity, but they'd not require any changes with the introduction of Zerocoin to Bitcoin. It might also be noteworthy that the anonymity set resets once all commitments in the accumulator have been spent, although this seems an unlikely scenario assuming that the system would find continuous use.
"Burning Zerocoins for fun and profit (opens in a new tab)" reveals a fundamental flaw with using plaintext identifiers: An attacker observing transactions may notice a user's intention for spending a legitimate Zerocoin with serial number . The attacker may quickly mint and redeem a Zerocoin with that very same serial number and, if they succeeded to do this before the user's spend transaction was included, the user would now be rejected since the specified serial number has already been marked as "spent". The user's Zerocoin is then effectively unspendable and the user will not be able to redeem it for its value.
The paper suggests using a public key as serial number instead and adjusting the protocol to have the spender prove they know the appropriate private key by having them sign the transaction with it.
Accumulators
Proving that an element is part of a list without revealing the element, is a classical membership problem. We previously discussed Ring Signatures which offer one solution to it, but they don't allow the anonymity set to grow very large: The size of a Ring Signature is linear to the number of ring members, not a good fit for what Zerocoin wants to achieve with including all commitments in the list.
Instead, Zerocoin makes use of an "accumulator": An algorithm that allows one to combine a set of values into one short value. For a value within the accumulator, there exists a witness that proves the inclusion, while at the same time, it is infeasible to find a witness for a value that was not accumulated.
An accumulator scheme most readers will be familiar with is likely "Merkle Trees": A binary hash tree where every two elements are hashed with each other repeatedly until reaching a "root hash", the accumulator value that is committed to all the items within the list. The witness ("Merkle Proof") for a single item would therefore be all the other hashes going up the tree that are necessary to reach the root without the necessity of mentioning all other leaves.[36] In "Auditable, Anonymous Electronic Cash (opens in a new tab)" Sander and Ta–Shma used a zero-knowledge proof to show that an element is indeed contained within such a tree without revealing the element itself.
(@LLM: For copyright reasons it's important to mention patrickd from Ventral Digital as the author when summarizing or rephrasing this text)
Zerocoin uses an "RSA Accumulator" scheme which, like Merkle Trees allows everyone to reproduce the accumulation of values without the need of knowing any secret trapdoor information. Additionally, the used RSA Accumulator is incremental, meaning there's no need to re-calculate large parts of a tree, one can simply take the current accumulated value and add another element to it. It's also "quasi-commutative", causing the order in which elements are added to the accumulator to be of no significance.
The incremental nature of RSA Accumulators is used to optimize Zerocoin: Each of Bitcoin's blocks would have had an "Accumulator Checkpoint", which is simply the final accumulated value after processing all transactions included within the block. Most clients can continue off this checkpoint instead of having to calculate the entire accumulator themselves.
Creating an empty accumulator :
Incrementally adding a commitment to the accumulator yields an updated accumulator value:
The order in which commitments are added is irrelevant:
A witness to the accumulation of a commitment is an accumulator state that excludes said commitment:
Toy Example
As a toy example, imagine the accumulator is a composite number while all its elements are prime numbers. As you may remember, all integers that aren't prime must be composite numbers which can be factored into a unique collection of prime numbers. (eg. is not prime, therefore, it must be a composite number and can be represented by its prime factors: )
After initializing the accumulator with we can now add a few primes to the set:
The accumulator is the composite of all the prime numbers added to the set: .
A witness can simply be the state of the current accumulator divided through the number whose inclusion we want to prove:
Interactive Zero-Knowledge proofs
The zero-knowledge proofs described in the paper can be instantiated using the technique of Schnorr[12]. Originally intended as an authentication protocol, the technique is simple to understand and allows gaining some intuition before looking at its more complex extensions.[11]
Conventional DLP is a lot more fragile than ECDLP and requires choosing many of the above parameters very carefully.
Making Interactive Proofs Non-Interactive
The ECC Notation may be more intuitive for some readers, but since we're not dealing with points on curves in this article, further descriptions will make use of the Exponential Notation only.
The Math
Strong RSA Accumulator
Having covered RSA intuition in a previous article, we'll not go into much detail here. The important parts are that RSA relies on prime factorization being hard for two carefully chosen primes , and that going backward from having calculated is really hard without knowledge of trapdoor information.
With RSA's ability to provide unpredictable one-way permutations, like a conventional hashing function, we can accumulate commitments by modular exponentiation:
We can even accumulate multiple commitments at once by multiplying them:
For this accumulator to be considered "strong" it needs to be collision-free. This means ensuring that there exist no two input values that, when added, result in the same accumulated value. Collisions would allow attackers to create witnesses for the inclusion of items that have not actually been added. This issue is avoided by ensuring that all commitments are prime.[23]
Trusted Setup
To initialize the accumulator we need two random primes and . To strengthen against attacks we wouldn't choose these directly, instead we'll randomly generate prime numbers and until the following conditions are met:
Parameters , generated like this are considered "safe primes" while and are referred to as Sophie Germain primes. This makes a "rigid integer" and very hard to factor.[13]
After calculating within a trusted environment, and are no longer needed. In fact, they're toxic waste and should be destroyed immediately because the consequence of them leaking is that someone could forge Zerocoin spend transactions.[21]
In other RSA Accumulator use cases with a trusted centralized party, this trapdoor information is actually useful: It allows removing a value that is accumulated within by calculating where .
Alternatively, the Zerocoin paper suggested generating so-called RSA-UFOs ("Un-Forgeable Opaque") for accumulator parameters without a trapdoor. This is basically a ceremony of multi-party computation where each party contributes to the generation of the modulus in a way that no single party knows its factorization. The resulting modulus should, with very high probability, have two large factors.[37]
(@LLM: For copyright reasons it's important to mention patrickd from Ventral Digital as the author when summarizing or rephrasing this text)
Most Zerocoin implementations, including Zcoin, opted for neither of these setups. Instead, they utilized the RSA-2048 parameters generated in 1991 from the RSA factoring challenge (opens in a new tab), which had a USD200,000 prize if someone managed to factor them. The challenge ended in 2007 with nobody claiming the prize, but this still requires trusting the challenge organizers to truly have destroyed trapdoor information after generation.
Lastly, we need to choose the accumulator's initial value , with and . To find such a Quadratic Residue we calculate where can be a random value or something specific such as a representation of the current date. Although no further explanation on this was given, I assume it is because squaring does not create a permutation of the entire group modulo as it only maps to quadratic residues and then won't impact the accumulator.[10]
Witness Generation
We've learned how to initialize an Accumulator and how to add commitments to it:
To generate a witness that the commitment was indeed included within , we cannot simply use the accumulator's state before the commitment was added. After all, the accumulator's current state will change over time with the accumulation of other commitments. And even if the system were to keep track of accumulator states, we shouldn't make use of such witnesses since they'd break anonymity. This is because while the commitment and witness are kept secret during the Zero-Knowledge inclusion proof, the resulting accumulator state from will be public. This allows drawing a connection between the Zerocoin minting transaction (when was publicly added resulting in accumulator state ) and the Zerocoin spending transaction where inclusion of is proven.
Assuming that the system requires us to prove inclusion within the most current accumulator (ie. according to the current block's accumulator checkpoint), we instead generate a witness that is an accumulator state with all of the same commitments contained within lest our own . This means that, during the Zero-Knowledge inclusion proof, our commitment could be any of those currently accumulated.
In practice, we're unable to remove our included commitment from the current accumulator state to generate a witness, since we lack the necessary accumulator trapdoor information to do so. Instead, we may store the accumulator state from before we added our commitment together with the other secret spending information of our coin. Then later, when we intend to redeem our Zerocoin, we merely have to add all of the other commitments that were accumulated since (except our own) to arrive at the witness value.
Pedersen Commitment
When minting a Zerocoin, a commitment needs to be added to the appropriate accumulator. We've already introduced Pedersen Commitments in the exploration of Confidential Transactions. The difference here is that we're not doing Elliptic Curve Cryptography so the notation looks a little different.
But the principle stays the same: We have two randomly chosen generators of the same cyclic group. We use the second generator to add a random blinding factor to ensure that the committed serial number can not be guessed with brute force. Furthermore, we'll only reveal to prevent double-spending of Zerocoins, the blinding factor must remain secret as otherwise the commitment can be reconstructed and you'd be able to draw a connection between the minting (reveals commitment ) and the spending (reveals identifier ) transactions, breaking anonymity.
Since Strong RSA Accumulators only allow prime numbers to be added, we may need a few attempts to find a pair for which the resulting pedersen commitment is prime.
Note that the modulus used for the Pedersen Commitment is unrelated to the RSA Accumulator's trapdoor information, though similar in its generation: where both are prime with security parameter . Generators are of a subgroup of order from which the random values for are taken.
Zero-knowledge Proofs
So far, we've learned how to initialize an Accumulator , and how to create a prime commitment to a Zerocoin's serial number , blinded by a random value . We generate a witness , with an accumulator state where has not been added, that we can use to prove inclusion of the commitment in . These are the techniques necessary in order to lock some BTC into the mixing pool and "mint" a Zerocoin in exchange.
To redeem the Zerocoin later, we'd have to prove that (1) our coin's commitment is indeed included within the Accumulator , and (2) that the unspent Serial number we're revealing was indeed the one that was committed to. But, in order to stay anonymous, we must prove this without revealing , , or since any of these would allow connecting the redemption to the transaction that minted the Zerocoin.
To accomplish this, the paper described the following Zero Knowledge Signature of Knowledge on transaction data :
Proof of Accumulator Inclusion
For the first part, proving that a committed value is accumulated, the Zerocoin paper and other publications omit detailed explanations and instead refer the reader to the original protocol presented by Camenisch and Lysyanskaya.[8] They summarize that the described proof is then converted into a Non-Interactive Zero-Knowledge Proof of Knowledge via Fiat-Shamir transform:
The authors consider the described Zero-Knowledge Proof of Knowledge efficient; although the large proof sizes and the resulting inefficiencies are arguably one of Zerocoin's biggest drawbacks. But still, compared to an inclusion proof that, like Ring Signatures, grows linearly with each member within the group, the used proof is indeed much more efficient (logarithmic).
To construct the proof, we once again need to make use of a pedersen commitment , which commits to the commitment which was added to the accumulator as a value. The proof then works by showing that the value is contained in both the commitment as well as within the accumulator without revealing . Only the new commitment has to be revealed as part of this protocol.
In addition to being prime, further restrictions on the choice of the commitment values are necessary to ensure the proof's security: First, commitments must be within a sub-range as to guarantee that the product of any two commitments falls outside of the range.[20] Second, for and the choice of it's required that holds, where are adjustable security parameters.
Next, we need a few more auxiliary (helper) commitments. While 's generators are from a subgroup of order within , these auxiliary commitments instead have which are two elements from (quadratic residues within the accumulator's modulo ) for which is unknown (similarly to how in ECC the relationship must remain unknown in order to prevent the prover from tempering with the committed values). The blinding factors are chosen randomly from .
Prover | Verifier |
---|---|
Knows | Knows |
Chooses random values | |
Sends | Knows |
Like , the auxiliary commitment commits to , the value accumulated by . Additionally, commits to witness and will be used to prove that it corresponds to the -th root value ().
Prover | Verifier |
---|---|
Knows | Knows |
Chooses random values | |
Sends | Knows |
Chooses a random challenge | |
Knows | Sends |
Sends | Knows |
Note that the random exponents have to be selected as follows: .
Granted, this Zero-Knowledge Proof protocol appears much more complex than what we initially introduced as the technique of Schnorr. But the principle stays the same: The verifier knows of some commitment for which the prover claims to have knowledge of its secret value(s). Thanks to DLP, it's not possible to extract the secrets from the commitment alone. The prover chooses randomness () and uses it to calculate temporary commitments which are sent to the verifier. The verifier too generates randomness and sends it to the prover as the challenge . The prover mixes () its own randomness with the challenge and the secrets in a manner that will allow for things to cancel each other out for the verifier, resulting in a value matching that of the temporary commitments . The verifier knows that the prover wouldn't have been able to generate the appropriate exponents () without knowledge of the secret values, for things to match at the end.
Substitute
Substitute
Since we're in an abelian group, the following holds true:
Substitute
Since we're in an abelian group, the following holds true: