The Privacy Questions Raised by Blockchain
Law360
To start, we will adopt the definition from the National Institute of Standards and Technology Blockchain Technology Overview:1 “Blockchains are tamper evident and tamper resistant digital ledgers implemented in a distributed fashion (i.e., without a central repository) and usually without a central authority (i.e., a bank, company, or government).” Next we will outline the fundamentals of privacy vs. information security so we can work through the nuances of each. In simple terms, privacy is the capability to choose whether information is disclosed to others and determine how it is used. The highest degree of privacy for an element of information would be if its owner had complete control over its dissemination and use for as long as the information exists. In practical applications, this level of privacy is often impossible to achieve due to limitations with technology and business needs.
The enforcement of privacy is how we will define security. It encompasses the mechanisms used to insure the confidentiality, integrity and availability of information. The strongest level of security would guarantee that the information would be disclosed only to those who should access it, its integrity would be insured at all times, and the information would be available to be used precisely as defined by the owner.
Now that we have defined a general framework we can dive into a deeper discussion of privacy and security implications of blockchain technology. This analysis will focus predominantly on the cryptocurrency use case because cryptocurrency is the easiest to study, having had real-world implementation for a decade. Also, key features associated with blockchain technology are all present in cryptocurrency in that they are “tamper evident and tamper resistant digital ledgers implemented in a distributed fashion (i.e., without a central repository) and ... without a central authority (i.e., a bank, company, or government).”
Blockchain Privacy
Blockchain has some fundamental privacy problems by virtue of its design. Specifically, the distributed aspect of a blockchain means that each full node that processes transactions and builds the blockchain necessarily has access to the blockchain transaction data itself. In a cryptocurrency like bitcoin, this means that the blockchain is publicly available and every transaction can be traced back to the first genesis block. Bitcoin is said to be “pseudonymous,” which means that it has “[d]ata points which are not directly associated with a specific individual [but where] multiple appearances of [a] person can be linked together.”2
A recent study by the New York Times described how enough pseudonymous location data can make identification of the individual trivial.3 This is a big concern for blockchain for a few reasons. Unlike app data cited in the Times article, blockchain data is open for scrutiny by everyone — including every malevolent criminal looking to exploit information for financial gain. Also, the immutable record of the blockchain exacerbates this problem. Once attributed to an individual through any means, a lifetime of pseudonymous transactions will be permanently exposed as linked to that person.
The public nature of the blockchain allows opportunities for identification. An article on bitcoin investigations from the Journal of Forensic Research illustrates many of them.4 One way is to monitor the communications between nodes on the blockchain, which can associate transactions and internet protocol addresses. Applications have been developed to perform these types of analyses on public blockchains. In addition, although not public, cryptocurrency wallet software can be forensically analyzed even without the passphrases or keys that are needed to use the wallet.
The most high-profile use of these techniques was the arrest of Ross Ulbricht for operating the deep web site “Silk Road,” which was a market place for illegal drugs, among other things. The techniques allowed law enforcement to identify Ulbricht as the operator of Silk Road. More interestingly, an IRS special agent was also able to track bitcoin transactions to determine that a U.S. Drug Enforcement Administration agent involved in the investigation, Carl Force, was laundering bitcoin related to Silk Road.
As can be seen from this discussion, a blockchain like bitcoin provides very little privacy protection. Aside from the public scrutiny, individuals that rely on intermediaries such as an exchange like Coinbase are subject to having their identities exposed by the exchange, such as when the IRS demanded Coinbase turn over certain records involving cryptocurrency transactions.5 Although privacy is often the focus when weaknesses of blockchain are discussed, the technology has some comparably problematic security issues as well.
Blockchain Security
While an internet search will generate many results proclaiming blockchain as revolutionizing cybersecurity, such articles are either typically so high-level that their claims cannot be reasonably evaluated or focus on one aspect of security while ignoring others. To properly evaluate security we need to consider a technology’s ability to insure the confidentiality, integrity and availability of information. We already addressed shortcomings of blockchain relating to confidentiality in the privacy section above. Therefore, this section will focus on availability and integrity.
Availability
The distributed nature of blockchain is fairly strong at insuring availability, a remarkable achievement for an untrusted decentralized network. With respect to bitcoin, there have been some denial-of-service attacks, but the significant ones have only occurred against exchanges, not against the network nodes themselves. These are similar to attacks on typical publicly accessible websites and not indicative of a problem unique to blockchain, so we will not focus on them here. Beyond DOS attacks there are some other ways availability could be impacted.6 There may be other possibilities, but these all seem to be academic as they apply to the bitcoin network, as there is no evidence any of these has been carried out for any significant disruption.
Integrity
Those who conclude that blockchain will revolutionize cybersecurity often address integrity with the statement that blockchain is immutable and ensures integrity with near perfection. On one hand, if we focus on just the network itself, this is true. For example, if you view a transaction in the bitcoin network in a past block, particularly an older block, you can be essentially 100 percent sure that the transaction was valid at the time (e.g., the sender had the private key to authorize the transaction and had a sufficient amount of bitcoin to make the transaction). However, for any practical implication, integrity must also consider the validity of the transaction itself. This means protecting against fraudulent or mistaken transactions as well as preventing inadvertent loss. 7
Blockchain’s immutable record, which is often touted as demonstrating its superior integrity, conversely provides absolutely no defense against fraud or mistake. The bitcoin implementation resoundingly demonstrates this severe integrity issue. In July 2018, a cryptocurrency exchange engineer estimated that 4 million bitcoin were lost and another 2 million were stolen, when at the time less than 18 million bitcoin existed. This means over one-third of all bitcoin were compromised. Keeping in mind that bitcoin is the prototypical blockchain implementation and the most robust use case, this is a staggering number — a far higher risk tolerance than anyone would accept in designing a system handling critical transactions.
For anyone that has interacted with the blockchain directly, this integrity issue is not hard to understand. In its truest decentralized form, the user assumes 100 percent of the responsibility for securing what in the cryptocurrency context is a form of wealth. But even worse than keeping cash under a mattress, with cryptocurrency it all hinges on digital keys that when accessed can be used to take that wealth right out from under someone, or can render it inaccessible if the keys are lost. And due to the uncompromising immutability of the blockchain coupled with the lack of centralized authority there is absolutely no recourse. Another factor that makes the one-third figure even worse is the fact that to date those that interact directly with the bitcoin blockchain are undoubtedly at the higher end of technical and computer proficiency. One can imagine the number would only get much worse if the system was used across the whole population.
Shifting Security to the End User
At a high level, the combination of the decentralized nature and immutability of the blockchain shifts essentially all of the integrity responsibility to the end user because there are no checks against fraud, mistakes, or loss. There is an old computer adage “garbage in garbage out” that seems particularly applicable here. The integrity of any real-world implementation is measured by its output, which in the cryptocurrency implementation is tied 1:1 to the integrity of its transaction inputs. If the system is constructed such that it facilitates fraudulent transactions8 and provides no mechanism of redress, it has deep integrity flaws.
Trade-Offs
This article focuses on cryptocurrency because it is the quintessential implementation of blockchain and its analysis demonstrates that the benefits of blockchain come with some significant privacy and security drawbacks. A seemingly pervasive occurrence in blockchain discussion is the focus on the core benefits without addressing whether the attendant privacy and security problems can be reduced to a suitable risk tolerance. Another frequent occurrence is an attempt to address some of the privacy and security problems by proposing “trade-offs” of aspects of the technology, without acknowledging that those trade-offs remove some of the “revolutionary” blockchain benefits and begin to make the technology less distinguishable from legacy database technology. And finally, it is almost impossible to find any description of a use case that addresses all of these issues and then compares the all-in cost of implementation to legacy technology that perform similar tasks with objectively comparable metrics.
With respect to the first issue of focusing on the core benefits and not providing equal consideration of the challenges, one possibility is that there are some irreconcilable problems that cannot be addressed. In January 2016, Ethereum co-founder Vitalik Buterin wrote a blog post laying out some theoretically possible technical approaches to blockchain privacy, which range from “horrendously inefficient” to more feasible for “private blockchains” to one that is “private by default” but leaks information to the outside world when there is a dispute.9 One takeaway from Vitalik’s post is that if this privacy issue can be suitably addressed, the solution will likely be far more complex than even blockchain itself.
Beyond the privacy paradox, few seem to acknowledge that there is a similar security incompatibility due to the inherent reliance on end users for integrity of transactions. An example of this can be found in a whitepaper on self-sovereign identity published in October 2018 by Blockchain Bundesverband.10 In describing the “core capabilities,” the article states “we assume that all individuals and entities as Identity Owners have one or more [digital identities], with a unique set of private keys that control each [digital identities].” In an appendix, the article identifies the security risks, stating — “[i]f an attacker is able to get access to an identity (for example by phishing attacks) the full control over this identity is lost [and t]his may result in huge damage to the identity subject.” Considering the “huge damage” that can occur, if — as is the case with bitcoin — there is a double digit fraud and loss rate of the keys used to secure the identities, the system will fall far short of being workable. While the article describes aspirational features that need to be implemented, such as a “highly secure and usable key management,” there is no detailed proposal on how to accomplish this or carry out other requirements specified in the article, such as the subject must be “empowered to regain control over their identity.” The most pertinent questions of how, and perhaps more importantly whether, these goals can be achieved are not addressed.
Another common theme of blockchain commendation involves acknowledging a possible trade-off in response to a privacy or integrity concern while not addressing the full implications of the trade-off. An example is invoking “private permissioned blockchain” as a response to privacy concerns of public blockchain. Making a blockchain permissioned necessarily means introducing a controlling authority and a layer of credential management. This introduces the need to trust that authority, adds centralized privacy and security vulnerabilities, and runs counter to the trustless distributed nature of a blockchain that is often cited as the key novel feature.11 A privately controlled blockchain cannot guarantee immutability the way a public one like bitcoin can because the entity or entities controlling it could manipulate the consensus mechanism. Likewise, it introduces security and privacy vulnerabilities because the compromise of a controlling entity (externally or from an insider) can compromise the integrity of the entire system. At some point, with enough trade-offs, the question becomes whether this purportedly revolutionary technology is any different from existing technology at all.12
When it comes to cryptocurrency blockchains like bitcoin, there is little question that it was transformative in the sense that no other technology solution exists where an untrusted decentralized network of self-interested actors can be incentivized to cooperate to facilitate transactions. The purpose of discussing the trade-offs in the preceding paragraphs is that when trade-offs are made for a particular use case this novelty is eroded, and there are likely other existing technologies that can readily perform the same functionality. A private permissioned blockchain implementing transactions within a single organization is functionally indistinguishable from currently implemented distributed databases. In such a case, the evaluation of new technology should be a relative comparison of costs and benefits to existing technology.
There is a very popular use case where a large retailer implemented blockchain to track food supply. There is no question that the technical capability to implement such a system without blockchain existed long before as demonstrated by shipping companies’ ability to track packages. By all accounts the blockchain implementation was an enormous undertaking, which included “years” of testing. But none of the articles about it shed any light on the cost of this project and how (or if) a blockchain implementation was better than alternatives by any metric.13 While the vendor has made some statements about how this solution can reduce the cost of a recall,14 it does not include any analysis of how alternative solutions could have provided similar results perhaps with less complexity, shorter development time, and at a lower price point.
Conclusion
The most obviously novel and transformative use case of blockchain — cryptocurrency — has fundamental flaws in practical implementation from both a security and privacy perspective. If solutions to these problems are to be found, they will not come from touting the purported novelty of blockchain without addressing the privacy and security issues head on. Hopefully this article provides a framework to further this discussion.
The complete article, "The Privacy Questions Raised by Blockchain," first appeared on law360.com on January 14, 2019
[1] The NIST BTO published in October 2018, available here, https://nvlpubs.nist.gov/nistpubs/ir/2018/NIST.IR.8202.pdf, is a comprehensive primer on blockchain technology.
[2] See IAPP Glossary, available at https://iapp.org/resources/glossary/
[3] Your Apps Know Where You Were Last Night, and They’re Not Keeping It Secret, Dec. 10, 2018, available at https://www.nytimes.com/interactive/2018/12/10/business/location-data-privacy-apps.html
[4] Bitcoin Investigations: Evolving Methodologies and Case Studies, May 14, 2018, available at https://www.omicsonline.org/open-access/bitcoin-investigations-evolving-methodologies-and-case-studies-2157-7145-1000420.pdf
[5] See IRS Notification available at https://support.coinbase.com/customer/portal/articles/2924446-irs-notification
[6] One way is isolating a particular user through a so-called “Sybil” attack (creating multiple fraudulent nodes) or “Eclipse” attack (monopolizing all in-bound and out-bound connections), which could be used to isolate a particular node by either disconnecting them from the network or possibly providing fraudulent data.
[7] This could be cast as an availability issue as well, but for the purpose of this article we will consider permanent inaccessibility to be a loss and therefore implicating integrity the same way as if data was deleted or corrupted.
[8] While the term “facilitates fraudulent transactions” may seem a bit overstated, it is defensible when considering that in any non-blockchain implementation there are layers of mechanisms and centralized controls dedicated to mitigating the same concerns. Even in the typical financial institutions with layers of protections in place, the FBI identified business e-mail compromise used for fraudulent money transfers as the most significant internet crime by victim loss in 2017.
[9] See Privacy on the Blockchain, available at https://blog.ethereum.org/2016/01/15/privacy-on-the-blockchain/
[10] Self-sovereign Identity, A position paper on blockchain enabled identity and the road ahead, October 23, 2018, Published by the Identity Working Group of the German Blockchain Association, available at https://www.bundesblock.de/wp-content/uploads/2018/10/ssi-paper.pdf (hereinafter “SSI Paper”).
[11] See, e.g., NIST BTO at 18 “A key feature of blockchain technology is that there is no need to have a trusted third party provide the state of the system. . . .”; see also Bitcoin Investigations, supra n. 4, “Blockchain technology is being considered separately for many applications, but its intrinsic function and value is in providing a mathematically demonstrable means of settling transactions between parties that do not trust each other without resorting to a ‘trusted third party’.”
[12] As noted in the SSI Paper, supra n. 10, “private permissioned” blockchains have “limited difference to distributed databases.”
[13] This point has been made by some other commentators as well. See https://medium.com/@jonelcordero/walmart-ibms-supply-chain-blockchain-is-missing-the-point-c4685b4939de (“However why not just use a string of databases at this point? If the idea is simply food trace-ability that is not trustless or verifiable, supply chain users could at this point just take their current database structures and offer permissions to other users, allow multiple input validation, multiple copies and append-only writes and logs of all people accessing it.)
[14] See https://www.forbes.com/sites/rachelwolfson/2018/07/11/understanding-how-ibm-and-others-use-blockchain-technology-to-track-global-food-supply-chain/#3855b7f62d1e (“the use of blockchain technology can reduce the cost of the average product recall by up to 80%.)