Corda 5: Restating the Corda vision

July 08, 2022

By: Dr. Kat Baker, Head of Software Engineering

Following the open sourcing of Corda 5, which we wrote about here, let’s dig into some of the thinking guiding its development.

Where we’ve come from

Imagine a world where two people in completely different global and political locations could agree to something and see the results unfold simultaneously, with no slow reconciliation, no business processes involving faxes like the 1980s never ended. A world where data can actually be owned, and shared, not licensed through some central party with overarching control. Imagine your industry running on a single shared, yet secure, database.

That was the founding vision of Corda. Five years on, how are we getting on and what have we learned?

Corda was founded on a single vision, laid out in its original statement of intent. As well as a technical description of a platform, this was a rallying cry to Corda developers to get out into the world and be disruptive.

Half a decade later, it’s time to revisit this founding vision. We will bring in the lessons we have learned, account for changes in the rapidly growing industries we serve, and then restate our mission loudly with a new collective understanding of what is possible.

Let’s take a look at these foundational words:

A distributed ledger made up of mutually distrusting nodes would allow for a single global database that records the state of deals, obligations, and other agreements between institutions and people. This would eliminate much of the manual, time-consuming effort currently required to keep disparate ledgers synchronized with each other. It would also allow for greater levels of code sharing than is currently typical in the financial and other industries, reducing transaction costs for everyone.

Richard Gendal Brown, August 2016

A single global database…

A core part of that original vision is summed up in the phrase “a single global database”. Picture an ever-evolving spectrum of intertwining states where the devolved evolution would be the driving force of Corda and the industry. Applications exist at the periphery of this infrastructure as an important enabler, but they do not sit at its heart. Indeed, there is no mention of the orchestration of things in the above statement of purpose.

…with applications at its heart

Over time, we have been taught by our community that we must place business logic, encapsulating workflows, at the heart of what Corda is. With each chain constrained by some business context, operating effectively as a separate schema within that global database. Indeed, as the DLT space expands there is no forcing function that restricts these schemas to the same underlying technology, interoperation should mean removing the platform semantics from the ability to interoperate, what matters is the data, not that it’s on Ethereum or Corda. Operations across chains are constructed as independent actions by the same entity operating in two contexts, Entities, people, and organizations, have identities that match the context in which they’re transacting and interacting.

This allows each network, or chain, to set its own rules around governance, both for who can join and interact but also who and how other networks can interact with it. A Corda network in our new vision must be application-centric.

…that isolates nobody.

No chain is, or should be, an island. Yes, isolation is important (and often a key regulatory requirement), but that choice must be opt-in and reversible. However, the capacity for the interconnection of different networks should be provided in such a way that allows for the reasoning about that interconnection without overburdening either side with operational details of the other or exposing information that should stay within the confines of one network.

In today’s world interoperability is paramount, nobody wants to take the risk of being “trapped” by a single vendor or technology choice, especially where there cannot be one tech choice to rule them all, or adopt one whose features do not lend themselves to the problem domain. For some use-cases, the public, permissionless, model of a blockchain makes perfect sense.

Returning to the global database concept, whether at the outset of an application or added later, each schema should be to expose the ability to mutate it safely using an identity trusted within its identity domain rather than allowing a third party to “root around” and hope they don’t break any rules that aren’t even visible to them.

Explicit composability

The next iteration of the Corda Architecture places the solution to this composability problem at the center of its design. Whether it is Corda to Corda or Corda to another platform, the ability to reason about the interactions and compose an application able to live in both worlds using a well-defined API is key.

Let’s take, as an example, the task of purchasing racehorses for some digital currency, and contrast how this would be done in the existing version of the Corda architecture and that which is upcoming.

Alice and Bob, our favorite crypto couple, are set to sell Alice’s “fine filly” to Bob. Conceptually this is a very simple exchange where each party is ostensibly happy with the value they receive from the other; of course, whether Alice’s mare is a future Derby Winner or an old nag is something a DLT cannot help Bob ascertain.

Why horses

Alice, bless her, is well-known amongst the horse-racing fraternity, a digitally signed document from her announcing that Bob is now the owner of her horse is expected to be accepted by the stable when Bob rocks up with it. It does seem par for the course when talking about DLTs and blockchains to want to put expensive, “big ticket”, items down as the example, but in reality, anything would equally serve.

The crux of the matter is that it is those digital property rights that really matter, the ability for someone to “own” something and do what they want with it (under the terms of the smart contract) in a digital realm, such as sell it to someone else who has the equal ability to transfer value to the other person. They don’t hold a license to it, there is no central authority that allows them to transact it, and that digital information over which they hold sovereignty can be enriched by other data, other signatures, all in a cryptographically secure manner.

Horses are just the new black!

Buying a racehorse in Corda 4

Alice and Bob belong to a single network (indeed, the concept of network interop doesn’t exist). Alice proposes a transaction with Bob for the bilateral exchange of a “horse” (linear state) for some KatCoins (fungible state). We presume that the network has an issuer of those coins and that they are making their Contract Types available for other developers to use (of course, integrating and managing the lifecycle of that code is a challenge left to the integrator). Alice and Bob either discover an application that allows for the transacting of horses for coins or they write their own with Bob being content to allow Alice to self-issue their pony as a token onto the ledger.

Bob accepts the transaction which is atomically completed through the signature of some globally whitelisted notary, with the results communicated to all participants. Whilst that atomicity makes for a simple resolution or failure of the transaction in that moment, it inherently intertwines the two blockchains of the different states.

Conceptually, this is very simple and easy to reason about, yet a number of issues arise at scale. Firstly, that global network does not account for the sharding of identities across various personas, nor does it allow for individual chains to specify which rules govern their evolution (in terms of notaries, consensus parameters, etc).

Additionally, we are entwining blockchains, which makes the resolution of validity extremely difficult at scale, but on a more pragmatic level, it means the mingling of Java code by many different vendors in an application whose lifecycles and quality assurance require management.

Buying a racehorse in Corda 5

Corda 5 posits that there is no need to intermingle back-chains, indeed, a single chain can be thought of as its own application-level network with its own rules. Exchanging one state for another becomes a cross-network swap enabled through APIs and the concept of “vicarious trust” within a single entity universe. Essentially, if I have identities in many chains, then I can instruct myself in another to do something through an API provided by the platform to enable this.

This adds strong versioning, simple lifecycle management, devolved identity authentication, and easy back-chain verification at the expense of much more complex non-atomic operations on multiple chains. However, this becomes a platform concern, which means the platform, Corda, will be doing the heavy lifting instead of the application writer. It is a clear case where complexity in the platform allows for a simpler end-user experience, which is ultimately why platforms exist and are adopted.

Continuing our above example, Alice and Bob each have two identities, Alice-Horse and Alice-Cash, having joined each network and provided to the operator(s) sufficient material to prove their identity matches the real-world entity. A transaction is proposed within the horse network, with Bob instructing himself in the cash network to pay Alice.

There is no need for the horse network to understand cash, to be able to model it, only to interact with it. Effectively Bob’s identities are able to vicariously trust instructions from one another through a well-defined API such that instructions can be issued to one’s self to interact with some other entity.

Applications are at the heart of it all.

Each schema added to the global database is not done by random chance, it is put there by some entities for a purpose; that purpose is expressed as “an application”. Applications embody a set of states that can be mutated by onboarded members through workflows, applications set the rules by which their member identities can be interacted with by other facet identities of the owning entity.

Applications bound the concept of a CorDapp, they allow for the operation and execution of “the purpose” to be as distributed or centralized as needed to solve the problem and as will be tolerated by those wishing to use it.

Scale up, distribute out

Corda 5 is about starting small, scaling as needed, and facilitating distribution and federation as needed.

Corda Architecture Version 1.0’s biggest shortcoming was its presumption that all entities would embrace true decentralization overnight. The design of The Corda Node is built around the idea that each member of a network wants to manage their own participation and, critically, is able to technically and logistically do so.

Early adopters building application networks on top of Corda highlighted the error in our thinking here. Their own customers were not ready to take on the challenge of hosting their own infrastructure. Additionally, traditional businesses, whilst seeing the huge benefits DLT technology could bring from the removal of reconciliation or “middle-men” in a process, were equally unprepared to decentralize their own control to their customers.

Progressive decentralization

Corda 5 at its very roots embraces the concept of progressive decentralization. Its clustered architecture allows a single entity to manage many identities within a single, logical, compute environment whilst at the same time not preventing those identities to migrate to their own environment later. The technology solutions are there when users and industries are ready for them and in the meantime, it becomes the perfect environment to explore what DLT offers. We are no longer asking people to decide at the outset of a project if it is decentralized or not, which is an extremely powerful thing as it de-risks projects and enables things to get off the ground that might otherwise not have.

Through the identity isolation and P2P layers of the tech stack the ability to seamlessly migrate those with a sub-set of an application chain exists. The boundaries between self-hosted, remotely hosted, centralized, and managed vs devolved and sovereign become deployment issues rather than fundamental choices around where trust boundaries exist inviolate through the lifetime of a chain.

Start small, start local, start centralized, and over time flow outward as needed without the limitations of reinventing and resolving your business problem at each step. Yet, that solution is still leveraging R3 DLT technology to move toward DLT-backed solutions to industry problems; Corda is still Corda.

Not all loads are equal

Finally, Corda 4’s architecture, in its pure decentralized focus, assumed all entities within a network would be roughly equal in their participation. The fixed costs associated with being “in” a network and running a Corda node mean the barrier to entry is extremely high. Especially if the management of infrastructure for identities is being undertaken by a single entity in some hosted fashion.

Corda 5’s virtual nodes and scalable worker architecture allow thousands of identities to be hosted in a single logical compute stack, yet that stack can shrink to nothing should no actual activity by those identities be occurring. Corda 4 was never envisaged of needing to start up and down, Corda 5 delivers this as an innate part of the architecture. This will result in a massive reduction in TCO where individual identities are not transacting at a high rate, allowing efficient sharing of computing resources.