Developer Preview: Java in Intel® SGX
Back in 2017 we at R3 first announced that we are working on a project to bring the JVM to the Intel® SGX platform.
Today I have the pleasure of announcing the first official release of our ongoing work to allow the secure execution of JVM bytecode inside Intel SGX. The project’s goal is to provide tooling that enables the development of Intel SGX enclaves in Java, Kotlin, Scala, and more generally in any JVM based language. This first release showcases our progress to date – the basic capability of building and hosting JVM enclaves.
What is Intel SGX?
Intel Software Guard Extensions(SGX) is an extension to modern Intel CPUs designed to help increase the security of application code and data. It allows running code that operates on cryptographically isolated memory areas inaccessible to the owner of the CPU. Furthermore it provides a way to prove to a remote party that they are indeed communicating with such a protected piece of code. This essentially enables the processing of sensitive data remotely, on untrusted machines. You can learn more in this introductory article.
How can I access the release?
During this developer preview phase, we are gathering feedback from a limited audience. To request an invite please sign up here.
Part of the project is public on Github, including a simple sample application and documentation.
Also, join us on Slack!
How does this help developers?
Developing code for Intel SGX presents some fundamental challenges. Most of the existing libraries, software stack, debugging tools, and even languages that developers are used to are simply not available in enclaves, for two reasons.
- Intel SGX code cannot make system calls. That’s right, no filesystem access, no networking, no nothing baby! Even memory allocation works differently.
- Enclaves aren’t regular programs. With every enclave comes an untrusted host process that loads the enclave and interfaces with it using a custom Intel SGX-specific ABI. Interactions are restricted through this interface, and for good reason, as this is The Attack Surface of Intel SGX.
To address these constraints, the Intel SDK provides tools to assist developers writing enclaves in C and C++. The SDK includes a partial libc implementation, and a GDB plugin allowing debugging of an enclave loaded in a special mode. These tools are very helpful, but developers who want the safety and productivity of higher level languages need more.
Faced with these challenges, Intel SGX development has so far consisted of two main approaches.
Specialized enclaves written in C/C++
These are enclaves written from scratch, for a specific purpose. The upside is that these enclaves are self contained, the programmer has full control over what happens in them.
The downside is it’s very hard to write such an enclave correctly. The usual caveats of C/C++ about unmanaged memory apply. Programmers must learn a brand new set of tools. Existing libraries will likely not work, or need porting. The resulting enclaves are also inflexible, there’s no hope for e.g. dynamic code loading.
Furthermore if developers want to make sure their enclaves are actually secure they also need to be well versed in security-sensitive programming. And I am not talking about simply importing the latest version of a crypto library. I’m talking about oblivious computation. Needless to say, this also increases audit requirements, every enclave must be inspected and audited separately, it is difficult to reuse audits, to “compose” trust. But I’m getting ahead of myself.
Embedding of existing programs or containers
The other route some projects take is to try and embed existing programs and even full containers. This in theory solves the usability problem, however we believe it can undermine the security properties Intel SGX provides.
The way these projects work is they proxy operating system functionality through the host-enclave boundary, and try to mitigate attacks on a per-functionality basis. However, these pre-existing programs were not written with Intel SGX’s quite severe threat model in mind (a malicious operating system). This means the proxying opens a whole new world of possible side channel attacks! The attack surface increases from a well defined enclave-specific auditable ABI to an unspecified list of proxied system calls that are difficult, nay impossible to audit.
We strive for the happy medium between usability and security. We don’t promise that you will be able to run existing applications unmodified, but by enabling the running of JVM bytecode we make a ton of well-known libraries and tools available for enclave development, and open Intel SGX development to a large portion of the programmer community.
Furthermore, by embedding a JVM we provide a central control point for implementing security hardening, in the virtual machine itself! For example one of our long term plans is to modify the JIT compiler to automatically make memory accesses oblivious, possibly utilizing Intel TSX. Note that this is a longer term research project and certainly is not part of this initial release.
What’s in the box?
The 1.0 release includes the basic tooling required to build a JVM-based enclave.
- Partial JVM enclaves: these are binary blobs containing the embedded JVM, which can be linked together with a JAR to create a full enclave. This process will be transparently done by our build tools.
- General purpose enclave(let) host: this is a hosting process that loads enclaves that abide by a certain ABI. We creatively named such enclaves enclavelets. The programming model is akin to the webserver(host) vs webapp(enclave) separation, clients create sessions with your enclavelet through the host.
- Gradle build plugins: these plugins allow the building and testing of JVM-based enclaves, as well as bundling of the enclaves with the host process into a Docker container. Note that usage of Docker is optional, you can run the hosting process directly, or even write your own!
- Sample code: a simple RNG enclave that generates signed random numbers, and a CLI tool that can connect to such a hosted enclave.
- Hosting: We host a publicly accessible production-signed instance of the sample RNG enclave and a nightly debug build as well, for testing. See the documentation on how to connect to these.
Our next release targets end-to-end encryption and dynamic enclave loading.
The former is simply a feature missing from the current release. It’s relatively straight-forward to add, but requires care. We have a prototype implementation using SIGMA-I, which is what we will use for this work.
The latter however, is something special. Embedding the JVM opens up a unique possibility of loading code on the fly! This is not easy. So far we have been working with static enclaves, which means the hosting process knew exactly which enclave and therefore what functionality it was loading. However with dynamic loading this is not true anymore. We need to tackle the issue of malicious enclaves.
In order to do this we can utilize a sandbox we have developed, originally meant for a different use case (sandboxing Corda transaction verification). With this sandbox we can bound CPU and memory usage of dynamically loaded code, and restrict its access to a sandbox-controlled API. Using this we can effectively multiplex Intel SGX functionality between dynamically loaded enclaves.
Dynamic code loading means that a large chunk of the auditing work entailed by Intel SGX can be reused between different enclaves, as the JVM and sandbox will stay the same. The separation also helps focus security research. As vulnerabilities in Intel SGX, the JVM and the sandbox are found, we can harden all functionality at once, as opposed to hardening individual functionality-specific enclaves. Of course the dynamically loaded code still needs auditing, but the more code we can share the better.
Furthermore, from the infrastructure point of view dynamic code loading allows us to spin up a cluster of completely homogeneous enclaves, AWS Lambda style, and load the required functionality on the fly! Because the owner of the hardware no longer matters, this opens the possibility of a global peer to peer network of machines usable without trust: it’s hard to be more spiritually aligned with the blockchain vision than that.