Oxide set out to build a private cloud rack system that retains the advantages of public cloud and doesn't sacrifice on infrastructure control, efficiency, and flexibility.
Businesses that need to build on-premises private clouds have a hardware problem, according to startup Oxide Computer Company. The typical on-premises cloud infrastructure rack consists of multiple commodity appliances (servers, switches, storage) from different vendors, a chaotic tangle of cables, and software layers on top of all of that.
While these systems improve on legacy designs, many of the advantages of cloud computing, such as elastic capacity and multitenancy are lost.
If an enterprise wants to enjoy cloud benefits, their only choice is to rent cloud infrastructure from the likes of AWS or Microsoft Azure. Renting infrastructure provides several benefits, including flexibility and shifting the burden of deploying and maintaining hardware to the service provider. However, renting forces businesses to cede control of critical infrastructure to a third-party.
The problem is that many workloads must run on-premises, due to security, compliance, or even latency concerns. But cobbling together DIY cloud infrastructure is like assembling Ikea furniture, but with each piece coming from a different vendor. Deployment cycles are long, ongoing maintenance is a serious burden, and the one-size-fits-all nature of commodity hardware means that it’s never quite optimized for the software running on it. The hardware-software mismatch creates unforeseen complexity that can lead to a range of problems, including outages, ongoing performance issues, and underutilization.
Putting the cloud back in private clouds
According to Oxide CEO Steve Tuck, the way forward is to stop thinking of the cloud as a destination. “At its core, cloud computing is a way to programmatically interface with large pools of compute, networking, and storage resources,” he said. This not only makes it easy for developers to write, deploy and manage software, but it delivers on original cloud promises, such as elastic capacity and better resource utilization.
In other words, Oxide’s main mission is to put the “cloud” back in private cloud computing.
The company is built on the premise that you should be able to choose to rent or own cloud capacity, depending on the workload, not losing the benefits of cloud computing like elasticity when you choose the latter. To accomplish this, the Oxide team set out to build an entirely new cloud hardware rack that would deliver all of the advantages public cloud vendors enjoy, without sacrificing on control, efficiency, and flexibility.
Another issue that limits private clouds is that many enterprises attempt to build their private clouds on Kubernetes. The problem is that Kubernetes was not designed for multitenancy, and, thus, it does not offer a true cloud experience. That’s not a knock on Kubernetes, but the container orchestration software is typically deployed on top of bloated layers of software, adding complexity and making it difficult to manage at scale.
Hyperscalers build proprietary hardware to overcome OEM obstacles
Co-founders Tuck and Bryan Cantrill, CTO, struggled with these issues firsthand when they worked together at the cloud infrastructure provider Joyent. “The cloud software we built ran on commodity hardware. However, we kept having issues where the root cause was that our software did not perform as expected on the OEM hardware. There were so many disconnects, and they were impossible to predict ahead of time,” Tuck said. Then, when Joyent was acquired by Samsung in 2016, the problems grew right alongside the company’s expanding cloud footprint.
When the Joyent team sought ways to remedy these problems, they first attempted to acquire the same infrastructure that the large hyperscalers use. Cantrill contends that the hyperscalers have “infrastructure privilege.” “Hyperscalers like Facebook, Microsoft, and Google gave up on commodity hardware long ago,” said Cantrill. The major cloud companies have all built their own proprietary systems, which offer superior performance, but the typical enterprise can only rent, not buy, that hardware.
Eventually, Cantrill, Tuck, and co-founder Jessie Frazelle (now CEO of the hardware design company KittyCAD) raised seed funding and founded Oxide Computer Company in 2019. As the Oxide team investigated how to deliver the kind of cloud computing experience to the enterprise that hyperscalers enjoy, they soon learned that they were taking on a massive challenge, one that Cantrill said would involve rebuilding a “completely ossified” infrastructure stack, top to bottom.
This meant that Oxide had a talent challenge on its hands, as well. But as word trickled out that they intended to rebuild the cloud hardware-software stack, they soon attracted 60 seasoned hardware and software veterans, who worked with them to redesign not only compute and switching but also printed circuit boards, fans, the OS, and even power supplies.
To date, Oxide Computer is backed by $78 million in VC funding, which includes a $44-million in Series A funding round secured in October. The round was led by Eclipse VC; Intel Capital, Riot Ventures, Counterpart Ventures, and Rally Ventures also participated.
After four years in stealth mode, in July 2023 the startup shipped its first product: a 3,000-pound, 9-foot-tall, rack-scale system of cloud resources.
Rethinking the cloud rack, from power to fans to firmware
While I spoke with Cantrill and Tuck over a Zoom call, Tuck (virtually) walked me through their office to a working Oxide rack, displaying a streamlined private cloud rack system. Tuck pointed to the back of the rack, which lacked the typical tangle of cables. “We built our own networking switch and compute sled,” Tuck said.
A cabled backplane allows sleds to snap right into the rack, so enterprise IT teams never have to worry about running cables. Each Oxide rack has 32 sleds, each of which contains an AMD CPU, DRAM, and storage (up to 32TBps per sled). Even though it’s a massive rack, the system is designed to deliver an iPhone-like experience, ready to go right out of the box.
According to Oxide, this design allows enterprises to be fully deployed within a few hours of unboxing the system, versus what typically takes weeks or months using the “kit car” build of OEM hardware.
Redesigning everything from the ground up meant that the Oxide team stumbled across other issues that mostly go unnoticed, like the legacy drag of BIOS. Developed by AMI in the 1980s, BIOS firmware runs on every x86 OEM server motherboard and handles boot-up runtime services.
Cantrill argues that BIOS is a poorly written piece of software that is not only out of date, but also often serves as an unintentional system throttle because it sits at the lowest layer of the compute stack, but it has no awareness of the processes running on top of it. As a result, BIOS often hijacks resources for its own purposes, whether it needs them or not.
The startup eliminated BIOS, replacing it with its own cloud optimized firmware. Now, after the AMD security processor executes, the first thing to execute on boot up is the startup’s OS, Hubris, which is based on open-source Rust software. The startup also developed its own hypervisor and control plane. All of the Oxide software is open source, although the company does charge a subscription for support, updates, and bug fixes.
The Oxide team dug so deeply into the problems with the typical on-premises computing rack that they also redesigned the power supply, running the system on DC, and they even worked with the vendor that supplies cooling fans to make them quieter and more efficient.
After shipping its first rack in July, Oxide secured its $44 million Series A round in October and revealed its first named customer, Idaho National Laboratory, a Department of Energy research facility.
Startup snapshot: Oxide Computer Company
- Year founded: 2019
- Funding: $78 million
- Headquarters: Emeryville, Calif.
- CEO: Steve Tuck
- What they do: Develop cloud infrastructure for the enterprise
- Competitors include: OEM providers such as Dell, IBM, and HPE, as well as from public cloud service providers such as Google, AWS, and Microsoft Azure
- Named customer: Idaho National Laboratory