News from EETimes points towards a startup that claims to offer an extreme performance advantage over Infiniband. A3Cube Inc. has developed a variation of the PCIe Express on a Network Interface Card to offer lower latency. The company is promoting their Ronniee Express technology via a PCIe 2.0 driven FPGA to offer sub-microsecond latency across a 128 server cluster.
In the Sockperf benchmark, numbers from A3Cube put performance at around 7x that of Infiniband and PCIe 3.0 x8, and thus claim that the approach beats the top alternatives. The PCIe support of the device at the physical layer enables quality-of-service features, and A3Cube claim the fabric enables a cluster of 10000 nodes to be represented in a single image without congestion.
The aim for A3Cube will be primarily in HFT, genomics, oil/gas exploration and real-time data analytics. Prototypes for merchants are currently being worked on, and it is expected that two versions of network cards and a 1U switch based on the technology will be available before July.
The new IP from A3Cube is kept hidden away, but the logic points towards device enumeration and the extension of the PCIe root complex of a cluster of systems. This is based on the quote regarding PCIe 3.0 incompatibility based on the different device enumeration in that specification. The plan is to build a solid platform on PCIe 4.0, which puts the technology several years away in terms of non-specialized deployment.
As many startups, the process for A3Cube is to now secure venture funding. The approach to Ronniee Express is different to that of PLX who are developing a direct PCIe interconnect for computer racks.
A3Cube’s webpage on the technology states the fabric uses a combination of hardware and software, while remaining application transparent. The product combines multiple 20 or 40 Gbit/s channels, with the aim at petabyte-scale Big Data and HPC storage systems.
Information from Willem Ter Harmsel puts the Ronniee NIC system as a global shared memory container, with an in-memory network between nodes. CPU/Memory/IO are directly connected, with 800-900 nanosecond latencies, and the ‘memory windows’ facilitates low latency traffic.
Using A3cube’s storage OS, byOS, and 40 terabytes of SSDs and the Ronniee Express fabric, five storage nodes were connected together via 4 links per NIC allowing for 810ns latency in any direction. A3Cube claim 4 million IOPs with this setup.
Further, in interview by Willem and Anontella Rubicco shows that “Ronniee is designed to build massively parallel storage and analytics machines; not to be used as an “interconnection” as Infiniband or Ethernet. It is designed to accelerate applications and create parallel storage and analytics architecture.”