HyperDex strategically places objects on servers so that both search and key-based operations contact a small subset of all servers in the system. Whereas typical key-value stores map objects to nodes using just the key, HyperDex takes into account all attributes of an object when mapping it to servers.
The actual mapping technique used by HyperDex ishyperspace hashing. Hyperspace hashing creates a multidimensional euclidean space into which objects are mapped. Each attribute in the original object corresponds to one dimension in the euclidean space. Objects are hashed into this space by hashing all attributes to determine a coordinate in the space.
For example a table of objects with “first name”, “last name” and “phone number” attributes corresponds to a three dimensional hyperspace where each dimension corresponds to one attribute in the original object. Such a space is depicted in the diagram at right. In this space, there exist three objects. The red point is “John Smith” whose phone number is 555-8000. The green point is “John Doe” whose whone number is 555-7000. The blue point is “Jim Bob” whose phone number is 555-2000. Anyone named “John” must map to somewhere in the yellow plane. Similarly, anyone with the last name “Smith” must map to somewhere within the translucent plane. Naturally, all people named “John Smith” must map to somewhere along the line where these two planes intersect.
In each multi-dimnesional space corresponding to a type of object, HyperDex assigns nodes to disjoint regions of the space. The above figure shows two of these assignments. Notice that the line for “John Smith” only intersects two out of the eight assignments. Consequently, performing a search for all phone numbers of “John Smith” requires contacting only two nodes. Furthermore, the search could be made more specific by restricting it to all people named “John Smith” whose phone number falls between 555-5000 and 555-9999. Such a search contacts only one out of the eight servers in this hypothetical
Some benchmark from HyperDex (compare with MongoDB, Cassandra, Redis)
HyperDex provides high throughput and low latency. The graph at right shows the performance of HyperDex as well as that of competing NoSQL key-value stores. Each workload specified by the YCSB benchmark simulates a real application. HyperDex outperforms Cassandra and MongoDB by a factor of 2-13.
Workload E from the YCSB benchmark tests search/scan functionality. While other systems operate on the indexed key, HyperDex operates on secondary attributes of the stored object — i.e. HyperDex is performing a look up the other systems cannot perform. The graph shows the performance advantages of hyperdimensional hashing used in HyperDex.
HyperDex scales linearly in the number of servers. The figure shows a scalability experiment conducted on the VICCIplatform with varying numbers of clients. With 32 servers and sufficient clients to create a heavy workload, HyperDex is able to sustain 3.2 million ops/s.
HyperDex outperforms Redis for common key-value store workloads. The figure shows an experiment comparing one HyperDex server to one Redis server with a fixed number of clients. HyperDex has a slight performance lead for GET/PUTworkloads (A-D,F), and holds a significant advantage forSEARCH workloads (E). Redis offers advanced datastructures, but this comes at a cost: all data must fit in memory, and the system must be deployed in a single-master configuration. HyperDex does not suffer from these limitations, and also provides higher performance.