Paper: MegaPipe: A New Programming Interface For Scalable Network I/O

The paper MegaPipe: A New Programming Interface for Scalable Network I/O (videoslides) hits the common theme that if you want to go faster you need a better car design, not just a better driver. So that’s why the authors started with a clean-slate and designed a network API from the ground up with support for concurrent I/O, a requirement for achieving high performance while scaling to large numbers of connections per thread, multiple cores, etc.  What they created is MegaPipe, “a new network programming API for message-oriented workloads to avoid the performance issues of BSD Socket API.”

The result: MegaPipe outperforms baseline Linux between 29% (for long connections) and 582% (for short connections). MegaPipe improves the performance of a modified version of memcached between 15% and 320%. For a workload based on real-world HTTP traces, MegaPipe boosts the throughput of nginx by 75%.

What’s this most excellent and interesting paper about?

Message-oriented network workloads, where connections are short and/or message sizes are small, are CPU intensive and scale poorly on multi-core systems with the BSD Socket API. We present MegaPipe, a new API for efficient, scalable network I/O for message-oriented workloads. The design of MegaPipe centers around the abstraction of a channel a per-core, bidirectional pipe between the kernel and user space, used to exchange both I/O requests and event notifications. On top of the channel abstraction, we introduce three key concepts of MegaPipe: partitioninglightweight socket (lwsocket), and batching

We implement MegaPipe in Linux and adapt memcached and nginx. Our results show that, by embracing a clean-slate design approach, MegaPipe is able to exploit new opportunities for improved performance and ease of programmability. In microbenchmarks on an 8-core server with 64 B messages, MegaPipe outperforms baseline Linux between 29% (for long connections) and 582% (for short connections). MegaPipe improves the performance of a modified version of memcached between 15% and 320%. For a workload based on real-world HTTP traces, MegaPipe boosts the throughput of nginx by 75%.

Performance with Small Messages:

Small messages result in greater relative network I/O overhead in comparison to larger messages. In fact, the per-message overhead remains roughly constant and thus, independent of message size; in comparison with a 64 B message, a 1 KiB message adds only about 2% overhead due to the copying between user and kernel on our system, despite the large size difference.

Partitioned listening sockets:

Instead of a single listening socket shared across cores, MegaPipe allows applications to clone a listening socket and partition its associated queue across cores. Such partitioning improves performance with multiple cores while giving applications control over their use of parallelism.

Lightweight sockets:

Sockets are represented by file descriptors and hence inherit some unnecessary filerelated overheads. MegaPipe instead introduces lwsocket, a lightweight socket abstraction that is not wrapped in filerelated data structures and thus is free from system-wide synchronization.

System Call Batching:

MegaPipe amortizes system call overheads by batching asynchronous I/O requests and completion notifications within a channel.



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s