Enabling Wide-spread Communications on Optical Fabric with MegaSwitch


Existing wired optical interconnects face a challenge of supporting wide-spread communications in production clusters. Initial proposals are constrained to only support hotspots between a small number of racks (e.g., 2 or 4) at a time, reconfigurable at milliseconds. Recent efforts on reducing optical circuit reconfiguration time from milliseconds to microseconds partially mitigate this problem by rapidly time-sharing optical circuits across more nodes, but are still limited by the total number of parallel circuits available simultaneously. In this paper, we seek an optical interconnect that can enable unconstrained communications within a computing cluster of thousands of servers. In particular, we present MegaSwitch, a multi-fiber ring optical fabric that exploits space division multiplexing across multiple fibers to deliver rearrangeably non-blocking communications to 30+ racks and 6000+ servers. We have implemented a 5-rack 40-server MegaSwitch prototype with real optical devices, and used testbed experiments as well as large-scale simulations to explore MegaSwitch’s architectural benefits and tradeoffs.

14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)