We present the design, optimization and analysis of a highly flexible and efficient multi-leg stock trading system. Automated electronic multi-leg trading allows atomic processing of consolidated orders such as "Buy 200 shares of IBM and sell 100 shares of HPQ". While the expressive power of multi-leg trading brings significant value to investors, it also poses major challenges to stock exchange architecture design, due to additional complexities introduced in performance, tradability, and fairness. Performance can be significantly worse due to the need to coordinate transactions among multiple stocks at once. This paper studies the performance of multi-leg trading under different fairness constraints and variability in order price and order quantity. We identify the major performance bottlenecks when using traditional atomic commitment protocols such as 2- Phase Commit (2PC), and propose a new look-ahead algorithm to maximize transaction concurrency and minimize performance degradation. We have implemented a base-line 2PC prototype and a look-ahead optimized prototype on IBM z10 zSeries eServer mainframes. Our experimental results show that the look-ahead optimization can improve throughput by 58% and reduce latency by 30%.