With the 12th OpenStack Summit in full swing this week in Tokyo, the promise of OpenStack is on full display, while the the on-going quest to reduce OpenStack complexity, increase resiliency and deliver at scale remains front and center.  Over the years, early OpenStack disciples have highlighted three key challenges to achieving operational efficiency:

  • Deployment: Installing OpenStack and operationalizing change management has been a constant struggle, especially at scale.

  • Networking: L3 Neutron networking has been a major obstacle to OpenStack scale and resiliency.  The L3 agent becomes the choke point as all traffic requiring routing and floating IP services need to be handled by the agent.  While it is possible to deploy multiple pairs of L3 agents, significant care needs to be taken – in terms of developing custom code and manual workflows – even at moderate scale.

  • Operations: Because there are two network domains to manage – physical (e.g. leaf and spine switches) and virtual (e.g. vSwitches in compute hosts), cross-domain issues – provisioning & change management, traffic visibility and trouble-shooting of connectivity – become extremely complicated.

While there is tremendous excitement around OpenStack for constructing on-premise as well as off-premise (managed) private clouds, current success stories have mostly been (1) internal SW teams developing custom solutions and workflows for scalable deployments or (2) out-of-box solutions deployed at limited scale (typically few racks, ~100 nodes).  

Broad-based OpenStack adoption by mainstream network operators, ready-to-deploy and fully validated solutions that are easy to deploy and operate, scalable from 10s to 100s of compute nodes and robust under variety of failure conditions, are starting to occur, with new architectures and approaches delivering promising results.

Figure 1: Big Cloud Fabric (P+V Edition) Architecture for OpenStack

 Big Switch Networks has taken an architectural approach to eliminate OpenStack networking challenges.  Specifically, our Big Cloud Fabric (BCF) solution provides the industry’s first unified physical + virtual SDN fabric that delivers a highly resilient and automated networking solution for OpenStack data centers. As depicted in Figure 1, Big Cloud Fabric (BCF) is built as a leaf, spine and virtual switch fabric, with the BCF Controller acting as the centralized pane for provisioning, control, troubleshooting, visibility and analytics of the entire physical and virtual network environment.  So how does BCF P+V Edition solve OpenStack challenges?

  1. Eliminate Neutron L3 Agent Bottlenecks: Switch Light VX for Neutron L2/L3 Networking is a user-space software agent for KVM-based Open vSwitch (OVS) kernel module.  It is installed in OpenStack compute node using Big Switch’s OpenStack installer.  It natively provides distributed virtual routing as well as distributed floating IP, thus eliminating Neutron L3 agent bottlenecks.  Physical switch provisioning occurs through Big Switch’s ML2 driver.

  2. Physical + Virtual SDN Fabric: With P+V, the BCF’s SDN controller manages both physical (leaf/spine) and Switch Light VX, so there is only one network to deal with.  Built-in visibility and trouble-shooting tools (e.g. test path, fabric span, flow visibility), network/cloud operations teams can rapidly resolve connectivity issues across physical and virtual domains.  We have even shown how self-service trouble shooting via Horizon GUI can be accomplished by the tenant.

For additional details of this architecture and customers benefits, check out this video demo: Big Switch Presents: Unified P+V Networking for OpenStack Clouds

Big Switch + Mirantis + Dell Scale Validation

Big Switch partnered with Mirantis and Dell to scale test a 300-node, 8-rack OpenStack test pod at the Dell Open Networking center of excellence in Santa Clara, CA (see Figure 2):

  • Dell Networking Switches & Servers including 22 10/40GbE open networking switches (18 S4048-ON switches and 4 S6000-ON switches) and 302 1RU rack servers (PowerEdge Series).

  • Mirantis OpenStack with Fuel installer for rapid, flexible and stable deployment of OpenStack components.

  • Big Cloud Fabric (Physical + Virtual Edition) with 297 Switch Light VX host vSwitches, BCF OpenStack Neutron Plugin for L2+L3 networking using ML2 driver and L3 plugin, to provide scale and resiliency necessary for production-grade OpenStack deployments.

  • Big Cloud Fabric OpenStack Installer – a plugin for OpenStack installation that seamlessly interoperates with Mirantis OpenStack and Fuel installer to handle all networking related installation and configuration tasks on OS Controller and Compute nodes.

Figure 2: Big Switch, Mirantis and Dell scale testbed (300 nodes, 8 racks)

With seamless interoperability across Mirantis Fuel Installer for OpenStack and Big Switches Big Cloud Fabric OpenStack Installer, scalable deployment of OpenStack controller nodes and ~300 compute nodes can be accomplished in hours.  Figure 3 shows the resource dashboard for this test setup, both in Fuel Installer as well as Big Cloud Fabric GUI.

Figure 3: Scalable OpenStack Deployment using Mirantis Fuel, BCF Controller and Dell switches/servers.

With the Big Switch Big Cloud Fabric approach, all legacy OpenStack Networking challenges have been resolved. The solution is production-grade, scalable & resilient, simple to deploy and operate, and currently deployed at some of the largest carrier and government agency OpenStack internal clouds today. OpenStack Networking Nirvana has arrived.

Welcome to disruptive networking!

Prashant Gandhi

VP, Products and Strategy

To Learn More