The storage industry is in the middle of massive transformation and the move to hyper-convergence is a key example of that.  This architecture is rapidly gaining popularity as more workloads are placed on top of integrated server-storage systems. Gartner predicts almost 80% growth for this market, expecting it to reach $2B in 2016. Virtual SAN (VSAN), which is a hyper-convergence solution from VMware, is getting accelerated adoption as part of this trend. VSAN allows to pool distributed disk resources to create one logical disk with Virtual Machine File System (VMFS) file system on top of it, thereby assembling one logical datastore. It is tightly integrated with the ESXi hypervisor, allowing one to achieve high scale, performance, and ease of use. Big Cloud Fabric (BCF) is a Software Defined Networking (SDN) fabric that brings agility, cost savings and other benefits of virtualization to data center networks. It is an integrated networking solution based on an SDN controller and open networking hardware (whitebox / britebox) that allows a multi-rack network to be represented as one logical switch. As shown in Figure 1, there is a lot of synergy between “one logical switch” and “one logical datastore” models and combining them in one data center will allow admins to realize maximum benefits of SDN and hyper-convergence.

 

Figure 1: Big Cloud Fabric and VMware VSAN architectural synergy

 

With the distributed nature of hyper-converged solutions and an increase in east-west traffic between storage nodes, the role of physical network becomes extremely critical. During deployment and operation, VSAN and network admins will face cross-silo interaction challenges that will ultimately slow the project down. Let's take a look at what they are and how an already proven integration between Big Cloud Fabric and VMware vSphere allows you to keep the project on track and deploy VSAN in minutes.

Network Provisioning

If you are an admin trying to deploy VSAN, the last thing you want to worry about is your network configuration. Typical tasks like attaching VSAN nodes (ESXi hosts in VSAN cluster) to leaf switches, configuring VLANs and enabling multicast take time and are error-prone when done manually. If you are a network admin, you are probably tired of responding to tickets asking to provision the “plumbing”. We often hear that network admins just want to respond with “don’t come to me for VLANs!”. While some of the tasks may be automated using scripts or templates inside a bolted-on management solution, they take time to validate and have to be maintained. Let’s compare the amount of work needed to provision VSAN on top of a traditional box-by-box network against Big Cloud Fabric. Keep in mind that we are not taking any shortcuts here and want to be in full compliance with all VMware VSAN networking best practices and recommended designs.

Figure 2: Network Configuration Tasks for VMware VSAN

As shown in Figure 2, a switch in a traditional network will need quite a bit of configuration to enable Layer 2 as well as Layer 3 unicast and multicast routing. Many vendors have evolved their network designs to spine-leaf, where a pure Layer 3 spine will not need much of this configuration. While that is true, the percentage of spine switches out of total number of switches becomes smaller in a scale-out design and hence the impact of a lean spine is negligible. Meanwhile, all of this configuration has to be applied on each leaf switch.

Big Cloud Fabric requires only two items to be configured once during initial deployment. The rest is taken care of automatically by vCenter sending information about VSAN nodes and their virtual networking configuration to BCF controller. The controller then programs the physical network. Adding more capacity and workloads to VSAN cluster is now simpler than ever, regardless of what changes are made on VSAN or on Big Cloud Fabric:

  • New hard drive addition to existing nodes is transparent.

  • New node addition does not require manual entry of complex LAG / MLAG configurations on each connected switch.

  • Leaf switch addition is transparent and non-disruptive to currently running VSAN cluster.

  • Leaf switch uplink addition for more bandwidth is transparent.

  • Regardless of whether VSAN communication happens over L2 or L3, zero complex unicast or multicast routing configuration is required. Enabling multicast, which is a requirement for VSAN, is literally one BCF GUI click!

Visibility and Troubleshooting

Solving configuration challenges is necessary but not sufficient. How does a VSAN admin really know if "plumbing" is the reason for poor performance of an application using VSAN storage? With VSAN running on top of traditional box-by-box networks, VSAN admin is left to hope and pray that network admin did not misconfigure something. How does a network admin prove that network is not the culprit? It often takes days of back and forth communication between network and storage teams to root cause the problem. VMware recently released VSAN 6.2 with many useful cluster troubleshooting tools. BCF + VSAN solution builds on these tools to further reduce troubleshooting time. With controller providing full network visibility, VSAN admin can use BCF plug-in for vSphere Web Client to zero-in on exact problem area instead of simply knowing that there is a problem. As seen in Figure 3, the plug-in allows VSAN admin to see the physical fabric configuration as it pertains to the specific VSAN cluster. The BCF Test Path feature has been recently incorporated inside the plug-in, allowing VSAN admin to trace exact traffic path two VMkernel interfaces used for VSAN communication.  A successful test means that the network is not the problem and other areas need to be explored. If network is the culprit, a failed test will show the exact location and reason for poor performance or communication outage of a VSAN cluster. This level of sophisticated visibility into the network is available to the VSAN admin at a click of a button and without having to login to a single networking component (including the BCF controller) or coordinate with network admin. All of this can be accessed from vSphere GUI familiar to VSAN admin. With so much on their plate, no VSAN admin is dreaming of deciphering cryptic error messages from the network. Our enhanced network error reporting inside the plug-in is designed to be intuitive to VSAN admins. This enables VSAN admins to understand what is happening in the network and provide precise information to the network admin for faster troubleshooting. All of this is done without forcing VSAN admin to know anything about the architecture or operations of the fabric.


Figure 3: Benefits of BCF Plugin for VMware vSphere Web Client

 

Ready to accelerate your VMware VSAN deployment? Feel free to reach out to us at info@bigswitch.com for more information on Big Cloud Fabric and join us on this journey as we continue to innovate and disrupt the status quo of networking. To learn more about BCF with VMware VSAN we will have a webinar on Sept. 8th. To register: http://bigswitch.com/webinars#upcoming

 

Arkadiy Shapiro

Principal Technical Marketing Engineer, Big Switch Networks

 

Srinivasan Ramasubramanian

Chief Architect, Big Switch Networks

 

 To Learn More