Monitoring Problems In A Virtual World.

Trevor Dearing, EMEA marketing director, Gigamon.

These days it is difficult to find an organisation that has not considered virtualisation as a means of reducing cost, improving management flexibility, and boosting business agility. Cloud services are evolving at breakneck speed, and VMware and other technologies have enhanced their offerings of virtual services, increasing the appeal for businesses as a viable and scalable solution.

However, this shift has not been without its challenges, and organisations must now consider how to move from tactical to strategic virtualisation – namely by improving the effectiveness of traffic management, and extending established monitoring best practices into the virtual world.

Virtualisation and the invisible network.

Virtualisation has been a prominent feature of computing for many years, but recent developments have led to significantly improved security and greater interoperability between network and storage vendors. This is particularly of interest now, as cash-conscious businesses are looking for new ways and disruptive technologies to improve the economics of their operations. Virtualisation improves efficiency and helps organisations effectively do more with less – creating dynamic and flexible infrastructures by maximising resource utilisation while increasing IT service delivery. It is therefore not surprising that adoption rates are increasing so rapidly, with Gartner forecasting that the cloud market will grow to $131 billion by 2017[1], largely driven by virtualisation.

As more business-critical workloads migrate to the virtual servers, an increasingly large share of network traffic is occurring between virtual machines residing on the same host. And even when the packets hit the physical network, it is encapsulated to ensure delivery to the appropriate destination. To maintain end-to-end service delivery and optimise performance, selected data streams between virtual machines on the same host or across hosts therefore needs to be extracted and delivered to external monitoring tools – without impacting security. However, as more of the path of application data is shrouded in the virtual switching infrastructure, and  visibility becomes so critical for problem diagnosis and monitoring, this contradiction makes managing and monitoring through traditional approaches very difficult.

By its very nature, virtualisation creates numerous blind spots within the server infrastructure. With so much traffic flowing across the network and being encapsulated across virtual tunnels, or in many cases the traffic not hitting the physical network at all, network operators are losing all-important visibility and control.

In addition, as security and compliance are often top of the agenda when it comes to virtualisation, visibility is often left on the backburner and organisations are struggling with how to reconcile these competing priorities to virtualise their environments effectively.

Finding the right approach.

Over the years, there have been many different approaches to monitoring virtualised environments, each with their own shortcomings. For instance, vSphere vMotion was developed to enable the live migration of running virtual machines from one physical server to another – allowing the creation of continuous and self-optimising virtual machines. However, while promising, this adds a new layer of complexity as administrators would need to ensure monitoring can seamlessly be updated to reflect numerous, inevitable changes to the server infrastructure. The lack of traceability and historical event tracking is also a problem, as configurations can be compromised when virtual machines are constantly reallocated.

The VN-Tag standard was proposed as a versatile alternative to provide access layer extension without extending management and STP domains – boosting visibility in Cisco virtualised environments. However, while promising, VN-Tags utilise additions to the Ethernet frame, which would render most standard monitoring tools useless as they are incapable of understanding this traffic. In addition, traffic on the host server’s physical network links would be increased through the use of VN Tags.

While many solutions have limitations, one of the most effective approaches is to introduce intelligent filtering techniques that enable specific traffic flows between virtual machines on the same host or across hosts to be selected, forwarded, and delivered to the appropriate monitoring, analysis, or security tools. Such solutions can be installed without the need for invasive agents, or changes to the hypervisor, allowing system managers to achieve the same packet-level traffic visibility between virtualised applications as is normally available between discrete physical applications and servers. At the same time packets being extracted from physical links carrying packets destined or sourced from virtual networks need to be normalised and optimised via decapsulation before serving them up to the monitoring tools. Operators can then filter these traffic flows based on user-defined criteria, before sending them to a secondary tool on the physical network to be aggregated, replicated, and made available to network performance, application performance, and security monitoring systems.

To combat the problems associated with the lack of historical tracking, visibility policies should be tied to each virtual machine being monitored, and migrate with them as they move across physical hosts. This enables visibility rules to synchronise, and leads to seamless, real-time adjustment of monitoring and security best practice in an agile, virtual environment.

So, while network virtualisation may on the face of it seem like an unnerving visibility nightmare for many, deploying the right solutions to sort and forward traffic from the hypervisor can ease this headache and help more organisations reap the cost and flexibility benefits of the technology. Having an end-to-end solution that spans both the physical and virtual infrastructure further empowers application, server, and network engineers with the granular insight needed to ensure consistent quality of service without affecting productivity.