As new technologies such as software-defined networking (SDN) and network functions virtualization (NFV) continue to change how networks are architected, it’s becoming increasingly difficult for IT to get a complete picture of the entire network or to measure performance KPIs accurately.
To compensate, organizations often use individual tools to solve individual problems, which can result in tool sprawl (a financial and resource burden), and/or they rely on a single source of network data, such as SNMP, which is no longer sufficient in today’s hybrid IT landscape. To overcome these challenges, organizations are increasingly deploying network performance monitoring and diagnostic (NPMD) platforms that collect and visualize a variety of network data, so IT can proactively manage the network from the core (data centers) to the edge (cloud or remote sites).
There are several types and formats of networking data, and each is useful for monitoring and troubleshooting in their own way. They all have pros, cons and unique quirks, and the most effective IT teams monitor as many of these data types as possible. What data types should you be collecting and why? Industry information for this eWEEK Data Points article is provided by Jay Botelho, director of engineering at LiveAction.
Data Point No. 1: Network Telemetry
These types of networking data are drawn from network devices themselves for off-net processing and analytics of performance management:
Flow: “Flow” is a generalized term that includes both NetFlow (which was created by Cisco Systems in approximately 1996) and an array of variants such as sFlow, jFow, IPFIX, etc. Each of these gives an effective path view of internet traffic across a network by providing useful performance data on each device and interface along the entire source-to-destination path. Flow excels at tracking near real-time path data for active notification and isolation of issues due to changes in the network.
SNMP: The Simple Network Management Protocol provides a polling methodology for network elements, with a subset of objects through an SNMP Management Information Base (MIB) view. The MIB is a description of a set of network objects that can be polled and managed with the Simple Network Management Protocol. This provides data on devices, interfaces, CPU and others for monitoring and collecting the status of the network infrastructure. It’s a good foundation for basic network up/down monitoring, but SNMP typically does not provide detailed network information to analyze root cause for application performance or many user experience issues such as quality of service (QoS) policies and tunnel performance.
Data Point No. 2: Synthetic Testing
Virtual agents: Cloud applications can lack visibility and performance data to ensure users are getting the experience they expect. By using virtual software agents, IT can continuously monitor these important applications to ensure they’re delivering the latency and path quality needed to ensure end-user performance.
Data Point No. 3: Application Recognition
NBAR and NBAR2: Network-based application recognition is a mechanism that classifies and regulates bandwidth for network applications on Cisco routers. This lets network administrators view the mix of applications in use by the network at any given time and decide how much bandwidth to allow each application to ensure that available resources are utilized as efficiently as possible. NBAR can extract information from applications such as HTTP URL, HTTP User Agent and SIP URL for export or classification. NBAR 2 detects well over 1,000 applications with regular updates through NBAR2 Protocol Packs. NBAR2 identifies applications regardless of the ports on which applications may be running. Application categorization uses NBAR2 attributes to group similar applications to simplify application management for both classification and reporting.
AVC: Application Visibility and Control incorporates several technologies including application recognition and performance monitoring capabilities into the WAN router platform. Previously, network traffic could easily be identified using well-known port numbers, such as Port 80 for HTTP. Today, however, many more applications are delivered over HTTP–both business and recreational. Many applications use dynamic ports such as Exchange, and voice and video which are delivered over RTP. This makes them impossible to identify by looking at port numbers. In addition, some applications disguise themselves as HTTP because they do not want to be detected. As a result, identifying applications by checking well-known port numbers is no longer viable. AVC fills this gap.
Cisco AVC is enabled in Cisco IOS and IOS XE software. AVC is tracked using a combination of metric providers, embedded monitoring agents and Flexible NetFlow and includes both TCP performance metrics, such as bandwidth use, response time and latency, and RTP performance metrics, such as packet loss and jitter. These metrics are aggregated and exported in NetFlow v9 or IPFIX format to a management and reporting package.
Data Point No. 4: Systems Integration and Packet Capture
APIs: An application programming interface is a set of subroutine definitions, communication protocols and tools for building software. In general terms, it’s a set of clearly defined methods of communication among various components. In today’s SDN environments, the control plane is typically centralized with a management application and controller to define and push policies and configurations down to the devices and functions. Having an API integration with the management systems provides a way for path and APP ID information to know the business class and traffic routing through the SDN environment.
Also, many performance and analytics platforms, including LiveAction’s LiveNX, will use APIs to integrate with ticketing software such as ServiceNow for workflow optimization across incident management. When an alert is triggered, the analytics platform can automate the creation of an incident ID (trouble ticket) with semantic information such as location, time and the alert that triggered the incident to reduce the wait time for this data to get in the hands of engineers eager to solve the issue.
Packet Data: Being able to capture packets and write them to disc allows for detailed network troubleshooting to fix problems that can’t be solved with just flow data. For example, a flow with high latency could have several root causes. Packet data allows IT to see if a particular application is causing that latency, if a specific user is causing it, and how often it occurred.
Data Point No. 5: Achieving End-to-End Network Visibility
As we’ve seen, there are a number of ways to gather data to measure the performance of networked applications depending on what and where the resources are. Ultimately, multiple data sets are required to deliver a complete end-to-end view of the current state of a network. IT teams should look for tools that gather data from multiple sources, analyze it and present it in consumable insights for NetOps teams.
If you have a suggestion for an eWEEK Data Points article, email cpreimesberger@eweek.com.