- Community Home
- >
- HPE Networking
- >
- Networking
- >
- Get rich data center telemetry with DPU-powered sw...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Printer Friendly Page
- Report Inappropriate Content
Get rich data center telemetry with DPU-powered switches
Network telemetry is a source of truth for network engineers and security operations teams. Telemetry takes a variety of forms, including SNMP, device memory and CPU utilization, port status, firewall syslogs, and flow records. Flow records are particularly valuable because they track the source and destination of communications, identify applications, and monitor bandwidth consumption by devices, protocols, and applications.
However, telemetry can be hard to collect, especially in the data center. The typical data center approach is to attach hardware probes to network devices, or to install software on the servers. While these probes and agents can gather flow records, they tend to be expensive and complicated to deploy and only provide visibility where deployed, which typically shows just a fraction of the overall data center traffic. To get full fidelity, youโd almost have to build a second network, which is cost-prohibitive. Whatโs more, devices or software agents also need to be monitored and maintained, which adds to the to-do lists of busy network engineers.
Given these constraints, many companies will rely on the sampled telemetry they can gather from the data center switches. This approach means typical solutions can only provide insights based on a small sample of total network traffic; in some cases, as little as 1 in every 8,000 flows, or .0125% of all traffic.
I believe that this limited sampling is not acceptable. It restricts visibility and doesnโt provide a full picture of the data center. It also hampers the effectiveness of AIOps tools by only providing partial awareness of what is happening in the network. Using only sampled flows creates a โgarbage in-garbage outโ scenario that drastically restricts the insights that modern AI/ML tools can provide.
The value of rich telemetry
Rich telemetry indicates the state of the network as well as the health of individual devices in it. It provides insights into performance and is essential for troubleshooting. With access to the right telemetry, network engineers can speed up mean time to resolution (MTTR)โor mean time to innocence (MTTI)โwhen the network isnโt at fault.
Telemetry is also valuable for security operations. By tracking the east-west movement of traffic through a network fabric, security teams may be able to identify anomalies or patterns that indicate suspicious behavior, be it an intruder mapping out resources or an insider trying to access sensitive systems.
Lastly, telemetry is vital for network automation, including AIOps. AI and ML tools are fueled by telemetry; it is the raw data they analyze to generate context-based insights or take automated actions. Without telemetry, there would be no modern AIOps. Today, feeding non-sampled flows into AI/ML tools creates the conditions for the advanced automation that has been needed for decades in the data center.
DPUs put eyeballs in your switches
So how to get better telemetry from your data center? A new option is to marry the computing power of data processing units (DPUs) with data center switches. The DPU is an evolution of the SmartNIC; it is a programmable processor designed to offload and accelerate networking, security, and other data center infrastructure services. DPUs can be deployed in servers and switches. By adding DPUs to Top of Rack (ToR) switches, network engineers can collect and export telemetry such as flows and logs via a computer platform that sits directly in the path of your data center trafficโon servers hosted in the data center.
HPE Aruba Networking and AMD have partnered to develop the industryโs first DPU-enabled switch, the HPE Aruba Networking CX 10000 with AMD Pensandoโข switch. The CX 10000 is a 1 RU device that offers 3.6 Tbps of standard line-rate stateless switching and supports 1, 10, and 25 GbE port options to servers with 40/100 GbE uplinks.
According to HPE, this CX 10000 Distributed Services Switch further delivers stateful services at 800 Gbps of throughput in each server rack. With its integrated programmable DPU, it can offer highly scalable east-west network firewall security, full non-sampled telemetry, IPsec encrypt/decrypt, and network address translation services. The form factor of the CX 10000 is designed to distribute these services to to the edge of the data center fabric, directly connected to each server; by doing so, service resources automatically scale along with data center workloads. This is the same architecture leveraged by many of the world's largest hyperscalers.
The CX 10000 can export firewall logs as well as industry-standard non-sampled IPFIX flow records. Network engineers can set intervals for flow sampling based on their requirements, from as granular as every second to longer periods such as one or five minutes.
In the flow
For years, organizations have been bolting on telemetry solutions to the network. By embedding DPUs into the switch, telemetry capabilities are now woven into the network fabric itself. And because these capabilities are offloaded to DPUs, there is no impacting switch performance.
By monitoring flow records and logs, network engineers can quickly spot congestion, retransmission, packet drops, and bandwidth-hogging applications. This can speed up troubleshooting, and even allow network engineers to head off issues before they impact application performance or service levels. Uniquely, since there is now telemetry for all flows in the network, network visibility is now mapped directly to each application instead of the legacy model of examining trunk usage.
Of course, itโs one thing to collect telemetryโit also needs to be analyzed. This analysis is best handled by dedicated systems such as flow analyzers, log collectors, and SIEMs. HPE Aruba Networking has developed a set of APIs to provide flow records and logs to a variety of third-party tools that are widely used in network operations centers (NOCs) and security operations centers (SOCs). These integrations include solutions from Splunk, Elastic, Guardicore, and Augtera Networks.
And as more AI and ML-driven systems come to market, the DPU-powered CX 10000 switch will be ready to fuel these tools with the high-fidelity telemetry required for these systems to provide accurate, context-based insights or take automated actions.
I can see clearly now
Network engineers have lacked the ability to gather comprehensive telemetry in data center networks because of complex, cost-prohibitive collection architectures. That changes with the CX 10000, which now makes rich telemetry available for collection and analysis. HPE Aruba Networking and AMD have developed a unique approach that inserts telemetry collection directly into the data center fabric.
For more information:
About the Author
Scott Stevens
With over 25 years of experience in the networking and security industries, Scott leads the Global Systems Engineering team for AMD Pensando. Here he is responsible for driving the DPU embedded solutions of AMD Pensandoโboth the Aruba CX10000 and the VMware Project Monterey enabled DPUs. In addition, he leads the Technical Business Development team bringing new innovative integrations with 3rd party AI/ML vendors to compliment and automate the Aruba CX10000 solution.
Previously, he ran the Global Systems Engineering team for Palo Alto Networksโbuilding the team from 100 to over 1300 customer facing field engineers as revenue ramped to over $4B annually. Prior to that he ran the Global SE team for Juniper Networks, spending 14 years building various Systems Engineering and Sales organizations.
Scott holds a Masters Degree in Business from Oklahoma City University and a Bachelors of Science Degree in Electrical Engineering from the Massachusetts Institute of Technology (MIT). He speaks regularly at industry conferences and is viewed as an industry visionary in the area of network and security architectures.
- Back to Blog
- Newer Article
- Older Article
-
AI-Powered
23 -
AI-Powered Networking
22 -
Analytics and Assurance
4 -
Aruba Unplugged
7 -
Cloud
9 -
Corporate
3 -
customer stories
4 -
Data Center
19 -
data center networks
19 -
digital workplace
2 -
Edge
4 -
Enterprise Campus
9 -
Events
5 -
Government
10 -
Healthcare
2 -
Higher Education
2 -
Hospitality
4 -
Industries
1 -
IoT
8 -
Large Public Venue
1 -
Location Services
3 -
Manufacturing
1 -
midsize business
1 -
mobility
17 -
Network as a Service (NaaS)
12 -
Partner Views
4 -
Primary Education
1 -
Retail
1 -
SASE
21 -
SD-WAN
12 -
Security
102 -
small business
1 -
Solutions
7 -
Technical
5 -
Uncategorized
1 -
Wired Wireless WAN
89 -
women in technology
2
- « Previous
- Next »