WorkloadIntelligence

Maximizing Workload Performance in the Data Center

Analytics Screenshot  

Challenges in the Data Center

Understanding real-world application workloads is easier said than done. This has traditionally been one of the most challenging obstacles to storage validation and performance. And, most organizations have little understanding how their applications interact with their storage infrastructure. Many organizations have tried to use internally developed and/or third-party, load-generation tools to simulate production workloads. In some cases, these basic, synthetic workloads are “good enough,” but they never give an accurate reproduction of real-world production workloads.


Data Centers face several challenges related to understanding their workloads, such as:

  • Inability to proactively capture, visualize and analyze production I/O workload problems when they are occurring in real time.
  • The lack of a deep analytics tools to analyze, assess, and tune application performance to fix the problem.
  • New flash based SSD technology that behaves differently than mechanically based legacy HDDs.
  • No method for Data Centers to share and replay these production I/O workloads with SSD suppliers to help improve the storage and help shorten their qualification cycle.
  • The difficulty in keeping up with new features and functionality since storage and SSD technology moves so quickly.

In response, Teledyne LeCroy has worked closely with its Data Center (and SSD) customers to deliver deep capture, analytics and replay tools called WorkloadIntelligence to identify, characterize and solve performance problems before systems are deployed.

WorkloadIntelligence Applications

The three applications work in concert to solve some of the toughest challenges Data Center operators face daily.

WorkloadIntelligence has been used successfully to:

  • Find system and application latencies affecting performance.
  • Detect database latencies affecting overall application performance
  • Characterize current and future workload performance
  • Identify rogue or unoptimized processes
  • Ensure applications were utilizing and allocating CPU cores efficiently
  • Tune the Application and Linux block layer to achieve better workload performance


Deep Capture with WorkloadIntelligence Data Agent

DataAgent is a smart capture tool developed as a series of custom plug-ins that can capture different variants of production system and workload data in Data Center environments. The tool’s modular approach allows users to customize it for desired system elements and keep system overhead extremely low. The plug-ins feature minimal CPU utilization and user defined memory usage based on buffer sizes.

Designed with NVMe in mind, DataAgent allows users to capture more than just the IO packets common with generally available tools. The tool can also capture admin commands so users can better understand their impact on system performance. With its extensible architecture, DataAgent will be able to capture other related traffic to provide users a more wholistic view of how their system performs across a range of events.

The DataAgent software provides user-defined filters and triggers to capture the relevant data. With filters, the user can just capture the data that are defined by certain conditions. Triggers can be used to take action when a user-defined condition has been met, such as capturing the surrounding data (e.g. latency > 10ms). The data buffers capture both historic and future events which can be exported and then visualized and analyzed with Teledyne

LeCroy’s WorkloadIntelligence™ Analytics software. With the triggering capability, events of interest can be found without looking for a needle in a haystack. Designed for high volume traffic, DataAgent is extremely efficient at capturing trace event data. The DataAgent plug-ins are lightweight, efficient and ready for production use.

DataAgent is shipped with four distinct DataAgent plug-ins:

  • Storage Lite Collector: Captures Linux block events (blktrace replacement)
  • System Statistics Collector: Captures CPU, memory, and network events
  • Storage Pro Collector: Captures other storage information (e.g. NVMe admin commands)
  • SVF Collector: Captures events from our OakGate SSD Test Platform (i.e. NVMe Zones)

WorkloadIntelligence Storage Data Collectors
WorkloadIntelligence Data Agent

Visualizing and Analyzing with WorkloadIntelligence Analytics

The WorkloadIntelligence™ Analytics Application enables users to optimize their infrastructure and applications in high-performance, hyperscale environments through the analysis of imported DataAgent deep workload traces and/or workload traces captured with standard methods today such as Linux block layer I/O traces from data center backend servers.

With the Analytics application, users can review, analyze and synchronize data from the application layer to the physical storage layer. The analytics tool makes it simple to create advanced performance charts with an extensive selection of parameters.

The Analytics Dashboard enables the user to create canvases of built-in or custom high-performance charts of various parameters and correlations. These charts can represent a range of hundreds of millions of data points to a single data point depending on the selected time range. This remarkable charting flexibility and depth provides an extraordinary insight into workload data for deep workload analytics, leading to well-informed decisions for SSD development and managing data center technology.

When operating the Analytics application, users can review and analyze I/O data and process identifiers (PIDs); plot charts with an extensive selection of parameters for reads, writes, and discard (trim) I/Os; evaluate advanced performance charts, which have detailed analytics and synchronization capabilities; and analyze and generate replay files for SSD analysis and comparative studies.

Visualizing and Analyzing with WorkloadIntelligence Analytics

Additionally, the Analytics application provides the capability to generate a WorkloadIntelligence™ Replay file from a trace collection, trace view, or specified time range that can be replayed in Teledyne LeCroy OakGate SVF Pro/Enduro storage validation software.

Running a replay file can provide deep insight of the device under test (DUT), such as:

  • Running it repeatedly, which can stress the DUT and expose areas of weakness or validate its stability.
  • Executing the same replay file on different DUTs, which can provide a comparative analysis between the DUTs.


Accurately Replay Production Workloads with WorkloadIntelligence Replay

Using WorkloadIntelligence Replay, decision makers can effectively and accurately replay real world production workloads. The exact same baseline real world workflow can be played and replayed multiple times across multiple drives or firmware revisions. This replay helps decision makers understand what variation may exist within their drives. In addition, a deeper understanding of production workload behavior and performance is crucial to optimizing SSDs and storage systems.

Another important benefit for using replay is that it does not require a physical infrastructure to reproduce the production workload. Accurate workload replay ensures the workload is reproduced with the same performance, throughput, response time, and I/O request order. Therefore, when using workload replay, the production infrastructure is not exposed to possible interruptions and performance impacts.

WorkloadIntelligence Replay Production Workloads

Additionally, the Analytics application provides the capability to generate a WorkloadIntelligence Replay file from a trace collection, trace view, or specified time range that can be replayed in Teledyne LeCroy OakGate SVF Pro/Enduro storage validation software.

WorkloadIntelligence Replay can provide deep insight of the DUT’s robustness and stability, such as:

  • Running a real-world replay file repeatedly, which can stress the DUT and expose areas of weakness or validate its stability.
  • Running the same replay file for multiple DUTs, which can provide a comparative analysis between the DUTs.

The WorkloadIntelligence replay process flow:

WorkloadIntelligence replay process flow

WorkloadIntelligence Product Features

DataAgent Logo
D​ataAgent
Analytics Logo
Analytics
Replay Logo
Replay
Fast and efficient; Trace every event without dropping any data. Uses in-memory circular buffers
Detailed analytics and synchronization capability with the potential to calculate statistics on over 25 different workload trace attributes
Easily transfer replay files with flash drive or through the network using SVF CLI commands or using any off-the-shelf FTP application. Replay files are captured and stored automatically in output directory
Four data collectors are included.
  1. Storage Lite Collector: Captures Linux block events (blktrace replacement)
  2. Storage Lite Collector: Captures Linux block events (blktrace replacement)
  3. Storage Pro Collector: Captures other storage information (e.g. NVMe admin commands)
  4. SVF Collector: Captures events from our OakGate SSD Test Platform (i.e. NVMe Zones)
Chart wizards to create charts of different types (such as scatter plots and histograms); with numerous predefined or user-defined parameters (such as time, latency, LBA, I/O size, CPU, and so on); and with various gradient colors to easily visualize data points and comparisons
The results can be re-imported to the Analytics application for further analytics. Users can create customized filtered “views” of their workloads and then zoom-in on any segment of the filtered workload view. All the performance graphs can then be synchronized to the zoom time stamps. The results can be compared and contrasted with the original workload(s).
Captures only relevant data with user-definable filters and triggers
The versatility to display canvases of multiple charts in one window, separate windows, or a combined window
​​Ensure Integrity of the Workload during Replay. The LBA Mapping Policy determines the type of policy to use
Build your custom streaming engine
  • Collectors
  • Streaming Nodes
  • ​​Buffers
  • Filters + Triggers
  • Exporters
Ingests DataAgent file types plus multiple other file types, such as Linux block trace files; OakGate SVF Pro/ Enduro-generated files; and any generic, time-based, text- based files
Integrated help system that provides interface hints providing command and control context throughout the application
Minimal CPU utilization; Memory usage definable based on buffers
Infinite zoom capability on performance charts
Replay is a feature of SVF Pro/Enduro and requires user to have an OakGate SSD Test Solution

A list of process identifiers (PIDs)
 
The ability to take a snapshot of a time range of PIDs and their statistics
The flexibility to merge multiple traces, and analyze them as individual traces or a combined trace
The ability to save charts as PNG files and chart data as CSV files
A light and dark mode user interface


Ordering Information for DataAgent


​Product Number
​Product Description​​​​
OGT-WI-DA-50
​WorkloadIntelligence DataAgent Base set of 4 Plugins Qty 50
OGT-WI-DA-100
​WorkloadIntelligence DataAgent Base set of 4 Plugins Qty 100
OGT-WI-DA-500
​WorkloadIntelligence DataAgent Base set of 4 Plugins Qty 500
000
​WorkloadIntelligence DataAgent Base set of 4 Plugins Qty 1000
OGT-WI-DA-SITE
​WorkloadIntelligence DataAgent Base set of 4 Plugins Qty Greater than 10,000



Ordering Information for Analytics


​Product Number
​Product Description​​​
OGT-WIA-BASE-SW
​WorkloadIntelligence Analytics Base Software Application 3-year License
OGT-WIA-1YR
WorkloadIntelligence Analytics Base Software Application 1-year License
OGT-WIA-UL
​WorkloadIntelligence Analytics Single User License - 1 Year
OGT-WISAAS-PLTNM
​WorkloadIntelligence Analytics Enterprise Platinum Package - SaaS 24x7x365
OGT-WISAAS-ENT
WorkloadIntelligence Analytics Enterprise Package - SaaS 12x7x365
OGT-WISAAS-PRO
​WorkloadIntelligence Analytics Professional Package - SaaS 12x5 (Mon-Fri)



Ordering Information for Replay *


​Product Number
​Product Description​​
OGT-WIR-200-G4
WorkloadIntelligence Replay Appliance
OGT-WIR-SWP
​WorkloadIntelligence Replay Software License Package and Hardware Add in Card

*Replay is an application executed on an OakGate SSD Test Appliance running SVF Pro/Enduro.