12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879 |
- Introduction:
- -------------
- The tracer hwlat_detector is a special purpose tracer that is used to
- detect large system latencies induced by the behavior of certain underlying
- hardware or firmware, independent of Linux itself. The code was developed
- originally to detect SMIs (System Management Interrupts) on x86 systems,
- however there is nothing x86 specific about this patchset. It was
- originally written for use by the "RT" patch since the Real Time
- kernel is highly latency sensitive.
- SMIs are not serviced by the Linux kernel, which means that it does not
- even know that they are occuring. SMIs are instead set up by BIOS code
- and are serviced by BIOS code, usually for "critical" events such as
- management of thermal sensors and fans. Sometimes though, SMIs are used for
- other tasks and those tasks can spend an inordinate amount of time in the
- handler (sometimes measured in milliseconds). Obviously this is a problem if
- you are trying to keep event service latencies down in the microsecond range.
- The hardware latency detector works by hogging one of the cpus for configurable
- amounts of time (with interrupts disabled), polling the CPU Time Stamp Counter
- for some period, then looking for gaps in the TSC data. Any gap indicates a
- time when the polling was interrupted and since the interrupts are disabled,
- the only thing that could do that would be an SMI or other hardware hiccup
- (or an NMI, but those can be tracked).
- Note that the hwlat detector should *NEVER* be used in a production environment.
- It is intended to be run manually to determine if the hardware platform has a
- problem with long system firmware service routines.
- Usage:
- ------
- Write the ASCII text "hwlat" into the current_tracer file of the tracing system
- (mounted at /sys/kernel/tracing or /sys/kernel/tracing). It is possible to
- redefine the threshold in microseconds (us) above which latency spikes will
- be taken into account.
- Example:
- # echo hwlat > /sys/kernel/tracing/current_tracer
- # echo 100 > /sys/kernel/tracing/tracing_thresh
- The /sys/kernel/tracing/hwlat_detector interface contains the following files:
- width - time period to sample with CPUs held (usecs)
- must be less than the total window size (enforced)
- window - total period of sampling, width being inside (usecs)
- By default the width is set to 500,000 and window to 1,000,000, meaning that
- for every 1,000,000 usecs (1s) the hwlat detector will spin for 500,000 usecs
- (0.5s). If tracing_thresh contains zero when hwlat tracer is enabled, it will
- change to a default of 10 usecs. If any latencies that exceed the threshold is
- observed then the data will be written to the tracing ring buffer.
- The minimum sleep time between periods is 1 millisecond. Even if width
- is less than 1 millisecond apart from window, to allow the system to not
- be totally starved.
- If tracing_thresh was zero when hwlat detector was started, it will be set
- back to zero if another tracer is loaded. Note, the last value in
- tracing_thresh that hwlat detector had will be saved and this value will
- be restored in tracing_thresh if it is still zero when hwlat detector is
- started again.
- The following tracing directory files are used by the hwlat_detector:
- in /sys/kernel/tracing:
- tracing_threshold - minimum latency value to be considered (usecs)
- tracing_max_latency - maximum hardware latency actually observed (usecs)
- tracing_cpumask - the CPUs to move the hwlat thread across
- hwlat_detector/width - specified amount of time to spin within window (usecs)
- hwlat_detector/window - amount of time between (width) runs (usecs)
- The hwlat detector's kernel thread will migrate across each CPU specified in
- tracing_cpumask between each window. To limit the migration, either modify
- tracing_cpumask, or modify the hwlat kernel thread (named [hwlatd]) CPU
- affinity directly, and the migration will stop.
|