What is Dashboard?

Dashboard is a performance monitoring library that creates and updates a website dynamically at runtime. Separate pages report the performance of key hardware and software components. The website may be viewed by any device supporting a web browser, including desktops, tablets and smartphones.

To get an overview of the capabilities of Dashboard you may run MatrixWarrior.

How is Dashboard used?

Your
Application
Your
Application
FMSlib
Dashboard Dashboard
Operating System
Hardware
There are two ways to use Dashboard:
  1. Instrument your application with calls to Dashboard routines, or
  2. Use one of the FMSlib utilities already instrumented with Dashboard. These utilities include:
    • Matrix algebra,
    • Parallel processing,
    • GPU accelerators,
    • Memory management, and
    • Disk file striping.
When any of these FMSlib utilities are used, Dashboard reports are automatically generated.

What reports does Dashboard generate?

Website pages generated by Dashboard are divided into three categories:

1. Hardware Reports:

Dashboard interfaces directly with hooks built into the operating system (Linux and Windows) to obtain configuration and performance information. Dashboard also uses functions provided by hardware manufacturers to obtain device-specific information. These reports are initialized by a single call at the beginning of your application. As your application is running, time varying information is automatically updated at a specified interval.

2. Application Reports:

Dashboard also provides functions which monitor the performance of your application. These include calls you can make at the beginning and end of subroutines. Dashboard uses the information you provide to maintain a call stack, listing the currently active routines and the order in which they were called. A call history is also maintained, which lists the number of times each routine was called, the calling thread (parent or child) and the time used (CPU and Wall).

You may also provide Dashboard with the amount of Useful Work the routine is performing. Typical Useful Work items include floating point operations performed or bytes transferred. Dashboard combines this information with the routine timing to compute the rate of performing useful work. This information is then displayed as compute rate (flops/sec) or transfer rate (bytes/sec).

3. Custom Reports:

Dashboard can also generate application-specific pages to display information in the context of an application. As an example, FMSlib includes a Performance page which provides job status and a summary of compute and transfer rates. This page also includes a picture of the matrices and vectors being processed.

If your application uses FMSlib, Dashboard is already included. As an option, you can supplement the FMSlib reports by including calls to your routines.

Where do I get Dashboard?

Dashboard is distributed as part of the FMSlib library. You may obtain Dashboard by downloading FMSlib from Multipath's website at www.fmslib.com.

Installing Dashboard in your application

The first step is to add a nonzero value of IWATCH to the license file between the FMSSET and RETURN lines as follows:
FMSSET
IWATCH=99
RETURN
This will instruct the FMS library to generate Dashboard reports. If FMSlib is already installed in your application, this is all that is required.

1. Hardware Reports

To generate reports on the installed hardware and its performance, make the following two calls in your application:
       CALL FMSINI()
During this call, Dashboard performs a series of initialization steps. A list of these steps is included in the subroutine description for FMSINI. This call should be at the beginning of your application so the overall timing reports will be accurate.

Next,

       CALL FMSEND()
The call to FMSEND should be at the end of your application.

These two calls are all that is required to generate the Dashboard hardware reports. If you are using FMSlib, these calls are already in your application.

2. Application Reports

You can instrument your application to provide performance and timing reports by placing your routines on the Dashboard call stack as follows:
       SUBROUTINE MYSUB(...)
       CALL FMSPSH("MYSUB")
       ...
       CALL FMSPOP("MYSUB")
       END
The first call places the subroutine on the stack and the second call removes it. Dashboard creates a time stamp on both calls to compute the net timing (CPU, WALL) spent in the routine. You may instrument all or part of your application. Subroutines can be nested. Dashboard also records which thread made the call, parent or child.

The first call to FMSPSH is assumed to be your application name. This name is included on the header of each page.

3. Custom Reports

All timing data collected by Dashboard is available to generate an application-specific web page. As an example, the Performance page was designed to display information important to FMSlib. If you have an application that could benefit from a custom Dashboard page, please contact Multipath for more information.

How Dashboard works

Computer, Disk, Terminal The display of information involves the following 3 steps:
  1. Dashboard and the instrumented application run on a computer. This is the machine and application that are monitored. It can be any machine running the Linux or Windows operating system, from a laptop to high performance server containing GPU accelerators.
  2. Dashboard creates a website as files on a disk. This disk may be in the instrumented computer or on any remote fileserver accessible from the instrumented computer. By default, webpages are stored in the current working directory used to run the application. You may change this location by specifying the directory in the environment variable FMSHTML.
  3. You view the website from any viewing device that contains a web browser, including a pc, tablet or smartphone.
The flow of information is in one direction: from the computer, to the disk, to your viewing device. You do not interact directly with` Dashboard. This design allows you to monitor the performance of a computer which may be located in a restricted area.

How to control Dashboard reports

Controlling Report Generation
Parameter Default Description
IWATCH 0 Which reports are generated
A value of 115 will generate a movie.
NSUPD 5 sec. How frequently Dashboard updates the pages
NSREF 3 sec. How frequently the browser reloads the pages
NSMOVE 2 sec. The pause between movie frames
MAXMOV 1000 frames Maximum number of movie frames
The following Parameters control what reports Dashboard generates and when they are generated:

Dashboard Reports

The pages displayed by Dashboard depend on what functions have been implemented and what hardware is available. This section presents the results of implementing Dashboard in FMSlib.

Main

This page lists the network node name of the machine.

You can include the name of your application in one of two ways:

  1. Explicitly setting it,
    CALL FMSCST(ANAME, 'My_Application')
    or
  2. Dashboard will use the name of the first routine put on the stack:
    CALL FMSPSH('My_Application')
If your application includes calls to FMSlib, the FMS logo is also included.

CPUs

This page lists the number and type of CPUs installed in the system. Note that for modern processors, the Clock Speed listed may change with utilization.

GPU-Fixed

This page lists information about the GPUs that does not change during the run. If multiple GPUs of different types are found, only the one(s) with the highest compute capability are listed. If no GPUs are available, pages GPU1 and GPU2 are not generated.

Nvidia uses two numbering systems: one by the Run Time Library (RTL) and another by the Nvidia Management Library (NVML). These may not be the same and can change on reboot. Correlation between these numbering systems is established by the PCI bus location of the device. This page lists the devices in the RTL order and shows the corresponding NVML number for reference. If you use the nvidia-smi utility to change the properties of a device, you should use the NVLM number.

Dashboard needs to know where the Nvidia NVML utilities are located in your file system. By default, the following directories are used:

   (Linux)   /usr/bin
   (Windows) "C:\Program Files\NVIDIA Corporation\NVSMI"
The nvidia-smi utility is also in this directory. If this utility is in a different directory, then the FMS Parameter CUPATH must be included in the license file to provide the name of this directory. If Dashboard cannot find this directory, the application will still run but some of the hardware information about the GPUs will be missing.

GPU-Chg.

This page lists GPU information which may be changed, either by the application program or using the nvidia-smi utility. This includes limits on temperature, power and clock speed.

GPU-Dyn.

This page lists GPU information that changes with time. This includes temperature, fan speed (for workstations), utilization rates, ECC errors and power management. This page is automatically updated during the run.

GPU-RTL

This page lists GPU information from the Run Time Libaray. This may include some of the information obtained from NVML, where available.

Memory

This page lists the total amount of memory in the system. For applications which use the FMS Memory Management utilities, it also shows how that memory is allocated. On systems with GPUs, it lists the memory for each GPU and its address range.

Disks

This page lists the following tables:

Files

For applications using the FMS File Striping System, information on open FMS files, their stripe configuration and performance are listed.

Software

This page lists the following information about the software being used by the application and FMS:

Calls

This page lists the subroutines which have been called. The order in each table is determined by the order of first call. The following tables are included: Routines highlighted in yellow are currently active on the stack.

Parameters

This page lists the current values of FMS parameters which have been changed from default values. In some cases, changing one parameter will cause others to be changed also. Some of the parameters are output only and are computed by FMS.

When FMS determines that a different value of a Parameter would improve performance, they are listed in the Suggested Tuning Values table.

Performance

This is the application-specific page for monitoring FMS performance. It first lists the FMS routine currently executing with a link to its manual page and the total wall time used. All information on this page pertains to that routine. When a new routine is called, this page is reset.

This page contains one or more of the following sections:

Movie

When IWATCH is set to generate a movie, this link provides a way to playback the movie. This link only appears after all the movie frames have been generated.

Usage

On systems running Linux, this page lists the overall usage of system resources since the job started. Memory performance information, including resident set size and page faults are listed both in absolute values as well as ratio to FMS memory. This page lists links to Multipath's home page and the documentation for FMSlib, MatrixWarrior and Dashboard.

Dashboard

This page lists the number of times the contents of each page changed, the number of times the page was updated and the time (CPU, Wall) used. You may use this information to adjust NSUPD to increase or decrease how often pages get updated.

Help

For MatrixWarrior, this links to the Quick Start Guide. For FMS EXAMPLE problems, this links to documentation on how to run the example as well as a listing of the program code. For other applications, this links to the Dashboard documentation.