Version 6.0 of
FMS offers the following significant enhancements:
- Support for NVIDIA GPU processors. On systems equipped with GPUs, FMS automatically
detects them and divides the work between the CPUs and GPUs. GPUs can provide an order of magnitude or more
in performance, cost/performance and power reduction. Several
FMS Parameters
have been
added to control the options available with GPUs.
- The
Dashboard Performance Report
has been added to provide a WEB based report as FMS jobs are running.
The information includes current details on the CPUs, GPUs, Memory, Disks, Files, Software, Subroutine calls
and System usage. In addition Performance data showing the progress of the calculation and rate of
performing useful work (matrix terms filled, flops, etc.) is also included.
- Subroutines
FMSPSH and
FMSPOP
have been added to allow you to instrument your subroutines. By placing a call to FMSPSH and the beginning
of the routine and FMSPOP at the end, FMS will compute the amount of time spent in that
routine. In additon your subroutine will be available for monitoring using the
Dashboard performance reports.
- Subroutines
FMSILG, FMSRLG, FMSCLG
have been added to allocate memory that is local to a processor. Using local memory can improve
performance on machines with several processors, especially NUMA machines. When the memory is no
longer required, subroutines
FMSILR, FMSRLR, FMSCLR can be called to return it.
- The license file is now named fmslic.txt. License files named fmslic.52 will not work with version 6.0.
New FMS Subroutines
- FMSPSH
Add your subroutine to the FMS call stack.
- FMSPOP
Remove your subroutine from the FMS call stack.
- FMSILG, FMSRLG, FMSCLG
Allocate integer, real or complex memory local to each processor.
- FMSILR, FMSRLR, FMSCLR
Return the integer, real or complex local memory that was allocated.
New FMS Parameters
- MAXGPU
Determines the number of GPUs to use.
- GPUFL
Controls the use of GPUs.
- GPUPR
Determines how much information is printed about the GPUs.
- ICHUNK
Controls how data is distributed among the GPUs.
- IMMPCT
Specifies the percentage of work done by the CPUs during matrix multiply computations.
- ITRPCT
Specifies the percentage of work done by the CPUs during triangle solve computations.
- MAXEVT
Read-only. Returns the maximum number of events that were recorded during a single call to a
GPU routine.
- CUPATH
Path to the nvidia-smi program.
- IOLAST
Physically reserve file space when a file is opened by writing to the end of the file.
NOTE: On version 5.2 the default was to write to the end of the file. On version
6.0 the default is NOT to write to the end of the file since this can take a considerable
amount of time for large files. Setting this Parameter to 1 will cause the same behavior
as version 5.2.
New Example Problems
- EXAMPLE_21
This example illustrates the use of the matrix multiply routine
FMSMM
- EXAMPLE_22
This example is used to measure disk performance.
- EXAMPLE_23
This example shows how OpenMP and FMS calls can be mixed.