Data Type
IntegerDefault Value
0, Most machines1, NUMA machines
Description
This parameter directs FMS to redistribute the work among threads on a node during matrix multiply to reduce the amount of nonlocal memory references. Where possible, this is accomplished by computing blocks that are more square in shape.