Valgrind User Manual

12. Verrou: a floating-point rounding errors checker

To use this tool, you must specify --tool=verrou on the Valgrind command line.

12.1. Overview

12.1.1. Basic usage

Verrou helps you look for floating-point round-off errors in programs. It implements a stochastic floating-point arithmetic based on random rounding: all floating-point operations are perturbed by randomly switching rounding modes. This can be seen as an asynchronous variant of the CESTAC method, or a subset of Monte Carlo Arithmetic, performing only output randomization.

As with many other Valgrind tools, you probably want to recompile your program with debugging info (the -g option) and with optimization turned on.

To start a floating-point check for a program, execute:

valgrind   --tool=verrou  [verrou options] {program} [program options]

For example:

	$  valgrind --tool=verrou --rounding-mode=random python3
==20752== Verrou, Check floating-point rounding errors
==20752== Copyright (C) 2014-2019, EDF (F. Fevotte & B. Lathuiliere). 2019-2023, EDF (B. Lathuiliere). 2020-2021, TriScale innov (F. Fevotte)
==20752==  
==20752== Using Valgrind-3.21.0.verrou-dev and LibVEX; rerun with -h for copyright info
==20752== Command: python3
==20752== 
==20752== First seed : 325750
==20752== Backend verrou : 1.x-dev
==20752== Backend mcaquad : 1.x-dev
==20752== Backend checkcancellation : 1.x-dev
==20752== Backend check_float_max : 1.x-dev
==20752== Backend checkdenorm : 1.x-dev
==20752== Instrumented operations :
==20752== 	add : yes
==20752== 	sub : yes
==20752== 	mul : yes
==20752== 	div : yes
==20752== 	mAdd : yes
==20752== 	mSub : yes
==20752== 	sqrt : yes
==20752== 	cmp : no
==20752== 	conv : yes
==20752== 	max : no
==20752== 	min : no
==20752== Instrumented vectorized operations :
==20752== 	scal : no
==20752== 	llo  : yes
==20752== 	vec2  : yes
==20752== 	vec4  : yes
==20752== 	vec8  : yes
==20752== 	unk  : yes
==20752== Instrumented type :
==20752== 	flt : yes
==20752== 	dbl : yes
==20752== Backend verrou simulating RANDOM rounding mode
Python 3.7.3 (default, Oct 31 2022, 14:04:00) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> sum([0.1*i for i in range(1000)])
49949.999999999825
>>> sum([0.1*i for i in range(1000)])
49949.9999999999
>>> sum([0.1*i for i in range(1000)])
49949.99999999985
>>> sum([0.1*i for i in range(1000)])
49949.99999999991
>>> sum([0.1*i for i in range(1000)])
49950.00000000016
>>> sum([0.1*i for i in range(1000)])
49949.9999999999
>>> exit()
==20752== 
==20752==  ---------------------------------------------------------------------
==20752==  Operation                            Instruction count
==20752==   `- Precision
==20752==       `- Vectorization          Total             Instrumented
==20752==  ---------------------------------------------------------------------
==20752==  add                     7527                     7527          (100%)
==20752==   `- dbl                     7527                     7527      (100%)
==20752==       `- llo                     7527                     7527  (100%)
==20752==  ---------------------------------------------------------------------
==20752==  sub                       79                       79          (100%)
==20752==   `- dbl                       79                       79      (100%)
==20752==       `- llo                       79                       79  (100%)
==20752==  ---------------------------------------------------------------------
==20752==  mul                     7599                     7599          (100%)
==20752==   `- dbl                     7599                     7599      (100%)
==20752==       `- llo                     7599                     7599  (100%)
==20752==  ---------------------------------------------------------------------
==20752==  div                       11                       11          (100%)
==20752==   `- dbl                       11                       11      (100%)
==20752==       `- llo                       11                       11  (100%)
==20752==  ---------------------------------------------------------------------
==20752==  cmp                     7516                        0          (  0%)
==20752==   `- dbl                     7516                        0      (  0%)
==20752==       `- scal                    7516                        0  (  0%)
==20752==  ---------------------------------------------------------------------
==20752==  conv                     109                        0          (  0%)
==20752==   `- dbl=>int                 102                        0      (  0%)
==20752==       `- scal                     102                        0  (  0%)
==20752==   `- dbl=>sht                   7                        0      (  0%)
==20752==       `- scal                       7                        0  (  0%)
==20752==  ---------------------------------------------------------------------
==20752==  min                        1                        0          (  0%)
==20752==   `- dbl                        1                        0      (  0%)
==20752==       `- llo                        1                        0  (  0%)
==20752==  ---------------------------------------------------------------------
==20752== For lists of detected and suppressed errors, rerun with: -s
==20752== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

First, Verrou will output a header indicating which floating-point instructions will be perturbed, and how. By default, nothing will be perturbed. The --rounding-mode=random command-line option is the most standard way to perturb floating-point rounding-modes; see Rounding-mode switching for more details.

During program execution, floating-point operations will be perturbed by constantly and randomly switching the rounding-mode. This makes you program output (hopefully slightly) different results than in a normal execution. See Rounding-mode switching for more details on the different rounding-mode switching strategies. In the example above, the same python calculation performed several times yields varying results, whose analysis show that the first 14 decimal digits are always the same, and are thus probably reliable.

After program termination, a summary of floating point operations will be printed on screen. See Floating-point instructions counting for more details.

12.1.2. Restriction of the perturbations scope

Some algorithms are specifically and carefully designed to work in IEEE nearest rounding, and do not behave well when Verrou perturbs floating-point rounding modes.

This is for example the case of the douple-precision cosine implementation of the Gnu mathematical library (libm). By default the libm is not instrumented, so to illustrate the problem we add use the option --libm=manual_exclude. Getting back to the previous python example:

$  valgrind --tool=verrou --rounding-mode=random --libm=manual_exclude python3
[VERROU INIT]
Python 3.7.3 (default, Oct 31 2022, 14:04:00) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> math.cos(42.)
-0.3999853149883513
>>> math.cos(42.)
-0.3999853149883513
>>> math.cos(42.)
-1.0046831797202347
>>> math.cos(42.)
-1.0046831797202345

Here, we see that the same calculation performed several times did not only produce different results; it also produced completely wrong values, well outside the expected [-1;1] interval.

In such cases, where random rounding will almost always yield false positives, it might be a good idea to temporarily disable the perturbations during the execution of functions which are known to be correct. This can be performed by adding a --exclude=LIST command-line switch in the following way:

valgrind --tool=verrou --rounding-mode=random --libm=manual_exclude --exclude=libm.ex python3

where the file libm.ex contains a list of functions to be left unperturbed. For example, in order to disable random rounding modes in the whole libm:

# sym  lib
*      /usr/lib/x86_64-linux-gnu/libm-2.28.so

Warning

The library name listed in the second column of an exclusion file must be identified by a fully canonical path, as produced by readlink -f. The file libm.ex can be generated with the following command (python3 should be replace by the binary).

	echo "python3" | xargs which | xargs ldd  |grep libm | cut -d " " -f 3|xargs readlink -f | xargs echo "*"  > libm.ex

When the libm is excluded from perturbations in such a way, the python example above works as expected: the cosine is accurately computed, as usual. See Excluded symbols for more details about exclusion lists. But by doing this you are no able to measure the floating point error implied by libm. To pertub libm we need to use the option --libm=instrumentation, which replace the dynamically loaded libm by your own stochastic implementation. If libm is statically linked with your application (Please don't do it), you will need to exclude yourself each libm symbol and it is not possible to measure floating point error implied by libm.

12.2. Standard features

12.2.1. Floating-point instructions counting

Verrou detects and counts floating-point operations. A summary is printed after each program execution, listing the number of floating-point operations executed by the program, broken down into categories according to various criteria :

Operation type:

add: addition
sub: subtraction
mul: multiplication
div: division
mAdd: Fused Multiply-Add (FMA)
mSub: Fused Multiply-Sub
sqrt: hardware square root (not square root function provided by libm)
cmp: comparison
conv: conversion (or cast)

Floating-point precision:

sht: half precision (IEEE-754 binary16)
flt: single precision (C float / IEEE-754 binary32)
dbl: double precision (C double / IEEE-754 binary64)

Vector nature of the instruction:

scal: scalar instruction
llo: lowest-lane-only (unpacked) instruction (i.e. scalar instruction from the SSE set, such as addss)
vec: full (packed) vector instruction (such as addps)

Below is an example output of Verrou's summary of floating-point operations:

==18913==  ---------------------------------------------------------------------
==18913==  Operation                            Instruction count
==18913==   `- Precision
==18913==       `- Vectorization          Total             Instrumented
==18913==  ---------------------------------------------------------------------
==18913==  add                     7044                     7044          (100%)
==18913==   `- dbl                     7044                     7044      (100%)
==18913==       `- llo                     7044                     7044  (100%)
==18913==  ---------------------------------------------------------------------
==18913==  sub                       21                       21          (100%)
==18913==   `- dbl                       21                       21      (100%)
==18913==       `- llo                       21                       21  (100%)
==18913==  ---------------------------------------------------------------------
==18913==  mul                     7073                     7073          (100%)
==18913==   `- dbl                     7073                     7073      (100%)
==18913==       `- llo                     7073                     7073  (100%)
==18913==  ---------------------------------------------------------------------
==18913==  div                        7                        7          (100%)
==18913==   `- dbl                        7                        7      (100%)
==18913==       `- llo                        7                        7  (100%)
==18913==  ---------------------------------------------------------------------
==18913==  cmp                       78                        0          (  0%)
==18913==   `- dbl                       78                        0      (  0%)
==18913==       `- scal                      78                        0  (  0%)
==18913==  ---------------------------------------------------------------------
==18913==  conv                   14042                        0          (  0%)
==18913==   `- dbl=>int                  28                        0      (  0%)
==18913==       `- scal                      28                        0  (  0%)
==18913==   `- dbl=>sht               14014                        0      (  0%)
==18913==       `- scal                   14014                        0  (  0%)
==18913==  ---------------------------------------------------------------------
==18913== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The set of columns labeled "Total" shows the total number of floating-point instructions as seen by Verrou. In the second set of columns (labeled "Instrumented"), instructions are only accounted for if they come from Instrumented sections. The last column shows the fraction of instrumented instructions.

Floating-point instructions counters can be displayed at any point during the program execution using the VERROU_DISPLAY_COUNTERS client request or the count monitor command. It is also possible to disable this feature using the --count-op=no.

12.2.2. Rounding-mode switching

When the instrumented program performs a floating-point operation, Verrou can replace it with a perturbed version of the same operation, using another rounding mode. The --rounding-mode command-line option allows choosing between different rounding strategies, described hereafter.

Verrou can simulate any of the four IEEE-754 standard rounding modes:

nearest (default),
upward,
downward,
toward zero,

as well as three other (non-IEEE-754) deterministic modes:

farthest: in this mode, all non-exact results are rounded in the opposite way to IEEE-754 nearest rounding mode. This helps producing results that are different from nearest rounding mode, while still being deterministic.
away zero: in this mode, all non-exact results are rounded in the opposite way to IEEE-754 toward zero rounding mode.
float: in this mode, all double precision floating-point operations are replaced by simple precision equivalent. Float operations remain unchanged. (See also --float=yes).
ftz: in this mode, denormal floating-point operations are flushed to zero.

Finally, the main use for Verrou is to randomly switch rounding mode at each floating-point operation, in order to implement the "random rounding" variant of Monte Carlo Arithmetic (MCA). Several strategies can be used to choose the rounding mode for a given operation:

random: Randomly pick one among upward and downward rounding, with equal probabilities. This is a form of asynchronous CESTAC method.
prandom: Randomly pick one among upward and downward rounding, with probabilities p and 1-p. By default p is selected randomly between 0 and 1 with uniform distribution, once by verrou tool initialization. p can be setup with option --prandom-pvalue . p can be dynamically modified with client requests VERROU_PRANDOM_UPDATE_VALUE / VERROU_PRANDOM_UPDATE or option --prandom-update. This rounding mode is designed to try to get wider distribution (useful for debug).
average: Randomly choose between upward and downward rounding mode, in such a way that the expectation of the random result equals the exact operation result (without rounding). This is called "uniform_absolute output randomization" or "SR-nearness" in the MCA literature.
random_det: Randomly choose between upward and downward rounding mode, with equal probabilities but insuring that, for a given seed, the same computation (same inputs and operation) will always be rounded in the same way. This is a variant of the random rounding mode that improves determinism and removes the need to use Deterministic sections.
prandom_det: Randomly choose between downward and upward rounding mode, with probabilities p and 1-p but insuring that, for a given seed, the same computation (same inputs and operation) will always be rounded in the same way. This is a variant of the prandom rounding mode that improves determinism and removes the need to use Deterministic sections. If the p is modified dynamically, determinism is no longer assured.
average_det: Randomly choose between upward and downward rounding mode, with the same probability law as average but insuring that, for a given seed, the same computation (same inputs and operation) will always be rounded in the same way. This is a variant of the average rounding mode that improves determinism and removes the need to use Deterministic sections.
random_comdet: variant of random_det ensuring that x op y and y op x will be rounded in the same way for commutative op (+,x, fma).
prandom_comdet: variant of prandom_det ensuring that x op y and y op x will be rounded in the same way for commutative op (+,x, fma).
average_comdet: variant of average_det ensuring that x op y and y op x will be rounded in the same way for commutative op (+,x, fma).
random_scomdet: variant of random_det ensuring that x op y and y op x will be rounded in the same way for commutative op (+,x, fma) and the following constraints :
- (-x)+(-y) == -(x+y)
- x+(-y) == -((-x)+y)
- x-y == x+(-y)
- (-x)*(-y) == (x*y)
- (-x)*y == -(x*y)
- (-x)/(-y) == (x/y)
- (-x)/y == -(x/y)
- x/(-y) == -(x/y)
- fma(x,y,z) == fma(-x,-y,z)
- fma(-x,y,-z) == -fma(x,y,z)
- fma(x,-y,-z) == -fma(x,y,z)
- fma(x,y,0) == (x*y)
- cast(-x) == -cast(x)
- With --libm=instrumented the odd/even properties of libm instrumented functions are fullfilled.
average_scomdet: variant of average_det which respects the same constraints as random_scomdet
sr_monotonic: Choose between downward and upward rounding mode, with a fixed limit. This limit is randomly selected for each interval between 2 successive floating point numbers.
sr_smonotonic: Choose between downward and upward rounding mode, with a fixed limit. This limit is randomly selected for each interval between 2 successive floating point numbers. This rounding mode satisfy the monotonicity of the rounding and respect also all the random_scomdet constraints as the limits are designed to be symmetric.

A pseudo-Random Number Generator (pRNG) is used to generate the randomness used in the modes average and random. The pRNG is normally seeded with a value that changes at each execution in order to produce different "random rounding" results at each run. In order to reproduce results from a given run, it is however possible to use the --vr-seed command-line option. For the modes average_[[s]com]det, random_[[s]com]det and sr_[s]monotonic, the pRNG seed is used as a parameter of hash functions.

12.3. Restriction of the perturbations scope

There is a number of reasons why it could be desirable to only perturb parts of a program:

when an algorithm is designed to work only in standard "rounding to nearest" mode, such as the libm example described in the overview;
to save time by avoiding costly perturbations in places which are known to be particularly stable;
to help locating the origin of numerical instabilities by studying the overall impact of localized perturbation; see the Delta-Debugging section for an instability localization method that build upon this idea.

Controlling which parts of the program are perturbed can be performed in one of two ways, described hereafter.

12.3.1. Excluded symbols

A first way to restrict the scope of instrumentation is based on either the function (or rather symbol) name, the object (library) name, or both.

12.3.1.1. Exclusion files

This can be done by providing an exclusion file via the --exclude command-line option. The file should follow the following format:

one rule per line;
each rule of the form FNNAME OBJNAME, where FNNAME is the function name, and OBJNAME is the symbol name (the name of the executable or the shared library). Either can be replaced by a star (*) to match anything. The two columns can be separated by any number of spaces and tabulations (\t);
each line beginning with a hash (#) is considered as a comment and disregarded.

When verrou finds a block of instructions (an IRSB, Intermediate Representation SuperBlock in valgrind terminology) whose address matches the FNNAME/OBJNAME specification, the whole chunk is left uninstrumented (we may still instrument IRSB to count uninstrumented operations).

Warning

The library name listed in the second column of an exclusion file must be identified by a fully canonical path, as produced by readlink -f.

A recommenced procedure to find the correct library path is the following:

$ ldd program
        linux-vdso.so.1 (0x00007fffebbc6000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fdbe1760000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fdbe155c000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fdbe1358000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fdbe113d000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdbe0e38000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdbe0a96000)
        /lib64/ld-linux-x86-64.so.2 (0x000055ba64113000)

$ readlink -f /lib/x86_64-linux-gnu/libm.so.6
/lib/x86_64-linux-gnu/libm-2.23.so

Here are a few examples of exclusion rules for mathematics functions:

exclude the whole libm:

#FNNAME   OBJNAME
*         /lib/x86_64-linux-gnu/libm-2.23.so

exclude specific functions based on their names only (beware to list all functions in which the program will pass; not only top-level ones):
```
#FNNAME        OBJNAME
exp            *
__ieee754_exp  *
__exp1         *
```

exclude a specific function of a specific library:

#FNNAME  OBJNAME
exp      /lib/x86_64-linux-gnu/libm-2.23.so

12.3.1.2. Automatic generation of exclusion lists

Instead of manually establishing an exclusion list, using the --gen-exclude command-line option can help producing the complete list of FNNAME/OBJNAME pairs through which a given program passes. This complete list can then be filtered to keep only functions which really need to be excluded.

For example:

$ valgrind --tool=verrou --gen-exclude=all.tex python3
==10469== Verrou, Check floating-point rounding errors
==10469== Copyright (C) 2014-2019, EDF (F. Fevotte & B. Lathuiliere). 2019-2023, EDF (B. Lathuiliere). 2020-2021, TriScale innov (F. Fevotte)
==10469==
==10469== Using Valgrind-3.21.0.verrou-dev and LibVEX; rerun with -h for copyright info
==10469== Command: python3
==10469==
==10469== First seed : 147635
==10469== Backend verrou : 1.x-dev
==10469== Backend mcaquad : 1.x-dev
==10469== Backend checkcancellation : 1.x-dev
==10469== Backend check_float_max : 1.x-dev
==10469== Backend checkdenorm : 1.x-dev
==10469== Instrumented operations :
==10469== 	add : yes
==10469== 	sub : yes
==10469== 	mul : yes
==10469== 	div : yes
==10469== 	mAdd : yes
==10469== 	mSub : yes
==10469== 	sqrt : yes
==10469== 	cmp : no
==10469== 	conv : yes
==10469== 	max : no
==10469== 	min : no
==10469== Instrumented vectorized operations :
==10469== 	scal : no
==10469== 	llo  : yes
==10469== 	vec2  : yes
==10469== 	vec4  : yes
==10469== 	vec8  : yes
==10469== 	unk  : yes
==10469== Instrumented type :
==10469== 	flt : yes
==10469== 	dbl : yes
==10469== Backend verrou simulating NEAREST rounding mode
Python 3.7.3 (default, Oct 31 2022, 14:04:00)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> math.cos(42.)
-0.39998531498835127
>>> exit()
==10469==
==10469==  ---------------------------------------------------------------------
==10469==  Operation                            Instruction count
==10469==   `- Precision
==10469==       `- Vectorization          Total             Instrumented
==10469==  ---------------------------------------------------------------------
==10469==  add                      519                      519          (100%)
==10469==   `- dbl                      519                      519      (100%)
==10469==       `- llo                      519                      519  (100%)
==10469==  ---------------------------------------------------------------------
==10469==  sub                        8                        8          (100%)
==10469==   `- dbl                        8                        8      (100%)
==10469==       `- llo                        8                        8  (100%)
==10469==  ---------------------------------------------------------------------
==10469==  mul                      520                      520          (100%)
==10469==   `- dbl                      520                      520      (100%)
==10469==       `- llo                      520                      520  (100%)
==10469==  ---------------------------------------------------------------------
==10469==  div                        2                        2          (100%)
==10469==   `- dbl                        2                        2      (100%)
==10469==       `- llo                        2                        2  (100%)
==10469==  ---------------------------------------------------------------------
==10469==  mAdd                      15                       15          (100%)
==10469==   `- dbl                       15                       15      (100%)
==10469==  ---------------------------------------------------------------------
==10469==  cmp                      204                        0          (  0%)
==10469==   `- dbl                      204                        0      (  0%)
==10469==       `- scal                     204                        0  (  0%)
==10469==  ---------------------------------------------------------------------
==10469==  conv                      33                        0          (  0%)
==10469==   `- dbl=>int                  32                        0      (  0%)
==10469==       `- scal                      32                        0  (  0%)
==10469==   `- dbl=>sht                   1                        0      (  0%)
==10469==       `- scal                       1                        0  (  0%)
==10469==  ---------------------------------------------------------------------
==10469==  min                        1                        0          (  0%)
==10469==   `- dbl                        1                        0      (  0%)
==10469==       `- llo                        1                        0  (  0%)
==10469==  ---------------------------------------------------------------------
==10469== Dumping exclusions list to `/home/E52654/srcVerrou/valgrind-3.19.0-verrou/all.tex'... OK.
==10469== For lists of detected and suppressed errors, rerun with: -s
==10469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

  $ cat all.tex |grep libm |tee libm.ex
__cos_fma	/usr/lib/x86_64-linux-gnu/libm-2.28.so

Note on C++ and symbol name mangling

The exclusion lists (generation and exclusion) use the mangled form. When analyzing C++ programs, symbol names in the exclusion list can still be demangled using an external program such as c++filt.

12.3.1.3. Source code lines selection

A more fine-grained way to restrict the scope of instrumentation is based on source code lines. In order to effectively use this feature, the instrumented binary file must embed debugging information (i.e. it must have been compiled with the -g switch or something equivalent).

This can be done by providing a list of source code lines to instrument, via the --source command-line option. The file should have the following format:

one item per line,
each item of the form FILENAME LINENUM [SYMNAME], where FILENAME is the source file name (not path!), LINENUM is the line number within the source file, and SYMNAME is optional and indicates the symbol name in which the source code has been compiled. The columns can be separated by any number of spaces and tabulations (\t).

Here is an example of a sources list file:

#FILENAME       LINENUM  SYMNAME
s_sin.c         594      __cos_avx
s_sin.c         598      __cos_avx
fenv_private.h  433      __cos_avx
s_sin.c         581      __cos_avx
e_asin.c        10       __acos_finite
s_sin.c         12       cos

When verrou finds an instruction coming from a listed source code line, it instruments it. Only the source file name and line number are considered during this selection. If the --source option is not specified all instructions are instrumented. If the file is empty, nothing is instrumented (contrary to the first verrou versions).

The --gen-source command-line switch can help generate the list of all source code lines encountered during program execution, for later filtering. It can be wise to combine it with the --exclude switch in order to restrict the source code lines list to a specific set of symbols.

Note

This fine-grained selection of code parts to instrument is primarily meant to be used by automatic debugging tools (see Delta-Debugging), not directly by humans.

12.3.2. Instrumented sections

Another way of controlling the scope of perturbations consists in toggling instrumentation on and off during program execution.

Instrumentation can be toggled off at program start using the --instr-atstart command-line option. Furthermore, it can be (de)activated programmatically using the VERROU_START_INSTRUMENTATION and VERROU_STOP_INSTRUMENTATION client requests. This feature is greatly inspired by Callgrind (see Limiting range of event collection in the Callgrind manual). Below is an example use:

#include <valgrind/verrou.h>
#include <stdio.h>

float compute ();

int main () {
  VERROU_START_INSTRUMENTATION;
  float result = compute();
  VERROU_STOP_INSTRUMENTATION;

  fprintf (stdout, "result = %f", result);
}

This program should be run in Verrou using the following command:

valgrind --tool=verrou --instr-atstart=no program

The same thing can be achieved interactively (and without the need for a recompilation of the program) using the instrumentation monitor command.

As the VERROU_START_INSTRUMENTATION and VERROU_STOP_INSTRUMENTATION client requests (usually called hard stop/start) imply to flush the valgrind instrumentation cache, it can be expensive for very frequent calls, we also propose an other couple of client requests (usually called soft sotp/start): VERROU_START_SOFT_INSTRUMENTATION and VERROU_STOP_SOFT_INSTRUMENTATION. The idea behind is to instrument only once, but with a function which dynamically select between the instrumented backend and the nearest backend. As the instruemented code is less efficient, this approach designed only for frequent stop/start. The both approach are compatible : if VERROU_STOP_SOFT_INSTRUMENTATION or VERROU_STOP_INSTRUMENTATION are activated, the instrumentation is stopped.

The hard and soft stop/start can also be called (without recompilation) thanks the an IOMatch script (See --IOmatch-clr).

For python program it is possible to use python bindings:

#!/usr/bin/python3
import valgrind.verrouPyBinding as verrou

verrou.start_instrumentation()
verrou.stop_instrumentation()
verrou.start_soft_instrumentation()
verrou.stop_soft_instrumentation()

As verrou_dd_stdout and verrou_dd_task use a stop/start mechanism (hard or soft), you can use the other mechanism to manually instrument your code, without unwanted side-effect.

12.4. Debugging and error localization

We describe in this section techniques which can help determining the origin of numerical errors in the source code of the analyzed program.

12.4.1. Delta-debugging with the verrou_dd_sym, verrou_dd_line, verrou_dd_task and verrou_dd_stdout commands

The first technique builds upon Verrou's Instrumentation scope features in order to perform a search of functions (or source code lines / verrou task / stdout task) whose perturbation produces the most important errors in final results. The idea is the following:

First, a complete list of all symbols (or line / task / stdout task) which contain perturbed floating point instructions is generated. It is expected that perturbing all functions in this list will produce inexact results. On the other hand, excluding all symbols from the scope of perturbations should produce unperturbed results. This list is called the search space.
By splitting the search space in two parts, and perturbing each half separately, it is possible to determine whether each perturbed half produces inexact results.
Going on like this and performing a bisection of the search space, the algorithm eventually finds a subset of functions whose only perturbation is enough to produce inexact results. This Trial and Error algorithm is called Delta-Debugging. Each trial (which corresponds to a subset of the search space) is called a configuration. As the evaluation of error is based on a stochastic evaluation there are several runs per configuration.

Depending on the search space there are four different commands:

verrou_dd_sym uses the symbol search space and uses Excluded Symbols to select parts of the program which are instrumented or not.
verrou_dd_line uses the code line search space and uses Source instrumentation to select parts of the program which are instrumented or not. The results of this command are more useful if the code is compiled with debug option (-g).
verrou_dd_task uses verrou task search space ( See how to perform instrumentation). This search space enable temporal localization. The task instrumentation can be automatic with python code but is intrusive for compiled codes. verrou_dd_task has one specific option.
verrou_dd_stdout performs localization in terms of code section defined by the lines of the stdout. The search space is not directly the stdout lines because from one perturbed execution to an other the lines may differ. To bypass the problem the search space is a list of wildcard expression (With the default search space, the float are replace by * and address by ?).

The user do not need to instrument the code but may need to modify the dataset to increase the verbosity. As the command works without code recompilation, it is possible to work from one result of an other verrou_dd_* commands, without rerunning the previous search.

The user has to pay attention to bufferization. verrou_dd_stdout automatically setup the PYTHONUNBUFFERED env variable and deactivate the c bufferization of stdout (not FILE*) but it not always sufficient.

As there are more corner cases, there are specific options to tackle it (cf. Specific options of verrou_dd_task/stdout and verrou_dd_stdout)

Synopsis

verrou_dd_sym [options]  {/path/to/run_script} {/path/to/cmp_script}

verrou_dd_line [options]  {/path/to/run_script} {/path/to/cmp_script}

verrou_dd_task [options]  {/path/to/run_script} {/path/to/cmp_script}

verrou_dd_stdout [options]  {/path/to/run_script} {/path/to/cmp_script}

Description

Arguments passed to verrou_dd_* are scripts. These scripts will be called during the delta-debugging process to automate the tasks of running your program under Verrou, and verifying the results. The scripts will be called with the following arguments:

run_script {dir}

Should run your program under Verrou, and put the results somewhere under the dir directory. A minimal example could look like the following:

#!/bin/bash
DIR="$1"
valgrind --tool=verrou --rounding-mode=random program args > ${DIR}/results.dat

cmp_script {ref_dir} {run_dir}

Should check whether program results in run_dir (as stored by a previous call to run_script) are "valid". The precise meaning of "valid" is left for the user to determine, depending on the program being analyzed. If a comparison to reference results is needed, directory ref_dir contains the results produced by an uninstrumented run of the program. cmp_script should return 0 if and only if the results are "valid".

A minimal example could look like the following:

#!/bin/bash
REF="$1"
RUN="$2"
diff ${REF}/results.dat ${RUN}/results.dat

A classical python script looks like this :

import sys
import os
def extractValue(rep):
    """This parse function has to be adapted to each case"""
    lines=(open(os.path.join(rep,"results.dat")).readlines())
    for line in lines:
        if line.startswith("res="):
            return float(line.partition("=")[2])
    return None
if __name__=="__main__":
    if len(sys.argv)==2:
        #this case is to be compatible with the extract script of verrou_plot_stat
        print(extractValue(sys.argv[1]))
    if len(sys.argv)==3:
        valueRef=extractValue(sys.argv[1])
        value=extractValue(sys.argv[2])

        relDiff=abs((value-valueRef)/valueRef)
	#the tolerance has to be adapted
        if relDiff < 1.e-6:
	    sys.exit(0)
        else:
            sys.exit(1)

Options and environment variables

The following options or environment variables affect the behavior of verrou_dd_sym, verrou_dd_line, verrou_dd_task and verrou_dd_stdout:

--nruns or VERROU_DD_NRUNS

An integer setting the required number successful runs needed to consider a configuration as stable. If this environment variable is not set, the default value is 5.

--num-threads or VERROU_DD_NUM_THREADS

Allows to perform VERROU_DD_NUM_THREADS runs in parallel. If this environment variable is not set, the parallelism is not activated.

--reference-rounding or VERROU_REFERENCE_ROUNDING [default native].

Select the reference rounding mode between native and nearest. The nearest and native rounding mode are different only when the verrou option --libm=instrumented is used.

--res-with-all-samples or VERROU_DD_RES_WITH_ALL_SAMPLES

Disable the early exit for result directories. It is useful to compute estimator for all ddmin* directories with the same number of samples and hence to sort them.

--algo= or VERROU_DD_ALGO

Option to choose one of the two Delta-Debugging algorithms (ddmax and rddmin).

rddmin: return recursively the min-set of unstable symbols.

ddmax: return the max-set of stable symbols (or lines).

If this environment variable is not set, the default value is set to rddmin.

--rddmin= or VERROU_DD_RDDMIN

Option to choose one of the variant of rddmin algorithms.

"", strict: rddmin algorithm with constant number of samples

s, stoch: rddmin algorithm with increasing number of samples (called srddmin)

d, dicho: srddmin algorithm pre-filtered by a dichotomy method (called drddmin)

The default value is set to dicho.

--rddmin-tab= or VERROU_DD_RDDMIN_TAB

Option to choose the evolution of the number of samples used by the srddmin algorithm.

exp: increases the number of samples from 1 to VERROU_DD_NRUNS with exponential rate.

all: consecutively try all numbers of samples from 1 to VERROU_DD_NRUNS.

single: directly use the number of samples given by VERROU_DD_NRUNS. srddmin is equivalent to rddmin in this case.

The default value is set to exp.

--dicho-tab= or VERROU_DD_DICHO_TAB

Option to choose the evolution of the number of samples used by the binary search in the first part of the drddmin algorithm.

exp: increases the number of samples from 1 to VERROU_DD_NRUNS with exponential rate.

all: consecutively try all numbers of samples from 1 to VERROU_DD_NRUNS.

single: use the number of samples given by VERROU_DD_NRUNS.

half: use half the number of samples given by VERROU_DD_NRUNS.

N: user-provided value (must be an integer between 1 and VERROU_DD_NRUNS).

The default value is set to half.

--dicho-granularity= or VERROU_DD_DICHO_GRANULARITY

Option to choose the granularity of the splitFunction of the dichotomy algorithm (used by drddmin).

The default value is set to 2. Higher value may help parallelism.

--quiet or VERROU_DD_QUIET

Option to reduce the verbosity of verrou_dd_*.

--cache= or VERROU_DD_CACHE

Option to choose the policy with the existing cache dd.sym, dd.line, dd.task or dd.stdout

clean: the previous cache is removed (after reading required information for --rddmin-heuristics-cache).

rename: the previous cache is renamed with time information (time of last symlink creation).

rename_keep_result: the previous cache is renamed and all configurations not pinpointed by symlink are removed. This option is a compromised between disk usage and the opportunity to used the --rddmin-heuristics-cache= option.

keep-run: the previous cache is kept but the cmp_script is run again. Useful when only the cmp_script is modified.

continue: the previous cache is kept for all computation. This option is really useful after manual cache modification to recover computation after error such as No space left on device or job timeout.

The default value is set to continue (will probably change soon).

--rddmin-heuristics-cache= or VERROU_DD_RDDMIN_HEURISTICS_CACHE

Option to choose the policy with cache to use to prefiltered the rddmin algorithm by previous ddmin results.

none: the current cache and old cache directory are not used.

cache: only the current cache directory is used.

all_cache: the current and old cache directories are used. The old cache directories come from previous run with rename or rename_keep_result values for --cache= option.

The default value is set to none.

--rddmin-heuristics-rep= or VERROU_DD_RDDMIN_HEURISTICS_REP

Option to choose the directory used to prefiltered the rddmin algorithm by previous ddmin results. Could be used several times (with option but not with environment variable). Usually useful after a manual cache renaming.

--rddmin-heuristics-line-conv or VERROU_DD_RDDMIN_HEURISTICS_LINE_CONV

Option to shift line numbers of line heuristics. The shift is computed thanks a diff between reference computations. This option is useful for a new delta debug with the same test case and few code modifications. This option can be applied for all tools but is only pertinent for verrou_dd_line.

--seed or VERROU_DD_SEED

Option to specify the seed used by verrou_dd_* tools to setup the VERROU_SEED. Each sample of a configuration receive a different seed (VERROU_SEED read by verrou tool) generated by a pseudo random generator initialized with VERROU_DD_SEED. The %i% sample (dd.run%i%) of each configuration receive the same seed (VERROU_SEED). By default VERROU_SEED is not set (ie each verrou run set is own seed). This option is useful for be more reproductible (You have to take care of parallelism, use of verrou_dd_* cache, and reproducibility of run script).

Results rddmin

verrou_dd_sym (resp. verrou_dd_line/verrou_dd_task/verrou_dd_stdout) stores many intermediate results within the dd.sym (resp. dd.line/dd.task/dd.stdout) directory. Synthetic results are present in the form of symbolic links named after the pattern: ddmin-N.

The list of all symbols (resp. lines/verrou tasks/stdout tasks) found to be unstable in the ddmin-N set can be retrieved in the file: ddmin-N/dd.include.

The symbolic link rddmin-cmp pinpoints the configuration which is the complement of the union of all ddmin configuration. This successful configuration terminates the rddmin algorithm.

For debug purpose there is one symbolic link (FullPerturbation) to pinpoint the configuration (which should fail) with all symbols (resp. lines) perturbed and another (NoPerturbation) to pinpoint the configuration without any perturbation (should be a successful configuration).

Results ddmax

verrou_dd_sym (resp. verrou_dd_line/verrou_dd_task/verrou_dd_stdout) stores many intermediate results within dd.sym (resp. dd.line / dd.task / dd.stdout) directories. The ddmax configuration is pinpointed by a symbolic link (ddmax).

Specific option of verrou_dd_task and verrou_dd_stdout

--stop_start= or VERROU_DD_STOP_START

hard[default] use the hard stop/start client requests

soft use the soft stop/start client requests. It should be used if the code use manual hard stop/start client requests or if the size of tasks is small (typically with python localization).

Specific options of verrou_dd_stdout

--unknown-task= or VERROU_DD_UNKNOWN_TASK

Option to choose the policy of task which are not in the search space.

previous: the status of the previous task is kept

stop: the task is uninstrumented.

start: the task is instrumented.

The default value is set to previous.

--ignore-empty-line=[true[default]|false] or VERROU_DD_IGNORE_EMPTY_LINE

Option to define the status of empty-line. If set (by default) the task defined by an empty line is merged with the previous task.

--IOmatch-header=FILE or VERROU_DD_IOMATCH_HEADER

This option allow to add the content of FILE in the IOMatch script used by verrou_dd_stdout. In practice it is useful to prevent excessive number of iterations (especially when the additional iterations recover the poor precision of the first ones) with this kind of FILE:

cmatch: IT=42 *
apply: exit

Be careful the match expression should not match any line of the reference computation.

verrou_dd_stdout tool performs few FILE verification, to let the user the full power of the functionalities. But you should keep in mind the use of bmatch and stop/start actions is really tricky and need a deep understanding of IOMatch and verrou_dd_stdout.

--filter-cmd=CMD or VERROU_DD_FILTER_CMD

The command CMD is used by verrou to replace the stdout lines by the result of the command. CMD reads lines from stdin and write his result to stdout for each line. The command is run only once, so it mean we have to pay attention to bufferization (We may need to use the option -u of sed or python3). The command "/usr/bin/sed -u s/debug:.*//" used together with --ignore-empty-line=true may be a way to ignore debug message. If you want to write your own python script, these is an example:

#!/usr/bin/python3 -u
import sys

def applyFilter(strLine,filterList):
    res=strLine
    for filterFnc in filterList:
        res=filterFnc(res)
    return res

def deleteDebug(strLine):
    if "debug" in strLine:
        return ""
    else:
        return strLine

def getLineCharByChar():
    line=""
    while True:
        c=sys.stdin.read(1)
        if line== "\x00": #EOF detected
            return None
        if c=='\n':
            return line
        line+=c

if __name__=="__main__":
    while True:
        line=getLineCharByChar()
        if line==None:
            sys.exit(0)
        print(applyFilter(line, [deleteDebug]))

To avoid problem with bufferization, we need to write your own readline function (cf. getLineCharByChar) and to use -u python3 option.

The first argument of CMD has to be an absolute path (or relative if there is no CWD change in run_script). This option is not (yet) able to read the $PATH env variable. All the spaces are used to separate command arguments. To prevent this behaviour, the only way is to escape the spaces by backslash.

--custom-search-space=FILE or VERROU_DD_CUSTOM_SEARCH_SPACE

The default search space generated may suffer of default. By default all floating point are replace by *. But if floating point are printed as int, there are missing. To bypass the issue it is possible to recover the file dd.stdout/ref/dd.IOMatch and perform manually the correction. Usually we only add * but it is also possible to do more complex modification:

The line order can be modified. Indeed if a line matches two wildcard expressions, the first one is taken into account. So if one expression is more specific than an other one, we should place it first.

By default the float are wildcarded by * and integer kept. But if we want specified iteration by the error order it is possible to rewrite the following example:

		    IT=0 ERRS=*
		    IT=1 ERRS=*
		    IT=2 ERRS=*
		    IT=3 ERRS=*

by this search space :

		    IT=* ERRS=*e-01
		    IT=* ERRS=*e-02
		    IT=* ERRS=*e-03
		    IT=* ERRS=*e-04
		    IT=* ERRS=*e-*
		    IT=* ERRS=*

--file-pattern=FILE_PATTERN or VERROU_DD_FILE_PATTERN

The stdout file can be replaced by the first file which matches the FILE_PATTERN. The FILE_PATTERN has to match only one opened file. To avoid difficulties with the absolute and relative path, the easiest way is to use a wildcard * for the path. If you want to be more precise, you could use a strace command to know how a file is opened:

strace -e openat,open,creat -f -ff -o strace_verrou MYPROG MYARGS
cat strace_verrou.*

Definition of verrou task

A verrou task is defined by a key string and an index integer. The integer is usually used to specify a line of code or an iteration number. If you want to use both, you have to build a string post-fixed by one of both integers. If you need neither of both, 0 is fine. The task represents the code between the verrou task call and the next one (or finalyze call).

C API

#include "libverrouTask.h"

void verrou_task_init();
void verrou_task(char const*const key, int index);
void verrou_task_finalyze();

To link you need to use the following link flag : -lverrouTask

Python API

The python API does not contain init and finalyze as it is done automatically by the module import.

#!/usr/bin/python3
import valgrind.verrouPyBinding as verrou

verrou.task("taskString", taskInt)

You can notice that verrouPyBinding can also provide binding to other client requests:

#!/usr/bin/python3
import valgrind.verrouPyBinding as verrou

verrou.start_instrumentation()
verrou.stop_instrumentation()
verrou.start_soft_instrumentation()
verrou.stop_soft_instrumentation()
index=verrou.dump_cover()
verrou.display_counters()
nbInst=verrou.count_fp_instrumented()
nbNotInst=verrou.count_fp_not_instrumented()

Python module instrumentation

The trace_verrou_task.py is a fork of trace.py (provided by cpython distribution) which add verrou task for each python line. The key is the filename and the index the line number. To use it with verrou_dd_task it imply to modify the run script following the example:

#!/bin/bash
DIR=$1
valgrind --tool=verrou --rounding-mode=random  trace_verrou_task.py ./myPythonProg.py > $DIR/res.dat

Remarks:

The APIs are not thread safe. As valgrind sequentialize threads, it is usually not an issue. But if due to parallelism, it is not clear which code is contained in the task, the user may have to modify his own code to suppress ambiguities.
Internal use of environnement variables:
- DEBUG_PRINT_TASK: configure the debug status
- TASK_LIST: read the file to setup which tasks are instrumented.
- GENERATE_TASK_LIST: generate the file containing the list of tasks.
- GENERATE_TASK_FP_LIST: generate the file containing the list of tasks containing instrumented floating point operation.

verrou_dd_clean

The tool verrou_dd_clean enables to clean the cache repository produced by verrou_dd_* commands (The cache directory is the first argument, and the default value is ./). In practice the command removes intermediate configurations which are not result (directories with hash name which are not pinpointed by a symlink).

Synopsis

verrou_dd_clean [options]  {/path/to/cache}

Options

--help: Print help
--dry-run: Only print what will do the command. Useful if you need to check carefuly what will be deleted.

12.4.2. Code coverage comparison

The first technique to localize unstable tests, is based on the compatibility with instrumentation provided by the compilers. The idea is to produce with compiler tools (option --coverage and tool gcov for gcc) two code coverage for the analyzed test case. The first one can be generate without verrou or with nearest rounding mode and the second one with random rounding mode. The second one can be produced several time, to get a coverage with sufficient discrepancies compared to the one with nearest rounding mode. Once both code coverage generated, simple graphical tools as meld or kompare can be used to pinpoint potential unstable test.

12.4.3. Basic Block coverage comparison

The second technique to localize unstable tests, is based on Basic Block (BB) Coverage provided by verrou. The idea behind this technique is the same as behind code coverage comparison except in the way to produce the coverage. First it is required to specify which part of the code to trace (instrument BB coverage), with a file format similar to symbols exclusion file. To instrument only the binary without libraries the following script can be used:

echo "*" ‘readlink -f program‘ > trace.inc

With the trace.inc the BB coverage instrumentation can be activated with the --trace= option:

valgrind --tool=verrou --rounding-mode=random --trace=trace.inc program args

Hence two files trace_bb_cov.log-PID and trace_bb_info.log-PID are generated. The first one contains counter information for each Basic Blocks and the second one contains debug information for each Basic Bloc. These two files (which must be moved together) can be post-treated to be as readable as possible with the genCovBB command.

genCovBB trace_bb_cov.log-*

The differences between generated coverage are more difficult to analyze compared to the ones generated by code coverage tools, because the code structure is lost. Nevertheless it is easier to generate and can produced really powerful information if used with the client request VERROU_DUMP_COVER to recover partially time information lost by coverage techniques.

The BB coverage generation can be simplified by post_verrou_dd.

12.4.4. post_verrou_dd

Synopsis

post_verrou_dd [options]  {/path/to/run_script} {/path/to/cmp_script}

Description

The tool post_verrou_dd enables to post-treat the result repository obtained by verrou_dd_line command. The post-treatment consists to run for each selected ddmin configuration new runs with modified (or not) configurations (rounding mode, instruction types). For each run it is possible to activate the Basic Bloc coverage generation.

The scripts run_script and cmp_script are the same parameters of the verrou_dd_line command.

Options and environment variables

--nruns= or VERROU_POST_NRUNS

An integer setting the run number of each configuration. If this variable is not set, the default value is 5.

--num-threads= or VERROU_POST_NUM_THREADS

Allows to perform VERROU_POST_NUM_THREADS runs in parallel. If this variable is not set, the parallelism is not activated. To use this option run_script has to be thread safe.

--rep= or VERROU_POST_REP

Set the cache repository to find the default configuration. If this variable is not set, its value is dd.line

--configuration= or --sub-rep= or VERROU_POST_CONFIGURATION

The option can be called several times to generate a list of configuration. By default the list contains $VERROU_POST_REP/ddmin* , $VERROU_POST_REP/rddmin-cmp and $VERROU_POST_REP/ref. Each element of the list has to be generated by the verrou_dd and so the directory ../ref has to exist.

--instr or VERROU_POST_INSTR.

The option can be called several times to generate a list of instruction type configuration. A configuration is a coma separated list of the following items : add, sub, mul, div, madd, msub, sqrt, conv.

--rounding= or --rounding-mode= or VERROU_POST_ROUNDING. The option can be called several times to generate a list of rounding mode ( random[_[[s]com]det], prandom[_[com]det], average[_[[s]com]det], sr_[s]monotonic, nearest, native, upward, downward, farthest, toward_zero, away_zero )

det corresponds to the list : [upward , downward, farthest, toward_zero, away_zero, nearest]

no_det corresponds to the list : [average, random, prandom]

--trace-bin or VERROU_POST_TRACE_BIN

Activate the BB coverage trace generation. The verrou trace file is automatically generated with the binary object (without libraries). If the trace file is not the one expected, the option --trace-file= has to be used.

--trace-pattern= or VERROU_POST_TRACE_PATTERN

Activate the BB coverage trace generation. The verrou trace file is automatically generated thanks pattern (* and ?). If the trace file is not the one expected, the option --trace-file= has to be used. This option is compatible with --trace-bin=. This option can be called several times.

--trace-file= or VERROU_POST_TRACE_FILE

Activate the BB coverage trace generation. The user has to provide directly a specific trace-file (See Verrou format). This option is incompatible with --trace-bin and --trace-pattern=. This option is less convenient than the two previous one but may be required for few corner cases.

12.4.5. verrou_plot_stat

Synopsis

verrou_plot_stat [options] [/path/to/run_script]  {/path/to/extract_script}

Description

The tool verrou_plot_stat enables to visualize the histogram of the result (extracted with extract_script) of a code run under verrou and to compute the non-gaussian error estimator (results are printed in stdout and in a file stat.last).

refValue[nearest] : 1.05891843750000000e+05
estimator	abs	rel	rel(in bit)
average:	1.034992e+03	9.7741e-03	6.68bit
average_det:	1.036070e+03	9.7842e-03	6.68bit
random:	1.983109e+03	1.8728e-02	5.74bit
random_det:	1.985172e+03	1.8747e-02	5.74bit
all:	4.337773e+03	4.0964e-02	4.61bit

The abs estimator is max_i(|X_i-X_ref|) .
The rel estimator is max_i(|X_i-X_ref|)/|X_ref|
The binary rel estimator is -log2(max_i(|X_i-X_ref|)/|X_ref|)
all means the i samples take into account all the sample of previous rounding mode and all deterministic computed rounding mode

The scripts run_script is similar to the one of verrou_dd_line command. This script is optional if the computation are already done (by example with a previous call to verrou_plot_stat or post_verrou_dd). The script extract_script has only one parameter (the repository of computation: the same parameter of run_script). It has to print the extracted value on the first line of the standard output. To study several output variables the command can be called several times the command with different extract_script. Thanks the cache repository (by default verrou.stat) the run commands won't be rerun. There is no guarantee in case of concurrent run of the command verrou_plot_stat and the run_script and extract_script should not be modify during a verrou_plot_stat call.

Options

--help= or -h

Print help.

--rep=DIR or -r DIR

Set the repository to store/read computation. By default DIR is set to verrou.stat. If the option --specific-pattern is used the default is set to ./.

--cache=continue[default]|clean|rename

Option to choose the policy with the existing cache defined by --rep option.

clean: the previous cache is removed.

rename: the previous cache is renamed with time information (time of last file modification).

continue: the previous cache is kept for all computation.

--samples= or --nruns= or -s

An integer setting the run number of random and average distribution. If this variable is not set, the default value is 100. -1 means the highest number without running additional computation.

--num-threads=

An integer setting the run number of threads to run code thanks run_script . Default is sequential computation. To use this option run_script has to be thread safe.

--rounding-list= or --rounding-mode= or --rounding=

A coma separated list of rounding mode. The list can contain random[_[[s]com]det], prandom[_[com]det], average[_[[s]com]det], sr_[s]monotonic, nearest, native, upward, downward, farthest, toward_zero, away_zero

nodet is an alias to average,random.

det is an alias to nearest,upward,downward,toward_zero,away_zero,farthest.

The list can also contain prandom_fun. This rounding mode corresponds to prandom with the --prandom-update set to func.

The list can also contain prandom_FLOAT with FLOAT a floating point number between 0 and 1. This rounding mode corresponds to prandom with the --prandom-pvalue set to FLOAT.

If the option is set multiple times, the list will be append to the current list.

--montecarlo or -m

If this option is set, the deterministic mode use also the number of sample defined by --samples=.

--png=FILE or -p FILE

Defined the output file for histogram. By default the plot is interactive.

--nb-bin=VALUE

Set the number of histogram bins to VALUE. The default value is 40.

--no-plot

Do not plot histogram. Only the error estimators are computed.

--run-only

Neither plot the histogram nor compute the error estimators.

--relative=VALUE or -r VALUE

With this option the abscissa is relative. The reference (defined by VALUE) is either a numerical value or a key to a rounding mode (nearest, upward, ...)

--ref=VALUE

This option is used to defined the reference used to compute accuracy estimator. The VALUE is key to a rounding mode (nearest, upward, ...). If the rounding mode is not deterministic, the average value is taken.

--continue-on-extract-value-error

This option is used to be able to plot and compute estimator even if few samples produce an execution error. If errors occur the plot and the stat indicate the failure rate. This option is not activated by default, as usually errors come from error in the extract script.

--mca=MODE-DBL-FLT

Add the MODE-DBL-FLT mca mode to the study. MODE has to be in ["rr","pb","mca"]. DBL is mca noise for double and FLT for float. Typical configuration is --mca=rr-53-24

--specific-pattern=

This option enables to plot histogram for computation run without the framework or after files renaming. Typical calls are:

--rep="." --specific-pattern=verrou.stat/random/dd.run*

--rep="verrou.stat" --specific-pattern=random/dd.run*

--seed=

Option to specify the seed used by verrou_plot_stat tool to setup the VERROU_SEED. This option is similar to --seed of verrou_dd_*.

12.5. Advanced features

12.5.1. Deterministic sections

Sometimes you want a part of your program to be instrumented and perturbed by rounding-mode switches, but you don't want to lose determinism. For example, in the following program, det is called twice with the same arguments, and the correct execution relies on the strong hypothesis that both calls will return the same result.

Note

Before to use this option, you should think about to use the _det, _comdet or _scomdet rounding modes (See --rounding-mode option).

double det (double x) {
  return 0.1*x;
}

int main () {
  double x1 = det (42);
  double x2 = det (42);
  assert (x1 == x2);
}

In this situation, you know that det can contain floating-point errors, which you want to quantify. However, you also know that whatever these errors, det will remain deterministic and the assertion only fails due to the instrumentation added by Verrou. The VERROU_START_DETERMINISTIC(LEVEL) client request can help dealing with such problems.

At the beginning of a deterministic section, the pseudo-random number generator (pRNG) used for random rounding mode switching is seeded with a new value. This value is computed deterministically from the location in the program source code. This ensures that each time the instrumented program enters the same deterministic section (same location in the source code), the pRNG is seeded with the same value, leading to the same sequence of rounding mode switches. The seed value also depends on the PID of the current process, so that different program executions lead to different results.

12.5.1.1. Basic usage

Use the VERROU_START_DETERMINISTIC(0) client request to mark the beginning of a deterministic section. Similarly, put a VERROU_STOP_DETERMINISTIC(0) client request at the end of the deterministic section to go back to (pseudo-)random rounding mode switching.

Here is an example instrumented program:

#include <valgrind/verrou.h>

double det (double x) {
  VERROU_START_DETERMINISTIC(0);
  double result = 0.1*x;
  VERROU_STOP_DETERMINISTIC(0);
  return result;
}

int main () {
  double x1 = det (42);
  double x2 = det (42);
  assert (x1 == x2);
}

whose execution yields the following output:

--8523-- Entering deterministic section 70660: det() (deterministic.c:4)
--8523-- Leaving deterministic section: det() (deterministic.c:6)
--8523-- Entering deterministic section 70660: det() (deterministic.c:4)
--8523-- Leaving deterministic section: det() (deterministic.c:6)

Here we can see that both calls to the det() functions used the same value to seed the pRNG (based on the client request location in the source).

12.5.1.2. Advanced usage

Assume the following program, in which two distinct deterministic sections are instrumented, but the client requests have been abstracted out in separate function calls (this is actually required for example for Fortran programs, which have to call a C function to issue client requests):

#include <valgrind/verrou.h>

void verrou_startDeterministic() {
  VERROU_START_DETERMINISTIC(0);
}

void verrou_stopDeterministic() {
  VERROU_STOP_DETERMINISTIC(0);
}

double det1 () {
  verrou_startDeterministic();
  /* ... */
  verrou_stopDeterministic();
}

double det2 () {
  verrou_startDeterministic();
  /* ... */
  verrou_stopDeterministic();
}

int main () {
  fprintf (stderr, "   det1\n");
  assert (det1() == det1());

  fprintf (stderr, "   det2\n");
  assert (det2() == det2());
}

Executing this program in Verrou yields the following output:

det1
--2909-- Entering deterministic section 82435: verrou_startDeterministic() (deterministic2.c:4)
--2909-- Leaving deterministic section: verrou_stopDeterministic() (deterministic2.c:8)
--2909-- Entering deterministic section 82435: verrou_startDeterministic() (deterministic2.c:4)
--2909-- Leaving deterministic section: verrou_stopDeterministic() (deterministic2.c:8)
det2
--2909-- Entering deterministic section 82435: verrou_startDeterministic() (deterministic2.c:4)
--2909-- Leaving deterministic section: verrou_stopDeterministic() (deterministic2.c:8)
--2909-- Entering deterministic section 82435: verrou_startDeterministic() (deterministic2.c:4)
--2909-- Leaving deterministic section: verrou_stopDeterministic() (deterministic2.c:8)

since the client requests are always issued from the same source location, the two deterministic sections are seeded with the same value.

It is possible to give the VERROU_START_DETERMINISTIC a non-0 LEVEL argument to look at the source location of a calling function in the stack. In the case described above, replacing the verrou_startDeterminisic and verrou_stopDeterministic function definitions like this:

void verrou_startDeterministic() {
  VERROU_START_DETERMINISTIC(1);
}

void verrou_stopDeterministic() {
  VERROU_STOP_DETERMINISTIC(1);
}

yields the following output:

det1
--4523-- Entering deterministic section 14298: det1() (deterministic2.c:12)
--4523-- Leaving deterministic section: det1() (deterministic2.c:14)
--4523-- Entering deterministic section 14298: det1() (deterministic2.c:12)
--4523-- Leaving deterministic section: det1() (deterministic2.c:14)
det2
--4523-- Entering deterministic section 65473: det() (deterministic2.c:18)
--4523-- Leaving deterministic section: det2() (deterministic2.c:20)
--4523-- Entering deterministic section 65473: det() (deterministic2.c:18)
--4523-- Leaving deterministic section: det2() (deterministic2.c:20)

in which the pRNG is seeded using source locations one level up the stack from the client request.

Since the source location is not needed to go back to (pseudo-)random rounding mode switching, the LEVEL argument to VERROU_STOP_DETERMINISTIC is only used for cosmetic and debug purposes.

12.6. Reference

12.6.1. Command-line options

12.6.1.1. General options

--vr-verbose=<yes|no> [default=no]

Toggle verbosity: prints messages for x387 instructions and client requests.

--count-op=<yes|no> [default=yes]

Toggle floating-point operations counting.

--backend=<verrou|mcaquad|checkdenorm> [default=verrou]

Select the verrou, mcaquad or checkdenorm backend. verrou enables to perform several rounding mode (See --rounding-mode option). mcaquad enables to perform MCA (Monte Carlo Arithmetics) based on extended precision in quad (See --mca-mode, --mca-precision-double and --mca-precision-float options). The integration of mcaquad backend in the frontend verrou is still considered as experimental. checkdenorm enables the ftz rounding-mode and the --check-denorm and --cd-gen-file options.

This option can be activated with the env variable VERROU_BACKEND

12.6.1.2. Perturbation of floating-point operations

Emulate the given rounding mode for operations instrumented with the verrou backend. If this option is not provided, Verrou always rounds to the nearest floating-point. Supported rounding modes are:

Stochastic rounding modes: random, average, prandom, random_det, average_det, prandom_det, random_comdet, average_comdet, prandom_comdet, random_scomdet, average_scomdet, sr_monotonic, sr_smonotonic.
IEEE-754 rounding modes: nearest, upward, downward, toward_zero.
Other: native[default] (similar to nearest (maybe different with --libm=instrumented), away_zero, farthest, float, ftz(imply checkdenorm backend).

This option can be activated with the env variable VERROU_ROUNDING_MODE.

--float=<yes|no> [default=no]

With this option, all double precision floating-point operations are replaced by simple precision equivalent in the frontend. Hence this option is compatible with all --rounding-mode options and with all other backends.

Numerically, option --rounding-mode=float is equivalent to --rounding-mode=nearest --float=yes.
Options --rounding-mode=nearest --float=yes instrument float operations by nearest wrapper function. So if your code contains float operations, it can be useful to use --vr-instr-float=no to speed up the instrumentation. The only drawback of this approach, is the deactivation of checks (such as Nan and Inf checks) over float operations.

This option can be activated with the env variable VERROU_FLOAT.

--libm=<auto_exclude|manual_exclude|instrumented> [default=auto_exclude]

Define the behavior of verrou with libm.

auto_exclude : The libm exclusion is automatically detected thanks pattern detection. The library detected are libm, vgpreload_verrou, interlibmath, libquadmath and libgcc_s. If you need to perturb libgcc_s, we will need to use libgcc_s.
manual_exclude : The automatic detection is ignored and the user has to define manually the exclusion thanks --exclude option. This option is useful if the pattern detection detect object you need to perturb.
instrumented : the interposition library Interlibmath is loaded (The complex libm function are not yet instrumented). This option is valid only with verrou backend. This option imply implicitly the libm exclusion.

The option instrumented is quite new. In a near future, the default will switch from auto_exclude to instrumented

--libm-noinst-rounding-mode==<native|nearest> [default=native]

--mca-mode=<mca|rr|pb|ieee> [default=mca]

Emulate the given MCA mode for operations instrumented with the mcaquad backend. Supported mca modes are:

mca : full mca (default)
rr : random rounding
pb : precision bounding
ieee : ieee (rounding to nearest)

The mcaquad backend implementation come from Verificarlo : More information on Verificalo github

--mca-precision-double= [default=53]

Configure the magnitude of inexact function used by mcaquad backend for double operation.

This option can be activated with the env variable VERROU_MCA_PRECISION_DOUBLE.

--mca-precision-float= [default=24]

Configure the magnitude of inexact function used by mcaquad backend for float operation.

This option can be activated with the env variable VERROU_MCA_PRECISION_FLOAT.

--vr-seed=RNG_SEED [default=automatically generated]

If present, this option allows setting the seed of the pseudo-Random Number Generator used for the random or average rounding modes. The same option can also be used to set the seed of the hash function used for the [p]random_[com]det and average_[com]det rounding mode. This helps reproducing the behavior of a program under Verrou.

If this option is omitted, the pRNG is seeded with a value based on the current time and process id, so that it should change at each execution.

This option can be activated with the env variable VERROU_SEED.

--prandom-update=func [default=none]

If present, this option the p-value of the prandom[|det|comdet] rounding mode is updated (with an uniform random selection between 0 and 1) at the beginning of each function call. Be careful the option is highly sensitive to the program compilation option. If each the update is inserted between each floating point operations the prandom and random modes are equivalent.

This option can be activated with the env variable VERROU_PRANDOM_UPDATE.

--prandom-pvalue=P

If present, this option the p-value of the prandom[|det|comdet] rounding mode is set to P instead of the random number between 0 and 1. If P is equal to 0., prandom and upward are equivalent. If P is equal to 1., prandom and downward are equivalent. If P is equal to .5, prandom and random are equivalent.

This option can be activated with the env variable VERROU_PRANDOM_PVALUE.

Toggle instrumentation of floating-point additions, subtractions, multiplications, divisions, fused multiply additions, fused multiply subtractions, square root, conversions (only double to float cast) respectively. This option can be set multiple times (or use "," to separate arguments) to instrument multiple types of operations.

If this option is not provided, all supported operations types are instrumented.

This option can be activated with the env variable VERROU_INSTR.

--vr-instr-scalar=<yes|no> [default=no]

Toggle instrumentation of x387 scalar instructions. On arm64 architecture, this option is set by default to yes, as scalar instructions respect the IEEE standard.

--vr-instr-llo=<yes|no> [default=yes]

Toggle instrumentation of llo scalar instructions (cast and all fma instructions are considered as llo).

--vr-instr-vec<2,4,8>=<yes|no> [default=yes]

Toggle instrumentation of vectorized instructions (number corresponds to the pack size)

--vr-instr-unk=<yes|no> [default=yes]

Toggle instrumentation of instructions with unknown vectorized status (fma is unvectorized in valgrind IR).

--vr-instr-[flt|dbl]=<yes|no> [default=yes]

Toggle instrumentation of float (or double) instructions.

12.6.1.3. Restriction of instrumentation scope

--instr-atstart=<yes|no> [default=yes]

Toggle hard instrumentation state on or off at program start. Useful in combination with client requests.

This option can be activated with the env variable VERROU_INSTR_ATSTART.

--instr-atstart-soft=<yes|no> [default=yes]

Toggle soft instrumentation state on or off at program start. Useful in combination with client requests.

This option can be activated with the env variable VERROU_INSTR_ATSTART_SOFT.

--exclude=FILE

Symbols listed in FILE will be left uninstrumented.

This option can be activated with the env variable VERROU_EXCLUDE.

--gen-exclude=FILE

Generate in FILE a list of all symbols (which contain perturbed floating point instruction) encountered during program execution. This is useful to build an exclusion list.

In combination with --exclude, only list symbols which were not already present in the provided exclusion list.

WARNING: in order to generate a correct list, the whole binary (including symbols listed in the list provided using --exclude) must be instrumented. When using both --gen-exclude and --exclude, it is advised to avoid perturbing rounding-modes using --rounding-mode=nearest.

This option can be activated with the env variable VERROU_GEN_EXCLUDE.

--source=FILE

When this option is present, only instructions coming from source code lines listed in FILE are instrumented.

This option can be activated with the env variable VERROU_SOURCE.

--warn-unknown-source=FILE

This option requires the use of --source option. When used, verrou generates warning for each line of code (which execute floating point operation) neither present in the FILE defined by --source option nor in the FILE provided by --warn-unknown-source option.

This option can be activated with the env variable VERROU_WARN_UNKNOWN_SOURCE.

--gen-source=FILE

Generate in FILE the list of all source code lines (which contain perturbed floating point instruction) encountered during program execution.

In combination with --source, only list source code lines which were not already present in the provided list.

This option can be activated with the env variable VERROU_GEN_SOURCE.

--IOmatch-clr=IOMATCH_FILE

When this option is present the IOMatch script is read. This file defines the interaction between verrou and the stdout (See IOMatch format). The main idea behind this format is to apply client request action when a line match a bmatch: (b for break) or cmatch: (c for continue) line. If a line match a bmatch the following match (b or c) are not tried. With a cmatch: the following match are tried.

The use of this option, may be sensitive to the bufferization of the program. In C++ std::endl flush the output, so we usually avoid the problem. In python, you can use PYTHONUNBUFFERED env variable. In C we proposed the library verrouUnbuffered.so (Loadable with LD_PRELOAD), to deactivate bufferization of stdout (For other file descriptors you have to flush manually, as it require more complex development). verrou_dd_task automatically use this two tricks.

This option can be activated with the env variable VERROU_IOMATCH_CLR.

--output-IOmatch-rep=REP

Specify the repository of IOmatch log file. By default the log file is IOMATCH_FILE.log-PID. With this option it is REP/IOMatch.log-PID. This option as also influence on the keys dump-stdout: and dump-filtered-stdout:of the IOMatch format).

This option can be activated with the env variable VERROU_OUTPUT_IOMATCH_REP.

12.6.1.4. Coverage generation

--trace=FILE

Activate the Basic Blocks Coverage for the symbols specified in FILE.

This option can be activated with the env variable VERROU_TRACE.

--output-trace-rep=REP

Specify the REP directory for the trace output files.

This option can be activated with the env variable VERROU_OUTPUT_TRACE_REP.

12.6.1.5. Detection

--check-nan=<yes|no> [default=yes]: Activate NaN detection. NaN produces a valgrind error. This functionality requires the verrou backend.
--check-inf=<yes|no> [default=yes]: Activate Inf detection. +/-Inf produces a valgrind error. This functionality requires the verrou backend.
--check-cancellation=<yes|no> [default=no]: Activate cancellation detection. Cancellation produces a valgrind error. This functionality is available for the verrou, mcaquad and checkdenorm backends. The level of detected cancellations can be configured with --cc-threshold-float and --cc-threshold-double.
--cc-gen-file=<FILE>: Generate in FILE with the source format for each code source line which produces at least one cancellation. This functionality is available for verrou, mcaquad and checkdenorm backends. The level of detected cancellations can be configured with --cc-threshold-float and --cc-threshold-double.
--cc-threshold-float=<integer> [default=24]: Configure the cancellation detection threshold for float operations. Default value is still experimental and could have to change.
--cc-threshold-double=<integer> [default=40]: Configure the cancellation detection threshold for double operations. Default value is still experimental and could have to change.
--check-denorm=<yes|no> [default=no]: Activate denormal number detection. Denormal number produced by floating point operation produces a valgrind error. This functionality is available for the checkdenorm backend.
--cd-gen-file=<FILE>: Generate in FILE with the source format for each code source line which produces at least one denormal number. This functionality is available for checkdenorm backend.
--check-max-float=<yes|no> [default=no]: Activate max float detection. This functionality is compatible only with verrou backend.

12.6.1.6. Performance optimization

--vr-unsafe-llo-optim=<yes|no> [default=no]: Activate faster instrumentation process but unsafe when binary mixes llo and vect instructions.

12.6.2. Client requests

Verrou provides the following client requests in the valgrind/verrou.h header.

VERROU_DISPLAY_COUNTERS: Display the current instructions counters.
VERROU_START_INSTRUMENTATION: Start Verrou hard instrumentation (including rounding mode switching) if not already enabled.
VERROU_STOP_INSTRUMENTATION: Stop Verrou hard instrumentation (don't switch rounding modes) if not already disabled.
VERROU_START_SOFT_INSTRUMENTATION: Start full Verrou soft instrumentation (including rounding mode switching) if not already enabled.
VERROU_STOP_SOFT_INSTRUMENTATION: Stop full Verrou soft instrumentation (don't switch rounding modes) if not already disabled.
VERROU_START_DETERMINISTIC(LEVEL): Start a deterministic section, i.e. one in which floating point operations are perturbed, but in a deterministic way.
VERROU_STOP_DETERMINISTIC(LEVEL): Stop a deterministic section, i.e. resume rounding mode switching in a (pseudo-)random way.
VERROU_DUMP_COVER: If --trace option is activated, the client request generates a partial cover and return the index of the cover.
VERROU_PRANDOM_UPDATE: The p-value of prandom[|_det|_comdet] rounding mode is set to a new random value between 0 and 1.
VERROU_PRANDOM_UPDATE_VALUE(floatStr): The p-value of prandom[|_det|_comdet] rounding mode is set to floatStr.

12.6.3. IOMatch script

The IOMatch script is activated thanks the option --IOmatch-clr=IOMATCH_FILE. The comment symbol is #.

The file can contain the configuration keys :

verbose: LEVEL: Configure the verbosity level (default 1)
ignore-empty-line:: If this key is present, the empty lines are ignored.
filter_line_exec: CMD: Activate the line replacement with CMD.
dump-stdout: [FILENAME]: Dump the stdout file. If FILENAME is not set the default name is IOMATCH_FILE.stdout-PID or IOMATCH_OUTPUT_REP/iomatch.stdout-PID (if option --output-IOmatch-rep=IOMATCH_OUTPUT_REP is used). Remark: FILENAME absolute path and --output-IOmatch-rep= are incompatible.
dump-filtered-stdout: [FILENAME]: Dump the stdout file after application of filter_line_exec CMD to each line. If FILENAME is not set the default name is IOMATCH_FILE.filtered.stdout-PID (or IOMATCH_OUTPUT_REP/iomatch.filtered.stdout-PID if option --output-IOmatch-rep=IOMATCH_OUTPUT_REP is used). Remark: FILENAME absolute path and --output-IOmatch-rep= are incompatible.

We can also set up action for each section.

default: ACTION: Add ACTION to the default section.
init: ACTION: Add ACTION to the init section. This action will be executed during the verrou initialization.
post-init: ACTION: Add ACTION to the post-init section. This action will be executed when the first stdout line appears.

Lastly match section with associated actions can be defined:

bmatch: PATTERN: Setup a break match section. PATTERN is a wildcard expression with * and ?. The first space after : is a part of the format. Every space after this one is meaningful. The order of match section is important. If the first match is *. All following match section are ignored. The action associated to the section are defined in the following lines by apply: and post-apply:. A break match (in contrast to continue match) mean that after a match the following match (continue or break) are ignored.
cmatch: PATTERN: Setup a continue match section. PATTERN is a wildcard expression with * and ?. In contrast to break match, if a line match the following match are tried. It is not possible to define post-apply action with cmatch:.
apply: ACTION: Add the ACTION to current match section. This action will be applied to the beginning of the match section. If no action is set for a match section, the default group of action will be executed. To avoid the default action, you can use a nop action.
post-apply: ACTION: Add the ACTION to current match section. This action will be applied to the end of the match section. If no post-action is defined, nothing is done (ie. no default action).

The different possible actions are :

start: Start instrumentation.
stop: Stop instrumentation.
start_soft: Start soft instrumentation.
stop_soft: Stop soft instrumentation.
nop: do nothing (useful to avoid default action)
display_counter: display counters
nb_instr: Display the number of instrumented floating point operation in the log file.
reset_counter: Set all floating point counters to zero.
dump_cover: Apply DUMP_COVER client request.
panic: Stop program with panic.
exit: Stop program with exit.
default,init,post-init: Apply the same group of actions defined for the section default, init and post-init.

12.6.4. Monitor commands

See Debugging your program using Valgrind's gdbserver and GDB to get more information about the Valgrind gdbserver and monitor commands. Below is a list of specific monitor commands provided by Verrou:

count: Display the current instructions counters.
instrumentation [on|off]: Set the current instrumentation state (or print it if no on / off parameter is given).