(DAEDALUS)

The Trier Data-Race Analyzer

The goal of our tool is to enhance reliability of multi-threaded C code by using program analyzer technology for obtaining sanity checks or even certificates for absence of certain programming errors. This type of application does not demand analyzers which run in a few seconds --- but flag thousands of unnecessary warnings which later-on must be checked manually by highly paid software engineers. We are clearly willing to spend some minutes analysis time on larger programs --- given only that the number of spurious errors is dramatically decreased. Therefore, we aim at a good balance between precision and analysis time.

Accordingly, the design goals for the analyzer are:

The analysis of multi-threaded programs has been considered as notoriously difficult and expensive. In fact, precise analyses are known for some restricted classes of parallel programs but for very simple program properties only. In order to arrive at the necessary precision for a non-trivial fragment of C, however, we have, e.g., to resolve function pointers and integrate some form of points-to analysis. Also, we have to take into account the possible interference between the execution of different threads.

The key observation for our inter-procedural C analyzer is that Posix threads communicate through global variables. In order to separate the analysis of the different threads, we attempt to infer for each global variable one single value which safely approximates all possible states of the global variable. This single invariant for the globals then is used for analyzing each thread individually. The separation is particularly successful in applications where the threads are only loosely coupled, i.e., where the control-flow mainly depends on the values of locals.

Consider, e.g., the control-flow edge:

(CFG edge)

The desired result of applying the edge's transfer function is shown right. Traversing the edge conceptually has two effects: first, the local state of the edge's target node receives the local variable assignment from the edge's source node, and second, the global z receives the value of the expression x+y. Conceptually, our tool is based on a very efficient solver for computing safe approximations to all possibly occurring side effects onto globals.

Our analyzer was tested on preliminary versions of a large (non-safety critical) on-board program provided by AIRBUS FRANCE. The whole system consists of seven components ranging in size from 23,000 to 80,000 LOC (before pre-processing and excluding header files). The analysis times for these components varied from a few minutes to less than half an hour. The numbers of flagged potential data-race errors were small enough to be manually inspected by humans. Since our analyses compute safe supersets of data-race errors, these experiments clearly indicate the high quality of the analyzed software. They also demonstrate that the global invariant approach is sufficiently efficient to deal with real-world software components and still precise enough to flag relatively few alarms.

A tarred and gzipped executable running under Linux is available for DOWNLOAD.



Helmut Seidl - DAEDALUS - Aug-13-2002