jeudi 28 janvier 2010

Introducing Fatgrind

Fatgrind is a new valgrind plugin I've been working on recently. It basically allows you to track the numerical accuracy of your codes without needing heavy instrumentation or a recompile. Basically you can use your binary code compiled with gcc -g and have it produce a list of source-level instructions causing the most numerical error in your computations. As opposed to other tools with a similar purpose, this one is not based on a static code analysis (as static analysis uses a model which always over-evaluates the errors, and therefore is not useful in real situations) but on a runtime analysis. In particular, this allows catching data-dependent issues which is the reason I wrote it in the first place.

It works as follows:
  • First, thanks to valgrind the code is analyzed and all floating point instructions are found and instrumented.
  • The floating points instructions are doubled: while they still get executed, a high-precision version (using GMP - the GNU MP Bignum library, a library for high precision computation) is executed at the same time.
  • At runtime, each floating point computation is compared with its high precision counterpart, and the resulting error (difference between the floating point result and the high precision result) is computed.
  • The error of each operation is added to the current instruction.
At the end, what you get is an annotated listing of code with the total amount of error that each instruction is responsible for. In short, you know what parts of your code are to blame for those pesky floating point errors.

Aint life great?

Not yet, as many things are still pending:
  • First and foremost, as valgrind plugins cannot use libc functions, but GMP does use them, I have to use a hacked GMP version which uses the valgrind internal functions. Not so clean. I need to take the relevant GMP files and have them compiled using the valgrind build system.
  • Second, I need to support more floating point instructions. I couldn't find any program that uses those extra instructions on x86/x86_64, but that doesn't mean these instructions won't show up. If computers teach you one thing, it's that shit happens. All the time.
  • Finally, I want to release this code and hopefully get it upstream into valgrind. We'll see how that goes.

Aucun commentaire: