Generating Pony Coverage Reports

2018/06/08
Main, the Pony mascot, galloping

Pony is my favorite programming language. I am gonna write another post about why it is better than everything else, promise, but this time I am telling you how to extend your Pony-tack to be able to ride it even better.

The additional trick I want to teach you is generating coverage reports for your test runs.

The Pony compiler is implemented in C and uses LLVM to compile down to machine code. As far as I know, there is no easy way to track code coverage with the current compiler, be it line-based or instruction-based. But there is knowledge of which part of the code sits in which line in debug builds of Pony programs (compiled with the --debug flag) and we already use this in the debugger to get some proper backtraces in case of segfaults or assertions being hit in the compiler. So why not leverage that?

I just recently found kcov which extracts coverage information from compiled programs using DWARF debugging information. This sounded like black magic to me, but this is exactly the information we keep in debug builds. kcov only supports ELF and Mach-O binaries, so it should work both on linux and MacOS, but I only tested on linux.

I went the hard way and cloned an compiled kcov on my machine because using the kcov docker image didn't work out for me. Detailed instructions can be found here.

Then all I had to do was to create a debug build of my pony program and run it with kcov like so:

mkdir out
kcov --include-pattern="$PWD/my_proj" --exclude-pattern="$PWD/my_proj/test,$PWD/my_proj/_test.pony" ./out bin/test

It is important to only include the Pony files you would like to have coverage for and to exclude the test files, otherwise your nice little coverage percentage is screwed and nobody is going to believe a word you gonna say, ever!

I am not 100% sure why yet, but from time to time I had to re-run the command above multiple times to actually have my tests run and actually get some report out of it. If I find out, I tell you, second promise.

How to read your Pony Coverage Reports

This is pretty straight forward: Lines covered are marked green, lines not covered are marked red. Easy!

But then there is something that jumped to my eye: Some lines in the output are not colored at all. For code being present in the binary and not being covered I would suspect the corresponding lines to be colored red. But all I get is: nothing:

unmarked section in example coverage report

Lets revisit some interesting parts of the Pony compiler to get behind what's going on here.

During the compilation process, after the lexing, parsing, de-sugaring, type-checking and all there is an interesting compiler phase called Reachability Analysis. It takes your programs Main actor and tries to find all the code it needs to run, getting rid of unused code. We don't need that in our slick Pony binary. So, some parts of your source code don't even make it into the binary you are producing.

Thus, if you detect some unmarked lines, that is because the compiler got rid of them during the compilation process. They are not only not covered, Pony didn't even considered them worth being part of your binary. Go test them, now!

Last words of warning regarding Coverage

Now following are the words of an old hermit who has seen enough for a lifetime.

In general, a high percentage of coverage does not mean a thing. You can reach that while not validating the right properties of your system under test at all, leaving a broken system that you think works perfectly until the alerting system wakes you up at sunday night. What you can definitely use Coverage Reports for is finding blind spots, uncovered code.

If you really want to make use of what actually is covered, limit your scope. E.g. have a certain coverage goal for executing a certain test only. Say, I am writing a test for getting the head of a list. E.g. the coverage report for executing this test only (using --only=<test-name> in ponytest) should show me coverage on both the case of an empty list and a list with a value. If not, I am doing something wrong in my test. The smaller the scope is the more meaningful the coverage report becomes.

Have fun and use your coverage reports responsibly!