söndag 30 juni 2019

Fuzzing {fmt}

Earlier this year, I was mowing the lawn and listening to cppcast. Both activities are weekly, a fortunate coincidence? It was Victor Zverovich being interviewed, about {fmt}, a formatting library in the process of being standardized

I decided to see if fuzzing would catch anything, and the first bug was very fast to find. It was fixed almost instantly, and I found more errors. This followed the experience I have with fuzzing other projects - bugs are often found during the first minutes, or never.


I asked Victor if he would be interested in getting fuzzing into fmt, and further onto oss-fuzz. He was positive to both, and I started with my plan:
  1. get basic fuzzing running locally
  2. smoke out the initial bugs
  3. get the initial fuzzers onto oss fuzz
  4. get the fuzzers merged
  5. repoint oss fuzz to the upstream repo
This plan is nice, because it minimizes coordination. I don't have to worry about getting the fuzzers up to merge quality, there's no waiting for pull requests to be approved. Any issues are reported to me so I can fix them without bothering anyone. Eventually, when the fuzzers are mature, they can be polished enough to be accepted as a part of the upstream repo. I was open with my plan to Victor, and I think it worked very well.

An unexpected hurdle - std::chrono::duration_cast

Fmt is able to format std::chrono::durations. This turned out to be quite difficult to do correctly. I learned during this process that there are no guarantees on std::chrono::duration_cast in case there is for instance signed integral overflow, or other UB, in the internals. The fuzzer revealed a few cases of this, and there were several places in the {fmt} internals where this would creep in. I discuessed this with Victor, and we both agreed that {fmt} should either give the correct answer, or signal an error. Never UB or the wrong answer. Eventually I made a separate library for providing a UB free duration_cast. I used fuzzing to find the corner cases, and a tip from my C++ user group made me use cfenv for the first time, so I could catch the interesting cases and handle them.
This detour took a while, but eventually a condensed version ended up in {fmt}. I found it very interesting, that the performance overhead of this is zero! I guess the compiler and CPU team up to cover the extra checks in parallel.

With this in place, I could finally enable the chrono fuzzer on oss-fuzz.

Summing up

Overall this has been very fun and rewarding. Victor has been very responsive and helpful. I will definitely follow up with more fuzzing work! For instance there seems to be a parser on it's way in. Also, I think it is wise to follow the oss-fuzz guidelines of running through the corpus as a part of continuous integration.
The good thing with oss fuzz is that it will keep running, without me or anyone else paying attention. I think that is the greatest benefit, rather than the cpu time. All in all I found seven bugs, and contributed fixes to get chrono formatting UB free. 10/10 would do it again!