Fuzz testing to maintain LibreOffice code quality

By Hossein Nourikhah On 29 July 2024 In Tutorial

Here I discuss what fuzz testing is, and how LibreOffice developers use it incrementally to maintain LibreOffice code quality.

Maintaining Code Quality

LibreOffice developers use various different methods and tools to maintain LibreOffice code quality. These are some of them:

1. Code review: Every patch from contributors should pass code review on Gerrit, and after conforming to coding standards and conventions, it can become part of the LibreOffice source code.

2. Static code checking: “Coverity Scan” continuously scans LibreOffice source code to find the possible defects. An automated script reports these issues to the LibreOffice developers mailing list so that developers can fix them.

3. Continuous Testing: There are various C++ unit test and Python UI tests in LibreOffice core source code to make sure that the functionalities of the software remain working during the later changes. They are also helpful for making sure that the fixed regressions do not happen again. These test run continuously for each and every Gerrit submission on CI machines via Jenkins.

4. Crash testing: A good way to make sure that LibreOffice works fine is to batch open and convert a huge set of documents. This task is done regularly, and if some failure occurs developers are informed to fix the issue.

5. Crash reporting: LibreOffice uses crash testing to find out about the recurrent crashes, and fix them.

6. Tinderbox Platforms: Using dedicated machines with various different architectures, LibreOffice developers make sure that LibreOffice source code builds and runs without problem on different platforms. Here is the description of tinderbox (TB) from TDF Wiki:

Tinderbox is a script to run un-attended build on multiple repos, for multiple branches and for gerrit patch review system.

LibreOffice tinderboxes status

You can see the build status here:

https://tinderbox.libreoffice.org/

7. Fuzz testing: LibreOffice software is checked continuously using Fuzz testing. This is essentially giving various automated inputs to the program to find the possible places in the code where problem occurs. Then, developers will become aware of the those problematic places in the code, and can fix them.

Fuzz Testing LibreOffice

Fuzz testing on LibreOffice source code is active since 2017, and since then there has been various bug fixes for the problems that the fuzz tester reported. You can see more than 1500 of such fixes in the git log until now:

$ git shortlog -s -n --grep=ofz#

Issues Found with Fuzz Testing

This tool can find various different problems. These issues are then filed in a section of Chromium bug tracker, and after ~30 days, they are made public. When developers fix bugs of this kind, they refer to the issue number (for example 321) as ofz#321. A comprehensive list of all issues found is visible here:

Chromium Bug Tracker – LibreOffice Issues

Fixing the Issues

Let’s look at one of the fixes. You can find commits related to fuzzing with:

$ git log --grep=ofz

This is a recent fix from Caolán, an experienced LibreOffice developer that provided most of the fixes found through oss-fuzz:

commit d30ecb5fb07f005ebd944e864f0a15678289a4ed
    ofz#69809 Integer-overflow

--- a/filter/source/graphicfilter/icgm/cgm.cxx
+++ b/filter/source/graphicfilter/icgm/cgm.cxx
@@ -227,7 +227,7 @@ double CGM::ImplGetFloat( RealPrecision eRealPrecision, sal_uInt32 nRealSize )
         else
         {
             sal_Int32* pLong = static_cast<sal_Int32*>(pPtr);
-            nRetValue = static_cast(abs( pLong[ nSwitch ] ));
+            nRetValue = fabs(static_cast(pLong[nSwitch]));
             nRetValue *= 65536;
             nVal = static_cast( pLong[ nSwitch ^ 1 ] );
             nVal >>= 16;

As you can see, using abs() first, and then casting to double is changed in this commit to cast to double first, and then using fabs(). The reason of this change lies in the data type of some variables.

pLong is an array of sal_Int32, which is 32 bit signed integer. It can take values from -2,147,483,648 to 2,147,483,647. As you can see, the smallest negative 32-bit signed integer can not be stored in the same 32-bit signed variable if abs() is used to remove sign from that.

As the result is stored in nRetValue, a varible of type double, it is possible to first cast the array item to double, and then use floating point version of absolute function, fabs() over it. In this way, “integer overflow” will not happen anymore.

This patch was one of the smallest examples of what a fix can be. There are many bugs that are more complex, and require more careful examination to provide a fix.