Crash fixes, part 2: abort
One category of the bugs that we see in computer programs including LibreOffice is the unexpected crashes. You’re working with the application, and it is suddenly closed! In the previous part, I have discussed crashes that are caused by segmentation fault. In this article, I discuss the crashes from invoking abort()
function. Please note that not an abort is not always a bad thing, or a bug.
Abort
In C/C++ sometimes we have to check the validity of certain conditions, and avoid certain bad situations. We do that to avoid data corruption or other problems that leads us to terminate the application instead of trying to continue.
This is different from normal error handling routines, because in error handling, we usually try to find errors, find ways out of the problem automatically, or ask user to fix the problem by changing the input and things like that. But, this is not always the case.
Better to Be Safe Than Sorry!
If you search for abort in the .cxx files, you will find a lot abort() usages:
git grep "abort(" *.cxx
Let’s look at some example. When writing tests, one of the problems that may occur while testing font rendering is the font fallback because of using fonts that are not available via LibreOffice. In this situation, it is better to stop the test application, so that the developers understand that they should use fonts that are available via LibreOffice. You can look vcl/unx/generic/fontmanager/fontconfig.cxx for more information.
Another example is the problem with the GPU. If you look at vcl/skia/gdiimpl.cxx, you can see this code. The comments explain the situation. When we run out of memory, the error is “unrecoverable without possible data loss”. So, the best thing that we can do, is to abort the program.
// If there's a problem with the GPU context, abort. if (GrDirectContext* context = GrAsDirectContext(mSurface->getCanvas()->recordingContext())) { // Running out of memory on the GPU technically could be possibly recoverable, // but we don't know the exact status of the surface (and what has or has not been drawn to it), // so in practice this is unrecoverable without possible data loss. if (context->oomed()) { SAL_WARN("vcl.skia", "GPU context has run out of memory, aborting."); abort(); } // Unrecoverable problem. if (context->abandoned()) { SAL_WARN("vcl.skia", "GPU context has been abandoned, aborting."); abort(); } }
Generally speaking, the developer should try to recover from the errors, but in the end, some unrecoverable errors may remain, and the developer might decide to do an abort()
.
Fixing Bugs Related to abort
On the other hand, sometimes it is possible to avoid doing an abort()
, and recover from an error. In such a case, fixing the crash would involve finding a way out of the problem, and continuing the normal execution of the program.
For example, take a look at tdf#138022 – LibreOffice exits/crashes when minimizing start center after closing a document (SKIA). This issue is fixed in this patch:
commit 42e30c24615402c49351f80cc8a47d61d47267c6
Author: Jan-Marek Glogowski Date: Mon Nov 16 22:43:51 2020 +0100 tdf#138022 Skia don't recreate empty surfaces Skia can't create empty surfaces, so the recreation will hit the std::abort() in SkiaSalGraphicsImpl::createWindowSurface. Origin of the backtrace is some queued Resize event, which will hit this a few times via SkiaSalGraphicsImpl::checkSurface. This feels a bit like tdf#130831, where VCL tried to track damange for an empty Qt image...
The idea of the fix is that Skia can not re-create empty surfaces, so when resizing events are queued, recreation of the surface will lead to an std::abort. To avoid this, Jan-Marek changed the code. Now LibreOffice does not re-create the surface in the case that both width and height of the surface are zero. Other than that, instead of using recreateSurface()
, it uses two commands destroySurface()
and then createSurface()
.
Final Words on Fixing a Crash
We have a meta tag dedicated to the crash bugs in Bugzilla. All the bugs related to the crashes are listed there:
tdf#133092 (Crash) – [META] Crash bugs
Other than that, we have EasyHacks related to the crashes. Some of them are discussed in our previous post, “Crashes that you can fix! – EasyHack“. You can try fixing them!