Crash fixes, part 1: segfaults

One of the bugs that we see in computer programs including LibreOffice is the crash. You’re working with the application, and the program is suddenly closed! Here we discuss the usual causes for these crashes, and how to fix some of them.

Crash Report

Crash Report

There can be many reasons for a crash in C/C++, including segmentation faults (segfaults), assertion failures, aborts, and exceptions that lead to abort. Here we discuss the segmentation faults.

Usual Causes for the Segmentation Fault Crashes

There can be many reasons for a segfault in C/C++, and most of them are are related to the incorrect use of memory. For example:

1. Accessing memory that the program does now own
2. Using memory outside the allocated parts
3. Using uninitialized variables

If you use a variable without properly initializing it, that may lead to a segfault. For example:

int main() {
    int *p;
    *p = 1;
    return 0;
}

Trying to de-reference a null pointer also leads to crash.

int main() {
    int *p = nullptr;
    *p = 1;'
    return 0;
}

$ g++ main.cpp -o main; ./main
Segmentation fault (core dumped)

In C/C++, most compilers do not check the array bounds automatically. When working with arrays, you have to take care that you are accessing inside the array bounds, and not beyond them. Look at this example:

int main() {
    int a[10];
    a[1000000] = 1;
    return 0;
}

When working with pointers, it is important to make sure that the pointer is not null before working with it. For example, look at this file:

sc/source/ui/view/tabview.cxx

Inside it, look at this method:

bool lcl_HasRowOutline( const ScViewData& rViewData )
{
    const ScOutlineTable* pTable = rViewData.GetDocument().GetOutlineTable(rViewData.GetTabNo());
    if (pTable)
    {
    const ScOutlineArray& rArray = pTable->GetRowArray();
    if ( rArray.GetDepth() > 0 )
        return true;
    }
    return false;
}

Here, before working with pTable, with the condition “if (pTable)“, we make sure that the pTable pointer is not null.

That is also the case when working with arguments. Look at this function inside the same file:

IMPL_LINK_NOARG(ScTabView, TimerHdl, Timer *, void)
{
    if (pTimerWindow)
        pTimerWindow->MouseMove( aTimerMEvt );
}

The condition makes sure that pTimerWindow is not null.

How to Fix the segfaults?

To describe how to fix some of the segfaults, I discuss to fixes that are merged recently by Caolan and me.

First, please take a look at this recently fixed crash by Caolan:

The crash was happening when you created a page break, and then tried to edit it. The backtrace showed that the problem is happening in this file:

sw/source/uibase/docvw/PageBreakWin.cxx

inside this method:

IMPL_LINK_NOARG(SwPageBreakWin, FadeHandler, Timer *, void)

just in this line, when invoking IsVisible().

if (IsVisible( ) && m_nFadeRate > 0 && m_nFadeRate < 100)
    m_aFadeTimer.Start();

In a debug session for LibreOffice, one can see that the window object is actually destroyed, but the code wants to check if the window is visible or not. In this way, it leads to a segfault. Looking carefully, it becomes visible that the problem comes from some lines below that. We see that m_pLine->DestroyWin() is invoked, but the control sequence is working in a way that IsVisible() is called right after that. To fix this, Caolan simply added a return after destroying the window, and it fixed the problem.

    if ( m_bIsAppearing && m_nFadeRate > 0 )
        m_nFadeRate -= 25;
    else if ( !m_bIsAppearing && m_nFadeRate < 100 ) m_nFadeRate += 25; if ( m_nFadeRate != 100 && !IsVisible() ) Show(); else if ( m_nFadeRate == 100 && IsVisible( ) ) {
        Hide();
        m_pLine->DestroyWin();
    }
    else
    {
        m_pLine->UpdatePosition();
        PaintButton();
    }

    if (IsVisible( ) && m_nFadeRate > 0 && m_nFadeRate < 100)
        m_aFadeTimer.Start();</pre

    if (IsVisible( ) && m_nFadeRate > 0 && m_nFadeRate < 100)
        m_aFadeTimer.Start();

Different Fixes for the Crashes

Not all the crashes can be fixed this way. Sometimes, you have to work on the logic of the application in a higher abstraction level, to be able to fix the crash. For example, see one of the crashes that I have fixed recently.

You could select many footnotes by pressing up/down keys and holding shift, then delete all the footnotes at once. Then, hovering on a reference of such a footnote, lead to crash.

To fix the crash, I bibisected the problem to find the responsible commit. In that specific change, the commit author changed the behavior of the arrow keys to be able to to go to the beginning of the first line just by pressing up in the first line. Also, the same behavior for the last line was also part of the goal.

To fix the problem, I reduced the behavior change to anywhere other than the footnotes, and set the previous behavior for the footnotes. With this change, the crash no longer happened.

diff --git a/sw/source/core/crsr/swcrsr.cxx b/sw/source/core/crsr/swcrsr.cxx
index b379fe6..06c73af 100644
--- a/sw/source/core/crsr/swcrsr.cxx
+++ b/sw/source/core/crsr/swcrsr.cxx
@@ -2088,7 +2088,9 @@ bool SwCursor::UpDown( bool bUp, sal_uInt16 nCnt,
            }
            bRet = !IsSelOvr( SwCursorSelOverFlags::Toggle | SwCursorSelOverFlags::ChangePos );
        }
        else
        else if (!pFrame->IsInFootnote()) // tdf#150457 Jump to the begin/end
                                          // of the first/last line only if the
                                          // cursor is not inside a footenote
        {
            sal_Int32 nOffset = 0;

@@ -2114,6 +2116,8 @@ bool SwCursor::UpDown( bool bUp, sal_uInt16 nCnt,
            }

        }
        else
            *GetPoint() = aOldPos;

        DoSetBidiLevelUpDown(); // calculate cursor bidi level
    }

Last line, *GetPoint() = aOldPos; comes from the previous behavior, as it was before the commit that introduced the regression.

More Information

There are many other types of crashes, and many other tricks for fixing those crashes. If you look for the crash fixes, you will find many of them, and take a look into them.

git log --oneline|grep -i crash|grep -i fix

You can also refer to a list of recently fixed crashes in the LibreOffice QA report of the August 2022 which is published recently:

QA/Dev Report: August 2022

One can learn a lot from the code itself!

I will continue this tutorial, and talk about other sources of crashes including assert failures, aborts, and exceptions that lead to abort.

Links Related to Program Crashes

In the end, these are some related links: