25 Nov 2023

LibreOffice extensions with Python: create and debug

Ever wondered how to create a LibreOffice extension? Here I discuss how to do that via Python programming language. We also discuss how to debug the code in an external IDE like PyCharm.

LibreOffice Extensions with Python

If you have used LibreOffice extensions, you know that many exciting things can be done with extensions. Extensions can open LibreOffice applications, create new documents, read and write text and images inside the documents, and convert them to all possible formats. They can have their own menus and toolbar buttons, and have nice looking GUIs to interact with the users.

To write an extension, the easiest way is to use LibreOffice BASIC language. You can refer to this tutorial for such an approach here:

But with Python, you will have access to a big set of packages, that is one of the many strengths of the Python programming language. You can do almost anything possible with a software with those packages. Furthermore, LibreOffice has its own Python interpreter! In this way, installing and using a Python extension would be much easier.

Handling Context in LibreOffice Extensions

First of all, you should know about context, and you should be able to have that variable to be able to use LibreOffice API.

There can be at least 3 different possibilities for running a Python program with LibreOffice:

  1. Running the Python program with APSO inside LibreOffice
  2. Running the Python program as an extension inside LibreOffice
  3. Running the Python program as a process outside LibreOffice

In each of these possibilities, the way to get the context and use them is different.

Structure of a LibreOffice Extension

Extensions are essentially zip files that have specific files known to LibreOffice inside them. This is the structure of a Python extension:

  • META-INF/ : required folder
    META-INF/manifest.xml: Specification of the script(s), menu/toolbar and language files
  • pkg-description/ : required folder
    pkg-description/pkg-description.en: Description of the extension in text, which can be also in languages other than English
  • registration/: required folder
    registration/license.txt: License of the extension
  • description.xml: Description of the extension in XML format, as displayed in the extension manager
  • main.py: The main script. Then name can be anything but it should
    be specified in the META-INF/manifest.xml

Contents of the Files

Most of the contents of the files are re-usable, so you can use the skeleton extension, and build your extension around that. But, the Python script is important and we will talk about it here.

From the above 3 possible situations for LibreOffice, in order to be able to use the code as extension, you should add these this 2 lines should be in the Python file

g_ImplementationHelper = unohelper.ImplementationHelper()
g_ImplementationHelper.addImplementation(MainJob,
"org.extension.sample.do",("com.sun.star.task.Job",), )

In addition, this import is also required:

from com.sun.star.task import XJobExecutor

Then, a Python class with this definition is needed:

class MainJob(unohelper.Base, XjobExecutor)

The program should have this method:

def trigger(self, args):

To be able to debug the program, the main function should be defined as something like this:

def main():
    try:
        ctx = XSCRIPTCONTEXT
    except NameError:
        ctx = officehelper.bootstrap()
        if ctx is None:
            print("ERROR: Could not bootstrap default Office.")
            sys.exit(1)
    job = MainJob(ctx)
    job.trigger("keywords")


if __name__ == "__main__":
    main()

This is a sample implementation of the MainJob class:

class MainJob(unohelper.Base, XJobExecutor):
    def __init__(self, ctx):
        self.ctx = ctx
        # handling different situations (inside LibreOffice / different process)
        try:
            self.sm = ctx.getServiceManager()
            self.desktop = XSCRIPTCONTEXT.getDesktop()
        except NameError:
            self.sm = ctx.ServiceManager
            self.desktop = self.ctx.getServiceManager().createInstanceWithContext(
                "com.sun.star.frame.Desktop", self.ctx)

And this is a sample trigger() function that opens Writer, and write a sample text consisting of the argument passed to it.

    def trigger(self, args):
        desktop = self.ctx.ServiceManager.createInstanceWithContext(
            "com.sun.star.frame.Desktop", self.ctx)
        model = desktop.getCurrentComponent()
        if not hasattr(model, "Text"):
            model = self.desktop.loadComponentFromURL("private:factory/swriter", "_blank", 0, ())
        text = model.Text
        cursor = text.createTextCursor()
        text.insertString(cursor, "Hello Extension argument -> " + args + "\n", 0)

You can use this structure to create Python extension that you want to create.

A complete extension with the above files is available here, and the plan is to make it available among the other LibreOffice SDK examples:

https://gerrit.libreoffice.org/c/core/+/159938

Final Notes

This year we had an extensive workshop on LibreOffice development in LibreOffice conference 2023. If you want to know more about using LibreOffice API in Python, you can refer to the presentation:

16 Nov 2023

String literals: C/C++ string data types part 2

In the first part of the series on string types in LibreOffice, I discussed some of the string data types that are in use in various places of the LibreOffice code. I discussed various character and string data types briefly: OString, OUString, char/char*, sal_Unicode, sal_Unicode*, rtl_String, rtl_uString and also std::string. Now I want to explain string literals.

String Literals

In C/C++, a string literal is a sequence of characters in double quotations, and represent read-only textual data. For example:

const char *str = "abc";

Please note that it is different from a character literal, which is a single character in single quotation marks:

const char c = 'a';

The non read-only version of these data types does not have const in it.

The char* data type is widely used in C programming language, but it is not the data type of choice in LibreOffice. As described in my previous post, OString is used for for 8-byte text, and OUStringis used for Unicode text in LibreOffice. It is worth noting that it is possible to store UTF-8 encoded Unicode text in OString.

In the past, it was possible to convert the const char* literal to OString/OUString like this: (it will not compile now)

OString sText = "abc";
OUString sUniText = u"abc";

It was not an efficient way to define and use such strings. A read-only memory is used to store the plain string literals. But then, a new dynamic memory chunk is allocated on the heap to store the new O[U]String object, and through the constructor, that read-only memory is copied into that memory. Also, the new OUString needs reference counting. These are non-necessary expensive operations, and we should avoid them.

O[U]StringLiteral

In LibreOffice, OStringLiteral and OUStringLiteral are the data types used to represent string literals for ASCII and Unicode data, respectively.

As an example, you can see lines like this in LibreOffice .cxx files:

static constexpr OUStringLiteral sStart = u"ABC";
static constexpr OStringLiteral sEnd("DEF");

The constexpr ensures that the expression is evaluated at compiled time, and this can improve the performance of the program. Also, avoiding reference counting in O[U]String helps to make the operation cheaper.

Later, OString/OUString variables are constructed from the OUStringLiterals. Or, they are passed to functions that expect OString/OUString parameters. The difference is that when static constexpr literals are used, the memory used for storing data is not the dynamic memory, it is allocated once, and it is read-only, which increases the performance. This approach is only usable when you work with strings that will be only initialized once, and will not be manipulated later.

String Literals in Headers

If you are working with a .hxx C++ header file, you have to use inline keyword to avoid creating duplicate copies of the global variable. For example:

inline constexpr OUStringLiteral ABC(u"abc");

Later we will see that we can re-write the above with a suffix as:

inline constexpr OUString ABC = u"abc"_ustr;

Essentially, that is a better replacement of the macro:

#define ABC "abc"

or, sometimes:

const char ABC[] = "abc";

These are no longer desirable in C++ having the string literals available with the latest C++ standard and new LibreOffice code. Also, it is important to know that the goal is eventually get rid of O[U]StringLiteral data types using the simpler form with suffixes.

Prefixes

String literals with no prefix are single byte strings which consist of 8-bit characters. Multi-byte Unicode string literals have various prefixes used to indicate their types. For example, to represent ABC in ASCII, UTF-8, UTF-16, UTF-32 and wide-char, you need to write:

// requires C++20
char ascii_cstr[] = "ABC";
char8_t utf8_cstr[] = u8"ABC";
char16_t utf16_cstr[] = u"ABC";
char32_t utf32_cstr[] = U"ABC";
wchar_t w_cstr[] = L"ABC";

Suffixes

Now that C++20 has become the baseline for LibreOffice source code, and thanks to Stephan Bergmann, it became possible to simplify the code, and avoid O(U)StringLiteral data type to write it it in a much shorter form, like:

static constexpr OUString sStr = u"abc"_ustr;
static constexpr OString sTransSource("def"_ostr);

As you can see in the above code snippet, for Unicode strings, _ustr is used, and for non-Unicode strings, _ostr.

Since C++14 standard, you can use s suffix to have a std C++ string out of the string literal, but you need to explicitly say that you will use the std::string_literals namespace first.

using namespace std::string_literals;

std::string ascii_str = "ABC"s;
std::u8string utf8_str = u8"ABC"s;
std::u16string utf16_str = u"ABC"s;
std::u32string utf32_str = U"ABC"s;
std::wstring wstring_str = L"ABC"s;

Final Words

Don’t be afraid of various string types that we discussed here! Most of the time, you will be using OUString. The other types will come up occasionally when you work with different parts the huge LibreOffice source code.

There are still other data types related to working with string like streams, buffers and stringview types that I will discuss in the next part of this series of blog posts.

If you want to know more, refer to the presentation from Stephan Bergmann in LibreOffice conference 2023. He talks about the improvements in C++20 (Class non-type template parameters) that made it possible to simplify the string literals in LibreOffice code:

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

2 Nov 2023

Integer data types improvement – EasyHack

Many different data types are used in LibreOffice code. During the long history of the LibreOffice, and before that in OpenOffice, there were integer data types that are no longer in use today. The task I discuss here is to choose appropriate data types to use instead of sal_uLong and similar deprecated integer data types.

Integer Data Types in LibreOffice

One of the old deprecated integer data types is ULONG, which then converted to sal_uLong. The latter, sal_uLong is still problematic, because it can be different on distinct platforms. Being “long” does not mean that it can fit to everything. It should be reviewed one by one, and replaced by another suitable data type. This EasyHack is focused on this change:

As sal_uLong is unsigned, usually an unsigend type like size_t, sal_uInt16, sal_uInt32, or sal_uInt64 can be suitable, but this is not always the case. Sometimes you should use signed types according to context. There are even cases that using floating point types like double is the correct choice.

Finding Instances to Change

Finding instances is easy. You can simply use grep to find the remaining instances:

$ git grep sal_uLong

Using the count.sh script provided in the EasyHack page, you can count the number of remaining instances in each folder of the LibreOffice core source code. It would be good if you start from some of the folders with less number of changes required, in order to reduce the number of remaining folders.

$ ./count.sh | sort -h
1: dbaccess/
1: unotools/
2: desktop/
2: drawinglayer/
3: framework/
6: svl/
8: svx/
11: toolkit/
13: compilerplugins/
25: starmath/
33: svtools/
34: filter/
61: include/
111: sd/
416: vcl/
593: sc/
635: sw/

To find instances inside a specific folder like toolkit/, mention it after sal_uLong in grep command, like this:

$ git grep sal_uLong toolkit/

Beware to preserve the capital letter L in sal_uLong.

Choosing Data Types

The main issue here is to find a specific integer data type that can replace sal_uLong, so that it can handle all the possible values in foreseeable scenarios.

You should look into where the data type is used to get the idea of the possible values that are stored in the variable, and are read later. Sometimes, it is obvious from the context. For example, as describe in Bugzilla, in below commit, data type for the positions of a SvStream is chosen:

Here, sal_uInt64 is chosen because the files that are read and write via SvStream can be larger than 4 GB. As an example, 32 bit unsigned integer sal_uInt32 can only handle size as big as ~2^32 which equals to something around 4*10^9 B = 4 GB. With a 64 bit unsigned integer, the possible size is much larger, and suitable for the purpose.

Using Return Types of Functions

If the variable is filled from the output of a function, then the data type of that function can be suitable for the variable. This may not be always the case, and you should have in mind that sometimes you also need to change the return types of the functions.

Using auto keyword and an IDE

Sometimes, you can use auto for the data type, and use your IDE capabilities to conclude the data type, and use mouse over or similar actions to find the provided data types.

Many C/C++ IDEs support this feature. For more information on how to setup an IDE, please refer to this wiki article:

Doing the Change

To actually do the change, you have to replace sal_uLong with the integer data type that you have chosen. But this is not the end! Sometimes you have to change many other places, like data type for return types in functions, member variables in classes, and many other places. That may also trigger another set of changes, where those functions or variables are used.

To get a better understanding on what is needed to be change, you can use a very handy feature of your IDE: “find usages”. This feature may be provided with different names, but is usually available when you right click on the variable/identifier name. For example:

  • Qt Creator: “Find reference to symbol under cursor”
  • Visual Studio: “Find all references”
  • Visual Studio Code: “Go to references”

You should look for similar functionality in the IDE of your choice.

Keep the Change Minimal

Please try to keep the changes minimal, and limit the changes to 1 or at most a few files. Otherwise, you may end up modifying several files, facing a difficult to mange change. Such a change would not be suitable here. That is because the goal of the EasyHack is to give you the opportunity to change small parts of the code to gain better understanding of the LibreOffice developments at early stages.

There are rare cases, where such large changes succeed. For example, look at this change:

tdf#152431 Fix line count resets to zero after 65535

This is a huge change. Although it is a spin-off of this EasyHack, it was eventually done as a fix to another bug, visible with the symptom that the line count was resetting to zero after 65535. Therefore, please keep your change minimal in this EasyHack, and postpone larger changes to the time when you have accomplished several difficultyBeginner EasyHacks.

Compromise: Keeping Some Deprecated Integer Data Types

It is not always possible, or easy to remove all the deprecated data types like tools:Long. Sometimes, you have to keep them, and there are even situations that you have to convert sal_uLong to tools::Long. This is fine for now, as tools::Long and other tools:: data types are still in use extensively. You can count:

$ git grep tools::Long *.cxx *.hxx|wc -l
16481

Writing the Commit Message

In the commit message for this EasyHack, please justify your selection of data types briefly. Please do not describe the data types themselves, but the reasoning behind your actual choice of the data type for the variables in place of sal_uLong.

Final Words

To do this task, you need to be able to build LibreOffice from source code, and send your changes to Gerrit. To do that, you can refer to our getting started guide:

Getting Started (Video Tutorial)

26 Oct 2023

UNO API error reporting improvement – EasyHack

In this blog post, I discuss the EayHack for improving UNO API error reporting. EasyHacks are good if you want to become familiar with LibreOffice programming, and this specific task is a good choice for beginners as it is a difficultyBeginner task.

What is UNO API?

UNO API is the programming interface that you can use to access LibreOffice capabilities programmatically. This API is usable across different languages, from LibreOffice BASIC macro programming to Python, Java, C++ and many more.

UNO API provides many services and functions, and using them should be  according to the LibreOffice API documentation. The API is stable over different LibreOffice versions, and most of the API is even compatible with its older predecessor, OpenOffice!

Improving UNO API Error Reporting

Here I discuss improvement to error reporting in UNO API. While using the API, there can be situations with the incorrect use of API, or for any reason, some errors. Many of the functions already provide good error reports, but there are still places with primitive error reporting. In this cases, it is possible to provide improvements.

This suggested improvement is defined as an EasyHack:

Although it may look very easy at first, you have to be patient, and read more about it to make sure that your change is good and meaningful.

Finding Instances

First, you have to pick a C++ file to improve the UNO API error reporting inside it. You can use ‘grep’ tool to find instances. To do this, run this command in terminal:

$ git grep "throw .*RuntimeException *( *)" *.cxx

Then, pick one of the files, and work on it. You may have to change more than one instance of error reporting in a single C++ file. Using counter program ‘wc’, you can see that are are still more than 1400 instances of this change are remaining across more than 300 files.

Required Steps

It is important to do these steps to create a good patch to improve the UNO API reporting:

  1. Read and grasp the idea of the change from the similar commits from other people in the same EasyHack
  2. Read the code to understand the case where error occurs
  3. Choose appropriate Exception type
  4. Choose suitable constructor
  5. Understand and differentiate between the functions that are exported as API functions, and local functions.
  6. Provide good error messages that describe the situation.
  7. Reproduce and test the error message (if possible)

Similar Commits

There are similar commits, that are listed in the Bugzilla page of the issue tdf#42982. You can learn from them what Exception type to choose, how to write the error message, and many more things.

Please take a look at the this related commit, which improves error reporting for the UNO XPath API:

For example, consider the function registerExtensionInstance(). The API documentation is here:

It is defined as:

void registerExtensionInstance ([in] com::sun::star::xml::xpath::XXPathExtension aExtension)

So, we know that it should take one string parameter. If not, we need to explicitly say that this parameter is lacking.

Choosing Exception Type

RuntimeException is usually the best choice, but for example, in commit 7e8806cd728bf906e1a8f1d649bef7337f297b1c you see that in case a parameter is not initialized, NotInitializedException is used. If the argument is empty or Null, IllegalArgumentException is a good choice, and if there is are no elements as expected, you can choose NoSuchElementException. But, remember that you can only replace the RuntimeException with the Exception types that are derivatives of it, to give a more specific Exception. That rule prevents you from replacing RuntimeException for example with NoSuchElementException, which is not derived from RuntimeException.

For a complete list of exception types inherited from RuntimeException, refer to the UNO IDL API reference:

There are multiple constructors for the Exceptions, so you should make sure that you are using the right one. This is the comment from Stephan, experienced LO developer:

RuntimeException constructors either take no arguments or two arguments (Message, a string; and Context, a com::sun::star::uno::XInterface reference to the relevant UNO object or a null reference).

So, in basic/source/uno/namecont.cxx you would need a second argument

static_cast< cppu::OWeakObject * >(this)

(where the cast is necessary as this derives from XInterface multiple times), and in the later files you would need to move your new, third argument to be the first one instead, replacing the empty rtl::OUString().

As an example, you can see that RuntimeException is sometimes called with only a message, and better, with the context. Also, in namecont.cxx you can see this:

void NameContainer::replaceByName( const OUString& aName, const Any& aElement )
{
    const Type& aAnyType = aElement.getValueType();
    if( mType != aAnyType )
    {
        throw IllegalArgumentException("types do not match", getXWeak(), 2);
    }
...
}

The getXWeak() method provides the context, and 2 means that the 2nd parameter is problematic.

By looking into similar commits, you can also learn how to write error messages. For example, if a parameter is null, you can say that it does not exist, or is null.

Reading the Code

The code itself can show you the good choice for exception, and the error message. In the above patch related to XPathAPI, you have to understand what are the goal of the functions, and the meaning of the error.

For example, the first change is:

-        if (!pCNode) { throw RuntimeException(); }
+        if (!pCNode) { throw RuntimeException("Could not use the namespace node in order to collect namespace declarations."); }

The error message is: “Could not use the namespace node in order to collect namespace declarations”. That is because the namespace node is used to collect the namespace. The function name is lcl_collectNamespaces, which means a local function to collect namespace, and it also this comment is informative:

// get all ns decls on a node (and parent nodes, if any)

As this is only a local function, and is not exported using SAL_CALL, parameter names may not be understandable outside the code itself. But if you see SAL_CALL, you can use the paramter name in the error message.

In some cases, you have to read more to understand what is the parameter. For example, consider this code snippet:

        // get the node and document
        ::rtl::Reference<DOM::CDocument> const pCDoc(
                dynamic_cast<DOM::CDocument*>(xContextNode->getOwnerDocument().get()));
-        if (!pCDoc.is()) { throw RuntimeException(); }
+        if (!pCDoc.is()) { throw RuntimeException("Interface pointer for the owner document of the xContextNode does not exist."); }

In this case, get() function is used to get the interface pointer of the xContextNode->getOwnerDocument(), which can be described as the “owner document of the xContextNode”, and because pCDoc.is() is false, it means that it does not exist.

Testing UNO API Error Reporting

Using this BASIC code, you can see the error message in action:

Sub Main
    oXPath = createUnoService("com.sun.star.xml.xpath.XPathAPI")
    oXPath.registerExtensionInstance(Null)
End Sub

This is the error message, before the change:

<yoastmark class=

After the change, it becomes this:

Error message after the change

Error message after the change

The new error message is more understandable and meaningful. Please note that it is not always easy to generate such an error, because the exception may occur in specific situation that may not be easy to reproduce. But, when it is about lack of a parameter or similar situations, it is good to check the error message similar to the above BASIC code.

Final Words

Having good error messages in LibreOffice API helps macro programmers and developers who use LibreOffice programmatically. If you are interested in doing this EasyHack, make sure that you go through the above mentioned steps to improve the error reporting.

5 Oct 2023

LibreOffice conference 2023 workshop presentation slides

LibreOffice conference (LibOCon) 2023 was held in Bucharest from 20 to 23 September 2023. Among the other programs, an important part was the workshop “Introduction to LibreOffice Development”. Here you will find the slides for the presentations.

LibOCon 2023 Bucharest

LibOCon 2023 Bucharest

LibreOffice Conference Workshop Program

The workshop was held in parallel with the main tracks in 3 days, and many different things around LibreOffice development were discussed in the workshop. You can see the detailed program of the workshop here:

Presentation SlidesSlides for the presentations can be found here:

Day One:
1. Office software, and the open source/free software development model (1 hour)
Presenter: Hossein Nourikhah

2. Effective communication in open source/free software projects (1 hour)
Presenter: Hossein Nourikhah

3. Bug reporting and triaging (2 hours)
Presenter: Stéphane Guillou

4. Git basics (2 hours)
Presenter: Stéphane Guillou

5. Gerrit for code reviews (2 hours)
Presenter: Xisco Faulí

Day Two:
6. Software localization (l10n) and internationalization (i18n) (1 hour)
Presenter: Hossein Nourikhah

7. LibreOffice automation via scripting (BASIC, Python) (3 hours)
Presenter: Rafael Lima / Alain Romedenne

8. Building LibreOffice from source code (4 hours)
Presenter: Hossein Nourikhah

Day Three
9. LibreOffice Documentation (1 hour)
Presenter: Olivier Hallot

10. LibreOffice SDK development (Java, Python) (2 hours)
Presenter: Hossein Nourikhah

11. Introduction to problem solving techniques (30 minutes)
Presenter: Michael Meeks

12. Introduction to LibreOffice Core (30 minutes)
Presenter: László Németh

13. LibreOffice core design (C++) (2 hours)
Presenter: Heiko Tietze

14. LibreOffice core development (C++) (1 hour)
Presenter: Hossein Nourikhah

15. Introduction into Writer development (1 hour)
Presenter: Miklos Vajna

14 Sep 2023

Catalog and schema support for SQL functions – difficulty interesting EasyHack

LibreOffice has a database application called Base. It can connect to various database management systems, and is integrated with two internal database engines: Firebird and HSQLDB. Here I discuss how to add catalog and schema support for SQL functions in LibreOffice Base.

SQL window

SQL window

One can use SQL to create and use internal functions. For example, with Firebird:

CREATE FUNCTION F(X INT) RETURNS INT
AS
BEGIN
  RETURN X+1;
END;

To run this, you can use “Tools > SQL…”, and then write the above SQL query. To see the result, you need to run this query:

SELECT F(5) FROM RDB$DATABASE;

Catalog and schema support

On the other hand, support for SQL commands is limited. For example, as the issue tdf#95174 describes, SQL parser of LibreOffice parser currently does not handle catalog and schema in function names:

Currently, this command  works fine:

SELECT function_name(a, b) FROM C

But this one does not:

SELECT schema_name.function_name(a, b) FROM C

The goal is to make the second one also work in LibreOffice Base.

Code Pointers

To add the support for catalog and schema in function names, you should refer to the Yacc rule for the SQL Parer.

Lionel, the experienced Base developer describes what to do in the first comment. In the file connectivity/source/parse/sqlbison.y, you can find this rule

function_name:
		string_function
	|	date_function
	|	numeric_function
	|	SQL_TOKEN_NAME

Here, you should add two new cases, like:

	|	SQL_TOKEN_NAME '.' SQL_TOKEN_NAME 
			{$$ = SQL_NEW_RULE;
			$$->append($1);
			$$->append(newNode(".", SQLNodeType::Punctuation));
			$$->append($3);
			}
	|	SQL_TOKEN_NAME '.' SQL_TOKEN_NAME '.' SQL_TOKEN_NAME
			{$$ = SQL_NEW_RULE;
			$$->append($1);
			$$->append(newNode(".", SQLNodeType::Punctuation));
			$$->append($3);
			$$->append(newNode(".", SQLNodeType::Punctuation));
			$$->append($5);}

After that, one should invoke this command:

git grep -E '(function_name|set_fct_spec)'

to find parts of the code that use them.

If the code is examining one of the above nodes, it expects a single token at the function_name. The code should be changed to expect a token or a node to handle the schema_name and function_name.

Final Notes

An implementation should be accompanied with a test to make sure that the code actually works, and will remain fine in the future changes. To see other discussed EasyHacks, follow the EasyHacks tag in this blog.

31 Aug 2023

Warning for low disk space – difficulty interesting EasyHack

Without enough space, one may face data corruption, which is really a terrible thing that can possibly happen for someones important data. In order to avoid falling into such a situation, it is good idea to give warning to the users in advance.

Code Pointers for generating warning for low disk space

To implement such a feature in LibreOffice, first place to look is this file sfx2/source/doc/sfxbasemodel.cxx.

The method to query the free space method should be added to the sal/osl folder in LibreOffice core source code. To add OS specific code, one may use unx and w32 folders inside it.

Please note that LibreOffice needs to know the disk space on different devices, so passing a vector containing path and free disk space is a good suggestion here.

You should know that guessing the required disk space to save the file is not easy. So, the idea is to have several megabytes free to avoid facing problems. That is in cases the file is not actually very huge. It is possible to add that limit as an option, placed in Tools > Options. These days, even 100-200 megabyte is not that much when comparing it to the very fast disk consumption by different applications like browsers and other similar huge software that people use regularly.

Another nice feature to implement is a handler that runs with low priority every several seconds and checks the available temporary space. That will help avoiding problems with saving images in that specifc temp directory.

Testing the Warning for Low Disk Space

One needs to create a test environment to actually test the patch in action. Using a small RAM drive, it is possible to do that. These commands are useful to create a 20 MB partition for testing:

mkdir /tmp/small
sudo /bin/mount -t tmpfs -o size=20m,mode=0700,uid=$USER,gid=$GROUP /dev/shm /tmp/small

After invoking the above instructions and filling the disk space, you can invoke LibreOffice with the below command to use temp drive. As a result, you will get the below error message:

No disk space error

No disk space error

But, no warning message is shown when you have some small disk space which is < 1 MB.

$ instdir/program/soffice -env:SAL_USE_VCLPLUGIN=gen -env:UserInstallation=file:///tmp/small /tmp/small/1.pptx

While having < 1 MB disk space, you will get this warning in the terminal, but not when the space is between 1 and 2 MBs.

warn:configmgr:57868:58063:configmgr/source/components.cxx:190: error writing modifications com.sun.star.uno.RuntimeException message: "cannot write to file:///tmp/small/user/nnePqE at ~/Projects/libreoffice/core/configmgr/source/writemodfile.cxx:109"

Please note that both the profile and the opened file were inside /tmp/small.

Final Words

The above issue is tdf#60909. If you like it, just follow the Bugzilla link to see more information.

To implement this feature, first you have to build LibreOffice from the sources. If you have not done that yet, please refer to this guide first:

Getting Started (Video Tutorial)

31 Aug 2023

Find and replace For Base – difficulty interesting EasyHack

LibreOffice Base is part of LibreOffice productivity suite that makes it possible to work with databases. It is an alternative to MS Access. One of the proposed enhancement for Base is to add a “Find and replace” dialog. Right now, a “Find” dialog is available, but it is not possible to do the replacement with the LibreOffice Base dialogs. This issue is filed as tdf#32506.

The importance

This was requested for a long time ago, but until now no developer has put time to make it a reality. This feature request has is a difficutlyIntersting EasyHack, which means it is among the EasyHacks that need more work compared to the difficutlyBeginner and difficutlyMedium ones.

I will describe the details of the task, and if you find it interesting, you can start working on it. Solving difficutlyIntersting EasyHacks is among the criterias for selecting GSoC candidates, so it worth trying if you want to be among next year GSoC candidates.

It is worth mentioning that MS Office provide a comparable functionality in “Find and replace” dialog for MS Access. Thus, it would be helpful for the people migrating from Access to Base.

Proposed UI Design for Find and Replace

Enrique, which proposed this enhancement, also provided a design for the “Search and replace” dialog.

Proposed design for LibreOffice Base Find and Replace dialog

Proposed design for LibreOffice Base Find and Replace dialog

Code Pointers For Implementing Find and Replace

As described, this enhancement will be extending the search functionality of Base with the ability to do replacement, which is not currently available from dialogs. It is however possible to use SQL queries to do the replacement. Then, the task would be extending the search dialog, and then adding the required methods that use SQL to do search and replacement.

Lionel, a LibreOffice Base developer, has suggested this path, which I have updated:

The discussed dialog is instantiated in this C++ file
dbaccess/source/ui/browser/brwctrlr.cxx:1798:

pDialog = pFact->CreateFmSearchDialog(getFrameWeld(), sInitialText, aContextNames, 0, LINK(this, SbaXDataBrowserController, OnSearchContextRequest));
pDialog->SetActiveField( sActiveField );
pDialog->SetFoundHandler( LINK( this, SbaXDataBrowserController, OnFoundData ) );
pDialog->SetCanceledNotFoundHdl( LINK( this, SbaXDataBrowserController, OnCanceledNotFound ) );
pDialog->Execute();
pDialog.disposeAndClear();

As the SetFoundHandler() uses OnFoundData, we search the same file for "OnFoundData", and find it in the line 2347:

IMPL_LINK(SbaXDataBrowserController, OnFoundData, FmFoundRecordInformation&, rInfo, void)
{
...
}

This function is called, when a match is found.

The comment above the function SetFoundHandler() describes the idea of “found handler”s:

/** The found-handler gets in the 'found'-case a pointer on a FmFoundRecordInformation-structure
(which is only valid in the handler; so if one needs to memorize the data, don't copy the pointer but
the structure).
This handler MUST be set.
Furthermore, it should be considered, that during the handler the search-dialog is still modal.
*/
void SetFoundHandler(const Link<FmFoundRecordInformation&, void>& lnk)
{
...
}

In the above mentioned file, brwctlr.cxx, this is the start of handler function:

Reference< css::sdbcx::XRowLocate > xCursor(getRowSet(), UNO_QUERY);

This "xCursor" is the form object. The brwctlr.cxx is only for grid (table) controls. For other controls, one should look into svx/source/form/fmshimp.cxx:1544:

SvxAbstractDialogFactory* pFact = SvxAbstractDialogFactory::Create();
ScopedVclPtr<AbstractFmSearchDialog> pDialog(
pFact->CreateFmSearchDialog(
m_pShell->GetViewShell()->GetViewFrame().GetFrameWeld(),
strInitialText, aContextNames, nInitialContext,
LINK(this, FmXFormShell, OnSearchContextRequest_Lock) ));
pDialog->SetActiveField( strActiveField );
pDialog->SetFoundHandler(LINK(this, FmXFormShell, OnFoundData_Lock));
pDialog->SetCanceledNotFoundHdl(LINK(this, FmXFormShell, OnCanceledNotFound_Lock));
pDialog->Execute();
pDialog.disposeAndClear();

The corresponding OnFoundData is line 2150:

IMPL_LINK(FmXFormShell, OnFoundData_Lock, FmFoundRecordInformation&, rfriWhere, void)
{
    if (impl_checkDisposed_Lock())
        return;

    DBG_ASSERT((rfriWhere.nContext >= 0) && (o3tl::make_unsigned(rfriWhere.nContext) < m_aSearchForms.size()),
        "FmXFormShell::OnFoundData : invalid context!");
    Reference< XForm> xForm( m_aSearchForms.at(rfriWhere.nContext));
    DBG_ASSERT(xForm.is(), "FmXFormShell::OnFoundData : invalid form!");
...
}

And then we can use the form object to implement the required change to fulfill the request.

Possible Pitfalls

It is important not to cause troubles with the keys, both foreign keys and primary keys. The idea is to allow find and replace in primary and foreign keys, but then it would be the role of the underlying database engine to see if the replacement is actually possible, or not, and then raise an error message.

Also, it would be the responsibility of the users to make sure that the search and replace they issue is a meaningful one. But, anyway the developer should handle the errors from the underlying database engine.

Final Notes

To implement this feature, first you have to build LibreOffice from the sources. If you have not done that yet, please refer to this guide first:

Getting Started (Video Tutorial)

10 Aug 2023

Highlight the current row and column in Calc – difficulty interesting EasyHack

In large computer displays, it is somehow hard to track the active cell, and the associated row and column. One of the solutions provided to fix this problem is to highlight the row and column. The feature request is visible in tdf#33201: (more…)

30 Jul 2023

ccache for a 5 minutes LibreOffice build

If you have ever tried to build LibreOffice code, you know that it can take a lot of time. LibreOffice has ~6 million lines of C++ and some Java code (<280k). But, there are tools that can help you build LibreOffice from source code much faster, if you do it repeatedly! Here I discuss how to use one of these tools: “ccache”. (more…)