Telemetry required? Ask users first!

In this article, I will discuss the recent problems with compiling LibreOffice using Microsoft Visual Studio, things that I did to debug and find the root cause, the source of problem itself – which is problems in Microsoft’s telemetry – and how I could fix it.

Describing The Problem

Recently, I was encountering a problem when configuring LibreOffice’s source code before compilation. Sometimes, random errors appeared without further details on why. The title: “powershell.exe” was also strange, as I wasn’t using PowerShell directly.

Powershell Error

Powershell Error

At first, I ignored the message, but then it become more error common, and at some point the configuration was aborted. I ignored that for a while, but after a few days, one of the mentees reported a somehow similar problem.

The error was that the UCRT (which is Microsoft Visual Studio C++’s standard C library), was not found. This is an error log:

$ ./autogen.sh
.
.
.
checking for Windows SDK... found Windows SDK 10.0 (/cygdrive/c/PROGRA~2/WI3CF2~1/10)
checking for midl.exe... C:\Program Files (x86)\Windows Kits\10\/Bin/10.0.20348.0/x64/midl.exe
checking for csc.exe... C:\Windows\Microsoft.NET\Framework\v4.0.30319\/csc.exe
checking for al.exe... C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\/al.exe
checking .NET Framework... found: C:/PROGRA~2/WI3CF2~1/NETFXSDK/4.8/
checking whether jumbo sheets are supported... yes
checking whether to enable runtime optimizations... yes
checking for valgrind/valgrind.h... no
checking for sys/sdt.h... no
checking what the C++ library is... configure: error: Could not figure out what C++ library this is
Error running configure at ./autogen.sh line 321.

Checking the Error Logs

The important log that contains the output of the configuration is the config.log file. In this file, I could see these related lines:

...
configure:19511: result: no
configure:20052: checking what the C++ library is
configure:20078: C:/PROGRA~1/MIB055~1/2022/COMMUN~1/VC/Tools/MSVC/1430~1.307/bin/Hostx64/x64/cl.exe -c  -IC:/PROGRA~2/WI3CF2~1/10/Include/ucrt  -IC:/PROGRA~2/WI3CF2~1/10/Include/ucrt -IC:/PROGRA~1/MIB055~1/2022/COMMUN~1/VC/Tools/MSVC/1430~1.307/Include conftest.cpp >&5
conftest.cpp
C:/PROGRA~1/MIB055~1/2022/COMMUN~1/VC/Tools/MSVC/1430~1.307/Include\cstddef(12): fatal error C1083: Cannot open include file: 'stddef.h': No such file or directory
Microsoft (R) C/C++ Optimizing Compiler Version 19.30.30711.2 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.
...

The strange thing was that I could configure that compilation with another Cygwin terminal with slightly different settings. To find the differences, I used the command export to see the values of the environment variables in the two configured terminals, and compare them using diff.

Then, I found that I could evade the problem by setting this environment variable. This was the environment variable from one of the terminals:

export CYGWIN="disable_pcon"
https://cygwin.com/cygwin-ug-net/using-cygwinenv.html

Unfortunately, this was not the case for our mentee who has the same problem. I also knew that this approach may lead to performance degradation.

Looking Further Into the Details

I tried to look further into the details of configure.ac, and debug to understand the root cause of the problem. At first, I changed the version manually in configure.ac, and the configuration actually worked! If you take a look into find_ucrt() function, the relevant part is:

PathFormat "$(win_get_env_from_vsdevcmdbat UniversalCRTSdkDir)"
UCRTSDKDIR=$formatted_path
UCRTVERSION=$(win_get_env_from_vsdevcmdbat UCRTVersion)

Setting the PathFormat and UCRTVERSION to something from a good build fixed the problem: configuration and make went smooth, and finished successfully.

Then, I tried to look into win_get_env_from_vsdevcmdbat() function. As the name implies, it runs the VsDevCmd.bat, and uses the contents of the two environment variables: PathFormat and UCRTVERSION.

This function creates a batch file in the temporary folder, runs it and gets the output, and then removes it. So, removed the removal part, and saved the created batch files.

I was skeptical about the commands that were processing the outputs of the batch files, so I tried to change them a little, but that didn’t help. The nice thing was that each of them were working fine. I ran them several times, but there was no problem! Then I decided to run them exactly one after another, and then I saw that sometimes there was no output.

Finding the Root Cause

At the point, I was almost certain that the problem was from the VSDevCMD.bat itself, but I didn’t know why, and how to fix that. So, I took a look into the script, and guess what: the problem was from the telemetry! If the variable VSCMD_SKIP_SENDTELEMETRY is not set, the command line tries to open a PowerShell script, and send data to Microsoft! That was the source of problem. This is the relevant part of the code:

@REM Send Telemetry if user's VS is opted-in
if "%VSCMD_SKIP_SENDTELEMETRY%"=="" (
    if "%VSCMD_DEBUG%" NEQ "" (
        @echo [DEBUG:%~nx0] Sending telemetry
        powershell.exe -NoProfile -Command "& {Import-Module '%~dp0\Microsoft.VisualStudio.DevShell.dll'; Send-VsDevShellTelemetry -NewInstanceType Cmd;}"
    ) else (
        START "" /B powershell.exe -NoProfile -Command "& {if($PSVersionTable.PSVersion.Major -ge 3){Import-Module '%~dp0\Microsoft.VisualStudio.DevShell.dll'; Send-VsDevShellTelemetry -NewInstanceType Cmd; }}" > NUL
    )
)

To fix that, I used the value 1 for the variable to opt out of telemetry:

set VSCMD_SKIP_SENDTELEMETRY=1

This change is now merged into the LibreOffice code:

So, the problem should be fixed by now.

Best Practices for Doing Telemetry

It took a lot of time to debug and find the root cause of the problem. I think the best way to avoid causing problems for the users of the Visual Studio would be asking for the users’ consent before activating the telemetry.

I agree that there are legitimate or justifiable reasons to do telemetry, but getting the users’ consent is very important before sending data back to the corporate servers.

In LibreOffice, we consider users the top priority, and we are bound to the best practice of: “Telemetry required? Ask users first”, and we ask others to do the same.

Comments
  1. 2 years ago