Debugging a futex crash August 15, 2023 on Rustam Kovhaev's blog

I was enjoying my quiet afternoon, when an interesting application core has been submitted to me for research.

And what made it interesting is that I did not have to load proprietary/private symbols for the application that crashed.

The first thing to check is whether the file is indeed a core file, and it turns out it is not.

rusty@nuc10:~/futex$ file veeamagent.0.crash 
veeamagent.0.crash: ASCII text, with very long lines (44565)

rusty@nuc10:~/futex$ head veeamagent.0.crash -n5
ProblemType: Crash
Architecture: amd64
Date: Thu Apr 27 19:26:47 2023
DistroRelease: Ubuntu 21.04
ExecutablePath: /opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/veeamagent

It is an apport report, let’s unpack it to get the core:

rusty@ubuntu22:~$ apport-unpack veeamagent.0.crash crash

rusty@ubuntu22:~$ file crash/CoreDump 
crash/CoreDump: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'veeamagent -g2500-3300 -i{24a4a58e-1080-4f6b-aa53-fd447b84ea47}  -lflush,/var/l', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/veeamagent', platform: 'x86_64'

CoreDump is the file we are looking for, let’s load it in gdb.

rusty@nuc10:~/futex$ export DEBUGINFOD_URLS="https://debuginfod.ubuntu.com"
rusty@nuc10:~/futex$ gdb

>>> core-file crash/CoreDump 
Core was generated by `veeamagent -g2500-3300 -i{24a4a58e-1080-4f6b-aa53-fd447b84ea47}  -lflush,/var/l'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f7430470fbb in ?? ()
[Current thread is 1 (LWP 64283)]

>>> bt
#0  0x00007f7430470fbb in ?? ()
#1  0x0000000000000202 in ?? ()
#2  0x0000000000000000 in ?? ()

The core is from Ubuntu 21, and, since I am opening it on my Fedora machine, I need to set up a sysroot.

I need to pull the system libraries either from customer’s machine or set up an Ubuntu 21 machine in my lab.

I have one ready and here is what I downloaded from it:

rusty@nuc10:~/futex$ find sysroot/
sysroot/
sysroot/lib64
sysroot/lib64/ld-linux-x86-64.so.2
sysroot/lib
sysroot/lib/x86_64-linux-gnu
sysroot/lib/x86_64-linux-gnu/libc.so.6
sysroot/lib/x86_64-linux-gnu/librt.so.1
sysroot/lib/x86_64-linux-gnu/libdl.so.2
sysroot/lib/x86_64-linux-gnu/libm.so.6
sysroot/lib/x86_64-linux-gnu/libpthread.so.0
rusty@nuc10:~/futex$ 

Now, let’s get back to gdb, set sysroot, and examine the stack again:

>>> set sysroot /home/rusty/futex/sysroot/

>>> i sharedlibrary 
From                To                  Syms Read   Shared Object Library
0x00007f7430dc27a0  0x00007f7430dc5d10  Yes (*)     /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/librt.so.1
0x00007f7430dba220  0x00007f7430dbb1d9  Yes (*)     /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libdl.so.2
                                        No          /opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/libblkid.so.1
                                        No          /opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/libattr.so.1
                                        No          /opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/libacl.so.1
0x00007f743064d3c0  0x00007f74306f2358  Yes (*)     /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libm.so.6
0x00007f7430623a60  0x00007f7430632334  Yes (*)     /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libpthread.so.0
0x00007f7430456650  0x00007f74305beadd  Yes (*)     /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libc.so.6
0x00007f7430dd3090  0x00007f7430df9f36  Yes (*)     /home/rusty/futex/sysroot/lib64/ld-linux-x86-64.so.2
                                        No          /opt/veeam/transport/VeeamAgentTmp_{24a4a58e-1080-4f6b-aa53-fd447b84ea47}/libuuid.so.1
(*): Shared library is missing debugging information.
>>> 

>>> bt
#0  0x00007f7430470fbb in raise () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f7430456864 in abort () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f74304b949c in ?? () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f74304b97b0 in __libc_fatal () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f743062f01c in __lll_lock_wait () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007f7430627733 in pthread_mutex_lock () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libpthread.so.0
#6  0x0000000001acedc0 in ?? ()
#7  0x0000000000000000 in ?? ()
>>> 

Perfect, the system libraries have loaded just fine, and, at this point, I was about to go looking for the proprietary VeeamAgent symbols, when this stack trace caught my attention.

Why would an attempt to lock the mutex, eventually, call abort() function?

Why would this simple operation in one of the threads initiate a termination of the whole process?

It does not make sense.

The only way to find out is to check glibc source code because pthread library is part of glibc.

__lll_lock_wait() calls futex_wait()

void
__lll_lock_wait (int *futex, int private)
{
  if (atomic_load_relaxed (futex) == 2)
    goto futex;

  while (atomic_exchange_acquire (futex, 2) != 0)
    {
    futex:
      LIBC_PROBE (lll_lock_wait, 1, futex);
      futex_wait ((unsigned int *) futex, 2, private); /* Wait if *futex == 2.  */
    }
}
libc_hidden_def (__lll_lock_wait)

futex_wait() is always inlined, that is why it is not in the stack, and futex_wait() calls futex_fatal_error(), which subsequently calls abort()

static __always_inline int
futex_wait (unsigned int *futex_word, unsigned int expected, int private)
{
  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
  switch (err)
    {
    case 0:
    case -EAGAIN:
    case -EINTR:
      return -err;

    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
		     being normalized.  Must have been caused by a glibc or
		     application bug.  */
    case -ENOSYS: /* Must have been caused by a glibc bug.  */
    /* No other errors are documented at this time.  */
    default:
      futex_fatal_error ();
    }
}

After reading the comments, let’s check the easiest thing first, wrong alignment.

Back to gdb:

>>> frame 5
#5  0x00007f7430627733 in pthread_mutex_lock () from /home/rusty/futex/sysroot/lib/x86_64-linux-gnu/libpthread.so.0

>>> disassemble pthread_mutex_lock
...
   0x00007f7430627723 <+211>:	mov    %rdi,0x8(%rsp)
   0x00007f7430627728 <+216>:	and    $0x80,%esi
   0x00007f743062772e <+222>:	call   0x7f743062efc0 <__lll_lock_wait>
=> 0x00007f7430627733 <+227>:	mov    0x8(%rsp),%rdi
   0x00007f7430627738 <+232>:	jmp    0x7f7430627689 <pthread_mutex_lock+57>
...

>>> x/w ($rsp + 8)
0x7f7371ff1ce8:	0x08001ea1

Lo and behold! The mutex address 0x08001ea1 is indeed not aligned.

Let’s write a reproducer for this crash, just to confirm the hypothesis.

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

pthread_mutex_t mutex __attribute__((section(".reproducer"))) __attribute__ ((aligned(1)));

int main(int argc, char **argv)
{
	printf("mutex %p\n", &mutex);
	pthread_mutex_lock(&mutex);
	sleep(1);
	pthread_mutex_unlock(&mutex);
	return 0;
}

Now, let’s compile it and test it, it should crash with the same stack trace as we have in our core.

rusty@nuc10:~/futex$ gcc -g futex0.c -lpthread  -Wl,--section-start=.reproducer=0x8001ea1
rusty@nuc10:~/futex$ ./a.out 
mutex 0x8001ea1
rusty@nuc10:~/futex$ 

And… It worked!

Why didn’t it crash?

Let’s trace that reproducer.

rusty@nuc10:~/futex$ sudo trace-cmd list -e syscalls:futex*
syscalls:sys_enter_futex
syscalls:sys_exit_futex
syscalls:sys_enter_futex_waitv
syscalls:sys_exit_futex_waitv

rusty@nuc10:~/futex$ sudo trace-cmd record -q -F -e sys_enter_fut* -e sys_exit_fut* ./a.out
mutex 0x8001ea1

rusty@nuc10:~/futex$ sudo trace-cmd report
version = 7
cpus=12
rusty@nuc10:~/futex$ 

And our reproducer did not call a single futex syscall.

Why? Well, it turns out that, locking an unlocked mutex is a purely user-space operation, no trip to the kernel.

This means that we need to modify our reproducer. The simplest way, without introducing another thread, would be to simply call pthread_mutex_lock() again. Normally this should deadlock your program, but we expect a crash just like in the original core.

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

pthread_mutex_t mutex __attribute__((section(".reproducer"))) __attribute__ ((aligned(1)));

int main(int argc, char **argv)
{
	printf("mutex %p\n", &mutex);
	pthread_mutex_lock(&mutex);
	pthread_mutex_lock(&mutex);
}

Let’s run this:

rusty@nuc10:~/futex$ gcc -g futex0.c -lpthread  -Wl,--section-start=.reproducer=0x8001ea1
rusty@nuc10:~/futex$ ./a.out 
mutex 0x8001ea1
The futex facility returned an unexpected error code.
Aborted (core dumped)
rusty@nuc10:~/futex$ 

And there we have it, the same crash as in that core.

Now, to put the icing on the cake, let’s find the place in the kernel that returns that -EINVAL that causes glibc to call abort().

rusty@nuc10:~/code/kernel$ git grep -c "\-EINVAL" kernel/futex/
kernel/futex/core.c:1
kernel/futex/pi.c:4
kernel/futex/requeue.c:11
kernel/futex/syscalls.c:8
kernel/futex/waitwake.c:7
rusty@nuc10:~/code/kernel$ 

Quite a few places, let’s save time and try a different approach, we have a reproducer after all.

rusty@nuc10:~/futex$ sudo trace-cmd record -q -F -p function_graph ./a.out
mutex 0x8001ea1
The futex facility returned an unexpected error code.

rusty@nuc10:~/futex$ sudo trace-cmd report
...
           a.out-867356 [004] 112365.616487: funcgraph_entry:        0.168 us   |  exit_to_user_mode_prepare();
           a.out-867356 [004] 112365.616487: funcgraph_entry:                   |  __x64_sys_futex() {
           a.out-867356 [004] 112365.616488: funcgraph_entry:                   |    do_futex() {
           a.out-867356 [004] 112365.616488: funcgraph_entry:                   |      futex_wait() {
           a.out-867356 [004] 112365.616488: funcgraph_entry:        0.150 us   |        futex_setup_timer();
           a.out-867356 [004] 112365.616488: funcgraph_entry:                   |        futex_wait_setup() {
           a.out-867356 [004] 112365.616488: funcgraph_entry:        0.199 us   |          get_futex_key();
           a.out-867356 [004] 112365.616489: funcgraph_exit:         0.425 us   |        }       
           a.out-867356 [004] 112365.616489: funcgraph_exit:         0.954 us   |      }         
           a.out-867356 [004] 112365.616489: funcgraph_exit:         1.205 us   |    }         
           a.out-867356 [004] 112365.616489: funcgraph_exit:         1.581 us   |  }
...

get_futex_key() kernel function is the main suspect according to the ftrace, let’s check it:

         /*
          * The futex address must be "naturally" aligned.
          */
         key->both.offset = address % PAGE_SIZE;
         if (unlikely((address % sizeof(u32)) != 0))
                  return -EINVAL;

And there we have it again, the futex address must be aligned.

And with that, we can conclude our research and send this bug report to Veeam for fixing.

And we did not even need private/proprietary symbols for research, but Veeam will surely need those symbols to find the place where they allocate that mutex to fix the alignment.