NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/56828 (futex calls in Linux emulation sometimes hang)
The following reply was made to PR kern/56828; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Hauke Fath <hf%spg.tu-darmstadt.de@localhost>
Cc: gnats-bugs%netbsd.org@localhost,
kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
Thomas Klausner <wiz%NetBSD.org@localhost>,
Chuck Silvers <chs%NetBSD.org@localhost>, Jason Thorpe <thorpej%NetBSD.org@localhost>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Sun, 21 Dec 2025 02:18:10 +0000
> Date: Sun, 21 Dec 2025 01:34:01 +0000
> From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
>
> Here's what the failing test test_wake02 does, in the main thread:
>
> 1. pthread_create a bunch of children that do futex_wait(0x1234)
> 2. wait a moment and do futex_wake(0x1234, 1) to wake one child
> 3. wait a moment and do futex_wake(0x1234, 2) to wake two children
> and so on, until all the children are woken, verifying that futex_wake
> woke the correct number each time (it returns the number of wakeups).
> [...]
> But I tried instrumenting uvm_voaddr_acquire to print the arguments,
> and...they're the _same arguments_ in the parent and child! Same
> struct vm_map pointer, same va. So how could uvm_voaddr_acquire be
> returning a different object in different (compat_linux) threads??
I instrumented some more and determined that:
1. uvm_voaddr_acquire arguments are the same in parent and children
2. uvm_map_lookup_entry arguments _and result_ (kva and content of the
struct vm_map_entry) are the same in parent and children
3. entry->aref.ar_map->am_slots[0], at the same _kernel_ virtual
address when the children do it and when the parent does it, holds
a different struct vm_anon pointer in the children (same for all
children) and the parent
So something is overwriting entry->aref.ar_map->am_slots[0] in the
time between the children all run and the time when the parent runs.
One difference might be that the children all read the futex word
first before doing any futex syscall (futex_wait), while the parent
does the futex syscall (futex_wake) first without reading the futex
word.
So the children might take a different path from the parent in the
first access of the futex word's virtual address -- and that might
lead them to resolve it one way, via uvm_fault through normal access,
while the parent opts to resolve it another way, via
uvm_voaddr_acquire through futex syscall.
I bet there's a bug in the uvm_voaddr_acquire logic which we're only
hitting in compat_linux for some reason.
Home |
Main Index |
Thread Index |
Old Index