NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/58561 (panic: kernel diagnostic assertion, "x86_read_psl() == 0" failed: file, "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)



On Sun, Jan 11, 2026 at 05:21:12PM +0000, Taylor R Campbell wrote:
> > Date: Sat, 10 Jan 2026 23:33:30 +0100
> > From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
> > 
> > And this assembly work may have allowed me to find the problem.
> > Basically, the CLI and STI macros are not atomic on Xen PV, and if preemption
> > happens at a bad time we may end up updating the upcall_mask of our previous
> > CPU. The attached patch should close this race condition; hopefully it's
> > the last one.
> 
> Nice!  I fixed some of this a few years ago but I must have neglected
> to search further for other uses like the assembly stubs using
> EVTCHN_UPCALL_MASK and EVTCHN_UPCALL_PENDING:
> 
> https://mail-index.netbsd.org/source-changes/2023/02/25/msg143510.html
> 
> There are still a lot of suspicious cases that turn up when I grep -i
> for evtchn_upcall:
> 
> - i386/spl.S -- might be fine because they're in the spllower and
>   doreti paths, so maybe preemption is already blocked by virtue of
>   having raised ipl, but needs another set of eyeballs to be sure:
> 
>    https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=1.58#218
>    https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=1.58#291
> 
> - i386 CLI/STI macros:
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/i386/include/frameasm.h?r=1.35#12
> 
>   (Maybe we can just nix XEN_BLOCK/UNBLOCK_EVENTS and XEN_TEST_PENDING
>   and merge those into the CLI/STI macros?)

Yes, i386 needs a look too. Wlll do later.

> 
> - amd64 profile stubs:
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/amd64/include/profile.h?r=1.21#82
> 
>   (Not sure why there's no obvious i386 analogue of that?  The i386
>   code has no XENPV conditionals.)

Maybe a Xen/i386 profiling kernel has never been tried. Or maybe Xen emulates
cli and related

> 
> - xen_init_amd64/i386_vcpuctxt -- guessing this runs before interrupts
>   or preemption can happen, but unclear, needs more eyeballs:
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=1.145#850
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=1.145#947

Yes, this runs before interrupts are enabled.

> 
> - do_hypervisor_callback -- does this run with ipl raised so
>   preemption is not possible?  If so, can we assert the ipl is raised?
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=1.46#301

It's called from interrupt context (with ci_idepth raised) so preemtion should
not be possible.

> 
> - hypervisor_send_event -- this is #if 0 code, can we delete it and
>   the one #if 0 caller in sys/arch/xen/xen/evtchn.c so we don't have
>   to think about it?  If not, I think we should either
>   KASSERT(kpreempt_disabled()) or assert the ipl is raised.
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=1.46#336
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=1.100#383

Sure, I deleted it.

> 
> - xen_debug_handler -- this runs with ipl raised, right?  Maybe not
>   worth asserting because this is a debug vector so dumping info is
>   better than crashing.
> 
>   https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=1.100#1142

That's the idea; and also it's called from interrupt context so no
preemtion possible

> 
> Also, I think we should either:
> 
> (a) delete the PUSHF/POPF macros altogether, if they're unused, or
> 
> (b) make sure they too block preemption in the critical path.

AFAIK they're unused, I will remove them.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index