NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/58561 (panic: kernel diagnostic assertion, "x86_read_psl() == 0" failed: file, "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)



The following reply was made to PR port-xen/58561; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Cc: Konrad Schroder <perseant%hhhh.org@localhost>, gnats-bugs%NetBSD.org@localhost,
	port-xen-maintainer%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost,
	gnats-admin%NetBSD.org@localhost, cherry%NetBSD.org@localhost, port-xen%NetBSD.org@localhost
Subject: Re: port-xen/58561 (panic: kernel diagnostic assertion,
	"x86_read_psl() == 0" failed: file,
	"/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)
Date: Sun, 11 Jan 2026 17:21:12 +0000

 > Date: Sat, 10 Jan 2026 23:33:30 +0100
 > From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
 >=20
 > And this assembly work may have allowed me to find the problem.
 > Basically, the CLI and STI macros are not atomic on Xen PV, and if preemp=
 tion
 > happens at a bad time we may end up updating the upcall_mask of our previ=
 ous
 > CPU. The attached patch should close this race condition; hopefully it's
 > the last one.
 
 Nice!  I fixed some of this a few years ago but I must have neglected
 to search further for other uses like the assembly stubs using
 EVTCHN_UPCALL_MASK and EVTCHN_UPCALL_PENDING:
 
 https://mail-index.netbsd.org/source-changes/2023/02/25/msg143510.html
 
 There are still a lot of suspicious cases that turn up when I grep -i
 for evtchn_upcall:
 
 - i386/spl.S -- might be fine because they're in the spllower and
   doreti paths, so maybe preemption is already blocked by virtue of
   having raised ipl, but needs another set of eyeballs to be sure:
 
    https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=3D1.58#218
    https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=3D1.58#291
 
 - i386 CLI/STI macros:
 
   https://nxr.netbsd.org/xref/src/sys/arch/i386/include/frameasm.h?r=3D1.35=
 #12
 
   (Maybe we can just nix XEN_BLOCK/UNBLOCK_EVENTS and XEN_TEST_PENDING
   and merge those into the CLI/STI macros?)
 
 - amd64 profile stubs:
 
   https://nxr.netbsd.org/xref/src/sys/arch/amd64/include/profile.h?r=3D1.21=
 #82
 
   (Not sure why there's no obvious i386 analogue of that?  The i386
   code has no XENPV conditionals.)
 
 - xen_init_amd64/i386_vcpuctxt -- guessing this runs before interrupts
   or preemption can happen, but unclear, needs more eyeballs:
 
   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=3D1.145#850
   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=3D1.145#947
 
 - do_hypervisor_callback -- does this run with ipl raised so
   preemption is not possible?  If so, can we assert the ipl is raised?
 
   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=
 =3D1.46#301
 
 - hypervisor_send_event -- this is #if 0 code, can we delete it and
   the one #if 0 caller in sys/arch/xen/xen/evtchn.c so we don't have
   to think about it?  If not, I think we should either
   KASSERT(kpreempt_disabled()) or assert the ipl is raised.
 
   https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=
 =3D1.46#336
   https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=3D1.100#383
 
 - xen_debug_handler -- this runs with ipl raised, right?  Maybe not
   worth asserting because this is a debug vector so dumping info is
   better than crashing.
 
   https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=3D1.100#1142
 
 Also, I think we should either:
 
 (a) delete the PUSHF/POPF macros altogether, if they're unused, or
 
 (b) make sure they too block preemption in the critical path.
 


Home | Main Index | Thread Index | Old Index