NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/58561 (panic: kernel diagnostic assertion, "x86_read_psl() == 0" failed: file, "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)



The following reply was made to PR port-xen/58561; it has been noted by GNATS.

From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: Konrad Schroder <perseant%hhhh.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, port-xen-maintainer%netbsd.org@localhost,
        netbsd-bugs%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, riastradh%NetBSD.org@localhost,
        campbell+netbsd%mumble.net@localhost, cherry%NetBSD.org@localhost
Subject: Re: port-xen/58561 (panic: kernel diagnostic assertion,
 "x86_read_psl() == 0" failed: file,
 "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)
Date: Sat, 10 Jan 2026 23:33:30 +0100

 --DyYqzEZ4vXS0XQ4C
 Content-Type: text/plain; charset=iso-8859-1
 Content-Disposition: inline
 Content-Transfer-Encoding: 8bit
 
 On Sat, Jan 10, 2026 at 10:01:05PM +0100, Manuel Bouyer wrote:
 > On Sat, Jan 10, 2026 at 11:30:27AM -0800, Konrad Schroder wrote:
 > > On 1/10/2026 4:55 AM, Manuel Bouyer wrote:
 > > > Hello,
 > > > can you try with the attached patch ? It won't fix the problem but
 > > > should let us know if syscall() is already called with interrupts disabled,
 > > > or if they're disabled later
 > > 
 > > Thanks!  Unfortunately that blows up almost instantly:
 > > [...]
 > 
 > 
 > Sorry; I should have checked more carefully. As there's a
 > call _C_LABEL(do_pmap_load)
 > early I assumed it was safe to call C functions here but do_pmap_load()
 > is in fact written in assembly.
 > Here's an updated patch which uses only assembly in copy.S
 > Also it adds the check to all copy* functions, not only copyout.
 > It boots multiuser on my test system.
 
 And this assembly work may have allowed me to find the problem.
 Basically, the CLI and STI macros are not atomic on Xen PV, and if preemption
 happens at a bad time we may end up updating the upcall_mask of our previous
 CPU. The attached patch should close this race condition; hopefully it's
 the last one.
 
 -- 
 Manuel Bouyer <bouyer%antioche.eu.org@localhost>
      NetBSD: 26 ans d'experience feront toujours la difference
 --
 
 --DyYqzEZ4vXS0XQ4C
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=diff2
 
 Index: sys/arch/amd64/include/frameasm.h
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/amd64/include/frameasm.h,v
 retrieving revision 1.55
 diff -u -p -u -r1.55 frameasm.h
 --- sys/arch/amd64/include/frameasm.h	30 Jul 2022 14:11:00 -0000	1.55
 +++ sys/arch/amd64/include/frameasm.h	10 Jan 2026 22:30:12 -0000
 @@ -24,13 +24,22 @@
  #define	NOT_XEN(x)
  
  #define CLI(temp_reg) \
 + 	movq CPUVAR(CURLWP),%r ## temp_reg ;			\
 +	incl L_NOPREEMPT(%r ## temp_reg);			\
   	movq CPUVAR(VCPU),%r ## temp_reg ;			\
 -	movb $1,EVTCHN_UPCALL_MASK(%r ## temp_reg);
 +	movb $1,EVTCHN_UPCALL_MASK(%r ## temp_reg);		\
 + 	movq CPUVAR(CURLWP),%r ## temp_reg ;			\
 +	decl L_NOPREEMPT(%r ## temp_reg);			\
  
  #define STI(temp_reg) \
 + 	movq CPUVAR(CURLWP),%r ## temp_reg ;			\
 +	incl L_NOPREEMPT(%r ## temp_reg);			\
   	movq CPUVAR(VCPU),%r ## temp_reg ;			\
 -	movb $0,EVTCHN_UPCALL_MASK(%r ## temp_reg);
 +	movb $0,EVTCHN_UPCALL_MASK(%r ## temp_reg);		\
 + 	movq CPUVAR(CURLWP),%r ## temp_reg ;			\
 +	decl L_NOPREEMPT(%r ## temp_reg);			\
  
 +#if 0 
  #define PUSHF(temp_reg) \
   	movq CPUVAR(VCPU),%r ## temp_reg ;			\
  	movzbl EVTCHN_UPCALL_MASK(%r ## temp_reg), %e ## temp_reg; \
 @@ -39,6 +48,7 @@
  #define POPF \
  	popq %rdi; \
  	call _C_LABEL(xen_write_psl)
 +#endif
 
 --DyYqzEZ4vXS0XQ4C--
 


Home | Main Index | Thread Index | Old Index