Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes

0 answers

48 views

Linux kernel locking documentation. PREEMPT_RT caveats section question

In the start of this section we see next mentions: > local_lock on RT > > The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a > few implications. For example, on a non-PREEMPT_RT kernel the > following code sequence works as expected: > > local_lock_irq(&local_lock); > raw_spin_lock(...

                                  In the start of this section we see next mentions:

> local_lock on RT
> 
> The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a
> few implications. For example, on a non-PREEMPT_RT kernel the
> following code sequence works as expected:
> 
>     local_lock_irq(&local_lock);
>     raw_spin_lock(&lock);
> 
> and is fully equivalent to:
> 
>     raw_spin_lock_irq(&lock);
> 
> On a PREEMPT_RT kernel this code sequence breaks because
> local_lock_irq() is mapped to a per-CPU spinlock_t which neither
> disables interrupts nor preemption.

It is okay. First code example try lock irqs and hold raw spin lock and on non-PREEMPT_RT kernel it is equals to second section code. But, after last code the mention about PREEMPT_RT kernels says that local_lock_irq() does not work on PREEMPT_RT kernels, so on these kernels we cant use first code example instead second. So the continue of documentation should say something like next thing: "On PREEMPT_RT kernel you should use second code example strictly". BUT, documentation says next thing:

> The following code sequence works perfectly correct on both PREEMPT_RT
> and non-PREEMPT_RT kernels:
> 
> local_lock_irq(&local_lock); spin_lock(&lock);

What? On PREEMPT_RT kernel: typedef spinlock_t local_lock_t, and local_lock_irq mapped to __local_lock which does next things: disables migration and calls spin_lock with lock variable pointer of this cpu. After that we call spin_lock (again) with another lock variable. Why we should do this? I dont understand what does author want say me. Thanks.

**UPD:**

If no answer, then I will try. If lock and raw_lock same on non-PREEMPT_RT kernel then I think autor want to say that dont need use raw on PREEMPT_RT as it is extra whole locking for them, just use simple lock instead and all works as need in most cases. lock irq will disables migrate and holds this cpu lock variable, then spin_lock for general lock and we get ordered flow of execution. Correct me if I'm wrong.
                                

nx4n (111 rep)

Nov 10, 2024, 12:16 PM • Last activity: Nov 12, 2024, 04:59 AM

0 votes

0 answers

65 views

Need help to understand a kernel "deadlock" in armhf kernel 5.4

kernel deadlock spinlock

I'm struggling to understand a condition that occurs rarely (MTBF measured in weeks to months), but results in a fatal kernel lockup when it does. The environment is kernel 5.4.0, running on a dual-core ARM A9 32-bit (armhf) processor in an AMD/Xilinx Zynq 7020 SoC. I've been able to examine stack t...

                                  I'm struggling to understand a condition that occurs rarely (MTBF measured in weeks to months), but results in a fatal kernel lockup when it does.

The environment is kernel 5.4.0, running on a dual-core ARM A9 32-bit (armhf) processor in an AMD/Xilinx Zynq 7020 SoC.
I've been able to examine stack traces on two failed devices using the Xilinx Vitis system debugger. In both cases, both cores were waiting for (attempting to lock) the same spinlock. The spinlock is the pi_lock member of the struct task_struct structure associated with a threaded IRQ handler.

Core #0 is running the idle process, responding to an interrupt, attempting to wake up the IRQ handler thread. Core #1 is reading the /proc//stat sysfs pseudofile associated with the IRQ handler thread (while reading all of the sysfs pseudofiles of the form /proc//stat on behalf of pidof, probably at the request of sessionclean, invoked by apache2). 

The interrupt being handled on one device is for the BQ25890 battery charger (asserted ~zero to 10 times per second, and on the other device, for a custom kernel module (~100/sec.). Both interrupts use the default primary IRQ handler.

The current hypothesis is that that core #0 got stuck first, followed some time later by core #1, rather than both occurring simultaneously, but this is speculative.

Stack traces for the two cores and threaded IRQ handler source for one of the devices are below.

**Questions**

Is there enough information in the stack traces to provide a clue as to what went wrong? If so, what went wrong? If not, what additional information is needed? One of the devices is still available, and some (not all) variables can be interrogated.
Is the fact that interrupts are disabled (cpsr.i == 1 for both cores) significant?

Please note that because of the extremely long time between failures, black-box testing techniques - that is, making changes in order to observe the result - are not practical.

**ARM Cortex-A9 MPCore #0 (Suspended)**

    0x80165d68: ./include/linux/compiler.h, line 199
    0x80165d68 arch_spin_lock(): .../arch/arm/include/asm/spinlock.h, line 75
    0x8076f774: ./include/linux/spinlock.h, line 193
    0x8076f774: ...include/linux/spinlock_api_smp.h, line 119
    0x8076f774 _raw_spin_lock_irqsave(): kernel/locking/spinlock.c, line 159
    0x8014a394 try_to_wake_up(): kernel/sched/core.c, line 2551
    0x8014a724 wake_up_process(): kernel/sched/core.c, line 2667
    0x8016f600 __irq_wake_thread(): kernel/irq/handle.c, line 134
    0x8016f820 __handle_irq_event_percpu(): kernel/irq/handle.c, line 167
    0x8016f888 handle_irq_event_percpu(): kernel/irq/handle.c, line 189
    0x8016f924 handle_irq_event(): kernel/irq/handle.c, line 206
    0x80174230 handle_level_irq(): kernel/irq/chip.c, line 650
    0x8016e808: ./include/linux/irqdesc.h, line 156
    0x8016e808 generic_handle_irq(): kernel/irq/irqdesc.c, line 644
    0x80447a20: drivers/gpio/gpio-zynq.c, line 632
    0x80447a20 zynq_gpio_irqhandler(): drivers/gpio/gpio-zynq.c, line 661
    0x8016e808: ./include/linux/irqdesc.h, line 156
    0x8016e808 generic_handle_irq(): kernel/irq/irqdesc.c, line 644
    0x8016eec4 __handle_domain_irq(): kernel/irq/irqdesc.c, line 681
    0x80102300: ./include/linux/irqdesc.h, line 174
    0x80102300 gic_handle_irq(): drivers/irqchip/irq-gic.c, line 383
    0x80101a70 __irq_svc(): arch/arm/kernel/entry-armv.S, line 212
    0x805827c0 cpuidle_enter_state(): .../arch/arm/include/asm/irqflags.h, line 39
    0x805829d8 cpuidle_enter(): drivers/cpuidle/cpuidle.c, line 344
    0x8014ecf0: kernel/sched/idle.c, line 117
    0x8014ecf0: kernel/sched/idle.c, line 201
    0x8014ecf0 do_idle(): kernel/sched/idle.c, line 263
    0x8014eeac cpu_startup_entry(): kernel/sched/idle.c, line 355
    0x80769828 rest_init(): init/main.c, line 451
    0x80c00b30 arch_call_rest_init(): init/main.c, line 572
    0x80c00f78 start_kernel(): init/main.c, line 784
    0x00000000
    0x00000000

**The offending lock**

    lock	arch_spinlock_t *	{{slock=0x94449440, tickets={owner=0x9440, next=0x9444}}}	
    	slock	u32	0x94449440	
    	tickets	struct __raw_tickets	{owner=0x9440, next=0x9444}	
    		owner	u16	0x9440	
    		next	u16	0x9444	

**Threaded IRQ handler source**

Note that this code does not appear in the call stacks, but is the code for the thread that core #0 is attempting to wake up.

On one of the devices:

    static irqreturn_t bq25890_irq_handler_thread(int irq, void *private)
    {
    	struct bq25890_device *bq = private;
    	int ret;
    	struct bq25890_state state;
    
    	ret = bq25890_get_chip_state(bq, &state);
    	if (ret lock);
    	bq->state = state;
    	mutex_unlock(&bq->lock);
    
    	power_supply_changed(bq->charger);
    
    handled:
    	return IRQ_HANDLED;
    }

On the other device:

    static irqreturn_t core100_irq_handler( int irq, void* data )
    {
    	struct core100_drvdata* p_info;
    	struct core100_block_header* next;
    	unsigned long iflags;
    	p_info = ( struct core100_drvdata* ) data;
    	if ( ( p_info->state != state_reading ) && ( p_info->state != state_writing ) )
    	{
    		return IRQ_HANDLED;
    	}
    	next = &p_info->current_block->next->header;
    	next->count_at_pps = read_register( p_info->count_pps_register );
    	next->frc_at_pps = read_register( p_info->frc_pps_register );
    	next->count = adc_counter();
    	next->frc = free_running_counter();
    	spin_lock_irqsave( &p_info->state_lock, iflags );
    	p_info->dma_head += bytes_per_block( p_info );
    	if ( p_info->dma_head - p_info->dma_tail > bytes_per_buffer( p_info ) )
    	{
    		p_info->dma_tail = p_info->dma_head - bytes_per_buffer( p_info );
    	}
    	p_info->current_block = p_info->current_block->next;
    	wake_up_interruptible( &p_info->wq );
    	spin_unlock_irqrestore( &p_info->state_lock, iflags );
    	return IRQ_HANDLED;
    }

Kernel source: 
[struct task_struct]


**ARM Cortex-A9 MPCore #1 (Suspended)**

    0x80165d68: ./include/linux/compiler.h, line 199
    0x80165d68 arch_spin_lock(): .../arch/arm/include/asm/spinlock.h, line 75
    0x8076f774: ./include/linux/spinlock.h, line 193
    0x8076f774: ...include/linux/spinlock_api_smp.h, line 119
    0x8076f774 _raw_spin_lock_irqsave(): kernel/locking/spinlock.c, line 159
    0x80148914 task_rq_lock(): kernel/sched/core.c, line 109
    0x8014e22c: kernel/sched/cputime.c, line 281
    0x8014e22c thread_group_cputime(): kernel/sched/cputime.c, line 326
    0x8014e650 thread_group_cputime_adjusted(): kernel/sched/cputime.c, line 678
    0x802ccb5c do_task_stat(): fs/proc/array.c, line 510
    0x802cdaf4 proc_tgid_stat(): fs/proc/array.c, line 632
    0x802c7c0c proc_single_show(): fs/proc/base.c, line 756
    0x80281b18 seq_read(): fs/seq_file.c, line 229
    0x8025dd1c __vfs_read(): fs/read_write.c, line 425
    0x8025de60 vfs_read(): fs/read_write.c, line 461
    0x8025e088 ksys_read(): fs/read_write.c, line 587
    0x8025e0ec: fs/read_write.c, line 597
    0x8025e0ec __se_sys_read(): fs/read_write.c, line 595
    0x80101000 __idmap_text_end()
    0x76e627e6
    0x76edc616

Here's an alternative view, showing the calls stacks associated with their calling processes. "Swapper" is an archaic term for the idle process:

    Kernel	
    	0 swapper/0        (Suspended), ARM Cortex-A9 MPCore #0	
    		0x80165d68: ./include/linux/compiler.h, line 199	
    		0x80165d68 arch_spin_lock(): .../arch/arm/include/asm/spinlock.h, line 75	
    		0x8076f774: ./include/linux/spinlock.h, line 193	
    		0x8076f774: ...include/linux/spinlock_api_smp.h, line 119	
    		0x8076f774 _raw_spin_lock_irqsave(): kernel/locking/spinlock.c, line 159	
    
    11950 pidof	
    	11950 (Suspended), ARM Cortex-A9 MPCore #1	
    		0x80165d68: ./include/linux/compiler.h, line 199	
    		0x80165d68 arch_spin_lock(): .../arch/arm/include/asm/spinlock.h, line 75	
    		0x8076f774: ./include/linux/spinlock.h, line 193	
    		0x8076f774: ...include/linux/spinlock_api_smp.h, line 119	
    		0x8076f774 _raw_spin_lock_irqsave(): kernel/locking/spinlock.c, line 159	

**RCU Configuration**

    # RCU Subsystem
    CONFIG_PREEMPT_RCU=y
    # CONFIG_RCU_EXPERT is not set
    CONFIG_SRCU=y
    CONFIG_TREE_SRCU=y
    CONFIG_TASKS_RCU=y
    CONFIG_RCU_STALL_COMMON=y
    CONFIG_RCU_NEED_SEGCBLIST=y
    # end of RCU Subsystem
    # RCU Debugging
    # CONFIG_RCU_PERF_TEST is not set
    # CONFIG_RCU_TORTURE_TEST is not set
    CONFIG_RCU_CPU_STALL_TIMEOUT=21
    # CONFIG_RCU_TRACE is not set
    # CONFIG_RCU_EQS_DEBUG is not set
    # end of RCU Debugging

**Kernel Thread Prioities**

    # ps -eo pid,tid,class,rtprio,ni,pri,wchan:14,comm | grep -E '(PID|bq25890|raw_dma)'
      PID   TID CLS RTPRIO  NI PRI WCHAN          COMMAND
      214   214 FF      50   -  90 irq_thread     irq/59-bq25890_
      229   229 FF      50   -  90 irq_thread     irq/60-raw_dma

**Processor Status Registers**
On at least one failed system, the interrupt disable bit is set on both processors. That seems wrong, although the fast interrupt disable bit is not set. If I set a breakpoint in arch_spin_lock() on a working system, the interrupt disable bit is not set.
It would seem that the only way for the spinlock to be released is in response to an interrupt, since the active threads on both cores aren't going to do it.

    Core #0
    cpsr	200f0193	537854355	
    	n	0	0	Negative condition code flag	
    	z	0	0	Zero condition code flag	
    	c	1	1	Carry condition code flag	
    	v	0	0	Overflow condition code flag	
    	q	0	0	Cumulative saturation flag	
    	it	00	0	If-Then execution state bits	
    	j	0	0	Jazelle bit	
    	ge	f	15	SIMD Greater than or Equal flags	
    	e	0	0	Endianness execution state bit	
    	a	1	1	Asynchronous abort disable bit	
    	i	1	1	Interrupt disable bit	
    	f	0	0	Fast interrupt disable bit	
    	t	0	0	Thumb execution state bit	
    	m	13	19	Mode field	
    irq	
    	sp	80d94400	2161722368	
    	lr	80101e20	2148539936	
    	spsr	200f0193	537854355	
    		n	0	0	Negative condition code flag	
    		z	0	0	Zero condition code flag	
    		c	1	1	Carry condition code flag	
    		v	0	0	Overflow condition code flag	
    		q	0	0	Cumulative saturation flag	
    		it	00	0	If-Then execution state bits	
    		j	0	0	Jazelle bit	
    		ge	f	15	SIMD Greater than or Equal flags	
    		e	0	0	Endianness execution state bit	
    		a	1	1	Asynchronous abort disable bit	
    		i	1	1	Interrupt disable bit	
    		f	0	0	Fast interrupt disable bit	
    		t	0	0	Thumb execution state bit	
    		m	13	19	Mode field	

    Core #1
    cpsr	200f0093	537854099	
    	n	0	0	Negative condition code flag	
    	z	0	0	Zero condition code flag	
    	c	1	1	Carry condition code flag	
    	v	0	0	Overflow condition code flag	
    	q	0	0	Cumulative saturation flag	
    	it	00	0	If-Then execution state bits	
    	j	0	0	Jazelle bit	
    	ge	f	15	SIMD Greater than or Equal flags	
    	e	0	0	Endianness execution state bit	
    	a	0	0	Asynchronous abort disable bit	
    	i	1	1	Interrupt disable bit	
    	f	0	0	Fast interrupt disable bit	
    	t	0	0	Thumb execution state bit	
    	m	13	19	Mode field	
    irq	
    	sp	80d94440	2161722432	
    	lr	80101a00	2148538880	
    	spsr	60030193	1610809747	
    		n	0	0	Negative condition code flag	
    		z	1	1	Zero condition code flag	
    		c	1	1	Carry condition code flag	
    		v	0	0	Overflow condition code flag	
    		q	0	0	Cumulative saturation flag	
    		it	00	0	If-Then execution state bits	
    		j	0	0	Jazelle bit	
    		ge	3	3	SIMD Greater than or Equal flags	
    		e	0	0	Endianness execution state bit	
    		a	1	1	Asynchronous abort disable bit	
    		i	1	1	Interrupt disable bit	
    		f	0	0	Fast interrupt disable bit	
    		t	0	0	Thumb execution state bit	
    		m	13	19	Mode field	


                                

Omnia tu scis quod malum (1 rep)

Nov 24, 2023, 10:59 PM • Last activity: Nov 29, 2023, 05:49 PM

1 votes

1 answers

1055 views

How does a spinlock prevents context switching?

thread spinlock

I am using this code in order to visualize how a spinlock would prevent context switching: ```c pthread_spinlock_t lock; void pp() { pthread_spin_lock(&lock); char d = 'D'; while(1) write(1, &d, 1); } void ppp() { char a = 'C'; while(1) write(1, &a, 1); } int main() { pthread_t thread; pthread_t tth...

I am using this code in order to visualize how a spinlock would prevent context switching:

pthread_spinlock_t lock;
void pp()
{
        pthread_spin_lock(&lock);
        char d = 'D';
        while(1) write(1, &d, 1);
}
void ppp()
{
        char a = 'C';
        while(1) write(1, &a, 1);
}
int main()
{
        pthread_t thread;
        pthread_t tthread;
        pthread_spin_init(&lock, PTHREAD_PROCESS_PRIVATE);
        pthread_create(&thread, NULL, pp, NULL);
        pthread_create(&tthread, NULL, ppp, NULL);
        pthread_join(thread, NULL); 
        pthread_join(tthread,NULL);
}

The problem is that I was expecting it to never switch to the second thread, since I never release the lock done in pp(), and to output `DDDDDDDDDDDD... because from my understanding, it should prevent context switching. But the output I get is of the form : DCDCDCDCDCDCDCDCDCDCDCDCDCDCDCDCDCDCDCDC`... How can we explain this? Is my understanding of spinlocks incorrect?

user514788

Mar 22, 2022, 03:58 PM • Last activity: Mar 22, 2022, 04:12 PM

0 votes

1 answers

147 views

A question about spin locks

spinlock

In studying the spin lock synchronization mechanism, i noticed that for each function there is also a corresponding macro, for example for spin_lock there is a function static __always_inline void spin_lock(spinlock_t *lock) { raw_spin_lock(&lock->rlock); } and the macro #define spin_lock(x) pthread...

                                  In studying the spin lock synchronization mechanism, i noticed that for each function there is also a corresponding macro, for example for spin_lock there is a function 

    static __always_inline void spin_lock(spinlock_t *lock) {
	    raw_spin_lock(&lock->rlock); }

and the macro

    #define spin_lock(x) pthread_mutex_lock(x)

Why there are two different definitions and in which circumstances it is used each one of them? Thanks a lot!

Athanasios Margaris (129 rep)

Aug 17, 2021, 12:09 PM • Last activity: Aug 17, 2021, 12:18 PM

1 votes

1 answers

177 views

Fedora booting failure under VirtualBox

fedora spinlock

A student of mine stuck with a Fedora virtual machine under VirtualBox: [![enter image description here][1]][1] [1]: https://i.sstatic.net/ghvkS.png I do not even recognize what is going on here. Do you have a clue what is going on? (on the left you see another VM that is broadcasted from me, a work...

                                  A student of mine stuck with a Fedora virtual machine under VirtualBox:

I do not even recognize what is going on here.

Do you have a clue what is going on? (on the left you see another VM that is broadcasted from me, a working instance of this VM)

Does this have anything to do with 

https://access.redhat.com/solutions/658783  ?

Gergely (826 rep)

Feb 8, 2021, 03:01 PM • Last activity: Feb 9, 2021, 09:52 AM

40 votes

5 answers

65158 views

What is a spinlock in Linux?

linux spinlock lock

I would like to know about Linux spinlocks in detail; could someone explain them to me?

                                  I would like to know about Linux spinlocks in detail; could someone explain them to me?  

                                

Navaneeth Sen (9719 rep)

Dec 22, 2010, 01:53 PM • Last activity: Jan 16, 2021, 12:25 PM

2 votes

1 answers

1663 views

How spin lock is implemented in Linux?

linux spinlock

I'm thinking about how the feature "prevent from context switch" is achieved in Linux for a spinlock. A big picture / pseudocode is acceptable since I just what to know the logic behind it.

                                  I'm thinking about how the feature "prevent from context switch" is achieved in Linux for a spinlock. A big picture / pseudocode is acceptable since I just what to know the logic behind it.
                                

Kindred (294 rep)

Dec 27, 2018, 04:58 PM • Last activity: Dec 30, 2018, 01:32 AM

0 votes

1 answers

133 views

Basic spinlock doubt

process lock spinlock

I read that in spinlock, process keeps on waiting for the lock continuously in a loop until it receives signal(lock) or release(lock) from other process. This causes busy waiting and hence CPU cycles are wasted. My doubt is if CPU is continuously checking the value of lock variable in loop, the...

                                  I read that in spinlock, process keeps on waiting for the lock continuously in a loop until it receives signal(lock) or release(lock) from other process. This causes busy waiting and hence CPU cycles are wasted.

My doubt is if CPU is continuously checking the value of lock variable in loop, then how will some other process completely execute critical section and release the lock as executing critical section also needs CPU and CPU is busy checking the value of lock in other process ?

Zephyr (183 rep)

Oct 31, 2017, 08:27 AM • Last activity: Oct 31, 2017, 08:52 AM

7 votes

4 answers

24621 views

Why are spin locks good choices in Linux Kernel Design instead of something more common in userland code, such as semaphore or mutex?

linux semaphore spinlock

I understand that Spinlocks are real waste in Linux Kernel Design. I would like to know why is it like spin locks are good choices in Linux Kernel Design instead of something more common in userland code, such as semaphore or mutex?

                                  I understand that Spinlocks are real waste in Linux Kernel Design.

I would like to know why is it like spin locks  are good choices in Linux Kernel Design instead of something more common in userland code, such as semaphore or mutex?

Navaneeth Sen (9719 rep)

Dec 23, 2010, 02:04 PM • Last activity: Aug 18, 2016, 03:16 AM

15 votes

3 answers

38178 views

what is the difference between spin locks and semaphores?

kernel spinlock semaphore lock

What are the basic differences between spin locks and semaphores in action?

                                  What are the basic differences between spin locks and semaphores in action?

                                

Renjith G (5988 rep)

Dec 29, 2010, 06:42 AM • Last activity: Jul 5, 2016, 07:05 AM

1 votes

1 answers

1546 views

how to monitor spinlocks

linux cpu crash spinlock

I'm wondering how I could monitor spinlocks. At my client, we have cpu soft lockup failure, for which, if I understand well, spinlock is a likely cause. Different team use that server for predictive modeling using R, Python and SAS, meaning we often have many unsupervized processes running in parall...

                                  I'm wondering how I could monitor spinlocks. At my client, we have cpu soft lockup failure, for which, if I understand well, spinlock is a likely cause.

Different team use that server for predictive modeling using R, Python and SAS, meaning we often have many unsupervized processes running in parallel, possibly with multiprocessing librairies.

Monitoring the number of spinlocks or, even better, which processes used them, might help in validating or invalidating them as a cause for our frequent failures (5 during the last 3 weeks).

Is there any way to monitor them? If not, how could we know what would be causing those soft lockups?

laurent exsteens (111 rep)

May 31, 2016, 08:39 AM • Last activity: May 31, 2016, 02:07 PM

Showing page 1 of 11 total questions