Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

callback hangs after upgrading to alsa-lib 1.1.2 on Linux #251

Closed
PortAudio-admin opened this issue Aug 19, 2016 · 9 comments
Closed

callback hangs after upgrading to alsa-lib 1.1.2 on Linux #251

PortAudio-admin opened this issue Aug 19, 2016 · 9 comments
Labels
linux Affects Linux P2 Priority: High src-alsa ALSA Host API /src/hostapi/alsa

Comments

@PortAudio-admin
Copy link
Collaborator

Issue created by @philburk

ALSA recently added recursive lock protection. PortAudio is attempting to lock recursively, which is now illegal. This causes a hang.

This was reported by Chris Brannon in an email to the PA list.

---------- Chris wrote:
Recently, after upgrading to alsa-lib 1.1.2 on Linux, I started noticing lockups in programs that I use.
They added some locking and thread safety to alsa with that library version.
I fired up Valgrind's helgrind tool, and I eventually tracked the problem down to Portaudio.
Here's part of my log from helgrind, along with an explanation of what I think is going on.

---begin log---
==13296== Thread #10: Attempt to re-lock a non-recursive lock I already hold
==13296== 0x4C2E7B4: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==13296== by 0x6A855F3: snd_pcm_lock (pcm_local.h:1101)
==13296== by 0x6A855F3: snd_pcm_drop (pcm.c:1197)
==13296== by 0x63FC24A: AlsaStop.isra.2 (pa_linux_alsa.c:3037)
==13296== by 0x63FFC79: OnExit (pa_linux_alsa.c:3388)
==13296== by 0x6402DF7: CallbackThreadFunc (pa_linux_alsa.c:4180)
==13296== by 0x4C31506: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==13296== by 0x661D493: start_thread (in /lib64/libpthread-2.22.so)
==13296== by 0x54535DC: clone (in /lib64/libc-2.22.so)
==13296== Lock was previously acquired
==13296== at 0x4C2E87D: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==13296== by 0x6A89404: snd_pcm_lock (pcm_local.h:1101)
==13296== by 0x6A89404: snd_pcm_mmap_commit (pcm.c:7030)
==13296== by 0x63FF6C6: PaAlsaStreamComponent_EndProcessing.part.13 (pa_linux_alsa.c:3489)
==13296== by 0x63FF941: PaAlsaStreamComponent_EndProcessing (pa_linux_alsa.c:3596)
==13296== by 0x63FF941: PaAlsaStream_EndProcessing (pa_linux_alsa.c:3596)
==13296== by 0x6402FAC: CallbackThreadFunc (pa_linux_alsa.c:4329)
==13296== by 0x4C31506: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==13296== by 0x661D493: start_thread (in /lib64/libpthread-2.22.so)
==13296== by 0x54535DC: clone (in /lib64/libc-2.22.so)
---end log---

And here's the first part of the definition of CallbackThreadFunc, from
portaudio's src/hostapi/alsa/pa_linux_alsa.c:

static void *CallbackThreadFunc( void *userData )
{
... variable definitions elided

/* Execute OnExit when exiting */
pthread_cleanup_push( &OnExit, stream );

That's around line 4180 in my copy of the file.

So the thread is executing in PaAlsaStreamComponent_EndProcessing, which
calls snd_pcm_mmap_commit. That function locks the pcm mutex, does some
stuff, and unlocks the mutex. But the thread is canceled during the
snd_pcm_mmap_commit call. The OnExit handler gets called, and it tries to
lock the mutex, but the lock was already taken in snd_pcm_mmap_commit,
which was not allowed to finish executing.

@PortAudio-admin
Copy link
Collaborator Author

Comment by gineera on Assembla

Attachment bI3ZfSArar5OkEdmr6CpXy

@PortAudio-admin
Copy link
Collaborator Author

Comment by gineera on Assembla

As discussed on the PA list, this issue is not actually about recursion (so I've edited the title), but that 'AbortStream' cancels the callback thread sometimes while it is calling Alsa-lib functions, and as a result one does not complete, retaining the lock. Is is not, and never has been, safe to cancel during Alsa execution (I've had this confirmed - "don't even think about it"). The newer Alsa-lib simply exposes this flaw in Portaudio due to the added locking.

A possible solution patch, that disables thead cancelability during the critical processing section of the callback, has been confirmed to be effective.

Similar cancelling is used in the OSS and ASIHPI host apis, so we may need to think about those also.

@PortAudio-admin
Copy link
Collaborator Author

PortAudio-admin commented Aug 23, 2016

Comment by gineera on Assembla

pa_linux_alsa.c-cancelability_2.diff.txt

@PortAudio-admin
Copy link
Collaborator Author

Comment by gineera on Assembla

A more thorough version of this patch 'pa_linux_alsa.c-cancelability_4.diff' (attached) disables cancelablity on the callback thread entirely, and just re-enables it for the poll() in PaAlsaStream_WaitForFrames(). This is necessary as there are various calls to Alsa-lib functions during the processing which might otherwise occasionally suffer cancellation. These changes are reported to be effective.

@PortAudio-admin
Copy link
Collaborator Author

PortAudio-admin commented Sep 11, 2016

Comment by gineera on Assembla

pa_linux_alsa.c-cancelability_4.diff.txt

@PortAudio-admin
Copy link
Collaborator Author

Comment by @philburk

Thanks Alan. I will make a Git MergeRequest from the last patch.

@PortAudio-admin
Copy link
Collaborator Author

Comment by @RossBencina

Merge request is this one:

pa_linux_alsa: fix hang in callback caused by abort
https://assembla.com/code/portaudio/git/merge_requests/3834073

@PortAudio-admin
Copy link
Collaborator Author

Comment by @philburk

Fix was merged in 2016.

@PortAudio-admin
Copy link
Collaborator Author

Issue closed by @philburk

@PortAudio-admin PortAudio-admin added P2 Priority: High linux Affects Linux src-alsa ALSA Host API /src/hostapi/alsa labels Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linux Affects Linux P2 Priority: High src-alsa ALSA Host API /src/hostapi/alsa
Projects
None yet
Development

No branches or pull requests

1 participant