Bug 45373

Summary: [alsa-sink] memblockq.c: Assertion '!bq->blocks || (bq->write_index + (int64_t)chunk.length <= bq->blocks->index)' failed at pulsecore/memblockq.c:408, function pa_memblockq_push(). Aborting.
Product: PulseAudio Reporter: Tanu Kaskinen <tanuk>
Component: coreAssignee: pulseaudio-bugs
Status: RESOLVED MOVED QA Contact: pulseaudio-bugs
Severity: normal    
Priority: medium CC: barney.gale, denisgolovan, lennart
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Backtrace
path to dump memblockq on assert
path to dump memblockq on assert
Patch to dump recent memblockq history when the crash happens
Patch to dump recent memblockq history when the crash happens (improved)
Pulseaudio log before crash
gdb log
gdb bt full log

Description Tanu Kaskinen 2012-01-29 11:03:49 UTC
Created attachment 56298 [details]
Backtrace

Reported in IRC:

19:28 -!- jelly-home [~jelly@homestuck.iskon.hr] has joined #pulseaudio
19:30 < jelly-home> Hello, I've caught pulseaudio killing itself with a SIGABRT and "Jan 29 15:44:43 kanta-linux pulseaudio[11171]: [alsa-sink] memblockq.c: Assertion '!bq->blocks || 
                    (bq->write_index + (int64_t)chunk.length <= bq->blocks->index)' failed at pulsecore/memblockq.c:408, function pa_memblockq_push(). Aborting."
19:30 < jelly-home> attempt at getting a backtrace: http://paste.debian.net/153989/
19:30 <@pulsator> Title: debian Pastezone (at paste.debian.net)
19:34 < jelly-home> this is PA 1.1, package 1.1-2 from Debian testing, i386 arch

I'll attach the backtrace here in case the pastebin link gets stale.
Comment 1 Arun Raghavan 2012-03-14 23:38:10 UTC
Do you have a reliable way to reproduce this?
Comment 2 Tanu Kaskinen 2012-03-15 18:41:11 UTC
I don't think I've seen this myself. I asked jelly-home, and he said "it's intermittent and happens every couple days it seems.  Can't trigger it reliably with any specific action though."
Comment 3 denisgolovan 2012-03-21 13:12:26 UTC
Same problem.
Gentoo x86, PulseAudio v1.1
Happens twice a day or more often.
Comment 4 Joe 2012-04-03 15:19:16 UTC
I'm also getting this error.

I'm using Archlinux and versions:

alsa-plugins 1.0.25-1
pulseaudio 1.1-2

I also have some output here:

https://gist.github.com/fd1c77fb122488f08966
Comment 5 Arun Raghavan 2012-04-13 22:02:59 UTC
Created attachment 59955 [details] [review]
path to dump memblockq on assert

I'm attaching a patch that could help debug this. If someone's willing to try it out, that'd be awesome.
Comment 6 Arun Raghavan 2012-04-13 22:03:41 UTC
(In reply to comment #5)
> Created attachment 59955 [details] [review] [review]
> path to dump memblockq on assert
> 
> I'm attaching a patch that could help debug this. If someone's willing to try
> it out, that'd be awesome.

Note that this will basically dump some output when the assert happens, so you'll need to run PA on the command-line or get logs from syslog once this happens.
Comment 7 Arun Raghavan 2012-04-13 22:06:01 UTC
Created attachment 59956 [details] [review]
path to dump memblockq on assert

Whoops, I screwed up translating that assert. Updated patch attached.
Comment 8 denisgolovan 2012-04-21 14:50:22 UTC
I tried building Pulseaudio-1.1 with https://bugs.freedesktop.org/attachment.cgi?id=59956 patch applied, but it fails with following error:

pulsecore/memblockq.c: In function 'dump_memblockq':
pulsecore/memblockq.c:275:145: error: 'pa_memblockq' has no member named 'name'
pulsecore/memblockq.c:278:9: warning: format '%lu' expects type 'long unsigned int', but argument 7 has type 'size_t' [-Wformat]
pulsecore/memblockq.c: In function 'pa_memblockq_push':
pulsecore/memblockq.c:423:13: warning: format '%lu' expects type 'long unsigned int', but argument 6 has type 'size_t' [-Wformat]
make[3]: *** [libpulsecommon_1.1_la-memblockq.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
Comment 9 Arun Raghavan 2012-04-21 23:20:22 UTC
(In reply to comment #8)
> I tried building Pulseaudio-1.1 with
> https://bugs.freedesktop.org/attachment.cgi?id=59956 patch applied, but it
> fails with following error:
> 
> pulsecore/memblockq.c: In function 'dump_memblockq':
> pulsecore/memblockq.c:275:145: error: 'pa_memblockq' has no member named 'name'
> pulsecore/memblockq.c:278:9: warning: format '%lu' expects type 'long unsigned
> int', but argument 7 has type 'size_t' [-Wformat]
> pulsecore/memblockq.c: In function 'pa_memblockq_push':
> pulsecore/memblockq.c:423:13: warning: format '%lu' expects type 'long unsigned
> int', but argument 6 has type 'size_t' [-Wformat]
> make[3]: *** [libpulsecommon_1.1_la-memblockq.lo] Error 1
> make[3]: *** Waiting for unfinished jobs....

Could you try rebuilding after applying this patch as well?

http://cgit.freedesktop.org/pulseaudio/pulseaudio/patch/?id=666261ece8cfa9b5458a1c1358b7ae4b9682b6a5
Comment 10 denisgolovan 2012-04-23 13:36:12 UTC
Ok.
I've got following crash dump in my syslog.

Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c: memblockq dump: [source output delay_memblockq] rindex: 1933523416, windex: 1933524752
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c:  -> block: index: 1933522788, length: 876 (current read)
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c:  -> block: index: 1933523664, length: 888
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c:  -> block: index: 1933524552, length: 184
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c:  -> block: index: 2549446944, length: 1792
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c: current chunk length: 4
Apr 23 23:33:42 comp1 pulseaudio[6960]: [alsa-sink] memblockq.c: Code should not be reached at pulsecore/memblockq.c:431, function pa_memblockq_push(). Aborting.
Comment 11 Tanu Kaskinen 2012-04-27 06:03:51 UTC
The indexes are peculiarly large, nearly 2 gigabytes. That would point to 32-bit integer overflow (or underflow), but on the other hand, the difference to the overflow point is still more than 200 megabytes, which makes the overflow scenario sound unlikely. (To be clear: there shouldn't be any 32-bit variables involved in the memblockq index calculations.)

Another strange thing is that the index difference between the second last and the last chunk is more than 600 megabytes. WTF? One explanation would be that pa_source_output_push() was given a 600 MB chunk, which is more than the delay_memblockq can store (which is 32 MB). If that's what's happening, it will insert a gap in the queue by the pa_memblockq_seek() call in pa_source_output_push(). The memblockq code is impossible to understand, so I don't know how an extremely large chunk would lead to a crash, but that sounds like a possible cause (extra large chunks shouldn't crash memblockq, but maybe such cases just aren't tested much, so there can be bugs).
Comment 12 Tanu Kaskinen 2012-04-27 06:40:01 UTC
My previous comment about pa_source_output_push() is based on the assumption that the crash happened in the context as in this backtrace: https://bugs.freedesktop.org/attachment.cgi?id=56298

I didn't notice that https://gist.github.com/fd1c77fb122488f08966 contained a different backtrace. Comment #10 doesn't have enough information to tell in which context the crash happened, so my speculation about pa_source_output_push() might be misleading. The comments about the index size strangeness are valid anyway.
Comment 13 Tanu Kaskinen 2012-04-28 09:03:47 UTC
Created attachment 60741 [details] [review]
Patch to dump recent memblockq history when the crash happens

I've attached another debugging patch. Like Arun's patch, this one also prints some debugging information if the crash happens. This patch is not compatible with Arun's patch. If you're not using git master, you'll need also http://cgit.freedesktop.org/pulseaudio/pulseaudio/patch/?id=666261ece8cfa9b5458a1c1358b7ae4b9682b6a5 just like with Arun's patch.

If someone can reproduce the bug with this patch, that would be awesome.
Comment 14 Tanu Kaskinen 2012-04-28 09:09:30 UTC
Created attachment 60742 [details] [review]
Patch to dump recent memblockq history when the crash happens (improved)

Slightly improved patch.
Comment 15 denisgolovan 2012-04-28 11:49:52 UTC
After applying https://bugs.freedesktop.org/attachment.cgi?id=60742 Pulseaudio-1.1 fails almost instantly. Only following is in syslog.

Apr 28 21:47:26 comp1 pulseaudio[26207]: [alsa-sink] memblockq.c: Assertion '(state->n_blocks = bq->n_blocks) <= HISTORY_MAX_N_BLOCKS' failed at pulsecore/memblockq.c:203, function save_state(). Aborting.
Comment 16 Tanu Kaskinen 2012-04-28 12:06:00 UTC
Ok. I set HISTORY_MAX_N_BLOCKS to 100 in the patch. That seemed like a safe choice - in the tests that I did (not comprehensive in any way) the memblockqs contained less than 20 blocks at maximum.

So, near the beginning in the patch there's this:

#define HISTORY_MAX_N_BLOCKS 100

You could try increasing that to 1000 or 10000. That still shouldn't cause excessive memory consumption (maybe around 15 MB per stream).

I'll be away for tomorrow, so further responses from me will come on Monday at earliest.
Comment 17 denisgolovan 2012-05-06 00:11:02 UTC
I increase HISTORY_MAX_N_BLOCKS to 10000.
Now it works, but crashes became infrequent.
Moreover when crashing pulseaudio does not log anything in syslog.

What is next?
Comment 18 Tanu Kaskinen 2012-05-06 00:25:34 UTC
If there's nothing in syslog, but Pulseaudio is definitely crashing, then it's probably a segfault. For debugging that, do the following in a terminal:

echo autospawn = no >> ~/.pulse/client.conf
killall pulseaudio
LANG=C LD_BIND_NOW=1 gdb pulseaudio

At the gdb prompt, enter
run -vvv --log-time

When the crash happens, enter these commands and save the output:
thread apply all bt
thread apply all bt full

The first command prints the backtrace in a more readable format, and the second command prints the backtrace with more information.

The log output before the crash is useful too.
Comment 19 denisgolovan 2012-05-06 09:02:36 UTC
Still no crash, but some other strange issues occur when pulseaudio is being ran under gdb. Basically, suddenly instead of main sound, high-pitch sound is played. I assume it is some bug. Only following is logged into console.

(30629.042|   0.000) D: [alsa-sink] source.c: Processing rewind...
(30631.103|   2.060) D: [alsa-sink] sink-input.c: Requesting rewind due to corking
(30631.103|   0.000) D: [alsa-sink] alsa-sink.c: Requested to rewind 65536 bytes.
(30631.103|   0.000) D: [alsa-sink] alsa-sink.c: Limited to 5668 bytes.
(30631.103|   0.000) D: [alsa-sink] alsa-sink.c: before: 1417
(30631.103|   0.000) D: [alsa-sink] alsa-sink.c: after: 1417
(30631.103|   0.000) D: [alsa-sink] alsa-sink.c: Rewound 5668 bytes.
(30631.103|   0.000) D: [alsa-sink] sink.c: Processing rewind...
(30631.103|   0.000) D: [alsa-sink] sink-input.c: Have to rewind 5668 bytes on render memblockq.
(30631.103|   0.000) D: [alsa-sink] sink-input.c: Have to rewind 5668 bytes on render memblockq.
(30631.103|   0.000) D: [alsa-sink] sink-input.c: Have to rewind 5668 bytes on implementor.
(30631.103|   0.000) D: [alsa-sink] source.c: Processing rewind...
Comment 20 denisgolovan 2012-05-06 09:17:04 UTC
Now I am getting the following:

(31572.340| 151.789) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31572.340|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.310|  32.970) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.310|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.346|   0.035) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.346|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.367|   0.021) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.367|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.410|   0.042) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.410|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.432|   0.021) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.432|   0.000) D: [alsa-sink] flist.c: pulsecore/memblock.c: unused_memblocks flist is full (don't worry)
(31605.496|   0.063) D: [alsa-sink] flist.c: pulsecore/hashmap.c: entries flist is full (don't worry)
(31605.666|   0.170) D: [alsa-sink] protocol-native.c: Underrun on 'ALSA Playback', 0 bytes in queue.
(31609.006|   3.339) D: [alsa-sink] sink-input.c: Requesting rewind due to corking
(31609.006|   0.000) D: [alsa-sink] alsa-sink.c: Requested to rewind 0 bytes.
(31609.006|   0.000) D: [alsa-sink] alsa-sink.c: Mhmm, actually there is nothing to rewind

... and sound stops.
Comment 21 denisgolovan 2012-05-09 10:03:00 UTC
Created attachment 61293 [details]
Pulseaudio log before crash
Comment 22 denisgolovan 2012-05-09 10:03:22 UTC
Created attachment 61294 [details]
gdb log
Comment 23 denisgolovan 2012-05-09 10:03:42 UTC
Created attachment 61295 [details]
gdb bt full log
Comment 24 denisgolovan 2012-05-09 10:04:17 UTC
Ok.
It seems we have some logs now.
Comment 25 GitLab Migration User 2018-07-30 10:31:15 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/pulseaudio/pulseaudio/issues/482.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.