ExoPlayer: Gapless playback is not working

Issue description

When trying to play a gapless playback between two MP3 files, a noticeable gap is detected between them. The issue is also reproduced when using Exoplayer demo app with the test.mp3 file already added in assets. I added the following snippet to PlayerActivity in Exoplayer demo app L299 for test purpose => the issue is reproduced : a gap between the two media sources is heard.

    val concatenatingMediaSource = ConcatenatingMediaSource()
    val defaultMediaSourceFactory = DefaultMediaSourceFactory(context)
    concatenatingMediaSource.addMediaSource(defaultMediaSourceFactory.createMediaSource(
        MediaItem.fromUri("asset:///test.mp3")))
    concatenatingMediaSource.addMediaSource(defaultMediaSourceFactory.createMediaSource(
        MediaItem.fromUri("asset:///test.mp3")))
    player.setMediaSource(concatenatingMediaSource)
    player.prepare()

Please notice that the delay/padding information is well detected by Exoplayer for the test.mp3 in demo app assets.

  • ExoPlayer version number : 2.13.0
  • Android version : Android 7 & Android 10
  • Android device: Samsung galaxy s7 edge & Samsung galaxy s10

Please let me know if you need more details. Thank you

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Looking into the emulator codec config, it is clear that the c2 decoders are alias to OMX.

The config is the same for all Emulated Pixel 3 XL API 29 & 30 and real Pixel 3 XL API 29 and 30:

$ adb shell "grep -A5  c2.android.mp3 $(find / -name media_codecs* 2>/dev/null)"
/apex/com.android.media.swcodec/etc/media_codecs.xml:        <MediaCodec name="c2.android.mp3.decoder" type="audio/mpeg">
/apex/com.android.media.swcodec/etc/media_codecs.xml-            <Alias name="OMX.google.mp3.decoder" />
/apex/com.android.media.swcodec/etc/media_codecs.xml-            <Limit name="channel-count" max="2" />
/apex/com.android.media.swcodec/etc/media_codecs.xml-            <Limit name="sample-rate" ranges="8000,11025,12000,16000,22050,24000,32000,44100,48000" />
/apex/com.android.media.swcodec/etc/media_codecs.xml-            <Limit name="bitrate" range="8000-320000" />
/apex/com.android.media.swcodec/etc/media_codecs.xml-        </MediaCodec>

Nevertheless on API 29 on the emulator, when querying MediaCodecInfo the canonical name of OMX.google.mp3.decode is returned not c2.android.mp3.decoder as expected. As a result it is not recognized as an aliased codec.

The weird issue is that this only affects an emulated Pixel 3 XL API 29. Emulating a Pixel 3 XL at API 30 doesn’t show the bug, all codecs are correctly listed as OMX (aliased) and C2. A real Pixel 3 XL flashed at the last public release of API 29 also correctly exposes the codecs twice (alias and not aliased). I can’t explain the behaviour difference.

So whatever is causing aliased codecs to only show as their aliased name seems to only affect some devices on API 29 (Android 10).

As a result, if we find a way to differentiate the OMX and C2 codec is seems it will have to be by analysing their decoding behaviour rather than their metadata.

Alternatively, we could consider all API 29 OMX.google.mp3.decode decoders to be alias of C2 as it was the default config on Android 10. Nevertheless, such change would be nearly impossible to test and I’m afraid of regressing any devices that would really use OMX decoders on Android 10.

I found what is happening but I have no root caused it yet.

Normally (for example on Pixel) during a track transition here is what happens:

  1. Extractor reaches the end of the 1st track. Last MP3 packet is at Presentation Time τ
  2. Renderer is notified that the current stream will end at τ.
  3. Renderer pulls from the MP3 buffer queue the last buffer of the 1st track (It has the Presentation Time τ). It queues it for decoding.
  4. Renderer pulls from the MP3 buffer queue a format change (due to the switch to the second track). It stops pulling from the MP3 buffer queue until τ is played and saves the new input format.
  5. Renderer pulls from the decoder a buffer τ and plays it. From step 2) it knows that τ was the last buffer of the stream. It notifies the AudioSink that there is now an audio discontinuity.
  6. Renderer pulls from the MP3 buffer queue the first buffer of the second track, which has a Presentation Time τ’.
  7. Renderer pushes the τ’ buffer for decoding.
  8. Renderer pulls from the decoder a buffer τ’. From step 6, because of the buffer presentation time, the renderer knows that it is the first buffer of a new format. It reconfigure the AudioSink with the new format and pushes the τ’ buffer to it.
  9. AudioSink receives the τ’ buffer. It knows there has been a reconfiguration is pending, so drains its AudioProcessors before reconfiguring.
  10. AudioSink’s TrimmingProcessor knows that a reconfiguration is pending from 8. It concludes that the previous track has ended. Therefore it drops the amount of data required by the gapless metadata.
  11. AudioSink the handle the buffer, but as there is a discontinuity pending from step 5, it drains the AudioProcessors again*. This is a no-op, but confusing.

The issue I’m observing in the emulator is at step 5. The renderer assumes that if the last buffer of the track queued for decoding has a presentation time of τ, the last buffer decoded of the codec will have a presentation time of τ. This is not the case on the emulator. An additional decoded buffer with a presentation time τ’’ is outputted by the codec even though no encoded buffer with a presentation time τ’’ was ever pushed. Note that presentation time τ’’ == τ’. I don’t think it is significant tough.

Here is what happens on the emulator:

  1. same
  2. same
  3. same
  4. same
  5. same, because Renderer assumes (incorrectly) that τ is the last buffer of track 1 from step 2, it informs the audio sink that there is a discontinuity. This is incorrect and leads to the following
    1. Render pulls the extra τ’’ buffer (which is the real last buffer of track 1) from the decoder. It sends it to AudioSink for playing.
    2. AudioSink receives the τ’’ buffer. It was told in step 5 that there is a discontinuity, so it drains its internal processors.
    3. AudioSink’s TrimmingProcessor has no reconfiguration pending. It concludes that this is a seek not the end of the track. Therefore it does not drop the amount of data required by the gapless metadata and plays it instead. => a gap is heard

Following reproduction step from https://github.com/google/ExoPlayer/issues/8594#issuecomment-779985786. The consistent presentation times observed are:

  • τ=1018775 us
  • τ’=τ’'=1044897us

I don’t know why the decoder (OMX.google.mp3.decoder) outputs one more buffer than was queued. We might want to rely on the EOS flag on the buffer rather than the presentation time.

Regardless, we should avoid the constant double onQueueEndOfStream call on the processors on each track transition in handle buffer. One due to reconfigurationPending (10) and the second one due to the startMediaTimeUsNeedsSync (11) due to handleDiscontinuity called on track transition (5).