Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: "Starting playback..." delay with Matroska albums (Read 593 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

"Starting playback..." delay with Matroska albums

I am playing around with switching to file-per-album storage with FLAC audio in a Matroska container and encountered some unexpected behavior with foobar2000 v1.6.12.

I've been able to use mkvmerge to create such .flac.mka files with chapters and tags that foobar2000 handles correctly, but after moving the files from an SSD to an SMB share accessed via a slow(er) network connection I've found there is a lengthy delay before the audio starts during which the status bar reads "Starting playback...".  Experimenting further, it seems that this is caused by .mka files with a timestamp scale of one audio sample, which mkvmerge uses as the default for files with no video tracks:

Normally mkvmerge(1) will use a value of 1000000 which means that timestamps and durations will have a precision of 1ms. For files that will not contain a video track but at least one audio track mkvmerge(1) will automatically chose a timestamp scale factor so that all timestamps and durations have a precision of one audio sample. This causes bigger overhead but allows precise seeking and extraction.

I wasn't able to find much on the topic of timestamp scale other than the (very old) thread Matroska Chapter Issues..., where it sounds like sample accuracy was desired but not yet possible.

Here's some simplified test cases that reproduce this issue (again, when played from slower storage):

  • Encode WAVE to FLAC:

    Code: [Select]
    flac.exe album.wav

    The resulting album.flac starts playback immediately.
  • Remux FLAC to Matroska with mkvmerge using single-sample precision:

    Code: [Select]
    mkvmerge.exe -o album-mkvmerge.flac.mka album.flac

    The resulting album-mkvmerge.flac.mka starts playback after a long delay and has a timestamp scale of 22,674:

    Code: [Select]
    mkvinfo.exe album-mkvmerge.flac.mka
    + EBML head
    |+ EBML version: 1
    |+ EBML read version: 1
    |+ Maximum EBML ID length: 4
    |+ Maximum EBML size length: 8
    |+ Document type: matroska
    |+ Document type version: 4
    |+ Document type read version: 2
    + Segment: size 269384309
    |+ Seek head (subentries will be skipped)
    |+ EBML void: size 4027
    |+ Segment information
    | + Timestamp scale: 22674
    | + Multiplexing application: libebml v1.4.2 + libmatroska v1.6.4
    | + Writing application: mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
    | + Duration: 00:39:37.999985442
    | + Date: 2022-09-30 18:56:56 UTC
    | + Segment UID: 0xba 0xb5 0x03 0x93 0x13 0x81 0x4e 0xc3 0xd2 0xe0 0x15 0x88 0x51 0x4a 0x7c 0x22
    |+ Tracks
    | + Track
    |  + Track number: 1 (track ID for mkvmerge & mkvextract: 0)
    |  + Track UID: 17301554570749911966
    |  + Track type: audio
    |  + Codec ID: A_FLAC
    |  + Codec's private data: size 69914
    |  + Default duration: 00:00:00.092879818 (10.767 frames/fields per second for a video track)
    |  + Language: und
    |  + Language (IETF BCP 47): und
    |  + Audio track
    |   + Sampling frequency: 44100
    |   + Channels: 2
    |   + Bit depth: 16
    |+ EBML void: size 1060
    |+ Cluster
  • "Clean" single-sample precision Matroska:

    Code: [Select]
    mkclean.exe album-mkvmerge.flac.mka

    The resulting clean.album-mkvmerge.flac.mka starts playback after a long delay and has a timestamp scale of 181,392:

    Code: [Select]
    mkvinfo.exe clean.album-mkvmerge.flac.mka
    + EBML head
    |+ Document type: matroska
    |+ Document type version: 2
    |+ Document type read version: 2
    + Segment: size 269370136
    |+ Seek head (subentries will be skipped)
    |+ EBML void: size 86
    |+ Segment information
    | + Duration: 00:39:37.999985442
    | + Timestamp scale: 181392
    | + Multiplexing application: libebml2 v0.21.3 + libmatroska2 v0.22.3
    | + Writing application: mkclean 0.9.0 from libebml v1.4.2 + libmatroska v1.6.4 + mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
    | + Date: 2022-09-30 18:56:56 UTC
    | + Segment UID: 0xba 0xb5 0x03 0x93 0x13 0x81 0x4e 0xc3 0xd2 0xe0 0x15 0x88 0x51 0x4a 0x7c 0x22
    |+ Tracks
    | + Track
    |  + Track number: 1 (track ID for mkvmerge & mkvextract: 0)
    |  + Track type: audio
    |  + Codec ID: A_FLAC
    |  + Track UID: 17301554570749911966
    |  + Codec's private data: size 69914
    |  + Default duration: 00:00:00.092879818 (10.767 frames/fields per second for a video track)
    |  + Language: und
    |  + Audio track
    |   + Sampling frequency: 44100
    |   + Channels: 2
    |   + Bit depth: 16
    |+ Tags
    | + Tag
    |  + Targets
    |   + Track UID: 17301554570749911966
    |  + Simple
    |   + Name: BPS
    |   + String: 905393
    |  + Simple
    |   + Name: DURATION
    |   + String: 00:39:37.999985442
    |  + Simple
    |   + Name: NUMBER_OF_FRAMES
    |   + String: 25603
    |  + Simple
    |   + Name: NUMBER_OF_BYTES
    |   + String: 269128005
    |  + Simple
    |   + Name: _STATISTICS_WRITING_APP
    |   + String: mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
    |  + Simple
    |   + Name: _STATISTICS_WRITING_DATE_UTC
    |   + String: 2022-09-30 18:56:56
    |  + Simple
    |   + Name: _STATISTICS_TAGS
    |   + String: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    |+ Cues (subentries will be skipped)
    |+ Cluster
  • Remux FLAC to Matroska with mkvmerge using millisecond precision:

    Code: [Select]
    mkvmerge.exe --timestamp-scale 1000000 -o album-mkvmerge+tss1000000.flac.mka album.flac

    The resulting album-mkvmerge+tss1000000.flac.mka starts playback after a brief delay and has a timestamp scale of 1,000,000:

    Code: [Select]
    mkvinfo.exe album-mkvmerge+tss1000000.flac.mka
    + EBML head
    |+ EBML version: 1
    |+ EBML read version: 1
    |+ Maximum EBML ID length: 4
    |+ Maximum EBML size length: 8
    |+ Document type: matroska
    |+ Document type version: 4
    |+ Document type read version: 2
    + Segment: size 269355110
    |+ Seek head (subentries will be skipped)
    |+ EBML void: size 4027
    |+ Segment information
    | + Timestamp scale: 1000000
    | + Multiplexing application: libebml v1.4.2 + libmatroska v1.6.4
    | + Writing application: mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
    | + Duration: 00:39:38.000000000
    | + Date: 2022-10-01 20:48:17 UTC
    | + Segment UID: 0x3d 0x82 0x27 0xba 0x8b 0xea 0x20 0x03 0xbd 0xa0 0x32 0xbd 0x85 0xc5 0x3e 0x59
    |+ Tracks
    | + Track
    |  + Track number: 1 (track ID for mkvmerge & mkvextract: 0)
    |  + Track UID: 12126524830391186088
    |  + Track type: audio
    |  + Codec ID: A_FLAC
    |  + Codec's private data: size 69914
    |  + Default duration: 00:00:00.092879818 (10.767 frames/fields per second for a video track)
    |  + Language: und
    |  + Language (IETF BCP 47): und
    |  + Audio track
    |   + Sampling frequency: 44100
    |   + Channels: 2
    |   + Bit depth: 16
    |+ EBML void: size 1060
    |+ Cluster
  • Remux FLAC to Matroska with ffmpeg:

    Code: [Select]
    ffmpeg.exe -i album.flac -c copy album-ffmpeg.flac.mka

    The resulting album-ffmpeg.flac.mka starts playback after a brief delay and has a timestamp scale of 1,000,000:

    Code: [Select]
    mkvinfo.exe album-ffmpeg.flac.mka
    + EBML head
    |+ EBML version: 1
    |+ EBML read version: 1
    |+ Maximum EBML ID length: 4
    |+ Maximum EBML size length: 8
    |+ Document type: matroska
    |+ Document type version: 4
    |+ Document type read version: 2
    + Segment: size 269326513
    |+ Seek head (subentries will be skipped)
    |+ EBML void: size 81
    |+ Segment information
    | + Timestamp scale: 1000000
    | + Multiplexing application: Lavf59.27.100
    | + Writing application: Lavf59.27.100
    | + Segment UID: 0xfe 0x5c 0xdc 0x86 0x74 0x33 0x05 0xef 0x8c 0xf6 0xf0 0xe4 0x33 0xf1 0x89 0xf2
    | + Duration: 00:39:38.000000000
    |+ Tracks
    | + Track
    |  + Track number: 1 (track ID for mkvmerge & mkvextract: 0)
    |  + Track UID: 1202185124873159546
    |  + "Lacing" flag: 0
    |  + Language: und
    |  + "Default track" flag: 0
    |  + Codec ID: A_FLAC
    |  + Track type: audio
    |  + Audio track
    |   + Channels: 2
    |   + Sampling frequency: 44100
    |   + Bit depth: 16
    |  + Codec's private data: size 42
    |+ Tags
    | + Tag
    |  + Targets
    |  + Simple
    |   + Name: ENCODER
    |   + String: Lavf59.27.100
    | + Tag
    |  + Targets
    |   + Track UID: 1202185124873159546
    |  + Simple
    |   + Name: DURATION
    |   + String: 00:39:38.000000000
    |+ Cluster

I also tried having Process Monitor running while starting playback and could see that album-mkvmerge.flac.mka was being read in its entirety before playback began, whereas for album-mkvmerge+tss1000000.flac.mka it looked like it read through the file about 4-8 KB out of every 1 MB, so perhaps a fraction of the data being read explains the much shorter delay.  Is that just the nature of the finer timestamp resolution that there's more blocks of data to have to "touch" before it can be used, or should these encodings be performing about the same regardless of that resolution?