ffmpeg-normalize: Can't normalize 32bit float WAV files to same bit depth?
I have a collection of 32bit float WAV files I’m trying to normalize. It works fine, but the output is 16bit.
I try -ofmt f32le (which is the format of the original) but it just generates an unreadable .wav file.
Command: ffmpeg-normalize in.wav --debug -nt peak -t -55 -ofmt f32le -f -o out.wav
Test file: http://lacinato.com/pub/audio/test.wav
Any ideas? Thanks!
Kubuntu 21.04 Python 3.9.5 ffmpeg 4.3.2-0+deb11u1ubuntu1
$ ffmpeg-normalize test.wav --debug -nt peak -t -55 -ofmt f32le -f -o out.wav
DEBUG: found executable in path: /usr/bin/ffmpeg
DEBUG: found executable in path: /usr/bin/ffmpeg
DEBUG: Running command: ['/usr/bin/ffmpeg', '-filters']
DEBUG: Parsing streams of test.wav
DEBUG: Running command: ['/usr/bin/ffmpeg', '-i', 'test.wav', '-c', 'copy', '-t', '0', '-map', '0', '-f', 'null', '/dev/null']
DEBUG: Stream parsing command output:
DEBUG: ffmpeg version 4.3.2-0+deb11u1ubuntu1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10 (Ubuntu 10.2.1-20ubuntu1)
configuration: --prefix=/usr --extra-version=0+deb11u1ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'test.wav':
Duration: 00:00:03.14, bitrate: 6144 kb/s
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 96000 Hz, stereo, flt, 6144 kb/s
Output #0, null, to '/dev/null':
Metadata:
encoder : Lavf58.45.100
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 96000 Hz, stereo, flt, 6144 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=N/A time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
DEBUG: Found duration: 3.014 s
DEBUG: Found audio stream at index 0
INFO: Normalizing file test.wav (1 of 1)
DEBUG: Running normalization for test.wav
DEBUG: Parsing normalization info for test.wav
INFO: Running first pass volumedetect filter for stream 0
DEBUG: Volumedetect command output:
DEBUG: ffmpeg version 4.3.2-0+deb11u1ubuntu1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10 (Ubuntu 10.2.1-20ubuntu1)
configuration: --prefix=/usr --extra-version=0+deb11u1ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'test.wav':
Duration: 00:00:03.14, bitrate: 6144 kb/s
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 96000 Hz, stereo, flt, 6144 kb/s
[Parsed_volumedetect_0 @ 0x559b81679e40] n_samples: 0
Stream mapping:
Stream #0:0 (pcm_f32le) -> volumedetect
volumedetect -> Stream #0:0 (pcm_s16le)
Output #0, null, to '/dev/null':
Metadata:
encoder : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le, 96000 Hz, stereo, s16, 3072 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
[pcm_f32le @ 0x559b816c01c0] Multiple frames in a packet.
[pcm_f32le @ 0x559b816c01c0] Invalid PCM packet, data has size 4 but at least a size of 8 was expected
Error while decoding stream #0:0: Invalid data found when processing input
size=N/A time=00:00:03.14 bitrate=N/A speed= 446x
bitrate=N/A
total_size=N/A
out_time_us=3140354
out_time_ms=3140354
out_time=00:00:03.140354
dup_frames=0
drop_frames=0
speed= 446x
progress=end
video:0kB audio:1178kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[Parsed_volumedetect_0 @ 0x559b816c6a40] n_samples: 602948
[Parsed_volumedetect_0 @ 0x559b816c6a40] mean_volume: -6.3 dB
[Parsed_volumedetect_0 @ 0x559b816c6a40] max_volume: 0.0 dB
[Parsed_volumedetect_0 @ 0x559b816c6a40] histogram_0db: 123265
INFO: Running second pass for test.wav
INFO: Adjusting stream 0 by -55.0 dB to reach -55.0
DEBUG: Moving temporary file from /tmp/oflr9v2g.wav to out.wav
DEBUG: Normalization finished
INFO: Normalized file written to out.wav
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 20 (10 by maintainers)
Commits related to this issue
- print warning for bit depths > 16, addresses #163 — committed to slhck/ffmpeg-normalize by slhck 2 years ago
- use astats instead of volumedetect filter, fixes #163 Allows floating point calculation. See: https://github.com/slhck/ffmpeg-normalize/issues/163 — committed to slhck/ffmpeg-normalize by slhck 2 years ago
Thanks for the info. My guess is that there’s a bug in how the samples are created from the Python utility. I don’t know enough about raw WAV to be able to tell you what’s wrong with it, though.
I have to check this in detail. I will merge the PR in the meantime, since it seems to be orthogonal to the issue you’re having.
Indeed. Gyan changed the docs to make that restriction clear: https://github.com/FFmpeg/FFmpeg/commit/b635ac126f340512335386d32bcd2693535945c3
I made a patch release that includes a warning for this.