I'm relying on this guidance to split a WAV file by silent segments, preserving all silence. My problem is that sox incorrectly detects the length of segments by approximately an order of magnitude. For example, using the following command, I get segments that only have a second or two of silence between them, rather than ten seconds I've specified:
sox input.wav output.wav silence -l 0 1 10.0 0.1%: newfile : restart
In order to only have splits occur where there are longer silent segments, I have been trying higher values just by trial and error -- something like 90.0 seconds appears to approximate an actual 5-7 seconds of silence.
I assume this must be something about lack of timing information in the input file, which was created by capturing sound from an ALSA loopback device and this command:
sox -t alsa hw:1,1 -c 2 input.wav
Is there something different I can do with my workflow, either at the capture stage or the split stage, so that sox correctly identifies the length of silent segments?
Asked by Adam J. Kessel
(81 rep)
Mar 13, 2025, 07:22 PM