Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes
1 answers
2243 views
Can I record sound until silence OR a maximum length of recording?
Looking at [detecting sound until some silence occurs][1], I arrived at the command `rec recording.flac rate 32k silence -l 1 0.1 3% 1 3.0 3%`. I realize my specific use would be somewhat different: I do want to record until some silence is detected, but I also want an upper limit, say 10-15 seconds...
Looking at detecting sound until some silence occurs , I arrived at the command rec recording.flac rate 32k silence -l 1 0.1 3% 1 3.0 3%. I realize my specific use would be somewhat different: I do want to record until some silence is detected, but I also want an upper limit, say 10-15 seconds, of how long the recording will go on before moving on. I can just prepend a timeout 15s command, which would give me a maximum speech time of (15 seconds - leading silence, which will vary), but is there some way to tell sox I only need the first x seconds of a recording, which would give me a maximum speech time of 15 secs regardless of leading silence?
Niklas Raatikainen (21 rep)
Nov 13, 2015, 11:10 AM • Last activity: Jul 27, 2025, 10:04 PM
9 votes
4 answers
13021 views
How to create the spectrogram of many audio files efficiently with Sox?
I have a bunch of audio files and I would like to create the spectrogram for each individual file using Sox. Usually, for a single file, I do this: sox audiofile.flac -n spectrogram However I don't know how to extend this method to more than one file. Ideally I would like my output `.png` file to ha...
I have a bunch of audio files and I would like to create the spectrogram for each individual file using Sox. Usually, for a single file, I do this: sox audiofile.flac -n spectrogram However I don't know how to extend this method to more than one file. Ideally I would like my output .png file to have a filename associated to its respective audio file; for example audiofile1.png for audiofile1.flac, audiofile2.png for audiofile2.flac and so on. Does anybody know how to do this?
Carl Rojas (1139 rep)
May 3, 2015, 09:19 PM • Last activity: May 1, 2025, 10:20 PM
0 votes
0 answers
60 views
sox split by silence incorrectly detects lengths of segments
I'm relying on [this guidance][1] to split a WAV file by silent segments, preserving all silence. My problem is that sox incorrectly detects the length of segments by approximately an order of magnitude. For example, using the following command, I get segments that only have a second or two of silen...
I'm relying on this guidance to split a WAV file by silent segments, preserving all silence. My problem is that sox incorrectly detects the length of segments by approximately an order of magnitude. For example, using the following command, I get segments that only have a second or two of silence between them, rather than ten seconds I've specified: sox input.wav output.wav silence -l 0 1 10.0 0.1%: newfile : restart In order to only have splits occur where there are longer silent segments, I have been trying higher values just by trial and error -- something like 90.0 seconds appears to approximate an actual 5-7 seconds of silence. I assume this must be something about lack of timing information in the input file, which was created by capturing sound from an ALSA loopback device and this command: sox -t alsa hw:1,1 -c 2 input.wav Is there something different I can do with my workflow, either at the capture stage or the split stage, so that sox correctly identifies the length of silent segments?
Adam J. Kessel (81 rep)
Mar 13, 2025, 07:22 PM
1 votes
0 answers
66 views
How to make Microsoft Teams use a virtual microphone
My goal is to alter my voice in a particular way and redirect it to MS Teams. I managed to pipe my voice from my headphones microphone to a virtual microphone named `wh40k` like this: `parec -d $PHYSICAL_MIC | sox $FILTERS | pacat -d $VIRTUAL_MIC`. In detail, this is my code: VIRTUAL_MIC="wh40k" # H...
My goal is to alter my voice in a particular way and redirect it to MS Teams. I managed to pipe my voice from my headphones microphone to a virtual microphone named wh40k like this: parec -d $PHYSICAL_MIC | sox $FILTERS | pacat -d $VIRTUAL_MIC. In detail, this is my code: VIRTUAL_MIC="wh40k" # Hard-coded headphones microphone name which I got from pactl list PHYSICAL_MIC="alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__Headphones__sink" # Check for required software: quit if any not found for CMD in parec sox pacat; do if ! command -v "$CMD" &>/dev/null; then echo "Error: $CMD not found.." exit 1 fi done cleanup() { echo "Removing virtual microphone $VIRTUAL_MIC..." pactl unload-module "$MODULE_ID" exit 0 } if ! pactl list sinks short | grep -q "$VIRTUAL_MIC"; then echo "Creating virtual microphone '$VIRTUAL_MIC'..." MODULE_ID=$(pactl load-module module-null-sink \ sink_name=$VIRTUAL_MIC \ sink_properties=device.description="$VIRTUAL_MIC") if [[ -z "$MODULE_ID" ]]; then echo "Error: could not create virtual microphone '$VIRTUAL_MIC'." exit 1 fi else echo "Virtual microphone '$VIRTUAL_MIC' already exists." fi trap cleanup INT TERM echo -e "\nConfigurated sinks:" pactl list sinks short echo -e "\nConfigured sources (monitor):" pactl list sources short echo -e "\nConfiguring '$VIRTUAL_MIC' audio pipeline..." # $PITCH and other filters below are vars like "pitch -750" parec -d "$PHYSICAL_MIC" --raw --format=s16le --rate=44100 --channels=1 | \ sox -t raw -r 44100 -e signed -b 16 -c 1 - -t raw - \ $PITCH $OVERDRIVE $GAIN $REVERB $EQUALIZER $BASS $TREBLE $CHORUS $ECHO | \ pacat --raw --device="$VIRTUAL_MIC".monitor I can confirm that parec | sox works because I tested it with parec | sox | pacat and sox filters were correctly applied. Now I can't use wh40k virt mic on MS Teams: when I select MS Teams > Settings > Devices > Microphone: Monitor of wh40k I can't hear my voice from the Teams test call. Any clue on how could I troubleshoot this? My versions: - MS Teams 1.5.00.23861 (64-bit) - Linux Mint 21 Kernel - 6.8.0-40-generic - pipewire 1.2.7
elmazzun (169 rep)
Jan 2, 2025, 09:59 PM • Last activity: Jan 5, 2025, 01:21 PM
2 votes
1 answers
871 views
sox gives inconsistent results mixing stereo to mono
Whichever of the following ways of mixing stereo to mono I try *repeatedly*, the resulting files are always different (md5 hashes do not match): ``` sox stereo.wav -c 1 mono.wav sox stereo.wav mono.wav remix 1,2 sox stereo.wav mono.wav remix 1-2 ``` The differences appear to be in the binary meat of...
Whichever of the following ways of mixing stereo to mono I try *repeatedly*, the resulting files are always different (md5 hashes do not match):
sox stereo.wav -c 1 mono.wav
sox stereo.wav mono.wav remix 1,2
sox stereo.wav mono.wav remix 1-2
The differences appear to be in the binary meat of the audio, not in the headers. If I use say ffmpeg, the resulting file is always the same:
ffmpeg -hide_banner -i stereo.wav -ac 1 mono.wav
Does sox use some sort of randomness in the mixing-down algorithm? Why?
Greendrake (459 rep)
Jan 19, 2022, 09:33 AM • Last activity: Sep 3, 2024, 12:56 PM
8 votes
3 answers
19984 views
sox returns an error when I try to handle mp3 files
Hello so here is the deal, I used: $ yum install sox To install it in CentOS 6. After that I did a quick test: $ sox test.mp3 test.amr and this is what it returns: $ sox formats: no handler for file extension `mp3' I need this done with `sox` not `lame` because I will need to use it for mixing and o...
Hello so here is the deal, I used: $ yum install sox To install it in CentOS 6. After that I did a quick test: $ sox test.mp3 test.amr and this is what it returns: $ sox formats: no handler for file extension `mp3' I need this done with sox not lame because I will need to use it for mixing and other functions not available with lame.
cppit (181 rep)
Nov 2, 2013, 07:56 AM • Last activity: Feb 5, 2024, 06:52 PM
0 votes
1 answers
95 views
Playing a sound every 15 minutes at a specific volume
I have monitor speakers that go into a sleep state when they don't receive a certain level of sound input. The speakers often turn off when I am watching/listening to something quietly. To remedy this, I have a `crontab` set to play a 10Hz wave file every 15 minutes using Sox Play: */15 * * * * XDG_...
I have monitor speakers that go into a sleep state when they don't receive a certain level of sound input. The speakers often turn off when I am watching/listening to something quietly. To remedy this, I have a crontab set to play a 10Hz wave file every 15 minutes using Sox Play: */15 * * * * XDG_RUNTIME_DIR=/run/user/1000 /usr/bin/play -v 1.25 /home/USER/rmas/10Hz.wav This works much of them time, but I have to be sure to have my system volume up around 40+% and my application volume super low. This is due to the -v (--volume) option being **relative** to system volume. I need a way to do this in an absolute fashion, where the sound is played at 100% (or a user-specified variable) volume without affecting the sound level of anything else.  And I do not want any windows popping up (action should occur completely in the background). Am I able to do this with Sox? or what is another route I could take? * Pop!_OS 22.04 - * KRK Rokit 5 Monitors (Gen 2) - * Scarlett 2i2 USB Audio Interface routing sound to speakers.
keenwa (1 rep)
Dec 12, 2023, 11:07 AM • Last activity: Dec 14, 2023, 10:17 AM
0 votes
0 answers
1108 views
Low-latency realtime sound filtering with Pulseaudio and Sox
I'm using Linux for audio experimentation, so PulseAudio and ALSA. I'm having trouble achieving a consistent low latency. I had an idea that I could cover up unwanted environmental noises (such as sirens or backup alarms) by using my computer to create noise in the same part of the frequency spectru...
I'm using Linux for audio experimentation, so PulseAudio and ALSA. I'm having trouble achieving a consistent low latency. I had an idea that I could cover up unwanted environmental noises (such as sirens or backup alarms) by using my computer to create noise in the same part of the frequency spectrum. A very simplistic way of doing this is to multiply input samples by some power-law noise such as "pink noise" (1/f) or "brown noise" (1/f^2) and play the result out of a speaker. I think this corresponds to a convolution in the frequency domain, so it should have the effect of making frequency spikes wider and less annoying. I'm not a big fan of PulseAudio, but it is the standard application-level audio framework in Linux, and it seems to be the easiest tool which is able to do variable-rate resampling. Resampling is used to correct clock skew when working with multiple devices (in this case microphone and speakers). I got some advice for reducing latency [for PulseAudio here](https://juho.tykkala.fi/Pulseaudio-and-latency) and [for Unix pipes here](https://unix.stackexchange.com/a/730565/118662) . I have a Sox command which implements the filter effect I want, but I can't figure out how to get PulseAudio's input and output to have a predictable latency. The following simplified (Zsh) pipe command just sends samples directly from the microphone to speakers, but sometimes when I run it the subjective latency is almost negligible, and sometimes the latency is near 500ms (for example if I snap my fingers in front of the microphone, I might hear it immediately on some runs, and on other runs it'll echo twice a second). These differences occur when I just restart the pipe; I don't have to restart the PulseAudio server. PFMT=(--rate 48000 --format s16le --channels 1) pacat -r --latency-msec=1 $PFMT | pacat --latency-msec=1 $PFMT I tried putting stdbuf -o64 -i64 before each pacat, in case the problem was caused by the Unix pipe buffer, but this doesn't seem to change the behavior. I can always kill the pipe and restart it, and keep repeating until I get the pipe started up with low latency, but it would be nice to have a solution that works every time. I can't figure out from the PulseAudio logs what the difference is between the high-latency runs and the low-latency runs. From a low-latency run (the first line is a virtual "monitor" source): $ (pactl list sources; pactl list sinks) | grep Latency Latency: 0 usec, configured 1999818 usec Latency: 4193 usec, configured 66000 usec Latency: 2861 usec, configured 15012 usec From a high-latency run: $ (pactl list sources; pactl list sinks) | grep Latency Latency: 0 usec, configured 1999818 usec Latency: 505 usec, configured 66000 usec Latency: 3305 usec, configured 15012 usec Here are some relevant lines in the PulseAudio config, which I copied from Internet advice. I'm not sure that any of them are having an effect. # .config/pulse/daemon.conf ;; https://forums.linuxmint.com/viewtopic.php?t=44862 default-fragments = 2 default-fragment-size-msec = 5 high-priority = yes rlimit-nice = 31 nice-level = -11 realtime-scheduling = yes rlimit-rtprio = 9 realtime-priority = 9 I'm running a version of PulseAudio which is a few years old, so please let me know if I'm running into some known bug that has already been fixed. Here is the full noise-multiplying command (Zsh again) that I want to run, it suffers from the same unpredictable latency problem as the simple pipe above. It is not really relevant to the latency problem I'm currently encountering, except that this is why I'm not just using PulseAudio's module-loopback to route samples from the source to the sink. SFMT=(-e signed -r 48000 -b 16 -c 1 -t raw) PFMT=(--rate 48000 --format s16le --channels 1) STDB=(stdbuf -o64 -i64) sox -n $SFMT - synth brownnoise vol 0.01 | sox --buffer 64 -T $SFMT - $SFMT ($STDB pacat --latency-msec=1 $PFMT) vol 100 Thanks. ---- Update, 5 December 2023: To answer some questions in the comments about my audio setup and about ALSA. The input is a USB microphone "JMTek, LLC. USB PnP Audio Device", output is my laptop's built-in 3.5mm audio jack, via an AMD audio controller. If I use ALSA (aplay, arecord) then it seems to achieve low latency more consistently, but I get strange messages like "underrun!!! (at least 806.752 ms long)" even when I use -B to shorten the buffer size to 50 microseconds. Also, unlike with Pulse, the apparent latency with ALSA sometimes will gradually increase over several minutes (from say 10ms to 100ms). Like Pulse, ALSA also has the problem of random latency changes from one invocation to another - sometimes I get 400ms - but as I said it seems to be more often that the latency is small with ALSA. Here is the shell code I use to experiment with ALSA. Note that I'm reading and writing directly from/to the devices, without using the 'plug' PCM to change rates or channel counts. #!/bin/zsh SFMT=(-e signed -r 48000 -b 16 -c 1 -t raw) AFMT=(-r 48000 -f S16_LE -c 1) AOPT=(-B 50 $AFMT) STDB=(stdbuf -o64 -i64) sox -n $SFMT - synth brownnoise vol 0.01 | sox --buffer 64 -T $SFMT - $SFMT ($STDB aplay $AOPT -c 2 -Dhw:1) vol 300 Example output from the above ALSA experiment: Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo underrun!!! (at least 18304.245 ms long) underrun!!! (at least 870598.858 ms long) underrun!!! (at least 241414.917 ms long) underrun!!! (at least 1.451 ms long) underrun!!! (at least 12.687 ms long) overrun!!! (at least 4.934 ms long) underrun!!! (at least 10.253 ms long) underrun!!! (at least 11.326 ms long) overrun!!! (at least 0.549 ms long) ...
Metamorphic (1219 rep)
Dec 3, 2023, 10:42 PM • Last activity: Dec 6, 2023, 06:28 AM
1 votes
2 answers
1579 views
Detecting sound / silence on a sox pipe?
I am trying to keep a `sox` pipe input from a sound card open and execute a player commend only when there is sound in the pipe (without killing the pipe or using a file). This could be easily achieved with `sox silence 1 0.1 5% -1 0.1 5%` for files but when I use it for a pipe output it doesn't wor...
I am trying to keep a sox pipe input from a sound card open and execute a player commend only when there is sound in the pipe (without killing the pipe or using a file). This could be easily achieved with sox silence 1 0.1 5% -1 0.1 5% for files but when I use it for a pipe output it doesn't work. This is the sox record command I'm using
/bin/sox -V2 -q \
-r 48000 -b 16 -c 2 -t alsa hw:CARD=sndrpihifiberry,DEV=0 \
-t wav -r 44100 -b 16 -c 2 - \ 
silence 1 0.1 0.1% -1 2 0.5% \ 
> $streamFile &
I would like to attach and detach a player to the pipe only when there's a sound in the pipe. Something like:
while [ true ]; do 
  
        until [ WAIT FOR  SOUND ]; do
        
        TEST FOR SOUND IN THE PIPE
        
        done
        
        echo "Sound Detected starting @ $(date)" >> $log
        /usr/bin/player > $log

done
Any ideas?
nadigo (11 rep)
Jan 6, 2022, 11:46 PM • Last activity: Nov 28, 2023, 04:36 PM
0 votes
1 answers
164 views
Capture output from SOX "-n stat"
I am trying to capture/pipe the output from the following: arecord -f S16_LE -qd 5 file && sox file -n stat output: Samples read: 8000 Length (seconds): 1.000000 Scaled by: 2147483647.0 Maximum amplitude: 0.992188 Minimum amplitude: -0.992188 Midline amplitude: 0.000000 Mean norm: 0.093221 Mean ampl...
I am trying to capture/pipe the output from the following: arecord -f S16_LE -qd 5 file && sox file -n stat output: Samples read: 8000 Length (seconds): 1.000000 Scaled by: 2147483647.0 Maximum amplitude: 0.992188 Minimum amplitude: -0.992188 Midline amplitude: 0.000000 Mean norm: 0.093221 Mean amplitude: -0.015338 RMS amplitude: 0.232947 Maximum delta: 0.617188 Minimum delta: 0.000000 Mean delta: 0.001067 RMS delta: 0.009643 Rough frequency: 52 Volume adjustment: 1.008 I need to capture the data to convert to json. Issue is "SOX" seems to defy any method I would normally use to capture/pipe the stdout. Any recommendations?
Mark Miller (3 rep)
Nov 17, 2023, 05:08 PM • Last activity: Nov 17, 2023, 05:29 PM
2 votes
1 answers
1013 views
Why does `read` fail saying "read error: 0: Resource temporarily unavailable"?
**script** ``` #!/bin/bash -- # record from microphone rec --channels 1 /tmp/rec.sox trim 0.9 band 4k noiseprof /tmp/noiseprof && # convert to mp3 sox /tmp/rec.sox --compression 0.01 /tmp/rec.mp3 trim 0 -0.1 && # play recording to test for noise play /tmp/rec.mp3 && printf "\nRemove noise? " read re...
**script**
#!/bin/bash --

# record from microphone
rec --channels 1 /tmp/rec.sox trim 0.9 band 4k noiseprof /tmp/noiseprof &&


# convert to mp3
sox /tmp/rec.sox --compression 0.01 /tmp/rec.mp3 trim 0 -0.1 &&


# play recording to test for noise
play /tmp/rec.mp3 &&


printf "\nRemove noise? "
read reply


# If there's noise, remove it
if [[ $reply == "y" ]]; then
  sox /tmp/rec.sox --compression 0.01 /tmp/rec.mp3 trim 0 -0.1 noisered /tmp/noiseprof 0.1
  play /tmp/rec.mp3
fi
**Errors with**: read error: 0: Resource temporarily unavailable **But**, the script works if I use the -e flag on read to enable readline
Pound Hash (327 rep)
Nov 16, 2023, 10:34 PM • Last activity: Nov 17, 2023, 01:21 AM
5 votes
2 answers
4324 views
Piping Sox and FFMPEG together
I'm processing a variety of audio files in a bunch of different formats and I'd like to unify their format and configuration using FFMPEG and SoX. There are two steps to my process: 1. Convert the file, whatever it may originally be, to a PCM 16-bit little-endian WAV file: `ffmpeg -i input.wav -c:a...
I'm processing a variety of audio files in a bunch of different formats and I'd like to unify their format and configuration using FFMPEG and SoX. There are two steps to my process: 1. Convert the file, whatever it may originally be, to a PCM 16-bit little-endian WAV file: ffmpeg -i input.wav -c:a pcm_s16le output.wav 2. Process the file in Sox to make it conform to the sample rate and channel count that we need: sox input.wav output.flac channels 2 rate 44.1k I'd ideally like to pipe these two commands together so as to avoid creating an unnecessary file. I'm having a lot of trouble actually getting the format to work properly, though. SoX complains that it needs to explicitly know the format of the incoming audio, which is something that I don't even know at execution time. I know the format of the PCM audio, but I'm not sure the channel count nor of the sample rate of the incoming audio. Is there a way to pipe these two commands together, or better, to only have to use one tool for the job? The reason I've used two tools rather than just trying to do it with one: ### FFMPEG ### * Not sure if there's a way to safely convert a mono audio stream to a stereo audio stream by duplicating the channels. (SoX does this natively.) * Not sure how to change sample rate. (SoX does this natively.) * Not sure how to output to FLAC using the best compression rate. ### SoX ### * Not able to do audio format detection as well as FFMPEG does. If I have a file without an extension, SoX asks me to manually specify the format, which doesn't work at all for my application.
Naftuli Kay (41346 rep)
Nov 5, 2013, 12:03 AM • Last activity: Nov 8, 2023, 10:20 AM
0 votes
1 answers
136 views
How to bulk edit wav files?
I need to cut off the first say 3 seconds from a batch of wav files. Is there a way to do it from the command line or using a linux native program? Thanks.
I need to cut off the first say 3 seconds from a batch of wav files. Is there a way to do it from the command line or using a linux native program? Thanks.
black-clover (383 rep)
Aug 21, 2023, 03:59 AM • Last activity: Sep 19, 2023, 10:38 PM
-1 votes
2 answers
358 views
How to fade out batch wav files?
I need to apply a fade out of 2 seconds on a batch of wav files (with names including spaces like C 0 120-127.wav). The files length varies, but I need a fade out of 2 seconds from the end of each file, regardless of the file length. I know this can be done with sox, but the tutorials I checked incl...
I need to apply a fade out of 2 seconds on a batch of wav files (with names including spaces like C 0 120-127.wav). The files length varies, but I need a fade out of 2 seconds from the end of each file, regardless of the file length. I know this can be done with sox, but the tutorials I checked include other actions (fade in or convert) or are for single files where you specify input and putput, and this is confusing (I'm a sox newbie). I need the files names to remain the same, although piping the output to another folder is also acceptable. Thanks.
black-clover (383 rep)
Sep 19, 2023, 05:38 PM • Last activity: Sep 19, 2023, 10:32 PM
1 votes
1 answers
72 views
How to extend sustain in batch wav files?
I have a batch of sound samples which are too short (2.15 sec), and I want to extend the sustain to a total of about 10 seconds, meaning stretch the last 0.50 second of the file to 10 seconds. I can do this on each single file in audacity with paulstretch but was wondering if there's a way to do so...
I have a batch of sound samples which are too short (2.15 sec), and I want to extend the sustain to a total of about 10 seconds, meaning stretch the last 0.50 second of the file to 10 seconds. I can do this on each single file in audacity with paulstretch but was wondering if there's a way to do so in batch from the command line. Here is the original sample: original and here is the result I'd want: stretched
black-clover (383 rep)
Sep 18, 2023, 06:09 PM • Last activity: Sep 18, 2023, 10:42 PM
1 votes
1 answers
70 views
Divide frequency of live audiostream from an ultrasound microphone
I want to record ultrasound with a microphone, divide the frequency and listen to the transposed sound. I start with a normal microphone for tests. I tried sox with rec and play so far. It works, but the latency is 2 seconds. rec -r 48000 -c 1 -t wav - pitch -1000 | play -t wav - How can I improve t...
I want to record ultrasound with a microphone, divide the frequency and listen to the transposed sound. I start with a normal microphone for tests. I tried sox with rec and play so far. It works, but the latency is 2 seconds. rec -r 48000 -c 1 -t wav - pitch -1000 | play -t wav - How can I improve this line and reduce the latency? My audio setup is # API: ALSA v: k6.1.27-gentoo status: kernel-api # Server-1: PulseAudio v: 16.1 status: active
Jonas Stein (4298 rep)
Aug 7, 2023, 01:19 AM • Last activity: Aug 7, 2023, 05:48 PM
3 votes
0 answers
1297 views
How to record audio device with more than 2 channels using SoX?
I have multiple USB audio interfaces that each have 10, 18 or even 32 input channels. Mainly used to record every instrument of a band into a separate track. I record in raw wav format (s32le @48kHz) which means I need crazy amounts of storage space if I record all channels. For that reason I need t...
I have multiple USB audio interfaces that each have 10, 18 or even 32 input channels. Mainly used to record every instrument of a band into a separate track. I record in raw wav format (s32le @48kHz) which means I need crazy amounts of storage space if I record all channels. For that reason I need to only record the channels that I actually want to record. I found this to be possible using SoX by specifying the amount of channels I need with the -c flag and using the remix "effect" to select the channels to be recorded. And this little proof of concept shows me that it does work:
Bash
$ export SOURCE_NAME="alsa_input.usb-Behringer_FLOW_8_03-FF-02-11-55-44-00.Direct__hw_F8__source"

# Record only 1 channel(s) (-c 1) - The channel(s) to record: 2
$ sox -t pulseaudio "${SOURCE_NAME}" -r 48000 -c 1 -b 16 -e signed-integer output.w64 remix 2
Scaling it up however, doesn't work:
Bash
# Record only 4 channel(s) (-c 4) - The channel(s) to record: 1 2 6 8
$ sox -t pulseaudio "${SOURCE_NAME}" -r 48000 -c 4 -b 32 -e signed-integer output.w64 remix 1 2 6 8
For some reason SoX only recognizes the first two channels:
Input File     : 'alsa_input.usb-Behringer_FLOW_8_03-FF-02-11-55-44-00.Direct__hw_F8__source' (pulseaudio)
Channels       : 2
Sample Rate    : 48000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

sox FAIL remix: too few input channels
FFmpeg also fails when recording more than 2 channels:
Bash
$ ffmpeg -f pulse -i "${SOURCE_NAME}" -c:a pcm_s32le -ar 48000 -ac 10 -channel_layout 0x3ff output.w64
FFmpeg throws this error:
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, pulse, from 'alsa_input.usb-Behringer_FLOW_8_03-FF-02-11-55-44-00.Direct__hw_F8__source':
  Duration: N/A, start: 1689504465.730127, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
Multiple -ac options specified for stream 0, only the last option '-ac 10' will be used.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s32le (native))
Press [q] to stop, [?] for help
[pcm_s32le @ 0x55bc4d7d6040] Channel layout '10 channels (FL+FR+FC+LFE+BL+BR+FLC+FRC+BC+SL)' with 10 channels does not match number of specified channels 2
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!
Double checking with ffmpeg probe:
$ ffprobe -f pulse -i "${SOURCE_NAME}"
Input #0, pulse, from 'alsa_input.usb-Behringer_FLOW_8_03-FF-02-11-55-44-00.Direct__hw_F8__source':
  Duration: N/A, start: 1689504633.181940, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s16le, 48000 Hz, 2 channels, s16, 1536 kb/s
So my next thought was that PulseAudio itself has a bug.But we can easily check for that using the pactl utility:
$ pactl list sources
Source #1414
    ...
    Name: alsa_input.usb-Behringer_FLOW_8_03-FF-02-11-55-44-00.Direct__hw_F8__source
    ...
    Sample Specification: s32le 10ch 48000Hz
    Channel Map: aux0,aux1,aux2,aux3,aux4,aux5,aux6,aux7,aux8,aux9
    ...
    Volume: aux0: 48287 /  74% / -7.96 dB,   aux1: 48287 /  74% / -7.96 dB,   aux2: 48287 /  74% / -7.96 dB,   aux3: 48287 /  74% / -7.96 dB,   aux4: 48287 /  74% / -7.96 dB,   aux5: 48287 /  74% / -7.96 dB,   aux6: 48287 /  74% / -7.96 dB,   aux7: 48287 /  74% / -7.96 dB,   aux8: 48287 /  74% / -7.96 dB,   aux9: 48287 /  74% / -7.96 dB
            balance 0.00
    ...
    Properties:
        ...
        audio.channels = "10"
        ...
This makes it quite obvious that PulseAudio is aware of all 10 input channels of that USB audio interface. So I tried using PulseAudio's tool parecord:
$ parecord --device=${SOURCE_NAME} --format=s32le --rate=48000 --channels 10 --file-format=w64 output.w64
Warning: failed to write channel map to file.
and although it produced this warning (whatever it means), it did actually record all 10 channels successfully. I was even able to select specific channels like this:
parecord --device=${SOURCE_NAME} --format=s32le --rate=48000 --channels 4 --channel-map=aux0,aux1,aux5,aux7 --file-format=w64 output.w64
So why is this not working with SoX or FFmpeg? I also tried telling SoX to use ALSA instead, but that doesn't work at all:
$ sox -t alsa "plughw:CARD=F8,DEV=0" -r 48000 -c 4 -b 32 -e signed-integer output.w64 remix 1 2 6 8
sox FAIL formats: can't open input  `plughw:CARD=F8,DEV=0': snd_pcm_open error: Device or resource busy
I guess access via ALSA just doesn't work when you have PipeWire and PulseAudio running on top of it. I checked if I can record via ALSA's arecord utility, but I get the same "device busy" error:
$ arecord -D plughw:CARD=F8,DEV=0 -r 48000 -c 10 -f S32_LE -t wav output.wav
arecord: main:867: audio open error: Device or resource busy
Directly recording using PipeWire's pw-record utility worked just fine though btw:
$ pw-record --target ${SOURCE_NAME} --format s32 --rate 48000 --channels 10
And I was also able to select the channels I want to record:
$ pw-record --target ${SOURCE_NAME} --format s32 --rate 48000 --channels 4 --channel-map=aux0,aux1,aux5,aux7 output.w64
I looked into SoX and if it supports PipeWire directly, but that doesn't appear to be the case unfortunately. But since PulseAudio does actually see all channels, I don't understand why SoX and FFmpeg are failing here Any ideas?
Forivin (1193 rep)
Jul 16, 2023, 11:54 AM • Last activity: Jul 16, 2023, 02:44 PM
0 votes
1 answers
933 views
SoX - Mix original signal with effected signal
Is there an option in SoX effects processing to mix the wet and dry signals instead of only outputting the wet? For example, say my effects chain is overdrive into pitch shift: `sox in.wav out.wav overdrive 0.5 gain -0.5 pitch 700` Except I don't want the final file to be _just_ the shifted signal....
Is there an option in SoX effects processing to mix the wet and dry signals instead of only outputting the wet? For example, say my effects chain is overdrive into pitch shift: sox in.wav out.wav overdrive 0.5 gain -0.5 pitch 700 Except I don't want the final file to be _just_ the shifted signal. I want a mix of the distorted, shifted signal and the distorted, unshifted signal. Does SoX support this somehow?
vomitHatSteve (3 rep)
Oct 9, 2017, 10:13 PM • Last activity: May 2, 2023, 05:57 PM
1 votes
0 answers
294 views
Find the peak frequency in Khz of audio file using sox or python for mp3 flac and wav files
I wanted to batch analyze audio files for peak frequencies using python or SoX [![file.mp3 with constant bit rate][1]][1] [![file.mp3 with variable bit rate][2]][2] Above spectrogram are created using sox I wanted to extract peak frequency like in image 1: 20Khz [1]: https://i.sstatic.net/91a1u.png...
I wanted to batch analyze audio files for peak frequencies using python or SoX file.mp3 with constant bit rate file.mp3 with variable bit rate Above spectrogram are created using sox I wanted to extract peak frequency like in image 1: 20Khz
Arjun (11 rep)
Jan 13, 2023, 05:17 PM
-1 votes
1 answers
288 views
sox: How To Trim Audio From a .mp4
using sox how do I remove 0:20-0:25 of a .mp4 file. So far I have sox filename filename2 trim 20-25
using sox how do I remove 0:20-0:25 of a .mp4 file. So far I have sox filename filename2 trim 20-25
mister mcdoogle (505 rep)
Jan 3, 2023, 05:23 AM • Last activity: Jan 3, 2023, 09:56 AM
Showing page 1 of 20 total questions