Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes

1 answers

4096 views

Unable to use python speech_recognition lib Microphone class due to ALSA

I am attempting to write a speech recognition program for the raspberry pi, however I am facing some issues using python's speech_recognition library. From the error messages (posted below) I think the issue may be with the wrong sound card being accessed, however I am able to record with PyAudio (w...

                                  I am attempting to write a speech recognition program for the raspberry pi, however I am facing some issues using python's speech_recognition library.

From the error messages (posted below) I think the issue may be with the wrong sound card being accessed, however I am able to record with PyAudio (which I think the microphone class uses) as well as 'arecord'

Below is the code I am trying to run:

    import speech_recognition as sr

    r = sr.Recognizer()

    with sr.Microphone() as source:

        while True:
        
            audio = r.listen(source)
        
            try:
                printf("You said " + r.recognize(audio))
            except LookupError:
                printf("Could not understand audio")


I have made some adjustments to which soundcard is used as default.

My "/etc/modprobe.d/alsa-base.conf" file is untouched and standard.

I have created a file in /home/pi under the name ".asoundrc" which contains:

    pcm.!default {
         type asym
         playback.pcm "hw:0,0"
         capture.pcm "hw:1,0"
        }
This allows for recording from the USB microphone and playback through the on-board headphone jack port.

Below is the error message I received when trying to run the python script:


    pi@raspberrypi ~/Desktop $ python speechtester.py
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.front.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM front
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.surround40.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM surround40
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.surround51.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM surround41
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.surround51.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM surround50
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.surround51.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM surround51
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.surround71.0:CARD=0'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM surround71
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM iec958
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM spdif
    ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.bcm2835.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
    ALSA lib conf.c:4241:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4720:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM spdif
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
    ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
    ALSA lib pcm_dmix.c:957:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
    Cannot connect to server socket err = No such file or directory
    Cannot connect to server request channel
    jack server is not running or cannot be started

Apologies for the relatively long post I just wanted to provide as much information as possible.
                                

Aphire (131 rep)

Jan 22, 2015, 12:28 PM • Last activity: Jun 18, 2025, 06:09 AM

125 votes

13 answers

88532 views

Is there any decent speech recognition software for Linux?

software-rec speech-recognition

The short version of the question: I am looking for a speech recognition software that runs on Linux and has decent accuracy and usability. Any license and price is fine. It should not be restricted to voice commands, as I want to be able to dictate text. ---------- More details: I have unsatisfying...

                                  The short version of the question: I am looking for a speech recognition software that runs on Linux and has decent accuracy and usability. Any license and price is fine. It should not be restricted to voice commands, as I want to be able to dictate text.


----------
More details:

I have unsatisfyingly tried the following:

- [CMU Sphinx](http://cmusphinx.sourceforge.net/) 
- [CVoiceControl](http://www.kiecza.net/daniel/linux/) 
- [Ears](http://www.speech.cs.cmu.edu/comp.speech/Section6/Recognition/ears.html) 
- [Julius](http://julius.osdn.jp/) 
- [Kaldi](http://kaldi.sourceforge.net/)  (e.g., [Kaldi GStreamer server](https://github.com/alumae/kaldi-gstreamer-server)) 
- [IBM ViaVoice](https://web.archive.org/web/19990508201353/http://biz.yahoo.com/bw/990426/ny_ibm_1.html)  (used to run  on Linux but was discontinued years ago)
- [NICO ANN Toolkit](http://nico.nikkostrom.com/) 
- [OpenMindSpeech](http://freespeech.sourceforge.net/) 
- [RWTH ASR](http://www-i6.informatik.rwth-aachen.de/rwth-asr/) 
- [shout](http://shout-toolkit.sourceforge.net/) 
- [silvius](http://voxhub.io/silvius)  (built on the Kaldi speech recognition toolkit)
- [Simon Listens](http://simon-listens.org/index.php?id=122) 
- [ViaVoice / Xvoice](http://xvoice.sourceforge.net/) 
- [Wine + Dragon NaturallySpeaking](https://appdb.winehq.org/objectManager.php?sClass=application&iId=2077)  + [NatLink](https://sourceforge.net/projects/natlink/)  + [dragonfly](https://pypi.python.org/pypi/dragonfly)  +  [damselfly](https://github.com/TristenHayfield/damselfly) 
- https://github.com/DragonComputer/Dragonfire : only accepts voice commands

All the above-mentioned native Linux solutions have both poor accuracy and usability (or some don't allow free-text dictation but only voice commands). By poor accuracy, I mean an accuracy significantly below the one the speech recognition software I mentioned below for other platforms have. As for Wine + Dragon NaturallySpeaking, in my experience it keeps crashing, and I don't seem to be the only one to have [such issues]((https://appdb.winehq.org/objectManager.php?sClass=application&iId=2077))  unfortunately.

On Microsoft Windows I use Dragon NaturallySpeaking, on Apple Mac OS X I use Apple Dictation and DragonDictate, on Android I use Google speech recognition, and on iOS I use the built-in Apple speech recognition.

Baidu Research released [yesterday](http://slashdot.org/story/16/01/17/1318228/baidu-releases-open-source-artificial-intelligence-code)  the [code](https://github.com/baidu-research/warp-ctc)  for its speech recognition library using [Connectionist Temporal Classification](http://www.cs.toronto.edu/~graves/icml_2006.pdf)  implemented with Torch. Benchmarks from [Gigaom](https://gigaom.com/2014/12/18/baidu-claims-deep-learning-breakthrough-with-deep-speech/)  are encouraging as shown in the table below, but I am not aware of any good wrapper around to make it usable without quite some coding (and a large training data set):

>|System          |Clean (94)  |Noisy (82)  |Combined (176)
>|:---------------|:----------:|:----------:|:------------:
>|Apple Dictation |   14.24    |  43.76     |  26.73
>|Bing Speech     |   11.73    |  36.12     |  22.05
>|Google API      |    6.64    |  30.47     |  16.72
>|wit.ai          |    7.94    |  35.06     |  19.41
>|**Deep Speech** |  **6.56**  |**19.06**   |**11.85**
>
>Table 4: Results (%WER) for 3 systems evaluated on the original audio. All systems are scored _only_ on the utterances with predictions given by all systems. The number in the parentheses next to each dataset, e.g. Clean (94), is the number of utterances scored.

There exist some very alpha open-source projects:

- https://github.com/mozilla/DeepSpeech  (part of Mozilla's Vaani project: http://vaani.io   ([mirror](https://web.archive.org/web/20170424002550/http://vaani.io/info.html))) 
- https://github.com/pannous/tensorflow-speech-recognition 
- Vox, a system to control a Linux system using Dragon NaturallySpeaking: https://github.com/Franck-Dernoncourt/vox_linux  + https://github.com/Franck-Dernoncourt/vox_windows 
- https://github.com/facebookresearch/wav2letter 
- https://github.com/espnet/espnet 
- http://github.com/tensorflow/lingvo  (to be released by Google, mentioned at Interspeech 2018)

I am also aware of this [attempt at tracking states of the arts and recent results (bibliography) on speech recognition.](https://github.com/syhw/wer_are_we)  as well as this [benchmark of existing speech recognition APIs](https://github.com/Franck-Dernoncourt/ASR_benchmark) .

----

I am aware of  [Aenea](https://github.com/dictation-toolbox/aenea) , which allows speech recognition via Dragonfly on one computer to send events to another, but it has some latency cost:



I am also aware of these two talks exploring Linux option for speech recognition:

- [2016 - The Eleventh HOPE: Coding by Voice with Open Source Speech Recognition](https://www.youtube.com/watch?v=YRyYIIFKsdU)  (David Williams-King)
- [2014 - Pycon: Using Python to Code by Voice](https://www.youtube.com/watch?v=8SkdfdXWYaI)  (Tavis Rudd) 
                                

Franck Dernoncourt (5533 rep)

Jan 18, 2016, 06:04 PM • Last activity: Mar 19, 2025, 08:46 PM

7 votes

3 answers

2281 views

Comfortable offline speech recognition software for Linux?

software-rec speech-recognition

I'm looking for an _offline_ speech recognition software for Linux which can handle also German language and which is easy to use and configure. I already tried CMU Sphinx and a few more others, but all of them had one in common: they have been way too complicated to install/use, mainly because of l...

                                  I'm looking for an _offline_ speech recognition software for Linux which can handle also German language and which is easy to use and configure.

I already tried CMU Sphinx and a few more others, but all of them had one in common: they have been way too complicated to install/use, mainly because of lack of a good manual and also because of a very crude concept (I try to avoid the word "usability" in this context).

So...is there a speech recognition software out there which can be set-up and configured in finite time, is able to execute scripts on recognised commands and works fully offline, means does not need a cloud service or remote server to analyse spoken words? I'm also willing to pay money for a working and usable solution!

Every hint and idea is welcome!

Thanks!

PS: I'm aware of the thread https://unix.stackexchange.com/questions/256138/is-there-any-decent-speech-recognition-software-for-linux  - but the answers given there do NOT point to offline solutions!

Mike Maikaefer (73 rep)

Mar 14, 2018, 11:42 AM • Last activity: Sep 9, 2022, 09:47 PM

4 votes

3 answers

2343 views

What are some current transcription or dictation software packages for Linux?

linux repository software-rec text-to-speech speech-recognition

The Mozilla [deepspeech][1] project is interesting, but perhaps not sufficiently sophisticated. My results, at least, were underwhelming. Online transcription or dictation services are fine, but an [offline][2] software package would be preferred. Is this just not that common on Linux and with open...

                                  The Mozilla deepspeech  project is interesting, but perhaps not sufficiently sophisticated.  My results, at least, were underwhelming.

Online transcription or dictation services are fine, but an offline  software package would be preferred.

Is this just not that common on Linux and with open source software?  Looking to get transcriptions from mp3 files.  

*Would prefer not to upload files or use an API which uses a similar such service.*

Nicholas Saunders (565 rep)

May 9, 2021, 06:54 PM • Last activity: Aug 5, 2021, 04:12 PM

1 votes

0 answers

95 views

speech recording and translate

flac text-to-speech speech-recognition wav

I have a problem with converting from "wav" to "flac". Command: ``` arecord -D plughw:0,0 -f cd -t wav -d 0 -q -r 16000 | flac - -s -f --best --sample-rate 16000 -o daveconroy.flac ``` Always gives: > ERROR: raw format options (--endian, --sign, --channels, --bps, and --sample-rate) are not allowed...

I have a problem with converting from "wav" to "flac". Command:

arecord -D plughw:0,0 -f cd -t wav -d 0 -q -r 16000 | flac - -s -f --best --sample-rate 16000 -o daveconroy.flac

Always gives: > ERROR: raw format options (--endian, --sign, --channels, --bps, and --sample-rate) are not allowed for non-raw input While I can record with the following command, playback is extremely noisy:

arecord -t raw -f S16_LE -r 8000 | flac - -f --endian little --sign unsigned --channels 1 --bps 16 --sample-rate 8000 -s -c -o test.flac

Project link: https://daveconroy.com/how-to/turn-raspberry-pi-translator-speech-recognition-playback-60-languages/ https://makezine.com/projects/universal-translator/ I have searched with my colleague for hours, unfortunately unsuccessfully, and have also tried with sox. Maybe there is a Linux professional among you who can help me.

chilyourlifefist (11 rep)

Jul 16, 2021, 03:44 PM • Last activity: Jul 16, 2021, 03:50 PM

2 votes

1 answers

272 views

How can I run speech to text and save the result in a variable?

shell assignment speech-recognition

I would like to speak into my computer's microphone, have what I say converted to text and then have that available as a shell variable. Is this possible? I thought I might do it using Google's speech input feature: [![enter image description here][1]][1] [1]: https://i.sstatic.net/YMpLk.png

                                  I would like to speak into my computer's microphone, have what I say converted to text and then have that available as a shell variable. 

Is this possible? I thought I might do it using Google's speech input feature:

venkatraman (21 rep)

Dec 29, 2015, 08:24 AM • Last activity: May 27, 2021, 06:16 PM

1 votes

1 answers

237 views

How to manually begin/end speech recognition with X11?

speech-recognition

Having found speech recognition software that works well (see [this question][1]), I'm still left with needing integration, in my case, an easy way to activate it. The outcome I'm looking for is: - Press a shortcut to begin dictation. - Press a shortcut to end dictation. - The result is typed out, a...

                                  Having found speech recognition software that works well (see this question ),

I'm still left with needing integration, in my case, an easy way to activate it.

The outcome I'm looking for is:

- Press a shortcut to begin dictation.
- Press a shortcut to end dictation.
- The result is typed out, as if I was typing it in from a keyboard.

---

This is something I could probably manage using shell scripts (manually control an audio recorder, then use xdotool to type out the result). However solutions might already be out there, so I'm asking this question.

ideasman42 (1461 rep)

May 23, 2021, 03:11 AM • Last activity: May 26, 2021, 12:15 PM

1 votes

0 answers

301 views

Feedback on voice-recognition software for Linux

ubuntu speech-recognition

I'd like to get some feedback on any of the voice recognition software that is available for Linux. Free or paid.. That can type on any program and have enter key and right mouse click. I'm currently using NaturallySpeaking.. Would like to switch from Windows 10. Thank you

                                  I'd like to get some feedback on any of the voice recognition software that is available for Linux. Free or paid.. 

That can type on any program and have enter key and right mouse click.  I'm currently using NaturallySpeaking..

Would like to switch from Windows 10. Thank you

Daniel.B (11 rep)

Nov 1, 2020, 07:04 AM

4 votes

1 answers

307 views

Detect simple voice commands

speech-recognition

I would like to detect simple words or phrases from my microphone and perform actions based on those phrases. I've looked into Python libraries and Google text-to-speech but these seem like extreme overkill 1 . I don't need something that is capable of recognizing every phoneme or word in the Englis...

                                  I would like to detect simple words or phrases from my microphone and perform actions based on those phrases. I've looked into Python libraries and Google text-to-speech but these seem like extreme overkill1. I don't need something that is capable of recognizing every phoneme or word in the English language, I just want to detect certain phrases like "go to sleep" or even just "sleep" to make my computer sleep for example.

I tried searching for this, but mostly I just get programs for dictation and posts from 10 years ago.

1\.
For example I stumbled across [this article](https://realpython.com/python-speech-recognition/)  that relies on web services or installing something heavy-duty like Sphinx. Can't I just train a model to respond to certain phrases instead of every possible phrase?

Display name (1337 rep)

Oct 25, 2020, 10:45 AM • Last activity: Oct 25, 2020, 03:58 PM

1 votes

1 answers

2684 views

Installing Simon Listens on Linux Mint

linux linux-mint software-installation speech-recognition

Recently I've heard about the Simon Listens package which enables you to create a speech recognition engine on Linux as well as windows. I have Linux Mind 14 - cinnamon installed on my laptop. I wanted to install Simon Listens on this system, I downloaded the most recent version (0.4.0) from here and extracted the files. However there is no way for me to run the build.sh script. When I double click on it a window pops up asking me if I want to run it or run it in terminal. Regardless which option I select a terminal window flashes briefly and closes (before I can read it). I can't install it. How can I get it to work? **EDIT** Here is what the sources.list says:

deb http://packages.linuxmint.com/  nadia main upstream import
deb-src http://packages.linuxmint.com/  nadia main upstream import #Added by software-properties
deb http://archive.ubuntu.com/ubuntu/  quantal main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/  quantal-updates main restricted universe multiverse
deb http://security.ubuntu.com/ubuntu/  quantal-security main restricted universe multiverse
deb http://archive.canonical.com/ubuntu/  quantal partner
deb http://packages.medibuntu.org/  quantal free non-free

deb http://archive.getdeb.net/ubuntu  quantal-getdeb apps
deb http://archive.getdeb.net/ubuntu  quantal-getdeb games

Jakub (721 rep)

Apr 30, 2013, 05:36 PM • Last activity: Oct 16, 2020, 02:52 PM

3 votes

2 answers

6281 views

Redirect Output of Pocketsphinx_continuous to a file

output speech-recognition

I have an ugly command: pocketsphinx_continuous -samprate 48000 -nfft 2048 -hmm /usr/local/share/pocketsphinx/model/en-us/en-us -lm 9745.lm -dict 9745.dic -inmic yes *Breakdown:* It listens for any noise and when it detects some, it listens to it, and then performs speech recognition on it. Now, the...

                                  I have an ugly command: 

    pocketsphinx_continuous -samprate 48000 -nfft 2048 -hmm /usr/local/share/pocketsphinx/model/en-us/en-us -lm 9745.lm -dict 9745.dic -inmic yes

*Breakdown:* It listens for any noise and when it detects some, it listens to it, and then performs speech recognition on it.

Now, the command output has a bunch of junk in it, and one line that matters. Here is the output of one speech recognition:

    READY....
    Listening...
    INFO: cmn_prior.c(131): cmn_prior_update: from 
    INFO: cmn_prior.c(149): cmn_prior_update: to   
    INFO: ngram_search_fwdtree.c(1553):      814 words recognized (9/fr)
    INFO: ngram_search_fwdtree.c(1555):    60871 senones evaluated (684/fr)
    INFO: ngram_search_fwdtree.c(1559):    37179 channels searched (417/fr), 6846 1st, 21428 last
    INFO: ngram_search_fwdtree.c(1562):     1415 words for which last channels evaluated (15/fr)
    INFO: ngram_search_fwdtree.c(1564):     2626 candidate words for entering last phone (29/fr)
    INFO: ngram_search_fwdtree.c(1567): fwdtree 0.66 CPU 0.742 xRT
    INFO: ngram_search_fwdtree.c(1570): fwdtree 3.36 wall 3.780 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 21 words
    INFO: ngram_search_fwdflat.c(948):      655 words recognized (7/fr)
    INFO: ngram_search_fwdflat.c(950):    40095 senones evaluated (451/fr)
    INFO: ngram_search_fwdflat.c(952):    31447 channels searched (353/fr)
    INFO: ngram_search_fwdflat.c(954):     1794 words searched (20/fr)
    INFO: ngram_search_fwdflat.c(957):     1006 word transitions (11/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.29 CPU 0.326 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.30 wall 0.333 xRT
    INFO: ngram_search.c(1253): lattice start node .0 end node .70
    INFO: ngram_search.c(1279): Eliminated 1 nodes before end node
    INFO: ngram_search.c(1384): Lattice has 127 nodes, 473 links
    INFO: ps_lattice.c(1380): Bestpath score: -2298
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:70:87) = -132973
    INFO: ps_lattice.c(1441): Joint P(O,S) = -156371 P(S|O) = -23398
    INFO: ngram_search.c(875): bestpath 0.01 CPU 0.011 xRT
    INFO: ngram_search.c(878): bestpath 0.00 wall 0.005 xRT
    HELLO

That HELLO is the only thing that matters, and I want that to be output into a file somehow.

I have already tried adding >foo.txt to the end of the command, which works, except it outputs everything except for HELLO to the file and HELLO never even makes it to the command line.

I've tried adding &> foo.txt 2> foo.txt >> foo.txt and all of them cause the output to go where it says except every time it also causes READY....,Listening..., and HELLO to disappear.

How can I direct HELLO to a file, in any way, I don't care if other stuff comes with it, I can cut the other stuff out.
                                

Patrick Cook (251 rep)

Jan 3, 2016, 12:27 AM • Last activity: Jul 4, 2020, 06:36 PM

1 votes

0 answers

1194 views

Error while trying to run a python program

linux python pulseaudio alsa speech-recognition

Error occurs when Trying to run a python with speech recognition and py audio.. ALSA lib pcm_dsnoop.c:641:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c...

                                  Error occurs when Trying to run a python with speech recognition and py audio..


ALSA lib pcm_dsnoop.c:641:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_a52.c:823:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave
                                

abduzaabi (33 rep)

Apr 17, 2020, 05:28 PM

0 votes

1 answers

243 views

Kali - problem with kaldi/egs/voxforge/s5 run.sh

linux kali-linux speech-recognition

I try to launch `run.sh` but it fails. Does anyone know how to fix it? Text from terminal after launching `run.sh`: ```lang-none There was an error running the SLURM sbatch command. The command was: '/usr/bin/sbatch -o exp/make_mfcc/train/q/make_mfcc_train.log --export=none,PATH=/home/kvcper/kaldi/e...

I try to launch run.sh but it fails. Does anyone know how to fix it? Text from terminal after launching run.sh:

-none
There was an error running the SLURM sbatch command.
The command was:
'/usr/bin/sbatch -o exp/make_mfcc/train/q/make_mfcc_train.log --export=none,PATH=/home/kvcper/kaldi/egs/voxforge/s5/../../../src/bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/chainbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/featbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/fgmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/fstbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/gmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/ivectorbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/kwsbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/latbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/lmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnet2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnet3bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnetbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/online2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/onlinebin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/rnnlmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/sgmm2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/sgmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/tfrnnlmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/cudadecoderbin:/home/kvcper/kaldi/egs/voxforge/s5/utils/:/home/kvcper/kaldi/egs/voxforge/s5/../../../tools/openfst/bin:/home/kvcper/kaldi/egs/voxforge/s5:/home/kvcper/kaldi/tools/python:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/chainbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/featbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/fgmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/fstbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/gmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/ivectorbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/kwsbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/latbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/lmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnet2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnet3bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/nnetbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/online2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/onlinebin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/rnnlmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/sgmm2bin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/sgmmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/tfrnnlmbin:/home/kvcper/kaldi/egs/voxforge/s5/../../../src/cudadecoderbin:/home/kvcper/kaldi/egs/voxforge/s5/utils/:/home/kvcper/kaldi/egs/voxforge/s5/../../../tools/openfst/bin:/home/kvcper/kaldi/egs/voxforge/s5:/home/kvcper/kaldi/tools/python:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/kvcper/kaldi/tools/irstlm/bin:/home/kvcper/kaldi/tools/srilm/bin:/home/kvcper/kaldi/tools/srilm/bin/i686-m64:/home/kvcper/kaldi/tools/sequitur-g2p/bin:/home/kvcper/kaldi/tools/irstlm/bin:/home/kvcper/kaldi/tools/srilm/bin:/home/kvcper/kaldi/tools/srilm/bin/i686-m64:/home/kvcper/kaldi/tools/sequitur-g2p/bin --array='1:2' /home/kvcper/kaldi/egs/voxforge/s5/exp/make_mfcc/train/q/make_mfcc_train.sh 2>&1'
and the output was:
'sbatch: error: s_p_parse_file: unable to status file /etc/slurm-llnl/slurm.conf: No such file or directory, retrying in 1sec up to 60sec
 sbatch: fatal: Unable to process configuration file
'

Full transcript at https://pastebin.com/J1TD9WNQ

Kacper Biały (1 rep)

May 3, 2019, 04:12 PM • Last activity: May 4, 2019, 08:53 PM

3 votes

1 answers

553 views

Logging in using voice commands with GDM

Is there a program capable of doing such thing? Something that would wait for me to either supply a username and a password, or select my username and a voice command that does the same thing.

                                  Is there a program capable of doing such thing?

Something that would wait for me to either supply a username and a password, or select my username and a voice command that does the same thing.

Mahmoud Hossam (485 rep)

Apr 1, 2011, 11:11 AM • Last activity: Jan 23, 2018, 07:09 PM

3 votes

3 answers

6777 views

Running a C++ compiled program in the background and sending input whenever needed

background-process c++ speech-recognition

I have a compiled program written in C++ for a UNIX environment which has this kind of structure: int main(){ ... LoadEngine() ... while(1){ std::cin >> buffer; ... ExecuteFunction(buffer); } } Loading the engine takes quite a while, so I'm trying to start the program in the background first, then s...

                                  I have a compiled program written in C++ for a UNIX environment which has this kind of structure:

    int main(){
    ...
	LoadEngine()
    ...
	while(1){
		std::cin >> buffer;
		...
		ExecuteFunction(buffer);
	}
    }

Loading the engine takes quite a while, so I'm trying to start the program in the background first, then send the input whenever I need to.

Running the program with the standard ampersand appended at the end seemingly does not make the program run in the background, but rather halts at _std::cin_ until an input is received from the console, and stops after an input is accepted from the console.

How do I execute the program such that the program runs continuously in the background and receive input and execute the function whenever it is needed?

**EDIT**: The final product is a small device (RaspberryPi) which recognizes speech, and does something based on the words recognized. The program that I have is the part where the device does something based on the word input, and the word input corresponds to the variable _buffer_ from the code snippet above.

So the _std::cin_ part is a dummy line code that I'm using for testing out that my part of the code starts up in the background process (to load the engine) and does whatever it is designed to do.

**EDIT 2** : To clarify what I’m trying to achieve, the program takes an input from a speech recognizer and does things (e.g. synthesize speech from input, send out signals to LEDs, etc.). The text input can be taken directly from the console (which my code is currently doing), or some other method that I’m not knowledgeable in. The only thing that the input adheres to is that it _must be_ a text, and is sent from another program that recognizes speech (which is handled by some other developer). So the exact method isn’t specified. All I have to worry about is my side of the program, which executes a function from a text input (i.e. _buffer_ from the code snippet).
So the general structure would look something like this:

    Int main(){
    LoadEngine()
    while(1){
    	buffer = ReceiveInput();
     	ExecuteFunction(buffer);
     }}
Where the _ReceiveInput()_ part is currently implemented as _std::cin_. It really can be any method, as long as the engine is loaded once in the beginning, and the program is able to perform _ExecuteFunction_ from an input until the device is turned off.

                                

stock username (33 rep)

Aug 27, 2015, 03:27 AM • Last activity: Jan 23, 2018, 07:05 PM

1 votes

0 answers

259 views

Native speech recognition software for Linux

software-rec speech-recognition

I am looking for a user friendly speech-to-text software to run offline on linux. Simon needs a dictionary and training, I need a full dictation-able software, speechpad needs internet access, julius and sphinx are engines not for end-user. I am in doubt if there is a solution. Look, I am able to in...

                                  I am looking for a user friendly speech-to-text software to run offline on linux.

Simon needs a dictionary and training, I need a full dictation-able software, speechpad needs internet access, julius and sphinx are engines not for end-user. I am in doubt if there is a solution.

Look, I am able to install softwares via ssh tunneling connection, but it is a pain in the neck to hold this connection forever, I am not allowed to give internet access to this machine, I would rather not to, just for installing purposes, after that it should run natively and offline.

For now it is Ubuntu 16.04.1. Is there any solution?

Tiago Pimenta (646 rep)

Dec 20, 2016, 11:56 AM

1 votes

0 answers

66 views

Use whole dictionary file with Julius or return null

speech-recognition

I have successfully set up [Julius](http://julius.osdn.jp/en_index.php) to work with my own grammar and .voca files. The problem I have is that it always returns a suggested response, even though the phrase uttered may not sound like anything in the .voca file. I would like to find a solution to eit...

                                  I have successfully set up [Julius](http://julius.osdn.jp/en_index.php)  to work with my own grammar and .voca files.

The problem I have is that it always returns a suggested response, even though the phrase uttered may not sound like anything in the .voca file.

I would like to find a solution to either return null if no match found and/or use the whole dictionary file to pull words from. Can anybody help me with this or at least point me in the right direction?

Wildcard27 (165 rep)

May 15, 2016, 10:35 AM

2 votes

0 answers

559 views

arecord until sound level drops low enough

audio python alsa speech-recognition

I am trying to implement constant voice recognition on my Pi at the moment, I am achieving this by having two threads running, one constantly recording (with `arecord` in a bash script) for X amount of seconds, saving that information to a WAV and then restarting, each time this WAV is written the o...

                                  I am trying to implement constant voice recognition on my Pi at the moment, I am achieving this by having two threads running, one constantly recording (with arecord in a bash script) for X amount of seconds, saving that information to a WAV and then restarting, each time this WAV is written the other thread performs the recognition on the WAV file.

This is working pretty well, however if by chance the users sentence gets cut off and started again in the next loop of recording, the sentence becomes fragmented between two recognition results. 

My question is:
Is there a way to ensure arecord will record until the sound level drops below a certain threshold, so that the entire sentence will be captured on the recording and then once the user has stopped speaking for a few seconds, the recording will stop?

(I am using Python for all this btw)

Also if there is a better way of solving this problem, I'm open to suggestions, I'm relatively new to Pi, so not very well versed in all the wonderful things it can achieve.

Aphire (131 rep)

Jan 21, 2015, 04:04 PM • Last activity: Jan 21, 2015, 10:51 PM

2 votes

1 answers

516 views

Can pocketsphinx_continuous read from stdin?

stdin speech-recognition

There is one parameter -adcdev "Name of audio device to use for input.", but it doesn't say if this can be stdin. Can pocketsphinx_continuous read from stdin?

                                  There is one parameter -adcdev "Name of audio device to use for input.", but it doesn't say if this can be stdin. Can pocketsphinx_continuous read from stdin?
                                

Yimin Rong (953 rep)

Nov 6, 2014, 09:02 PM • Last activity: Nov 10, 2014, 04:33 PM

6 votes

1 answers

6016 views

Convert audio to text

audio software-rec speech-recognition

I heard about existence of some speech recognition systems, and it seems I need one of those. Basically, I have an audio file with speech (only one person is speaking most of the time), and I want to get a transcript of the speech. Is something like that possible?

                                  I heard about existence of some speech recognition systems, and it seems I need one of those. Basically, I have an audio file with speech (only one person is speaking most of the time), and I want to get a transcript of the speech.

Is something like that possible?

Rogach (6533 rep)

Sep 2, 2012, 05:15 AM • Last activity: Jul 20, 2014, 02:45 AM

Showing page 1 of 20 total questions