I'm trying to cook up a script to use OpenAI's Whisper as a speech-to-text. The basic idea is I press some key combination, it starts recording to an audio file somewhere, I stop it, it analyzes the audio and gets the text from it, and then it pastes all this text directly to my cursor in whatever program I was using. It's this last part I don't know how to do - I can output it to the clipboard, but that's still an extra couple keypresses to paste and I want it to be as smooth as possible. Is there a way to do what I want?
EDIT: here is the complete script, which you can modify as necessary.
#!/bin/bashgunchislas() ( echo $BASHPID > [scripts path]/pid.txt arecord -f cd -D "default" --file-type raw | lame -r - "[scripts path]/.stt.mp3")waiting() { zenity --info --text="Taking speech to text input now."}gunchislas & waiting & wait -npkill -P $(cat [scripts path]/pid.txt) --signal 9aplay "[scripts path]/clock.wav"whisper "[scripts path]/.stt.mp3" --language en --model tiny | cut -c 28- | awk NF=NF RS= OFS=" " | xcliprm "[scripts path]/.stt.mp3"rm "[scripts path]/pid.txt"aplay "[scripts path]/recorded.wav" & xdotool type "$(xclip -o)"
clock.wav and recorded.wav are just little notification sounds I made so you can tell what it's doing