I tried to use DeepSpeech to transcribe last Friday's stream for editing purposes. It came out reading as though it were from a Markov bot.

"it is again stir so one of their god lichonin ... and had you know six in the morning the spies they came to me that quickly but he realized it was it was this and the box tom that is a piece of part ... god blue desperate busy sea libraries the niobrara with lucid programming hospitally not cluttered the bissextile"

I thought to see the fairies in the field, but I saw only the evil elephants with their black backs.

@publius I'm glad I've reminded you of that odd passage.

I won't pretend I didn't have to look it up. Something to add to my reading list for this fall.


I actually use it in my current intro for "Hear Now the Words!"

It's very convenient as a reading exercise because it is completely meaningless but very erudite-sounding.

@publius I'm glad you mentioned your show -- I haven't been following AnonRadio as much as I should, so I'd lost track of what's being broadcast these days.

I'll have to tune in this week :]

@jakob I was surprised to find that DeepSpeech doesn't seem to be using of a good language model, as it frequently produces obscure words and even weird combination of letters. At last, I found vosk from alphacep. It's quite accurate even when used by a non-native speaker :ablobowo:

@wzhd This time it wrote me a song:

how are your
oh wow
our our own
how long
our share

her mom
oh oh"

I think this is my fault, though. I should read the documentation.

@jakob I find the example a good place to start, sample rate and format are handled, so I don't need to worry about converting codec or getting timing wrong. I think the first time I talked into the microphone, words got recognized one by one with little latency, also the partial result sometimes changes to make it more likely to be a English sentence

