Two-wheeler Kate: in praise of speech recognition software

For many years I was a slow and inaccurate typist. Now, at relatively good speed, I am going to compose a piece to recommend that – if you don’t already do so –  you try using speech recognition software for your own writing.

Initially my failure as a typist was hidden by the capabilities and commitment of two colleagues to whom I would regularly dictate. Leanne and Lexia served me and our shared endeavours with extraordinary patience, speed and confidence. And our work together also served to build lasting friendships between us.

But one day, perhaps 12 or 14 years ago, our colleague Michael suggested that I try Dragon NaturallySpeaking, a new-fangled computer system to convert spoken words into written text.  As I recall it, the cost to the NRHA was something in the order of $200 for the software and the headset. The rest, as they say, is history – including the rest it meant for Leanne and Lexia.

As you probably know, the system ‘learns’ to ‘recognise’ a particular voice and a particular vocabulary. What this means is that one should not allow other people to dictate to the program, for fear of confusing the poor beast.

Those last words attest to the fact that, very soon, there was a significant amount of anthropomorphism attached to the Dragon and its use. For some reason we adopted a Tiger rather than a Dragon – perhaps there’s something more friendly about the former? – less breathing of fire and brimstone?

Anyway, ‘Tiger’ soon became a critical member of the NRHA staff, one who could be blamed for written errors, – while the credit for any good pieces written could remain with me.

I have become heavily reliant on my Tiger, particularly after the onset of Parkinsonian  tremor. However, and despite very heavy usage, I am sure that I am not what used to be called ‘a power user’. What I mean by this is that, like the motor car I drive, the system over which I have control has functionality that I use sparingly or not at all. For instance:

“The software has three primary areas of functionality: voice recognition in dictation with speech transcribed as written text, recognition of spoken commands, and text-to-speech: speaking text content of a document.” https://en.wikipedia.org/wiki/Dragon_NaturallySpeaking

My usage is restricted almost entirely to the transcription of speech. I use a few spoken commands (for punctuation,  new paras) and text-to-speech not at all.

While on the Dragon website, let me give credit where it is due:

 "Dr James Baker laid out the description of a speech understanding system called DRAGON in 1975. In 1982 he and Dr. Janet M Baker founded Dragon Systems - "

"DragonDictate - - utilized hidden Markov models, a probabilistic method for temporal pattern recognition. At the time, the hardware was not powerful enough to address the problem of word segmentation and DragonDictate was unable to determine the boundaries of words during continuous speech input. Users were forced to enunciate one word at a time, each clearly separated by a small pause."

Well things have improved immeasurably since then! My Tiger certainly has its moods and sometimes it helps if I give it some TLC, perhaps by checking the audio reception. But rarely do I think to improve its service by the other means available, such as tailoring special words for its learned vocabulary.

Tiger is willing but, despite becoming familiar with my voice, vocabulary and subject matter, still relatively naive. Tiger’s work is undertaken by the recognition of sound, by deducing or estimating what is phonetically apparent. (That is certain to be an entirely unsatisfactory description of the scientific means to which Tiger is slave!) The problem is that English is not wholly a phonetic language and the propensity to occasionally get something amusingly wrong is what provides some of Tiger’s charm.

No matter how helpful, Tiger does not have the human capacity to select a word according to the context or nuance of the sentence. He’s just a machine. Vive la différence.

So let me leave you here with just a taster of my Tiger’s sense of humour. It’s actually this that I set out with this piece to inform you about; but enough, for now, is enough. There can be more of that later.

Exhibit 1: “R. would be happy to be apart of this great event.” One cannot criticise Tiger for choosing ‘apart’ rather than ‘a part’. But by so doing, the sense conveyed is the opposite of what was intended. It’s a reminder of the need to check what’s been drafted onto the page – a habit which is essential for good clear writing whether with a Tiger or not.

2: “One of the key recommendations endorsed by the dissidents in the 9th National Rural Health Conference related directly to this matter.” What I said was ‘the participants’, not ‘dissidents’. The amusement is enhanced by the fact that it is always hard to get everyone at a large conference to agree to a particular recommendation, so there may well be a number of dissidents.

3: “For those unaware, it was the Toowoomba Hospital Foundation which hospiced the very first National Rural Health Conference in Toowoomba in 1992.” ‘Auspiced’ was intended but ‘hospiced’ is nice in this context.

4: “The NRHA’s but it’s tried cheese and election policies.” This is lovely! The trick is for you to say the words over and over until an alternative truth is heard.

5: “Andrew Waters, Manager of policy and kinetic oceans.” (Communications.)

6: “In many areas, pregnant women and their family have two wheeler Kate for an extended period prior to the birds.”

Ladies bicycle made in Melbourne by Arthur James Sutherland for his wife Marion Sutherland about 1910. National Museum of Australia. Photo: Katie Shanahan.

Post script: it occurs to me that in this matter I may owe a debt of gratitude to the incomparable Afferbeck Lauder (Alastair Ardoch Morrison, 1911-1998), whose Let Stalk Strine was published in 1965, way before I had any mind to go to Australia. Presumably it’s still available but I cannot answer the question Emma Chisit.)