I Have Been Using Dictation and Voice Control for a Week. That’s What Happened
It seems like everywhere you look, voice control and dictation are added to every app, operating system, and game console. We love to joke about how bad it works, but I decided to dive into my head to see what it would be like to actually use it … for everything. Here’s what happened.
Why does dictation bother me at all
Like any real nerd, I have always been intrigued by voice control and dictation . It looks socool in themovies , and while we certainly haven’t gotten to it yet, it gets closer every day. Whether we like it or not, in the coming years we are likely to have much more control over our computers and phones with our voice.
Likewise, dictation has a certain romantic flavor. This is the modern equivalent of mumbling yourthoughts on a tape recorder, with the added benefit of recording what you say as you say it. For those who type all day, this sounds awesome. Maybe I can write while walking. Or, to be honest, I could write without even getting out of bed or even sitting down. (This is a dream, isn’t it?)
The point is, even though I had low expectations of how things would go, the potential whimsy – to enjoy talking to my electronic devices – won out. Will I look and sound stupid? Yes. Can I annoy my friends by replying to text messages in public while talking to my phone? Yes. But it all seemed worth it because I could fall in love with it and it was worth trying.
Day one: learning the ropes
You’d think from countless sci-fi movies that voice control is pretty intuitive. As I quickly realized on my first day, this is not the case. I started out by trying to write a few blog posts under dictation. Here’s an excerpt from my first attempt, just to give you an idea of how poorly I understood how to use it:
DeleteBackspaceTalkHaveJessica the way you talk to my phone sounds good
As it should be pretty clear, the first thing that happened was the wrong word, which I then tried to delete. After that, I just ran for a while at the computer. Ok, so it will take a little more research on my part to figure this out.
Luckily, we have a guide for that . So I pulled out the mic, learned the basics of formatting (using commas, periods, etc.), and came back to that. At first I took for a pretty simple little post . Here’s what happened initially:
This quote came from our recent Freakonomics interview with Wired co-founder Kevin Kelly and is a good reminder that it doesn’t need a specific definition.
Okay, that’s a hell of a lot better than my first try, and even Freakonomics was correct. I still needed to go back and capitalize Wired, replace “will” with “tool” and add a hyphen for the co-founder, but that was definitely better than before. The microphone, combined with a basic understanding of punctuation commands, clearly helps to make the text more readable.
However, what confused me here was not because of text-to-speech errors, but because I said what I wanted to print. It’s not as intuitive as I thought, and it turns out that I take many long pauses when I think about what I want to print next. Typing on the keyboard gives you enough time to sit down and think about your next sentence, while dictation and speaking get you moving much faster. It took me a while for my brain to adjust to this.
It’s worth noting here that using dictation on my iPhone for short replies to text messages and emails was much smoother. Because of the concise nature of text conversations, it was much easier to use dictation on my phone, and I enjoyed doing it, despite the unpleasant nature of doing it in public.
Day two: setting up and learning to operate a computer
As my second day of using dictation passed, I realized I needed a little more digging if I was going to make it useful. This means digging into real voice commands, not just using dictation.
This is more than just dictation of what you want to say, it is also a little editing on the fly. On Mac, it turns out that you need to enable dictation commands if you want full control:
- Open System Preferences
- Go to accessibility
- Choose a dictation
- Click the Dictation Commands button.
- Check the box next to Enable Advanced Commands.
By checking the box for advanced commands, I can control my computer, open applications, and most importantly, edit text. It was here that I learned my mistake from day one. To remove an erroneous word, use the command not just “delete”, but “delete this”. I can now edit with commands such as cut this, copy, undo and capitalize. If you ever don’t know what to say to trigger an action, you can always say “show commands” to get a popup with the available commands.
Windows users have their own set of commands , but in general they are very similar, although you can simply say “remove” instead of “remove this”. I have no experience with Windows voice control, but enabling commands is really simple:
- Open Speech Recognition by clicking Start> All Programs> Accessories> Ease of Access.
- Click Windows Speech Recognition.
- Say “start listening” or press the microphone button.
Advanced dictation commands on Mac also let you manage apps. You can use commands such as switch to [application name], open document, and click [item name] to control what you want. Not in the list of commands for the action you want to perform? In the same “System Preferences” menu, click the “+” button to add your own. Just enter the phrase you want to launch, select the application you want to control, and then select the action to be taken. Personally, I have stuck with keyboard shortcuts.
For example, being able to say “switch tabs” in Chrome using the Command + Option + Arrow keyboard shortcut completely changed the way I used voice commands. If you really want to get ahead on this, you can also trigger Automator actions , although my basic usage didn’t really require me to go that far.
It’s the same with Siri. With Siri, you need to know the language you can and can’t use . To its credit, Siri (and Google Now) is much more convenient than using voice commands on a desktop computer. Controlling everything on your phone with your voice is almost effortless, and once you learn to look like an oddball it will be a pleasure to use . It’s only day two, and I’m already messing around, too preoccupied with comfort to make an effort to use both hands to text someone. It’s a really sad sight, but I honestly don’t care.
Day five: finally got used to it
The fourth day goes by without much understanding, but by the fifth day I finally start to find my rhythm. I can not only do my job, but also do it somewhat efficiently.
At this point, I have configured voice commands for almost everything I need. I can switch tabs, switch windows, launch applications, control certain functions in applications (for example, say “next” to navigate between RSS Stories in Reeder), and continue most of the day without ever touching my keyboard or mouse. It’s kind of cool, although I can feel my voice getting a little hoarse.
The dictation itself also began to click better. This is a complete brain overhaul: once you communicated by typing, now you do it by speaking, so you need time to get comfortable. Where the first couple of days were very soft and simple sentences, I am getting better at incorporating my “voice” into what I say. You would think that it would be the other way around if the performance emphasized your individuality, but for me it required training. I just don’t speak the way I type. Besides, I walk a lot while dictating that I’m surprisingly good.
As a side note, I should also point out that dictation commands have seeped into my real life. In at least one conversation with a real person in real life, I said “comma” out loud. I’m sure this was mainly due to the fact that I completely immersed myself in the study of this post, but it is still worth mentioning. Fortunately, my oversight was greeted with healthy laughter by all who heard it.
Day Seven: Accepting and Returning the Keyboard
By the end of the week, I had mastered both voice dictation and voice control. Both have their uses, but I’ve since gone back to keyboard and mouse.
In almost every article that mentions dictation, people like to point out that the article was written entirely under dictation . There are often funny little mistakes, missing punctuation, or a few odd wordings all over the place. I wrote this post entirely under dictation. But I also edited it for dictation. When I was done with this, I edited it using my keyboard and mouse. Then I sent it for other people to edit it too. If you can’t physically type, dictation is just a tool. For writing, this is not the end. You still have to edit after you speak (this is part of the beauty of writing after all).
As for voice control, it’s fun for a couple of days, and then it gets a little boring. For me, using keyboard shortcuts is faster. Navigating with your voice turns out to be more frustrating than helpful, but at least you can eat Cheetos without messing up your mouse. Typing is also easier because my brain is totally tuned in to it. Unfortunately, I can type much better than I can speak, and even after a week of using dictation, the improvement was minimal. It was a fun experiment, but I feel like the effort it takes to get my mind set up for dictation rather than typing isn’t worth it in the long run. Of course, I can lazily type articles lying on the floor (or, conversely, standing up without a standing table), but right now I should probably sit right in my chair and type.
However, I can see that this is helpful in many ways. Dictation and voice control are handy if you have a computer set up as a media center. They’re also useful if you’re the type of person who likes to walk around to get ideas. Just do not hope that there is to do everything. I enjoyed using it, and I believe it will be useful when brainstorming ideas or when I just need to get things out of my brain to paper without worrying too much about editing.
However, I am now much better at using Siri and using it all the time on my iPhone, albeit not on any social or public occasions. It’s very useful anytime I can’t look at my phone, like when I’m walking, running, cycling or, you know, I’m too lazy to reach out and take it from the docking station. This makes more sense on mobile devices too, because the likelihood that you won’t be able to reach your phone is higher than that of your computer. The car is the most obvious time, but cooking, eating, or doing something else where your hands are tied all make it useful. It’s worth taking the time to find out what you can do with Siri and Google Now, because if you don’t, you never know when and how you can use them.