A growing number of gesture interactions are being built into consumer products. Tablet PCs now recognize a number of hand movements, new multi-touch phones and media players are on the market, and thousands of households now attach Wii remotes to their forearms every evening. (If you happen to visit Vegas, you can also visit the Rio iBar owned by Harrah’s and order custom drinks or play virtual bowling on the Surface table.)
Gesture interactions are not new - visual designers have used gestures and their styluses to tweak graphics files on Wacom tablets since the 1990's. However, the use of gestures in mainstream applications is novel for many consumers.
Over the past couple of years we have studied users interacting with different gesture-based interactions, and we'd like to share some of the issues that they have experienced:
- There is no consistent "visual language" to help users realize that a product will recognize their gestures. Many users are often not aware which gestures a product will recognize until they experiment or see a demonstration of those specific gestures. For example, it is not always clear how to select or "pick up" an object using gestures. And if the user owns a multi-touch phone that recognizes gestures this does not necessarily make it easier for that user to play a game that uses gestures (or vice versa). The visual cues or application behaviors that help a user realize that their gesture will be recognized have not yet standardized. Meanwhile, the gap between 2-D gestures (where the user’s hand remains in contact with the screen) and 3-D gestures (where the user can take advantage of all three dimensions) still remains large. This is likely to standardize over the next few years. Although this lack of predictability can be fun in a gaming environment, it is neither fun nor effective when the user must figure out how to use a gesture interface for work or study.
- Naturalized human gestures often undershoot or overshoot "gesture interactions." Our fingers, hands and arms are accustomed to making certain types of gestures in three-D space. And through motor memory, we are accustomed to holding and manipulating objects with a certain amount of weight or resistance. This holds for both fine motor and gross motor movements. Users who try to manipulate a screen control through 2-D or 3-D gestures in the absence of resistance often act on the natural tendency to shorten their movements. When they find that their gestures fall short, they may then exaggerate their gestures and overshoot.
- Gestures can take more work. And we're not just talking about folks who work up a sweat from Rock Band or Wii Bowling. Gestures make sense on small screen devices such as phones or media players because they help maximize the ways that user can interact with their screen without adding additional buttons or burying functionality in menus. But for productivity applications (such as writing, entering data, editing video, or analyzing data), gestures are not necessarily easier than entering keystrokes or making menu selections. Dragging an object across the screen with your fingertips also requires the use of your forearm muscle and can take more time than using a mouse or trackpad because the on-screen representation for gestures is usually larger than the tracking area for a device. However, the value of using gestures to trigger a series of pre-defined actions remains very high, especially when the gesture has a natural association with those actions.
- Commercialized gesture recognition does not yet tap communication cues from inherent body language. All of the gestures we have seen associated with multi-touch or gesture interfaces have been focused on the interaction between the user’s hand and the screen. This is a natural extension of drag-and-drop selection of objects and seems to work well for "manipulations" where the user needs to adjust or modify the visual representation of data (such as a photo, video, or data file). However, it does not utilize the full expression of communication styles that humans have attached to hand gestures. For example, when we are agitated or frustrated, we often tense the muscles in our hands or clench our hands altogether. And when we speak, we often use our hands, arm position, and body posture to communicate broader attitudes and stances. Think of all the data that user experience researchers could capture about attitude and performance if we could automatically capture information on user frustration this way!
One more thing... For our summer project, User Centric conducted a preliminary (and internally funded) study on the usability of the Wii Fit. (We purchased a Wii Fit in interest of user research, of course!) We recruited a mixture of personal trainers and "everyday consumers" and examined task performance and participant comments. As it turns out, gesture/interface feedback was one of the major issues that both the personal trainers and "everyday consumers" reported.
Comments
Post new comment