Hands-on with Touchless UX
If you are thinking of replacing physical interfaces with touch free or touchless ones for safety reasons, there are a number of things to consider. In this post, we will cover some of the design considerations for touchless interfaces.
The challenge
In our daily lives there are many interfaces which require touch, from door handles and elevator buttons to things like ATMs. Additionally, in our workplaces there could be a huge variety of machines or systems that require a number of different people to touch screens or physical interfaces like keyboards or buttons. In a world where virus transmission is becoming a primary concern for everyone, it’s important to consider best practices for designing a touch free interface.
Who will use your system?
Before you start building it is important to consider a few factors about your specific problem space. The first thing to consider is who will be using the system you are building. Is this something that will be accessed by thousands or millions of people over time, or something designed for a smaller group of people?
For example, a touchless ATM interface could be expected to be accessed by large numbers of people. While the modern ATM relies on a limited number of physical buttons that perform a number of tasks based on the on-screen dialog, replacing such a system with a touchless one is not trivial. An ATM that used voice commands would not be ideal, since privacy is a concern here. Today, ATMs are legally required to be accessible to people who are visually impaired. They do this by providing a headphone interface so data such as bank balances can be read aloud, and physical keys with braille numbers and symbols. Headphones solve one side of the privacy problem here by restricting the information that comes from the system to the user, but if the keys were to be replaced by voice commands the user could still potentially disclose personal information to anyone standing nearby.
This also raises a second issue – if you use a gesture control interface for a public system, it could become inaccessible to some disabled people. How do you signal to a visually impaired user what areas input should occur within? Are there gestures which are difficult to perform for some users? Does the input area for your system cover where someone in a wheelchair can reach?
The first and most important question when designing a touchless system therefore is who are the users?
A second issue to consider on this point is skin color. Some sensor types rely on a certain amount of light being reflected back as a signal. Depending on the parameters of such a system, they can fail to work for people with darker skin. For any product where many different users could attempt to use it, failing to account for all potential users means the product itself could be considered a failure.
How many things do you want to control?
In the case of something like an elevator, the call button for an elevator is one simple button push. This could easily be replaced by a simple sensor that registers when someone waves a hand through the sensor area. Once inside the elevator however, how do we control which floor the elevator goes to? In this case, perhaps a voice recognition system is the best solution, since the inputs are limited and very defined. In a safety critical environment like an elevator though, having a physical interface as back up in case of emergency or fire would also be something to consider. In safety critical applications, edge case scenarios are as important to design for as the expected usages.
If your expected inputs are wide and varied, such as usernames and passwords a more flexible touch free system should be considered. In an earlier blog post we discussed the difference between gesture tracking and hand tracking systems. An interface which requires a variety of input modalities would likely require a hand tracking system, where the user could touch a floating virtual keyboard with their fingers. Another consideration here is that if you are displaying icons on a screen that you do not expect the user to actually touch, you will need to design the system accordingly. At CES 2015, Intel RealSense Technology displayed a proof of concept demo where a ‘holographic’ floating piano keyboard could be played by a user without touching a screen or physical keys. It used a combination of mirrors and lenses to project a floating keyboard.
What can the user understand?
Touchscreens are ubiquitous today, but when they were initially launched, users required training to understand the basics of interaction available to them – from simple touch, to pinch for zoom, and touch and drag for scroll. A good rule of thumb to follow when expecting users to adopt a new method of interaction is to limit to three basic interactions or less. Any more than that, taught simultaneously, are likely to confuse a user. Think about a mouse – while it can do many different interactions, most interactions are performed by moving a pointer on a screen and either left or right clicking. While a scroll wheel might make it easier to scroll, we can still perform that action without using a scroll wheel. If we are designing a touch free system for many users, limiting what we require them to learn to use the system is extremely important for comfort and adoption.
Digital signage is a great example of a place where gestures should be limited. Ideally, most of what the user wants to do should be accomplished by two or at most three gestures. A wave might work perfectly for scrolling or moving through menus or screens, and then a thumbs up or OK gesture for a ‘click’ or confirmation is something that most users will grasp very quickly. Today, there’s no current standard for gesture based interaction in the same way that there is for something like a mouse or touchscreen, but leaning on gestures that users are already familiar with either from social interactions or from other modalities will enable quick and intuitive adoption.
Additionally, if you expect there to be multiple people perhaps standing near your primary user when they are interacting with it, how will you differentiate and identify different users? Is it important that you know who is acting or not? Is the user going to need to define their local environment and make sure they are the only one within that interaction area? In some cases, for example, the elevator button, it doesn’t matter who the interaction comes from. For an ATM, it very clearly matters.
Exceptions to the rule
While limiting gestures on a system that requires quick adoption is incredibly important, it is worth also considering use cases where users can be trained over time. In a working environment where someone can receive training and learn a system over time, the number of gestures and interactions included can be increased – as a user becomes a power user, they are able to handle many different types of interaction. One key point here though, is that a gesture and touch free based system may still not be the best solution in this case.
As novice users of any interaction system, we primarily focus on visual cues – we look at something to tell us what to do or to touch, with the exception of voice recognition systems. Think about a keyboard as an example. When we first learn to type, we look at the keys. Over time as we gain mastery, we are able to type without ever needing to look at the keys, and do so incredibly quickly. Requiring us to look at an interface to use it slows us down once we have attained mastery in a system. If you are building something that either requires complex interaction or will ultimately require mastery, it may be worthwhile to look for alternate solutions that are per-user for safety. For example, is there a scenario on which someone could use their cellphone to interact with a system safely? In the case of industrial machinery, an individual keyboard or control interface that could be connected either wirelessly or directly might make more sense than a gesture control system.
Depth as the sensor
If you are going to use a depth camera as the sensor in a gesture control system, the first thing to consider is the environment you will be using it in. Is it solely an indoor solution? If so, the Intel RealSense LiDAR Camera L515 might be the perfect device for its high resolution and longer range.
In the case of something which might require an indoor or outdoor environment, one of the stereo depth cameras we offer, such as the Intel RealSense Depth camera D455 would be the appropriate solution. In the case of very short range indoor use cases, the Intel RealSense Depth Camera SR305 might be both a cost effective and ideal solution. Whichever one of these devices you choose will largely depend on the field of view and distance from camera that you want the user to be able to interact within. The further someone can stand from the camera, the wider the field of view is, but that may not be an important consideration if you are expecting the user to stand directly in front of it.
Subscribe here to get blog and news updates.
You may also be interested in
“Intel RealSense acts as the eyes of the system, feeding real-world data to the AI brain that powers the MR
In a three-dimensional world, we still spend much of our time creating and consuming two-dimensional content. Most of the screens