02 Aug 2020 — 7 mins read

Alexa, what's the future like without keyboard and mouse?

(1 Nov, 2018) It was first published on Zensar Blogs.

Until the mid to late 1970s, the manufacturers didn’t really care for the computer interfaces. The few people who had access to computers were professionals or academics. Since the user base was small, it wasn’t necessary to focus on how the users interacted with the computers.

With the dawn of personal computing, the area of interaction opened. All the major players, Xerox, Apple, IBM, started designing their version of the keyboard and mouse. In the 1980s, the discipline of human-computer interaction was born with a single objective; making computing easier for the masses.

Whenever there is a paradigm shift, it drives humans to innovate new technologies. And Innovating new products is essential to the progression of society. We are currently facing a similar shift, the rising use of smartphones, tablets, VR, AR devices, smartwatches and smart home gadgets have made the keyboard and mouse impractical. And just like every time, we are innovating. There are a couple of technologies and products that are all set to replace our traditional keyboard and mouse. And all these new ways of interactions have their pros and cons, some claim to be faster, some claim to be easier and some work where others just can’t. Let’s explore the future of interacting with computing devices.

Advanced technologies that will help us in interacting with computers

Voice

Speech is considered as the future of interacting with computers. Since the launch of Siri in 2010, the world has been enthralled with voice interfaces. How convenient has it become to just tell Google to turn off all the lights when you leave a room, Alexa to play your favourite song when you feel bored or Siri to tell you the current weather when you are planning a trip to Lonavla. According to some estimates, the voice-first devices have reached a total footprint of around 33 million in circulation.

Although speech can be useful and fast, it has some downsides to it. Suppose you have to compose a whole paragraph or write a paper, during that, the speech starts to interfere with the part of the brain that is composing the information for you. Also, speech interfaces can be slow and embarrassing when other humans are around. And they always require a start phrase like “Okay Google”, “Alexa” or “Hey Siri”, which can be pretty annoying after some time.

Thankfully, though, talking into midair is no longer our only option.

Wearable Keyboards

Wearable technology is an emerging market. Revenue from wearable device sales are forecast to a amount around 26.43 billion U.S. dollars y 2018. Most of the people picture health monitoring devices when they hear “wearable technology”, thanks to Fitbit. But its applications can span much wider than just health monitoring devices.

Tap systems, Inc. is trying to reimagine how portable and ergonomic a keyboard can be. Tap is a smart wearable device that enables the user to type without the use of a physical keyboard. According to the company, it is designed in a way such that it supports strain-free input for a longer period. Unlike traditional keyboards, the user can tap on any surface using a combination of the fingers to define a particular symbol or alphabet. The combination is something like this – tapping with your thumb and index finger, simultaneously, will input the letter ‘N’.

The major market that the company is trying to focus on is VR and AR, where you cannot see what you are typing on a keyboard. It’s definitely an interesting piece of hardware but there is a learning curve to it. And the user has to remember a lot of combinations for different letters.

Gestures

Group of people showing how would they turn off the lights (credits – IDEO)

The way computers perceive visual information is an entirely different discussion. Computer scientists have been researching on computer vision for a fairly long time. Today, a computer can understand if you are stirring a coffee mug or opening the laptop. This itself is very fascinating because each image/frame is just a long list of binary digits that contain information about each pixel.

Demo video of activity recognition using Temporal Relation Networks (credits – Bolei Zhou)

A lot of these smart devices come with a camera, sometimes multiple cameras that can perceive three dimensions and generate a depth map of the scene. Combining this with trained neural networks, computers can identify the humans in the scene, how they are posing and how they are moving. This technology opens the gates to gesture-based interaction models.

Leap Motion takes the idea of hand interactions another step further. They made sensors that can detect your hands moving in space and then model them into virtual environments. The key is to detect all the movements and reflecting those movements in VR in real-time.

Hand modelled in VR using Leap Motion device mounted on headset (credits – Wall Street Journal)

I worked on a hand gesture recognition framework to control the cursor on your computer. This framework was also used in a retail catalogue system which worked entirely using facial recognition, hand gestures and speech. The user was able to move the pointer across the screen, click icons and use the scroll functionality.

Controlling the mouse scroll wheel using hand gesture (credits - Zenlabs)

Speed is the major advantage of using gestures as it is much quicker than speaking sentences. For example, instead of asking the computer to “scroll 10% down”, you can just show your fist as if you are grabbing the page and then move up or down to scroll the page. It’s much more natural for the given usecase.

But if you have to communicate a longer set of information, probably, it’s not the best option.

Brain Control

Brain wave detecting headset - Emotive Epoch (credits – Emotive Epoch)

How cool would that be if a real brain could communicate with an artificial brain? Well, it’s a reality now. We can now get the data by physiological functions using devices that can track the activity of different systems in the body. There are headsets that can sense brain waves. Similar to what neurons do for our body, these headsets pick up brain signals and then translate them into actions.

At Stanford, researchers have developed a technology called brain gate that has enabled a woman suffering from A.L.S. to express her thoughts by typing on a screen, not with her fingers but with her brain waves.

The current generation of brain waves sensing headsets are not that powerful, so people need to focus a lot to provide commands to the system.

Conclusion

All the new technologies beg the question – out of these existing interaction technologies – sound (voice), haptics (touch), vision (gestures), and bio-feedback (brain control) - which is the best on to use?

The use cases point towards the answer. I believe all these technologies will not be competing against each other, rather, they will be working together in a multimodal interface. We will be communicating in one interface and getting the response in another. It will be complicated to develop and adapt, but that hasn’t stopped us from innovating, has it?

Voice

Wearable Keyboards

Gestures

Brain Control

Conclusion

References