Skip to content

Smartphone Controls with Eyes and Simple Gestures are Here

Using this new gaze-tracking tech, you will soon be able to control your smartphones using your eyes and gestures instead of spoken or typed commands.

As users continue to use their phones to read news, watch movies, edit multimedia, and keep up with social media, these devices have grown to accommodate higher processing power and bigger screens needed for those demanding tasks.

However, these phones frequently need a voice command or a second hand to operate, which is quite inconvenient and cumbersome (1).

In response, Future Interfaces Groups’ researchers at HCII, Human-Computer Interaction Institute of Carnegie Mellon University, are developing a new tool called EyeMU. It allows users to execute operations on their phones using gaze control and simple hand gestures.

Read Also: You Can Now Pay Using Your Face With this Technology

More About the Gaze-Tracking Tool

“We asked if there is a more natural way to interact with smartphones. And the pioneer for a lot of what we do is to look at something,” said Karan Ahuja, a doctoral student at HCII (2).

While gaze analysis and prediction is not a new technology, it would be a noteworthy advance if we achieve an acceptable level of functionality.

“Our eyes have what we call the Midas touch problem. We cannot have a situation where something happens on the phone everywhere we look. We might open many applications at once,” said Chris Harrison, an associate professor in the HCII and the director of Future Interfaces Group (3).

This difficulty can be solved with software that precisely tracks the eyes. A computer science senior, Andy Kong has been fascinated by eye-tracking technology since his first days at CMU (4). He found commercial versions too expensive, so he designed a program that tracked the user’s eyes and moved the cursor across the screen using the laptop’s built-in camera, a critical first step toward EyeMU.

“Right now, phones only interact when we ask for something, whether through speech, taps, or button clicks,” Kong explained. “Think how much rewarding it would be if we could forecast what the user desired by analyzing look or other biometrics.”

It was challenging to streamline the program to run quickly on a smartphone.

“This is due to a lack of resources. You must ensure that your algorithms are sufficiently fast, “Ahuja remarked. “Your eye will skip along if it takes too long.”

For more details, consider reading the research article on EyeMU Interactions. The team also presented their findings at last year’s International Conference on Multimodal Interaction (5).

Kong and Ahuja upgraded that early prototype by studying the gaze patterns of users looking at various sections of the screen and rendering the mapping data using Google’s Face Mesh tool. The team then created a gaze predictor, which uses the phone’s front-facing camera to focus on and register what the user is looking at as the target.

The team could make the tool more productive by integrating the gaze predictor with the smartphone’s built-in motion sensors to enable commands.

For example, a user may secure a notification as a target by looking at it long enough, then flicking the phone to the left to dismiss it or to the right to respond to it. Similarly, while holding a tall cappuccino in one hand, a user might pull the phone closer to expand an image or move the phone aside to deactivate the gaze control.

“Big tech giants like Apple and Google have come fairly close with gaze prediction,” Harrison said, “but simply looking at something won’t get you there.”

“The key breakthrough in this effort is the integration of a second modality, such as flicking the phone left or right, in conjunction with gaze prediction. That is what makes it so effective. It seems obvious, yet it’s a brilliant idea that makes EyeMU a lot more intuitive.”

Read Also: Companies Worldwide are Betting Big on Metaverse!

Gaze-Tracking the Future

Smartphone companies are striving to make their devices smarter. However, a mobile device’s ability to track the user’s gaze and execute the appropriate command has remained elusive (6).

Smartphones have been available since the late 20th century, but the fundamental design we use today is mostly based on the iPhone, which was released in 2007.

It had been 15 years since then. In 2022, as our phones become smarter, it’s important to remember that they’re also becoming the focus of our “smart” universe. Phones are essential in smart homes and automobile technology.

That isn’t likely to change anytime soon. Trying to get away from your smartphone in 2022 will be even more difficult than now (In response, a countermovement is also rising, Dumb Devices are Making a Come Back Across Industries).

In 2020, Google started experimenting with the Look to Speak app that allows people with motor or speech impairments to select and speak phrases (7).

The app allowed users to choose preselected phrases on their phone screen with their gaze.

Richard Cave, a Google speech and language therapist, works with individuals who have speech and motor disabilities, particularly those who are nonverbal and require communication support (8).

Cave added in a blog post unveiling the app, “It’s more than a job for me; it’s a passion.” “Every day, I work to make it easier and more accessible for people to communicate their basic needs, thoughts, sentiments, and identities.”

Google designed the app to be used on a smartphone and in conjunction with more advanced assistive devices. After placing the phone slightly below eye level, a user looks left or right to select from a list of sentences, which the gadget subsequently speaks aloud.

According to Cave, the sentences can be customized to allow people to express themselves in their voices. All data in the Look to Speak app, according to Google, is confidential and never leaves the phone.

Eye-tracking and other assistive touch technologies are also compatible with Apple devices (9). Eye-tracking has also made its way into games, with Meta filing patents for the technology, which will be included in the newest Oculus VR headset.

Zuckerberg has previously claimed that a possible Quest Pro is in the works and will be packed with cutting-edge sensors such as facial and eye tracking. Sony’s planned PS VR2 is supposed to be the first consumer-level headset to include eye-tracking, but Meta could beat them if the Quest Pro arrives in the summer or fall, as is presently expected.

In February, Meta added full-body tracking to the Quest OS, demonstrating that the business is always working on new capabilities for prospective VR headsets. Face, eye, and body tracking would undoubtedly make the Quest Pro a formidable opponent.

In VR, eye motions will be used first (10) (Suggested Reading: Eye-Tracking, Metaverse, and Health Crisis with Virtual Addiction).

Smarter Future

This advance in gaze tracking was made possible by adding a new modality: gesture.

Gestures provide more meaningful information than a look and improve the tech’s accuracy. This combination makes using a gadget while on the go much easier.

When used with AR glasses rather than mobile phones, gaze technology makes more sense. For example, users can navigate through a feed or read emails while looking straight ahead instead of glancing down at a phone.

Suppose it proves popular with developers of other digital platforms. In that case, it means that second and third-generation AR glasses will be able to perform more with a glance or a simple gesture than ever before.

Read Also: Eldercare Robots are Ready to Disrupt the Healthcare Industry

AR Glasses Could Replace Smartphones Soon

In November 2021, Apple analyst Ming-Chi Kuo announced that the company would launch a new AR headset in 2022. On the same note, he also predicted that the iPhone would live for a decade more before Apple would replace it with AR (11).

It was not the first that any analyst discussed the end of the iPhone. As we have previously discussed, Apple has been shifting more towards added services within its Hardware and Software products (Suggested Reading: Apple in an Aggressive Battle to Win the Indian Market).

Apple could be one of today’s most prominent companies; it appears that the CEO Tim Cook is well aware that a business can’t rely that much on one product (Suggested Reading: Netflix Could Adopt New Growth Strategies to Recover Record Losses), and even as of today, half of its revenue is still made up of iPhones.

Hence, Kuo believed that the future of Apple depends on its success with the company’s AR venture.

“Apple aims to replace the iPhone with AR in the upcoming decade, representing the demand for ABF of AR headsets will exceed at least one billion in the next ten years. Unimicron, Apple’s sole ABF supplier, is set to be the leading beneficiary,” said Kuo (12).

He also added that if the AR headset is only positioned as a Mac or iPhone accessory, it could affect the product’s success. However, if Apple offers an AR headset that works independently, it would mean that it could offer a complete and flexible user experience with its ecosystem.

And with multiple breakthroughs like gaze-tracking technology, not only iPhone, we could eventually see AR headsets replacing smartphones in the next ten years.