This article will start from a technical perspective and take the recently launched Apple MR as an example to analyze the three key technical points that the AR world needs to rely on: eye tracking, manual recognition and spatial calculation. Let’s take a look at the author’s analysis of these three technical points~
In the physical world, observing with eyes and operating with hands are the most natural ways of interaction. To realize the continuation of this natural interaction in the AR world, we need to rely on three key technical points.
We are not disappointed, as the release of Apple Vision Pro demonstrates the natural interaction capabilities enabled by three technologies.
1. Eye tracking technology
If we need to interact further with an object in reality, we will naturally focus our eyes on it. When we spend our attention to focus our eyes on it, it already represents our choice.
This process covers two states in the current interface field: activation state (focus) and click state (selection). Eye tracking technology realizes the process of seeing and focusing.
This technology is certainly not the first of its kind. The pioneer of AR glasses, Microsoft Holoens 2nd generation interaction features Eye-gaze, which is the function of focusing with your eyes.
Previously, the head-gaze interaction of the first generation of Microsoft Holoens actually had the concept of seeing and activating with the eyes. However, in Head-gaze, the head is slightly moved to control a point (Gaze) in the center of the screen to activate content. There is always a gap between it and our original natural interaction, because in fact, we focus on an object and do not need to focus on it every time. Just rely on your head to move and your eyes to move.
However, although eye tracking technology can better solve the standard of natural interaction, head-motion interaction is not without its merits. For example, the glasses currently developed by my team can only support head-gaze interaction. The cost and technical difficulty of this interactive method are lower. Compared with the focus activation method by hand/mouse/remote control, etc., it is closer to the natural interaction concept of seeing with our eyes.
In addition, eye tracking technology has realized the activation state, but it has not yet truly realized the selection, that is, the click state. The function is to tell the machine that I confirm it is it.
In strict terms, it is very necessary to omit the manual confirmation step for things that do not require operation. For example, when I am eating snacks and watching TV shows, I really hope that I don’t have to wipe my dirty hands before every operation... In the manufacturing field, this kind of need to free up hands is often raised by customers.
For more natural interaction, maybe this can also become part of our imagination. I wrote about a patent before that relied on brainwave technology for simple confirmation.
2. Gesture recognition technology
Eye tracking satisfies the interaction of viewing this part with eyes, while operating with hands requires gesture recognition technology.
This is not a new technology. It has been installed on many ARVR devices before. Of course, the degree of implementation needs to be confirmed by actual experience.
In the promotional video of Vision Pro, the gesture recognition looks very natural, and the hand does not even need to be raised. This should rely on 4 sets of downward-view cameras (this is probably one of the reasons why a 12-camera configuration is required).
Compared with the promotional video of Hololens2, it can be seen that the gesture is covered by the overhead camera.
Because gesture recognition relies on the camera (the computer needs input to know how your hands are moving), the results of the same gesture taken from different angles will be different, and the recognition results will also be affected.
In addition, the experience of 2D gesture recognition and 3D gesture recognition technology is also different.
3. Spatial calculation
To make it easier for the eyes to see and more natural to operate with the hands, the device needs to have the ability to understand space. Spatial computing is what Vision Pro wants to promote. Their emphasis on this capability makes them believe that this capability can divide an era.
"Spatial Computing Era".
Many interactions occur naturally because the machine has spatial computing capabilities. In other words, if the machine does not have the ability to understand space, these interactions cannot occur naturally. Nerf, SLAM, 3DOF, and 6DOF are all part of spatial computing technology, and everything you hear falls into this category.
From an experience perspective, the perception of distance between objects and between objects and users brought about by spatial depth; changes in the shape of objects brought about by the user's perspective orientation in different positions and postures; factors in the real environment Changes in color perception caused by differences in light such as time and weather; or even sounds from different spatial locations, etc. Spatial computing can give designers more space for natural interaction.
It can be said that spatial computing is the key technology that will enable AR to become different from ordinary screen interfaces in the future, and what we really expect, "everything you look at can become an interface."
Columnist
Lin Yingluo, WeChat public account: There are shadows falling in the forest, and everyone is a product manager columnist. A user experience designer who knows how to play cards, author of "AR Interface Design", 10 years of UIUX design experience, focusing on user experience design in the field of AR and intelligence for 6 years; design & psychology education background, national professional certification advanced OH card Teacher/Talent Discovery Coach. I hope my efforts can add value to the design field of an intelligent future and make the designer's career more valuable
Title picture comes from Unsplash, based on CC0 protocol
The above is the detailed content of Three key technologies give AR interfaces the power. For more information, please follow other related articles on the PHP Chinese website!