Eye Tracking – what it means for XR Developers and Designers

Technical Overview: Eye Tracking

Eye tracking means that the direction of the eye gaze for each eye, plus detection of eye-lid closure (i.e.blinks), are available. With a little mathematics it’s possible to turn this into a view direction in the rendered scene and determine the objects the user is gazing at. Additionally humans typically do not slowly track their eyes to a new direction, but tend to move them both simultaneously in a rapid shift in gaze direction called a “saccade”. Not only are they incredibly fast, but the brain also temporarily blinds you while you make them – that’s why  you don’t see a blurry sweeping image when you move your gaze direction – in a phenomenon called “saccadic masking.” Instead the brain replaces view during the saccade with a still image of the new direction.

Eye tracking also provides positional eye information  as well – that is – where the retina of each eye is – usually in relationship to the HMD. This opens up the possibilities for unique user identifiers. (Eye tracking hardware requires a driver to talk to the hardware to give you the information, but for security reasons – the information is filtered to provide only eye-tracking information – actual eye images are never shared outside the driver). Provided with this positional information it’s possible to adjust the inter-pupilary-distance (IPD) to each user, so that the lenses can be adjusted to that user’s optimal position.

Some tracking solutions (like the popular Tobii) also offer pupil dilation – offering up yet more possibilities to gauge the user’s interest.

In the vernacular of eye tracking, head-mounted displays and directional tracers/input devices, this position-direction (and sometimes implied “up” vector) triad is called a “pose”.  Just like you’ll need to know the HMD’s “pose” to render an XR scene correctly, you (or the game engine) need to orient the eye position and gaze relative to the HMD to get it into world coordinates. But once you have that there’s all sorts of possibilities for mischief.

TLDR:

  1. Eye tracking will provide a view direction – typically per/eye and averaged, and it might be up to you to transform that in “world space”. You can then use this to know what the user is looking at in your scene.
  2. Eye positional information can be used to measure (and possibly adjust) the IPD, perhaps retrieving a user’s profile if the IPD is unique.
  3. While the user saccades they are effectively blind – the brain seeing the new direction image instead of a transitioning one. (Yes, you read that right)
  4. Same with blinks – which are a little longer.
  5. You might also have the option of gauging the user’s interest by monitoring pupil dilation.

Practical/Ethical Options

So suddenly there’s a whole realm of possibilities open to the XR designer.

There are the “benign/helpful” ones like;

  • Treating a long gaze on an object like an “open” or “activate”
  • Fusing with foveated rendering to use high quality rendering only where the user is looking (saving both GPU effort and energy)
  • Gauging user interest by creating a “gaze” heat-map of the user’s view.
  • Making sure the user has “seen” an item of interest (like an alert) in the scene – and possibly using that fact as an acknowledgement instead of a “Press OK”

There are the “creative” ones;

  • Redirected walking is an opportunity to alter the topology of the virtual space compared to the real – either to make it seem virtually larger or to adjust the user’s trajectory through the real space – perhaps avoiding obstacles or other users.
  • Dynamically “adjusting” the scene in some way – either to bring attention to an object the user isn’t currently looking at or to remove or alter some object they aren’t currently looking at.
  • Changing the scene topology when they aren’t looking at it. This is a bigger version of redirected walking – you can alter

Redirected walking can be used in many ways – either adjusting the amount of virtual  head rotation vs actual (either increasing or decreasing as desired) or actually adding in a head rotation while the user is moving, to “guide” them in a desired direction – it’s amazing the amount you can get away with.

Questionable Options

As we strap more computing power to our bodies, we expect this power to be used to make the experience better. Unfortunately it also means that we’re passing more personally intimate physical information to be processed. This can be used for good or evil.

Scenario 1:

Imagine that you’re creating a  location based VR group experience. Say some derelict alien spaceship. You have a physical location that has a virtual landscape overlain on it. Since you also control the audio as well as the visual, you let the group start off going down a dark and creepy hallway – they are chatting it up, keeping their spirits up as they move as a group down the hallway.

You decide it’s time to up the excitement – you pick a subject and trigger an alternate experience for them from the main group. Using various redirected walking techniques you slowly guide the victim down an alternate physical route,  but meanwhile everyone still seems grouped together, and since they can still clearly hear each other, bit visually and audibly they are still together, but the main group has the victims avatar, and the victim has a virtual group they are actually hanging with.

You spring your trap.

The group suddenly sees the victim attacked and killed, with lots of screaming and gibblets (remember you own both audio and visuals, the two groups are now on separate rides) while the victim see one of the group attacked in a similarly horrible manner. Hilarity ensues.

Scenario 2:

You’re playing an XR Poker game, and manage to scrape the other user’s pupil dilation values – hence giving you a quantitative measure of how their interest in various cards are in real time – seems like a profitable effort.

Scenario 3:

You’re selling ad time on your XR platform, and one of the thing you provide are user’s physical measurements, including saccade time, blink measurements, pupil dilation. Some of the fun facts you can correlate with this information include;

  • Above average saccade times can indicate a brain condition or influence of drugs.
  • Above average pupil dialation can indicate either brain condition or influence of drugs.
  • Reduced observed upwards saccadic movement can indicate the user is elderly.
  • Excessive blinking can indicate the onset of a stroke, Tourett’s syndrome or some other disorder of the nervous system
  • Blink rates for females are usually higher than males.
  • Blink rates for females on oral contraceptives is 32% higher than for those not on oral contraceptives.

New market opportunities!

This entry was posted in Uncategorized, Virtual Reality, VR/AR/XR. Bookmark the permalink.