Accessibility Technology

Control Your Phone With
Eyes & Gestures

A production-ready system for touchless interaction using hybrid gaze estimation, 6-DOF head pose, temporal stabilization, micro-saccade filtering, dynamic calibration & AI intent prediction — 100% on-device, no cloud, no neural implants.

<80msLatency

60 FPSCamera

9-PointCalibration

100%On-Device

Real-time Gaze Tracking

System Layers

Vision Input Layer

MediaPipe Face Mesh + Hands captures eye position, head pose, and hand landmarks at 60/120 FPS with multi-point iris + eyelid contours.

Pupil + iris detection
Head pose estimation
21-point hand landmarks

Gaze Mapping Engine

Hybrid gaze model fuses binocular iris offset, head pose vector, and pupil boundary with temporal filtering to eliminate jitter.

Adaptive Kalman + EMA + window
Micro-saccade filter (12px/200ms)
Confidence-weighted fusion

UI Target Detection

Registers interactive components as bounding boxes. Detects gaze intersection with 300ms dwell time.

Bounding box registry
Dwell-time focus
Visual glow feedback

Gesture Engine

MediaPipe Hands landmarks power pinch detection, air tap recognition, and open palm cancel.

Pinch → Select
Air tap → Click
Open palm → Cancel

Calibration System

9-point calibration + continuous dynamic micro-calibration from confirmed interactions keeps gaze drift corrected over time.

9-point polynomial regression
Dynamic bias drift correction
Confidence-gated updates

Privacy & Safety

All video processing is 100% on-device. No camera data ever leaves the browser or device.

Zero cloud processing
No data transmission
Local calibration storage

How It Works

Launch

User launches accessibility mode — camera permissions requested

Calibrate

13-point ridge regression builds personalized gaze-to-screen model

Gaze

User looks at a UI element — system detects and highlights it

Gesture

User performs pinch or air tap to confirm and activate the element

Action

Element activates with visual + audio feedback confirming the selection

Layer 1 — Vision Input

Front Camera 120/60/30 FPS auto

Face Mesh 468 landmarks

Hand Tracking 21 landmarks

Head Pose 6-DOF estimation

Layer 2 — Processing

Gaze Vector Pupil → direction

                
                Kalman Filter
                Noise smoothing

EMA Filter α=0.3 smoothing

Gesture Classifier Landmark distances

Layer 3 — Gaze Mapping

Calibration Model Polynomial regression

                
                Screen Mapping
                Gaze → (x,y) coords

Dwell Timer 300ms stabilization

Debounce Logic Anti-jitter guard

Layer 4 — UI Interaction

Element Registry Bounding boxes

Hit Detection Gaze ∩ BBox

                
                Visual Feedback
                Glow + highlight

Audio TTS Speech feedback

Layer 5 — Action Output

Element Activation Simulated tap/click

Action Log Local storage only

Privacy Guard Zero data egress

Technology Stack

Frontend

Hono + TypeScript Vanilla JS CSS Animations Web Speech API

Vision AI

MediaPipe Face Mesh MediaPipe Hands WebGL Backend WASM Processing

Signal Processing

Kalman Filter EMA Smoothing Polynomial Regression Debounce Logic

Deployment

Cloudflare Pages Edge Network Zero-latency CDN HTTPS Only

Performance Targets

<80ms

System Latency

End-to-end response time

60 FPS

Camera Processing

Real-time frame analysis

<150ms

Gesture Detection

Hand gesture recognition

>85%

Gaze Accuracy

Post dynamic-calibration precision

Camera Feed

Camera not started

Click Start to begin

Active Mode

Detection Status

Face Detection Inactive

Gaze Tracking Inactive

Hand Detection Inactive

Gesture None

0 FPS

0 ms

0% Conf.

Gaze X: —

Gaze Y: —

Target: —

Hi! This is the AccessEye demo. Try looking at the buttons below and performing a pinch gesture to interact.

10:30 AM

The system will highlight buttons as your gaze focuses on them. Hold your gaze for 300ms to select.

10:31 AM

Gaze at a reply, then pinch:

Interaction Log

— System ready. Start camera to begin.

Phase 2 — Hybrid Engine ACTIVE

Hybrid | 30FPS | Kalman+EMA+Window

Gaze Confidence

Bright—

Occl—

Glare—

Latency—

Head Pose (6-DOF)

YAW 0°

PITCH 0°

ROLL 0°

Saccade + IVT

Fixation Scanning

Saccades 0

Fixations 0

IVT: Scanning

Velocity: 0px/f

AI Intent Prediction

Starting camera to begin…

Conf: 0%

Adaptive Dwell Timer

Preset: Normal (300ms)

Dynamic Calibration

Micro-samples: 0

Drift Bias X/Y: 0 / 0

Pipeline Benchmark

PACE Recalibration

Passive samples: 0

Tools

Gaze Calibration Layer

Optional post-processing layer. OFF by default — Phase 2 handles all calibration. Enable only if cursor drifts after Phase 2 calibration.

Sensitivity X 1.15×

Sensitivity Y 1.15×

Smoothing α 0.20

Voice Navigation

Inactive

Say a button name or “click”…

Navigation

Scroll Down / UpScroll page

Stop ScrollingStop scroll

Scroll To Top / BottomJump to edge

Go Back / ForwardBrowser history

Reload PageRefresh

Open New TabNew tab

Close TabClose tab

Home / Demo / DocsGo to page

Click & Select

ClickClick gaze target

Double ClickDouble-click

Right ClickContext menu

Next / Previous ItemTab focus

Select AllSelect all

Copy / Paste / CutClipboard

Zoom & View

Zoom In / OutAdjust zoom

Reset ZoomDefault zoom

Open SettingsSettings panel

SearchFocus search

Discovery

Show Clickable ItemsHighlight all

Hide Clickable ItemsRemove overlays

What Can I ClickShow interactive

Focus On [name]Focus element

Control Recovery

StopCancel action

Pause ControlSuspend voice

Resume ControlRe-enable voice

Reset CursorCenter cursor

Clear SelectionRemove highlights

Exit ModeClose overlays

Intent Fusion

Open / SelectAct on gaze target

ZoomZoom gaze area

Focus On [name]Named element

Snap-To & Targeting

Snap-To Mode

Cursor auto-snaps to nearest interactive element.

Auto Dwell-Click

Activates element after dwell time.

Snap Settings

Snap radius 140px

Dwell time 800ms

Smoothing 22%

Prediction 35%

Cursor Sensitivity

Adjusts how far the cursor travels per unit of eye movement. Lower if cursor overshoots; raise if hard to reach screen edges.

Sensitivity 1.0×

0.5× = narrow 1.0× = default 2.0× = wide

Adaptive Gaze Learning

Total activations 0

Settings auto-tune as you use the system.

Gesture Spam Control

Debounce time between gesture fires (800–1200 ms). Higher = fewer accidental repeats.

Pinch debounce 900ms

Air-tap debounce 1000ms

Accessibility Control Mode INACTIVE

Live Session Stats

Gaze Confidence —

Cursor XY —

Engine —

Interaction Log

Gaze activations0

Voice commands0

Key navigations0

Intent fusion0

Total logged 0

Elements indexed —

Gaze Dwell Time

Look at any element for the set time to activate via gaze. Only runs while ACM is ON.

Dwell 800 ms

Active Modalities

Gaze Voice Fusion Snap-To Keys

Compliance Export

Export timestamped interaction logs as proof of WCAG / ADA / Section 508 compliance.

Voice Commands

"Accessibility Mode" — toggle on/off
"Start Dictation" — type by voice
"Export Log" — download CSV
"Show Guide" — re-show hint

Gesture Studio

Create and customise hands-free gestures using hand movements and head poses. Custom gestures are always active once the camera is running.

Start the camera on the Live Demo page to enable gesture detection. Custom gestures activate automatically once recorded.

Calibrate from any page — improves eye-tracking accuracy.

Overview

AccessEye provides a JavaScript API for integrating eye tracking and gesture control into any web application. All processing runs client-side using MediaPipe WebGL workers.

Privacy First: No video data ever leaves the device. All ML inference runs in WebAssembly/WebGL workers in the browser.

Initialization

// Initialize the AccessEye system
const eye = new AccessEye({
  videoElement: document.getElementById('camera'),
  overlayCanvas: document.getElementById('overlay'),
  dwellTime: 300,      // ms before element focuses
  smoothing: 0.3,      // EMA alpha (0-1)
  useKalman: true,     // Kalman filter enabled
  audioFeedback: true,  // Web Speech API TTS
  debug: false
});

await eye.initialize();
await eye.startCamera();

Register UI Elements

// Register a single element
eye.registerElement({
  id: 'sendButton',
  element: document.getElementById('send-btn'),
  label: 'Send Message',    // TTS label
  onActivate: () => sendMessage()
});

// Or register multiple at once
eye.registerElements([
  { id: 'sendBtn',  x: 200, y: 400, width: 120, height: 60,
    label: 'Send', onActivate: () => send() },
  { id: 'menuIcon', x: 20,  y: 40,  width: 40,  height: 40,
    label: 'Menu', onActivate: () => openMenu() }
]);

// Remove element
eye.unregisterElement('sendButton');

Calibration System

// Run 5-point calibration flow
const result = await eye.calibrate({
  points: 13,         // 13-point ridge regression grid
  samplesPerPoint: 30, // frames to average
  timeout: 10000      // max 10s
});

// Calibration result
// { success: true, accuracy: 92.3, model: [...] }

// Save calibration (localStorage)
eye.saveCalibration();

// Load saved calibration
eye.loadCalibration();

// Calibration points schema
// TopLeft(10%,10%), TopRight(90%,10%),
// Center(50%,50%), BottomLeft(10%,90%),
// BottomRight(90%,90%)

Event System

// Listen for gaze events
eye.on('gaze', ({ x, y, confidence }) => {
  console.log(`Gaze at ${x}, ${y}`);
});

// Element focused (gaze entered + dwell met)
eye.on('focus', ({ elementId, label }) => {
  console.log(`Focused: ${label}`);
});

// Element activated (gesture confirmed)
eye.on('activate', ({ elementId, gesture }) => {
  console.log(`Activated via ${gesture}`);
});

// Gesture detected
eye.on('gesture', ({ type, confidence }) => {
  // type: 'pinch' | 'airTap' | 'openPalm'
});

Gaze Engine Internals

Component	Method	Description
Pupil Detection	Face Mesh iris landmarks 468–472	Left/right iris center coords
Gaze Vector	Head pose + iris offset	3D direction from eye
Kalman Filter	2-state Kalman (pos+vel)	Removes jitter noise
EMA Smoothing	α=0.3 per-axis	Temporal smoothing
Screen Mapping	Polynomial regression	Calibrated gaze→screen coords
Dwell Timer	300ms window	Anti-accidental selection

Gesture Recognition

Gesture	Detection Logic	Action	Debounce
Pinch	Thumb–index distance <0.05 (normalized)	Select / Click	500ms
Air Tap	Index forward Z-delta >0.04 in 150ms	Click	600ms
Open Palm	All finger spread >0.08 avg	Cancel / Back	800ms

Testing Plan

Glasses Users

Test with thick-frame glasses and anti-reflective coatings. Adjust iris detection threshold for glare compensation.

✓ Supported

Low Lighting

Test at <100 lux. MediaPipe Face Mesh remains robust to 50 lux with confidence threshold of 0.6.

✓ Supported

Slow Head Movement

EMA + Kalman smoothing handles slow-head tremor users. Dwell window extended to 400ms for motor impairments.

✓ Supported

Tremor Conditions

Kalman velocity state dampens high-frequency tremor. Gaze stabilization window set to 300–400ms.

✓ Supported

Control Your Phone With
Eyes & Gestures

System Layers

Vision Input Layer

Gaze Mapping Engine

UI Target Detection

Gesture Engine

Calibration System

Privacy & Safety

How It Works

Launch

Calibrate

Gaze

Gesture

Action

System Architecture

Technology Stack

Frontend

Vision AI

Signal Processing

Deployment

Performance Targets

Camera Feed

Active Mode

Detection Status

Eye Tracking Calibration

Accessibility Demo — Messaging App

Gesture Studio

API Documentation

Overview

Initialization

Register UI Elements

Calibration System

Event System

Gaze Engine Internals

Gesture Recognition

Testing Plan

Glasses Users

Low Lighting

Slow Head Movement

Tremor Conditions

Control Your Phone With Eyes & Gestures

System Layers

Vision Input Layer

Gaze Mapping Engine

UI Target Detection

Gesture Engine

Calibration System

Privacy & Safety

How It Works

Launch

Calibrate

Gaze

Gesture

Action

System Architecture

Technology Stack

Frontend

Vision AI

Signal Processing

Deployment

Performance Targets

Camera Feed

Active Mode

Detection Status

Eye Tracking Calibration

Accessibility Demo — Messaging App

Gesture Studio

API Documentation

Overview

Initialization

Register UI Elements

Calibration System

Event System

Gaze Engine Internals

Gesture Recognition

Testing Plan

Glasses Users

Low Lighting

Slow Head Movement

Tremor Conditions

Control Your Phone With
Eyes & Gestures