Accessibility Technology

Control Your Phone With
Eyes & Gestures

A production-ready system for touchless interaction using hybrid gaze estimation, 6-DOF head pose, temporal stabilization, micro-saccade filtering, dynamic calibration & AI intent prediction — 100% on-device, no cloud, no neural implants.

<80msLatency
60 FPSCamera
9-PointCalibration
100%On-Device
Real-time Gaze Tracking

System Layers

Vision Input Layer

MediaPipe Face Mesh + Hands captures eye position, head pose, and hand landmarks at 60/120 FPS with multi-point iris + eyelid contours.

  • Pupil + iris detection
  • Head pose estimation
  • 21-point hand landmarks

Gaze Mapping Engine

Hybrid gaze model fuses binocular iris offset, head pose vector, and pupil boundary with temporal filtering to eliminate jitter.

  • Adaptive Kalman + EMA + window
  • Micro-saccade filter (12px/200ms)
  • Confidence-weighted fusion

UI Target Detection

Registers interactive components as bounding boxes. Detects gaze intersection with 300ms dwell time.

  • Bounding box registry
  • Dwell-time focus
  • Visual glow feedback

Gesture Engine

MediaPipe Hands landmarks power pinch detection, air tap recognition, and open palm cancel.

  • Pinch → Select
  • Air tap → Click
  • Open palm → Cancel

Calibration System

9-point calibration + continuous dynamic micro-calibration from confirmed interactions keeps gaze drift corrected over time.

  • 9-point polynomial regression
  • Dynamic bias drift correction
  • Confidence-gated updates

Privacy & Safety

All video processing is 100% on-device. No camera data ever leaves the browser or device.

  • Zero cloud processing
  • No data transmission
  • Local calibration storage

How It Works

01

Launch

User launches accessibility mode — camera permissions requested

02

Calibrate

13-point ridge regression builds personalized gaze-to-screen model

03

Gaze

User looks at a UI element — system detects and highlights it

04

Gesture

User performs pinch or air tap to confirm and activate the element

05

Action

Element activates with visual + audio feedback confirming the selection

Layer 1 — Vision Input
Front Camera 120/60/30 FPS auto
Face Mesh 468 landmarks
Hand Tracking 21 landmarks
Head Pose 6-DOF estimation
Layer 2 — Processing
Gaze Vector Pupil → direction
Kalman Filter Noise smoothing
EMA Filter α=0.3 smoothing
Gesture Classifier Landmark distances
Layer 3 — Gaze Mapping
Calibration Model Polynomial regression
Screen Mapping Gaze → (x,y) coords
Dwell Timer 300ms stabilization
Debounce Logic Anti-jitter guard
Layer 4 — UI Interaction
Element Registry Bounding boxes
Hit Detection Gaze ∩ BBox
Visual Feedback Glow + highlight
Audio TTS Speech feedback
Layer 5 — Action Output
Element Activation Simulated tap/click
Action Log Local storage only
Privacy Guard Zero data egress

Technology Stack

Frontend

Hono + TypeScript Vanilla JS CSS Animations Web Speech API

Vision AI

MediaPipe Face Mesh MediaPipe Hands WebGL Backend WASM Processing

Signal Processing

Kalman Filter EMA Smoothing Polynomial Regression Debounce Logic

Deployment

Cloudflare Pages Edge Network Zero-latency CDN HTTPS Only

Performance Targets

<80ms
System Latency
End-to-end response time
60 FPS
Camera Processing
Real-time frame analysis
<150ms
Gesture Detection
Hand gesture recognition
>85%
Gaze Accuracy
Post dynamic-calibration precision

Camera Feed

Camera not started

Click Start to begin

Active Mode

Detection Status

Face Detection Inactive
Gaze Tracking Inactive
Hand Detection Inactive
Gesture None
0 FPS
0 ms
0% Conf.
Gaze X:
Gaze Y:
Target:

Accessibility Demo — Messaging App

Start camera, then move your gaze over buttons. Perform pinch or air tap to activate.

Hi! This is the AccessEye demo. Try looking at the buttons below and performing a pinch gesture to interact.

10:30 AM

The system will highlight buttons as your gaze focuses on them. Hold your gaze for 300ms to select.

10:31 AM
Gaze at a reply, then pinch:
Toolbar Actions:
Interaction Log
System ready. Start camera to begin.
Phase 2 — Gaze Engine ACTIVE
Loading…
Gaze Engine: Initializing…
Gaze Confidence
0%
Bright
Occl
Glare
Latency
Head Pose (6-DOF)
YAW
PITCH
ROLL
Saccade + IVT
Fixation Scanning
Saccades 0
Fixations 0
IVT: Scanning
Velocity: 0px/f
AI Intent Prediction
Starting camera to begin…
Conf: 0%
Adaptive Dwell Timer
Preset: Normal (300ms)
Dynamic Calibration
Micro-samples: 0
Drift Bias X/Y: 0 / 0
Pipeline Benchmark
PACE Recalibration
Passive samples: 0
Tools
Gaze Calibration Layer
Optional post-processing layer. OFF by default — Phase 2 handles all calibration. Enable only if cursor drifts after Phase 2 calibration.
1.15×
1.15×
0.20
Voice Navigation
Inactive
Say a button name or “click”…
Snap-To & Targeting
Snap-To Mode
Cursor auto-snaps to nearest interactive element.
Auto Dwell-Click
Activates element after dwell time.
Snap Settings
140px
800ms
22%
35%
Cursor Sensitivity
Adjusts how far the cursor travels per unit of eye movement. Lower if cursor overshoots; raise if hard to reach screen edges.
1.0×
0.5× = narrow 1.0× = default 2.0× = wide
Adaptive Gaze Learning
Total activations 0
Settings auto-tune as you use the system.
Gesture Spam Control
Debounce time between gesture fires (800–1200 ms). Higher = fewer accidental repeats.
900ms
1000ms
Accessibility Control Mode INACTIVE
Live Session Stats
Gaze Confidence
Cursor XY
Engine
Interaction Log
Gaze activations0
Voice commands0
Key navigations0
Intent fusion0
Total logged 0
Elements indexed
Gaze Dwell Time
Look at any element for the set time to activate via gaze. Only runs while ACM is ON.
800 ms
Active Modalities
Gaze Voice Fusion Snap-To Keys
Compliance Export
Export timestamped interaction logs as proof of WCAG / ADA / Section 508 compliance.
Voice Commands
"Accessibility Mode" — toggle on/off
"Start Dictation" — type by voice
"Export Log" — download CSV
"Show Guide" — re-show hint

Gesture Studio

Create and customise hands-free gestures using hand movements and head poses. Custom gestures are always active once the camera is running.

Start the camera on the Live Demo page to enable gesture detection. Custom gestures activate automatically once recorded.
Optional with RAGE-net zero-shot mode — improves accuracy further.

Overview

AccessEye provides a JavaScript API for integrating eye tracking and gesture control into any web application. All processing runs client-side using MediaPipe WebGL workers.

Privacy First: No video data ever leaves the device. All ML inference runs in WebAssembly/WebGL workers in the browser.

Initialization

// Initialize the AccessEye system
const eye = new AccessEye({
  videoElement: document.getElementById('camera'),
  overlayCanvas: document.getElementById('overlay'),
  dwellTime: 300,      // ms before element focuses
  smoothing: 0.3,      // EMA alpha (0-1)
  useKalman: true,     // Kalman filter enabled
  audioFeedback: true,  // Web Speech API TTS
  debug: false
});

await eye.initialize();
await eye.startCamera();

Register UI Elements

Register interactive elements to make them gaze-targetable:

// Register a single element
eye.registerElement({
  id: 'sendButton',
  element: document.getElementById('send-btn'),
  label: 'Send Message',    // TTS label
  onActivate: () => sendMessage()
});

// Or register multiple at once
eye.registerElements([
  { id: 'sendBtn',  x: 200, y: 400, width: 120, height: 60,
    label: 'Send', onActivate: () => send() },
  { id: 'menuIcon', x: 20,  y: 40,  width: 40,  height: 40,
    label: 'Menu', onActivate: () => openMenu() }
]);

// Remove element
eye.unregisterElement('sendButton');

Calibration System

// Run 5-point calibration flow
const result = await eye.calibrate({
  points: 13,         // 13-point ridge regression grid
  samplesPerPoint: 30, // frames to average
  timeout: 10000      // max 10s
});

// Calibration result
// { success: true, accuracy: 92.3, model: [...] }

// Save calibration (localStorage)
eye.saveCalibration();

// Load saved calibration
eye.loadCalibration();

// Calibration points schema
// TopLeft(10%,10%), TopRight(90%,10%),
// Center(50%,50%), BottomLeft(10%,90%),
// BottomRight(90%,90%)

Event System

// Listen for gaze events
eye.on('gaze', ({ x, y, confidence }) => {
  console.log(`Gaze at ${x}, ${y}`);
});

// Element focused (gaze entered + dwell met)
eye.on('focus', ({ elementId, label }) => {
  console.log(`Focused: ${label}`);
});

// Element activated (gesture confirmed)
eye.on('activate', ({ elementId, gesture }) => {
  console.log(`Activated via ${gesture}`);
});

// Gesture detected
eye.on('gesture', ({ type, confidence }) => {
  // type: 'pinch' | 'airTap' | 'openPalm'
});

Gaze Engine Internals

ComponentMethodDescription
Pupil DetectionFace Mesh iris landmarks 468–472Left/right iris center coords
Gaze VectorHead pose + iris offset3D direction from eye
Kalman Filter2-state Kalman (pos+vel)Removes jitter noise
EMA Smoothingα=0.3 per-axisTemporal smoothing
Screen MappingPolynomial regressionCalibrated gaze→screen coords
Dwell Timer300ms windowAnti-accidental selection

Gesture Recognition

GestureDetection LogicActionDebounce
PinchThumb–index distance <0.05 (normalized)Select / Click500ms
Air TapIndex forward Z-delta >0.04 in 150msClick600ms
Open PalmAll finger spread >0.08 avgCancel / Back800ms

Testing Plan

Glasses Users

Test with thick-frame glasses and anti-reflective coatings. Adjust iris detection threshold for glare compensation.

✓ Supported

Low Lighting

Test at <100 lux. MediaPipe Face Mesh remains robust to 50 lux with confidence threshold of 0.6.

✓ Supported

Slow Head Movement

EMA + Kalman smoothing handles slow-head tremor users. Dwell window extended to 400ms for motor impairments.

✓ Supported

Tremor Conditions

Kalman velocity state dampens high-frequency tremor. Gaze stabilization window set to 300–400ms.

✓ Supported
⚙ Gaze Diagnostics
rawGX rawGY screenX screenY px py confidence calib hp.yaw hp.pitch phase fps
direction: · (verify axes)
X scope: .. Y: ..