Accessibility Technology

Control Your Phone With
Eyes & Gestures

A production-ready system for touchless interaction using hybrid gaze estimation, 6-DOF head pose, temporal stabilization, micro-saccade filtering, dynamic calibration & AI intent prediction — 100% on-device, no cloud, no neural implants.

<80msLatency
60 FPSCamera
9-PointCalibration
100%On-Device
Real-time Gaze Tracking

System Layers

Vision Input Layer

MediaPipe Face Mesh + Hands captures eye position, head pose, and hand landmarks at 60/120 FPS with multi-point iris + eyelid contours.

  • Pupil + iris detection
  • Head pose estimation
  • 21-point hand landmarks

Gaze Mapping Engine

Hybrid gaze model fuses binocular iris offset, head pose vector, and pupil boundary with temporal filtering to eliminate jitter.

  • Adaptive Kalman + EMA + window
  • Micro-saccade filter (12px/200ms)
  • Confidence-weighted fusion

UI Target Detection

Registers interactive components as bounding boxes. Detects gaze intersection with 300ms dwell time.

  • Bounding box registry
  • Dwell-time focus
  • Visual glow feedback

Gesture Engine

MediaPipe Hands landmarks power pinch detection, air tap recognition, and open palm cancel.

  • Pinch → Select
  • Air tap → Click
  • Open palm → Cancel

Calibration System

9-point calibration + continuous dynamic micro-calibration from confirmed interactions keeps gaze drift corrected over time.

  • 9-point polynomial regression
  • Dynamic bias drift correction
  • Confidence-gated updates

Privacy & Safety

All video processing is 100% on-device. No camera data ever leaves the browser or device.

  • Zero cloud processing
  • No data transmission
  • Local calibration storage

How It Works

01

Launch

User launches accessibility mode — camera permissions requested

02

Calibrate

13-point ridge regression builds personalized gaze-to-screen model

03

Gaze

User looks at a UI element — system detects and highlights it

04

Gesture

User performs pinch or air tap to confirm and activate the element

05

Action

Element activates with visual + audio feedback confirming the selection

Layer 1 — Vision Input
Front Camera 120/60/30 FPS auto
Face Mesh 468 landmarks
Hand Tracking 21 landmarks
Head Pose 6-DOF estimation
Layer 2 — Processing
Gaze Vector Pupil → direction
Kalman Filter Noise smoothing
EMA Filter α=0.3 smoothing
Gesture Classifier Landmark distances
Layer 3 — Gaze Mapping
Calibration Model Polynomial regression
Screen Mapping Gaze → (x,y) coords
Dwell Timer 300ms stabilization
Debounce Logic Anti-jitter guard
Layer 4 — UI Interaction
Element Registry Bounding boxes
Hit Detection Gaze ∩ BBox
Visual Feedback Glow + highlight
Audio TTS Speech feedback
Layer 5 — Action Output
Element Activation Simulated tap/click
Action Log Local storage only
Privacy Guard Zero data egress

Technology Stack

Frontend

Hono + TypeScript Vanilla JS CSS Animations Web Speech API

Vision AI

MediaPipe Face Mesh MediaPipe Hands WebGL Backend WASM Processing

Signal Processing

Kalman Filter EMA Smoothing Polynomial Regression Debounce Logic

Deployment

Cloudflare Pages Edge Network Zero-latency CDN HTTPS Only

Performance Targets

<80ms
System Latency
End-to-end response time
60 FPS
Camera Processing
Real-time frame analysis
<150ms
Gesture Detection
Hand gesture recognition
>85%
Gaze Accuracy
Post dynamic-calibration precision

Camera Feed

Camera not started

Click Start to begin

Active Mode

Detection Status

Face Detection Inactive
Gaze Tracking Inactive
Hand Detection Inactive
Gesture None
0 FPS
0 ms
0% Conf.
Gaze X:
Gaze Y:
Target:

Accessibility Demo — Messaging App

Start camera, then move your gaze over buttons. Perform pinch or air tap to activate.

Hi! This is the AccessEye demo. Try looking at the buttons below and performing a pinch gesture to interact.

10:30 AM

The system will highlight buttons as your gaze focuses on them. Hold your gaze for 300ms to select.

10:31 AM
Gaze at a reply, then pinch:
Toolbar Actions:
Interaction Log
System ready. Start camera to begin.

Overview

AccessEye provides a JavaScript API for integrating eye tracking and gesture control into any web application. All processing runs client-side using MediaPipe WebGL workers.

Privacy First: No video data ever leaves the device. All ML inference runs in WebAssembly/WebGL workers in the browser.

Initialization

// Initialize the AccessEye system
const eye = new AccessEye({
  videoElement: document.getElementById('camera'),
  overlayCanvas: document.getElementById('overlay'),
  dwellTime: 300,      // ms before element focuses
  smoothing: 0.3,      // EMA alpha (0-1)
  useKalman: true,     // Kalman filter enabled
  audioFeedback: true,  // Web Speech API TTS
  debug: false
});

await eye.initialize();
await eye.startCamera();

Register UI Elements

Register interactive elements to make them gaze-targetable:

// Register a single element
eye.registerElement({
  id: 'sendButton',
  element: document.getElementById('send-btn'),
  label: 'Send Message',    // TTS label
  onActivate: () => sendMessage()
});

// Or register multiple at once
eye.registerElements([
  { id: 'sendBtn',  x: 200, y: 400, width: 120, height: 60,
    label: 'Send', onActivate: () => send() },
  { id: 'menuIcon', x: 20,  y: 40,  width: 40,  height: 40,
    label: 'Menu', onActivate: () => openMenu() }
]);

// Remove element
eye.unregisterElement('sendButton');

Calibration System

// Run 5-point calibration flow
const result = await eye.calibrate({
  points: 13,         // 13-point ridge regression grid
  samplesPerPoint: 30, // frames to average
  timeout: 10000      // max 10s
});

// Calibration result
// { success: true, accuracy: 92.3, model: [...] }

// Save calibration (localStorage)
eye.saveCalibration();

// Load saved calibration
eye.loadCalibration();

// Calibration points schema
// TopLeft(10%,10%), TopRight(90%,10%),
// Center(50%,50%), BottomLeft(10%,90%),
// BottomRight(90%,90%)

Event System

// Listen for gaze events
eye.on('gaze', ({ x, y, confidence }) => {
  console.log(\`Gaze at \${x}, \${y}\`);
});

// Element focused (gaze entered + dwell met)
eye.on('focus', ({ elementId, label }) => {
  console.log(\`Focused: \${label}\`);
});

// Element activated (gesture confirmed)
eye.on('activate', ({ elementId, gesture }) => {
  console.log(\`Activated via \${gesture}\`);
});

// Gesture detected
eye.on('gesture', ({ type, confidence }) => {
  // type: 'pinch' | 'airTap' | 'openPalm'
});

Gaze Engine Internals

ComponentMethodDescription
Pupil DetectionFace Mesh iris landmarks 468–472Left/right iris center coords
Gaze VectorHead pose + iris offset3D direction from eye
Kalman Filter2-state Kalman (pos+vel)Removes jitter noise
EMA Smoothingα=0.3 per-axisTemporal smoothing
Screen MappingPolynomial regressionCalibrated gaze→screen coords
Dwell Timer300ms windowAnti-accidental selection

Gesture Recognition

GestureDetection LogicActionDebounce
PinchThumb–index distance <0.05 (normalized)Select / Click500ms
Air TapIndex forward Z-delta >0.04 in 150msClick600ms
Open PalmAll finger spread >0.08 avgCancel / Back800ms

Testing Plan

Glasses Users

Test with thick-frame glasses and anti-reflective coatings. Adjust iris detection threshold for glare compensation.

✓ Supported

Low Lighting

Test at <100 lux. MediaPipe Face Mesh remains robust to 50 lux with confidence threshold of 0.6.

✓ Supported

Slow Head Movement

EMA + Kalman smoothing handles slow-head tremor users. Dwell window extended to 400ms for motor impairments.

✓ Supported

Tremor Conditions

Kalman velocity state dampens high-frequency tremor. Gaze stabilization window set to 300–400ms.

✓ Supported
Chrome Extension — Manifest V3

AccessEye Voice Control

Control any website with your voice. Open tabs, scroll pages, navigate back and forward — all hands-free. Runs persistently in the background across every tab you visit.

MV3Manifest
Always OnPersistent
All SitesCoverage
Zero CloudPrivacy
Download Extension ZIP · Unpack & Load in Chrome
👁 AccessEye
Listening
⏹ Stop Listening
Tabs
"New tab"Open + focus
"Next tab"Tab right
"Close tab"Close current
Scroll
"Scroll down"Page down
"Scroll to top"Jump top

Get Set Up in 4 Steps

1

Download the Extension

Click the download button above to get the extension ZIP file to your computer.

Download ZIP
2

Unzip the File

Extract the ZIP to a permanent folder on your computer. Don't delete this folder — Chrome needs it.

# Mac / Linux
unzip accesseye-extension.zip -d ~/AccessEye

# Windows: Right-click → Extract All
3

Load in Chrome

Open Chrome's extension manager, enable developer mode, and load the unpacked extension folder.

a Toggle Developer mode ON (top-right)
b Click Load unpacked
c Select the extension/ folder inside your unzipped directory
d AccessEye appears in your extensions list ✓
4

Start Listening

Click the AccessEye icon in your Chrome toolbar, then hit Start Listening. Allow microphone access when prompted. You're live.

Pin it to your toolbar for quick access
Works on every website automatically
Voice recognition runs in the background

All Voice Commands

Tab Control
"New tab"Open + focus blank tab
"Next tab"Switch right
"Previous tab"Switch left
"Close tab"Close current tab
"Go to next tab"Same as next tab
"Switch tab"Same as next tab
Scrolling
"Scroll down"600px down
"Scroll up"600px up
"Scroll to top"Jump to top
"Scroll to bottom"Jump to bottom
"Go down"Same as scroll down
"Page down"Same as scroll down
Navigation
"Go back"Browser back
"Go forward"Browser forward
"Reload"Refresh page
"Refresh"Same as reload
Interaction
"Click [text]"Click element by label
"Press [text]"Same as click
"Tap [text]"Same as click
"Select [text]"Same as click

How It Works

Mic Input
Web Speech API
offscreen doc
Voice Engine
Normalize + parse
intent mapping
Background Worker
Service worker
always running
Chrome APIs
tabs.create()
tabs.update()
Content Script
Injected in every
page: scroll/click
⚙ Gaze Diagnostics v2
Iris Signal (raw eye position)
iris.x iris.y lSpan rSpan
Screen Output
sx sy px py
Head Pose — move head; iris.y should NOT change
yaw pitch Δpitch baseline
Auto-Range (look around to expand)
X rng Y rng frames conf
dir · fps Alt+D to close