A production-ready system for touchless interaction using hybrid gaze estimation, 6-DOF head pose, temporal stabilization, micro-saccade filtering, dynamic calibration & AI intent prediction — 100% on-device, no cloud, no neural implants.
MediaPipe Face Mesh + Hands captures eye position, head pose, and hand landmarks at 60/120 FPS with multi-point iris + eyelid contours.
Hybrid gaze model fuses binocular iris offset, head pose vector, and pupil boundary with temporal filtering to eliminate jitter.
Registers interactive components as bounding boxes. Detects gaze intersection with 300ms dwell time.
MediaPipe Hands landmarks power pinch detection, air tap recognition, and open palm cancel.
9-point calibration + continuous dynamic micro-calibration from confirmed interactions keeps gaze drift corrected over time.
All video processing is 100% on-device. No camera data ever leaves the browser or device.
User launches accessibility mode — camera permissions requested
5-point calibration builds personalized gaze-to-screen model
User looks at a UI element — system detects and highlights it
User performs pinch or air tap to confirm and activate the element
Element activates with visual + audio feedback confirming the selection
Production-ready layered architecture for on-device eye + gesture control
Start camera, then move your gaze over buttons. Perform pinch or air tap to activate.
Integration guide for registering UI elements and interacting with the AccessEye system
AccessEye provides a JavaScript API for integrating eye tracking and gesture control into any web application. All processing runs client-side using MediaPipe WebGL workers.
// Initialize the AccessEye system
const eye = new AccessEye({
videoElement: document.getElementById('camera'),
overlayCanvas: document.getElementById('overlay'),
dwellTime: 300, // ms before element focuses
smoothing: 0.3, // EMA alpha (0-1)
useKalman: true, // Kalman filter enabled
audioFeedback: true, // Web Speech API TTS
debug: false
});
await eye.initialize();
await eye.startCamera();
Register interactive elements to make them gaze-targetable:
// Register a single element
eye.registerElement({
id: 'sendButton',
element: document.getElementById('send-btn'),
label: 'Send Message', // TTS label
onActivate: () => sendMessage()
});
// Or register multiple at once
eye.registerElements([
{ id: 'sendBtn', x: 200, y: 400, width: 120, height: 60,
label: 'Send', onActivate: () => send() },
{ id: 'menuIcon', x: 20, y: 40, width: 40, height: 40,
label: 'Menu', onActivate: () => openMenu() }
]);
// Remove element
eye.unregisterElement('sendButton');
// Run 5-point calibration flow
const result = await eye.calibrate({
points: 5, // 5-point grid
samplesPerPoint: 30, // frames to average
timeout: 10000 // max 10s
});
// Calibration result
// { success: true, accuracy: 92.3, model: [...] }
// Save calibration (localStorage)
eye.saveCalibration();
// Load saved calibration
eye.loadCalibration();
// Calibration points schema
// TopLeft(10%,10%), TopRight(90%,10%),
// Center(50%,50%), BottomLeft(10%,90%),
// BottomRight(90%,90%)
// Listen for gaze events
eye.on('gaze', ({ x, y, confidence }) => {
console.log(`Gaze at ${x}, ${y}`);
});
// Element focused (gaze entered + dwell met)
eye.on('focus', ({ elementId, label }) => {
console.log(`Focused: ${label}`);
});
// Element activated (gesture confirmed)
eye.on('activate', ({ elementId, gesture }) => {
console.log(`Activated via ${gesture}`);
});
// Gesture detected
eye.on('gesture', ({ type, confidence }) => {
// type: 'pinch' | 'airTap' | 'openPalm'
});
| Component | Method | Description |
|---|---|---|
| Pupil Detection | Face Mesh iris landmarks 468–472 | Left/right iris center coords |
| Gaze Vector | Head pose + iris offset | 3D direction from eye |
| Kalman Filter | 2-state Kalman (pos+vel) | Removes jitter noise |
| EMA Smoothing | α=0.3 per-axis | Temporal smoothing |
| Screen Mapping | Polynomial regression | Calibrated gaze→screen coords |
| Dwell Timer | 300ms window | Anti-accidental selection |
| Gesture | Detection Logic | Action | Debounce |
|---|---|---|---|
| Pinch | Thumb–index distance <0.05 (normalized) | Select / Click | 500ms |
| Air Tap | Index forward Z-delta >0.04 in 150ms | Click | 600ms |
| Open Palm | All finger spread >0.08 avg | Cancel / Back | 800ms |
Test with thick-frame glasses and anti-reflective coatings. Adjust iris detection threshold for glare compensation.
Test at <100 lux. MediaPipe Face Mesh remains robust to 50 lux with confidence threshold of 0.6.
EMA + Kalman smoothing handles slow-head tremor users. Dwell window extended to 400ms for motor impairments.
Kalman velocity state dampens high-frequency tremor. Gaze stabilization window set to 300–400ms.