Control YouTube hands-free with head gestures. Chrome MV3 extension.
Experience Vision Pro–like navigation on your $500 laptop with a $10 webcam.
Nodex injects a content script into youtube.com that captures webcam frames, runs MediaPipe Face Mesh (468 landmarks at 30 FPS), computes head pose (yaw/pitch/roll), eye aspect ratio, and mouth aperture, and maps recognized gestures to YouTube commands.
Implementation lives in shared/utils/gestureLogic.js. Below is the same geometry in compact form (mathematical expressions in Markdown on GitHub).
3-D distance between MediaPipe landmarks $\mathbf{p}$ and $\mathbf{q}$:
\[d(\mathbf{p}, \mathbf{q}) = \sqrt{(p_x - q_x)^2 + (p_y - q_y)^2 + (p_z - q_z)^2}.\]6-point Eye Aspect Ratio (per eye) — two vertical spans (primary + secondary eyelid pairs) over horizontal eye width (inner–outer corners); one scalar averages both eyes:
\[\mathrm{EAR}_{\mathrm{eye}} = \frac{\frac{1}{2}\bigl(d^{(1)}_{\mathrm{vert}} + d^{(2)}_{\mathrm{vert}}\bigr)}{d_{\mathrm{inner},\mathrm{outer}}}, \qquad \mathrm{EAR} = \tfrac{1}{2}\bigl(\mathrm{EAR}_{\mathrm{right}} + \mathrm{EAR}_{\mathrm{left}}\bigr).\]Lower EAR ⇒ eyes more closed (used with calibrated enter/exit thresholds and frame streaks so ordinary blinks ≠ commands).
Head pose (degrees, approximate): with cheek edge landmarks $L,R$, nose $N$, forehead $F$, chin $C$, and midpoints $x_{\mathrm{mid}}=\frac{x_L+x_R}{2}$, $y_{\mathrm{mid}}=\frac{y_F+y_C}{2}$,
\[\psi = -\frac{x_N - x_{\mathrm{mid}}}{|x_R - x_L|/2}\cdot 45^{\circ} \quad (\text{yaw}), \qquad \phi = \frac{y_{\mathrm{mid}} - y_N}{|y_C - y_F|/2}\cdot 40^{\circ} \quad (\text{pitch}),\] \[\rho = -\operatorname{atan2}(y_R - y_L,\, x_R - x_L)\cdot\frac{180^{\circ}}{\pi} \quad (\text{roll}).\]Mouth opening ratio (upper/lower lip vs mouth width):
\[m = \frac{d(\mathrm{upper},\mathrm{lower})}{d(\mathrm{mouth\ L},\mathrm{mouth\ R})}.\]| Hysteresis (concept): a head gesture turns on when $ | \theta | > T$ and turns off only when $ | \theta | < T - h$ (threshold $T$, gap $h$ per axis in code) so the boundary does not chatter at 30 Hz. |
Chrome Web Store: [coming soon]
npm install
npm run build # one-off build → dist/
npm run dev # watch mode
npm run prod # clean build + zip → nodex.zip
npm run zip:source # sources → nodex-extension-source.zip (excludes node_modules, dist, .git)
Project map and file roles (AI / onboarding): AGENTS.md.
Load unpacked: chrome://extensions → Developer mode → Load unpacked → select dist/.
See PRIVACY.md. TL;DR: nothing leaves your device.
MIT