How to run @mediapipe/task-vision in a web worker
MediaPipe's Pose Landmarker (or other models/tasks by MediaPipe) is heavy and often blocks the main thread during initialization. However, we can offload initialization and prediction to a web worker to keep the main thread unblocked and responsive.
Mediapipe + Web Workers. Image created by the Ankit Kumar. Mediapipe logo © Google.
Setting up the development environment
First, we'll set up our environment using vite and vanilla js. You can also find all the source code in my GitHub repository, which includes examples of using MediaPipe in a web worker in different frameworks.
Create a new Vite project:
npm create vite@latest
Then follow the prompts to setup the project.
Running MediaPipe in the main thread
Before running MediaPipe's PoseLandmarker task in a web worker, we'll run it directly on the main thread. If you already know how to run MediaPipe on the main thread, you can skip to Running MediaPipe in a web worker.
Remove the boilerplate code and then add the following import in main.js:
This will load all exports from @mediapipe/tasks-vision and places them under the $mediapipe namespace.
Note: You can use any name for the namespace, such as
vision,mpanything else. I used$mediapipebecause it looks nice. ;)
Next, we'll initialize MediaPipe's PoseLandmarker:
1let poseLandmarker = null;23async function initializeMediapipePoseLandmarker() {4const vision = await $mediapipe.FilesetResolver.forVisionTasks(5"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision/wasm",6);78poseLandmarker = await $mediapipe.PoseLandmarker.createFromModelPath(9vision,10"https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_lite/float16/1/pose_landmarker_lite.task",11);1213await poseLandmarker.setOptions({14runningMode: "VIDEO",15});16}
Once initialized, we need a function to run pose detection:
1async function detectPose(imageOrVideoElement) {2if (!poseLandmarker) {3throw new Error("PoseLandmarker is not initialized");4}56const timestamp = performance.now();7return new Promise((resolve) => {8poseLandmarker.detectForVideo(imageOrVideoElement, timestamp, (result) => {9resolve(result);10});11});12}
We need additional code to setup the camera, video and a button to start the pose detection.
Add a video element and a button element in index.html as follows:
1<!-- Important: we will use `video` id to grab video element -->2<video id="video" autoplay muted playinline height="300px" width="530px"></video>34<!-- Important: we will use `start` id to grab the button-->5<button id="start">Start</button>
Finally, we need to add four functions:
setupDrawingTools: This will setup the canvas and DrawingUtils, a helper class to draw pose landmarker.drawLandmarks: To draw the landmarker.loop: To run pose detection in a loop.initializeVideoAndStartPoseDetection: To initialize the video and start pose detection.
1let canvas;2let ctx;3let drawingUtils;45function setupDrawingTools() {6canvas = document.createElement("canvas");78canvas.style.position = "absolute";9canvas.style.zIndex = 100;10canvas.style.top = "0";11canvas.style.left = "0";12ctx = canvas.getContext("2d");13document.body.appendChild(canvas);14drawingUtils = new $mediapipe.DrawingUtils(ctx);15}161718function drawLandmarks(result, video) {19canvas.width = video.clientWidth;20canvas.height = video.clientHeight;21ctx.clearRect(0, 0, canvas.width, canvas.height);2223// Draw pose landmarks.24for (const landmark of result.landmarks) {25drawingUtils.drawLandmarks(landmark, {26radius: (data) =>27$mediapipe.DrawingUtils.lerp(data.from.z, -0.15, 0.1, 5, 1),28});29drawingUtils.drawConnectors(30landmark,31$mediapipe.PoseLandmarker.POSE_CONNECTIONS,32);33}34}3536async function loop(video) {37try {38const result = await detectPose(video);39drawLandmarks(result, video);40} catch (error) {41console.error("Error detecting pose:", error);42}4344requestAnimationFrame(() => loop(video));45}4647async function initializeVideoAndStartPoseDetection() {48let stream;49let video = document.getElementById("video");5051// Requesting camera and passing stream to video element52try {53stream = await navigator.mediaDevices.getUserMedia({ video: true });54video.srcObject = stream;55} catch (error) {56console.error("Error getting camera stream:", error);57return;58}5960try {61await initializeMediapipePoseLandmarker();62} catch (error) {63console.error("Error initializing mediapipe pose landmarker:", error);64return;65}6667setupDrawingTools();6869if (video.readyState < 2) {70/**71* If video is not ready then wait for it to be ready.72*/73await new Promise((resolve) => {74video.addEventListener("canplay", resolve);7576// A fallback, in case we is ready but didn't fire `canplay` event77setTimeout(resolve, 5_000);78});79}8081loop(video);82}8384document85.getElementById("start")86.addEventListener("click", initializeVideoAndStartPoseDetection);
Note: The code is only for demonstration purposes, it shouldn't be used as-is in production.
At this point, if everything is setup correctly, you'll see a button with "Start" label. Upon clicking, it will prompt for camera access, and once permission is granted, the Pose Landmarker will run.
Wonderful. Now, let's move on to the interesting part run the Pose Landmarker in a web worker.
Running MediaPipe in a web worker
Before we start, we need to understand the following things:
Web workers run in a different thread, hence they need to be in a separate file.
Communication between workers and the main threads is done via messages. Both sides send their messages using
postMessage()and respond to the messages via theonmessageevent handler.You can't directly manipulate DOM from inside a worker, the
windowobject is not available.
This means we need to create a new file and put the initialization and detection code in it. In addition, we cannot send the video element to the worker. Instead, we need to send a string or an object, or an ImageBitmap.
Let's start with creating a new file in the public directory called poselandmarker.worker.js.
123let poseLandmarker = null;45async function initializeMediapipePoseLandmarker() {6const vision = await $mediapipe.FilesetResolver.forVisionTasks(7"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision/wasm",8);910poseLandmarker = await $mediapipe.PoseLandmarker.createFromModelPath(11vision,12"https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_lite/float16/1/pose_landmarker_lite.task",13);1415await poseLandmarker.setOptions({16runningMode: "VIDEO",17});18}1920/**21* Detecting the pose.22*/23async function detectPose(bitmapImage) {24if (!poseLandmarker) {25throw new Error("PoseLandmarker is not initialized");26}2728console.log("Worker data received", bitmapImage);2930const timestamp = performance.now();31return new Promise((resolve) => {32poseLandmarker.detectForVideo(bitmapImage, timestamp, (result) => {33resolve(result);34});35});36}
As we learned, web workers communicate with the main thread through messages, so we'll setup messages for handling communication.
1self.onmessage = async (event) => {2const { type, payload } = event.data;34if (type === "init") {5await initializeMediapipePoseLandmarker();6self.postMessage({7type: "init",8payload: {9isSuccess: true,10},11});12return;13}1415if (type === "detect") {16const result = await detectPose(payload.image);17self.postMessage({ type: "detect", payload: { result } });18}19};
After this, we need to do some changes in our loop and initializeVideoAndStartPoseDetection functions that we have in main.js. Refer to the highlighted code below.
1async function loop(worker, video) {2try {3const image = await createImageBitmap(video); // creating image bitmap from video.4const result = await new Promise((resolve) => {5worker.onmessage = (event) => {6const { type, payload } = event.data;7if (type === "detect") {8resolve(payload.result);9}10};1112worker.postMessage({13type: "detect",14payload: { image },15});16});17image.close();18drawLandmarks(result, video);19} catch (error) {20console.error("Error detecting pose:", error);21}2223requestAnimationFrame(() => loop(worker, video));24}252627async function initializeVideoAndStartPoseDetection() {28let stream;29let video = document.getElementById("video");3031try {32stream = await navigator.mediaDevices.getUserMedia({ video: true });33video.srcObject = stream;34} catch (error) {35console.error("Error getting camera stream:", error);36return;37}3839// Creating a web worker.40const worker = new Worker("/poselandmarker.worker.js");4142try {43await new Promise((resolve, reject) => {44worker.onmessage = (event) => {45const { type, payload } = event.data;46if (47type === "init" &&48typeof payload === "object" &&49payload.isSuccess50) {51resolve("MediaPipe initialized");52}53};54worker.postMessage({ type: "init" });55});56} catch (error) {57console.error("Error initializing mediapipe pose landmarker:", error);58return;59}6061setupDrawingTools();6263if (video.readyState < 2) {64await new Promise((resolve) => {65video.addEventListener("canplay", resolve);66setTimeout(resolve, 5_000);67});68}6970loop(worker, video);71}
However, when we run the code, we will see the following error.
Cannot use import statement outside a module
Well, we know that in our poselandmarker.worker.js file, we have used import statement to import MediaPipe Task Vision module. However import is not allowed in classic web worker.
To fix this, we can try something else, in initializeVideoAndStartPoseDetection at line 40, where we are creating worker we can pass a second option:
const worker = new Worker("/poselandmarker.worker.js", { type: "module" });
But if we try this we will get a different error:
Failed to execute 'importScripts' on 'WorkerGlobalScope'
This means we are using importScripts somewhere to import a script. However, we are not using it anywhere in our worker file. After digging @mediapipe/task-vision, we'll find it uses importScripts to load necessary files if importScripts is defined.
@mediapipe/task-vision uses importScripts to load scriptsThis means we need to run the worker in the classic mode.
So, how do we fix the problem? The only part that stopping us from using @mediapipe/task-vision in classic mode are these lines at the end of the file:
Methods exported by @mediapipe/tasks-visionIf we remove these lines, and then use importScripts in poselandmarker.worker.js, we can fix the problem.
So, we need to download the js from https://cdn.jsdelivr.net/npm/@mediapipe/[email protected]/+esm and save it as mediapipe.js into public directory. Then go to the end of the file and replace the export with this:
1const $mediapipe = {2DrawingUtils: Ia,3FaceDetector: Za,4FaceLandmarker: uc,5FaceStylizer: lc,6FilesetResolver: Uo,7GestureRecognizer: mc,8HandLandmarker: _c,9HolisticLandmarker: Ac,10ImageClassifier: bc,11ImageEmbedder: kc,12ImageSegmenter: Rc,13ImageSegmenterResult: Sc,14InteractiveSegmenter: Vc,15InteractiveSegmenterResult: Fc,16MPImage: Ga,17MPMask: Ea,18ObjectDetector: Xc,19PoseLandmarker: Kc,20TaskRunner: Zo,21VisionTaskRunner: Ja,22};
Note: The variable name is important, in this case we are using
$mediapipe, because when we load this file viaimportScipts, all exported members are accessible through$mediapipe. So make sure the constant name matches the name you are using inside the web worker.
For simplicity, you can copy the code from here.
Next, we will use importScripts to import the mediapipe.js file in poselandmarker.worker.js, and revert the changes we made in initializeVideoAndStartPoseDetection.
importScripts('/mediapipe.js');let poseLandmarker = null;//... rest of the code
const worker = new Worker("/poselandmarker.worker.js");//... rest of the code.
Congratulations! You have successfully implemented MediaPipe Pose Landmarker in a web worker.
Conclusion
This approach is a bit of a hack, but it provides a simple and effective solution for running MediaPipe Pose Landmarker in a web worker.
The major advantage is that the Pose Landmarker runs entirely in a web worker, leaving the main thread free to handle the UI.