Enable the Hardware Acceleration of Chrome Browser on AWS Batch
In today’s rapidly evolving tech landscape, optimizing performance is a critical factor, especially when dealing with large-scale processing on cloud platforms. Many of you might find yourselves needing to automate Chrome on a server for various reasons. Additionally, you might want to navigate OpenGL-intensive websites using GPU acceleration. I was one of those people. However, if your tech stack isn’t perfectly aligned, this can be a daunting task.
In my quest to solve this issue, I scoured various QA sites, but I noticed that many users facing similar problems never received a conclusive answer. Therefore, I’ve written this article to address the gap and provide a comprehensive guide to enable hardware acceleration for Chrome on AWS Batch.
Steps to Enable Hardware Acceleration
Create a Docker Image with Chrome and Nvidia Support
First, you’ll need a Docker image that includes Google Chrome and the necessary Nvidia drivers. Here is a basic Dockerfile to get you started:
FROM --platform=linux/amd64 nvidia/cuda:12.5.0-devel-ubuntu22.04 WORKDIR /app ENV DEBIAN_FRONTEND=noninteractive ENV TZ=Asia/Tokyo # For timezone RUN apt-get update && \ apt-get install -y software-properties-common tzdata && \ ln -fs /usr/share/zoneinfo/Asia/Tokyo /etc/localtime && \ dpkg-reconfigure --frontend noninteractive tzdata && \ apt-get clean # ldd chrome | grep found # libnss3.so => not found # libnssutil3.so => not found # libsmime3.so => not found # libnspr4.so => not found # libatk-1.0.so.0 => not found # libatk-bridge-2.0.so.0 => not found # libcups.so.2 => not found # libdrm.so.2 => not found # libxcb.so.1 => not found # libxkbcommon.so.0 => not found # libatspi.so.0 => not found # libX11.so.6 => not found # libXcomposite.so.1 => not found # libXdamage.so.1 => not found # libXext.so.6 => not found # libXfixes.so.3 => not found # libXrandr.so.2 => not found # libgbm.so.1 => not found # libpango-1.0.so.0 => not found # libcairo.so.2 => not found # libasound.so.2 => not found RUN apt-get update && apt-get install -y \ libnss3 \ libnss3-tools \ libnspr4 \ libatk1.0-0 \ libatk-bridge2.0-0 \ libcups2 \ libdrm2 \ libxcb1 \ libxkbcommon0 \ libatspi2.0-0 \ libx11-6 \ libxcomposite1 \ libxdamage1 \ libxext6 \ libxfixes3 \ libxrandr2 \ libgbm1 \ libpango-1.0-0 \ libcairo2 \ libasound2 && \ apt-get clean # [ERROR:egl_util.cc(44)] : Failed to load GLES library: libGLESv2.so.2: libGLESv2.so.2: cannot open shared object file: No such file or directory # [ERROR:egl_util.cc(52)] : Failed to load EGL library: libEGL.so.1: libEGL.so.1: cannot open shared object file: No such file or directory RUN apt-get update && apt-get install -y libgles2-mesa libegl1-mesa && apt-get clean # [ERROR:gl_display.cc(520)] : EGL Driver message (Critical) eglInitialize: xcb_connect failed RUN apt-get update && apt-get install -y xvfb && apt-get clean # xvfb-run --server-args="-screen 0 1920x1080x24 +extension GLX +render -noreset" \ # chrome \ # --enable-logging --v=1 # --headless \ # --ignore-gpu-blocklist \ # --enable-gpu-rasterization \ # --enable-zero-copy \ # --use-angle=default \ # --no-sandbox RUN apt-get update && apt-get install -y \ libva2 \ libva-x11-2 \ libva-drm2 && \ apt-get clean
To get started, you’ll need a Docker image that includes the necessary Nvidia drivers. This Dockerfile will ensure Chrome runs smoothly. In my setup, I used Python’s Pyppeteer, which downloads Chrome, so there’s no need to install Chrome via apt-get. Of course, Puppeteer also works, and if you prefer, you can directly install Chrome using apt-get.
Pay close attention to the comments within the Dockerfile. The comments explain the commands executed, the errors encountered, and the libraries installed to address those errors. As of July 2024, this Dockerfile is functional. However, if you encounter any errors in the future, refer to these comments to guide your troubleshooting and adjustments.
Prepare Your Project
In this step, we will show an example using Pyppeteer in Python. If you prefer using Puppeteer, feel free to substitute Python with Node.js as appropriate.
There are a few key points to keep in mind, but one of the most important is to run your Python or Node.js through xvfb-run
. This is crucial not only to allow Chrome to operate in a headless mode but also to enable GPU usage.
Here is an overview of what you need to do:
1. Prepare a shell script for running Python or Node.js on X
xvfb-run --server-args="-screen 0 1920x1080x24 +extension GLX +render -noreset" \ python3.12 /app/run.py "$@"
2. Run Chrome in a Python Script
import asyncio import logging from typing import TypeAlias from pyppeteer.launcher import launch _BrowserLaunchOptions: TypeAlias = dict[str, list[str] | bool] def _get_launch_options() -> _BrowserLaunchOptions: args = [ "--lang=ja-JP", "--no-sandbox", "--ignore-gpu-blocklist", "--enable-gpu-rasterization", "--enable-zero-copy", "--disable-gpu-process-crash-limit", "--use-angle=default", ] if logger.level == logging.DEBUG: args.extend(["--enable-logging", "--v=1"]) launch_options: _BrowserLaunchOptions = {"args": args, "ignoreHTTPSErrors": True} if logger.level == logging.DEBUG: launch_options["dumpio"] = True logger.debug(f"launch_options={launch_options}") return launch_options async def main() -> None: launch_options = _get_launch_options() browser = await launch(**launch_options) # do something you need await browser.close() if __name__ == "__main__": asyncio.get_event_loop().run_until_complete(main())
This is a code snippet. In this code, we only create the browser object, but you will need to further automate Chrome operations as per your requirements. The key point here is the launch options. If you get these options wrong, Chrome will not start. Pay particular attention to the arguments passed to args. In DEBUG mode, configure Chrome to output more logs. This way, if any issues arise, you can usually resolve them. If you cannot read the logs yourself, you can always rely on ChatGPT for assistance.
If you want to quickly verify that everything is working correctly, you can navigate to chrome://gpu
, capture a screenshot of the page, and upload the screenshot file to S3. This will help you confirm that GPU acceleration is enabled.
page = await browser.newPage() await page.goto("chrome://gpu") await page.screenshot({"path": "./screenshot.png"})
As a side note, there is always a possibility of encountering errors in every operation, such as opening a tab, navigating to a URL, or retrieving content. Therefore, I write my code to restart from the point of opening a tab if any single operation fails. This process is repeated until it succeeds, but after five failures, it gives up. When automating Chrome, you need to write your code defensively to this extent.
3. Update the Dockerfile to adjust the Python Project
Please add the following content to the Dockerfile we prepared earlier. I will omit the details on how to create the requirements.txt.
# Install python3.12 RUN add-apt-repository ppa:deadsnakes/ppa -y && \ apt-get update && \ apt-get install -y python3.12 python3.12-distutils && \ apt-get install -y wget && \ apt-get clean # Install pip RUN wget https://bootstrap.pypa.io/get-pip.py && \ python3.12 ./get-pip.py COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . .
Set Up an Environment on AWS Batch for GPU-Accelerated Chrome
In this final step, we will set up an environment on AWS Batch that can utilize GPU-accelerated Chrome. For this example, we will use the g4dn instance type, which is optimized for GPU workloads.
Here is a snippet of the AWS CDK code to set up the necessary infrastructure:
export class BatchChromeStack extends cdk.Stack { constructor(scope: Construct, id: string, stage: string, props?: cdk.StackProps) { super(scope, id, props); const vpc: ec2.IVpc = this.getVpc(); const subnets = this.getSubnets(vpc); const securityGroup = this.getSecurityGroup(vpc); const instanceRole = this.createInstanceRoleAndProfile(); const { repository, image_name } = this.createAndDeployDockerImage(); const jobRole = this.createJobRole(); this.addSSMPolicyToJobRole(jobRole, stage); const bucket = this.createS3Bucket(); this.addS3PutObjectPolicyToRole(jobRole, bucket); const container = this.createContainer(image_name, jobRole, bucket, stage); repository.grantPull(container.executionRole); this.createEcsJobDefinition(container); const computeEnvironment = this.createComputeEnvironment(vpc, [securityGroup], subnets, instanceRole); this.createJobQueue(computeEnvironment); } // Omit many codes private createContainer(image_name: string, jobRole: iam.Role, bucket: s3.Bucket, stage: string): batch.EcsEc2ContainerDefinition { const logLevel = stage === 'production' ? 'INFO' : 'DEBUG'; return new batch.EcsEc2ContainerDefinition(this, `${this.NAME_PREFIX}ContainerDefinition`, { image: ecs.ContainerImage.fromRegistry(image_name), // g4dn.xlarge gpu: 1, cpu: 4, memory: cdk.Size.gibibytes(15), // 16GB - 1GB for OS jobRole: jobRole, logging: new ecs.AwsLogDriver({ streamPrefix: this.KEBAB_NAME_PREFIX }), environment: { BUCKET_NAME: bucket.bucketName, LOG_LEVEL: logLevel, }, }); } }
I have omitted quite a lot of details in this step. I did not include what you can search online or ask ChatGPT to get answers, as those are readily accessible resources.
The biggest issue I faced was that when I specified the memory size of 16GB, which is available on the g4dn.xlarge instance, in the EcsEc2ContainerDefinition, the instance failed to launch. However, it did launch with 15GB. I believe this is a quirk of AWS Batch.
If you need the full version of the CDK code or if something is not working correctly, feel free to reach out. If I feel inclined, I might provide more detailed information.