Getting Started with Roboflow Inference on NVIDIA® Jetson Devices

This wiki guide explains how to easily deploy AI models using Roboflow inference server running on NVIDIA Jetson devices. Here we will use Roboflow Universe to select an already trained model, deploy the model to the Jetson device and perform inference on a live webcam stream!

Roboflow Inference is the easiest way to use and deploy computer vision models, providing an HTTP Roboflow API used for running inference. Roboflow inference supports:

Object detection
Image Segmentation
Image Classification

and foundation models like CLIP and SAM.

Prerequisites

Ubuntu Host PC (native or VM using VMware Workstation Player)
reComputer Jetson or any other NVIDIA Jetson device

note

This wiki has been tested and verified on a reComputer J4012 and reComputer Industrial J4012 powered by NVIDIA Jetson Orin NX 16GB module

Flash JetPack to Jetson

Now you need to make sure that the Jetson device is flashed with a JetPack system. You can either use NVIDIA SDK Manager or command-line to flash JetPack to the device.

For Seeed Jetson-powered devices flashing guides, please refer to the below links:

note

Make sure to Flash JetPack version 5.1.1 because that is the version we have verified for this wiki

Tap into 50,000+ Models at Roboflow Universe

Roboflow offers 50,000+ ready-to-use AI models for everyone to get started with computer vision deployment in the fastest way. You can explore them all at Roboflow Universe. Roboflow Universe also offers 200,000+ datasets where you can use these datasets to train a model on Roboflow cloud servers or else bring your own dataset, use Roboflow online image annotation tool and start training.

Step 1: We will use a people detection model from Roboflow Universe as reference
Step 2: Here the model name will follow the format "model_name/version". In this case, it is people-detection-general/7. We will use this model name later in this wiki when we start inferencing.

Obtain Roboflow API Key

Now we need to obtain a Roboflow API key for the Roboflow inference server to work.

Step 1: Sign up for a new Roboflow account by entering your credentials
Step 2: Sign in to the account, navigate to Projects > Workspaces > <your_workspace_name> > Roboflow API, and click Copy next to "Private API Key" section

Keep this private key because we will be needing it later.

Running Roboflow Inference Server

You can get started with Roboflow inference on NVIDIA Jetson in 3 different ways.

Using pip package - Using pip package will be the fastest way to get started, however you will need to install SDK components (CUDA, cuDNN, TensorRT) along with JetPack.
Using Docker hub - Using Docker hub will be a little slow because it will first pull a Docker image which is around 19GB. however you do not need to install SDK components because the Docker image will already have those.
Using local Docker build - Using local Docker build is an extension of Docker hub method where you can change the source code for the Docker image according to your desired application (such as enable TensorRT precision with INT8).

Before moving on to running Roboflow inference server, you need to obtain an AI model to inference on, and a Roboflow API key. We will first go through that.

pip Package
Docker Hub
Local Docker Build

Using pip Package

Step 1: If you only flash the Jetson device with Jetson L4T, you need to install SDK components first

sudo apt update
sudo apt install nvidia-jetpack -y

Step 2: Execute the below commands on the terminal to install Roboflow inference server pip package

sudo apt update
sudo apt install python3-pip -y
pip install inference-gpu

Step 3: Execute the below and replace with your Roboflow Private API Key that you obtained before

export ROBOFLOW_API_KEY=your_key_here

Step 4: Connect a webcam to the Jetson device and execute the following Python script to run an open-source people detection model on your webcam stream

webcam.py

import cv2
import inference
import supervision as sv

annotator = sv.BoxAnnotator()

inference.Stream(
    source="webcam", 
    model=" people-detection-general/7", 

    output_channel_order="BGR",
    use_main_thread=True, 
    
    on_prediction=lambda predictions, image: (
        print(predictions), 
        
        cv2.imshow(
            "Prediction", 
            annotator.annotate(
                scene=image, 
                detections=sv.Detections.from_roboflow(predictions)
            )
        ),
        cv2.waitKey(1)
    )
)

Finally, you will see the result as follows

Using Docker Hub

To use this method, flashing the device with Jetson L4T is enough. This uses a client-server architecture where Roboflow inference server will run on a specific network port on the Jetson and you will be able to access this inference server using any PC on the same network or even use the Jetson itself as server and client both.

Server Setup - Jetson

Execute the following to download and run the Roboflow inference server Docker container

sudo docker run --network=host --runtime=nvidia roboflow/roboflow-inference-server-jetson-5.1.1

If you see the following output, the inference server has started successfully

Client Setup - Jetson/ PC

Step 1: Install the necessary packages

sudo apt update
sudo apt install python3-pip -y
git clone https://github.com/roboflow/roboflow-api-snippets
cd Python/webcam
pip install -r requirements.txt

Step 2: Create a roboflow_config.json file in the same directory including your Roboflow API key, model name. You can refer to the sample roboflow_config.sample.json file included inside this GitHub repo
Step 3: On the same device on a different terminal window or on another PC on the same as the Jetson, execute the following Python script to run an open-source people detection model on your webcam stream

python infer-simple.py

Using Local Docker Build

Server Setup - Jetson

Step 1: Clone the Roboflow inference server repository

git clone https://github.com/roboflow/inference

Step 2: Enter the "inference" directory and start compiling your own Docker image

cd inference
sudo docker build \
    -f docker/dockerfiles/Dockerfile.onnx.jetson.5.1.1 \
    -t roboflow/roboflow-inference-server-jetson-5.1.1:seeed1 .

Here the text after "-t" is the name of the container we are building. You can give it any name.

Step 3: Execute the below command to check whether the Docker image we compiled is listed

sudo docker ps

Step 4: Start a Docker container based on the Docker image that you just built

docker run --privileged --net=host --runtime=nvidia roboflow/roboflow-inference-server-jetson-5.1.1:seeed1

If you see the following output, the inference server has started successfully

Client Setup - Jetson/ PC

Execute the following Python script to run an open-source people detection model on your webcam stream

webcam.py

import cv2
import base64
import requests
import time


upload_url = ("http://<ip_address_of_jetson>:9001/"
              "people-detection-general/7"
              "?api_key=xxxxxxxx"
              "&stroke=5")
video = cv2.VideoCapture(0)

while True:
    start = time.time()

    ret, img = video.read()
    if ret:
        # Resize (while maintaining the aspect ratio) to improve speed and save bandwidth
        height, width, channels = img.shape
        scale = 416 / max(height, width)
        img = cv2.resize(img, (round(scale * width), round(scale * height)))

        # Encode image to base64 string
        retval, buffer = cv2.imencode('.jpg', img)
        img_str = base64.b64encode(buffer)

        # Get prediction from Roboflow Infer API
        resp = requests.post(upload_url, data=img_str, headers={
            "Content-Type": "application/x-www-form-urlencoded"
        }, stream=True)
        resp = resp.json()

        for bbox in resp["predictions"]:
            img = cv2.rectangle(
                img,
                (int(bbox['x']-(bbox['width']/2)), int(bbox['y']-(bbox['height']/2))),
                (int(bbox['x']+(bbox['width']/2)), int(bbox['y']+(bbox['height']/2))),
                (0, 255, 0),
                2)
            cv2.putText(
                img, f"{bbox['class']}",
                (int(bbox['x']-(bbox['width']/2)), int(bbox['y']-(bbox['height']/2)-5)),
                0, 0.9,
                (0, 255, 0), thickness=2, lineType=cv2.LINE_AA
            )
        cv2.imshow('image', img)
        print((1/(time.time()-start)), " fps")

    if cv2.waitKey(1) == ord('q'):
        break
video.release()
cv2.destroyAllWindows()

Note that the elements that need to be included in the upload_url in the script are:

IP address of roboflow inference srever
The model you want to run
roboflow api key

The model can be selected in the roboflow universe

Enable TensorRT

By default, Roboflow inference server is using CUDA runtime. However, if you want to change to TensorRT runtime to increase the inference speed, you can add the following inside the file "inference/docker/dockerfiles /Dockerfile.onnx.jetson.5.1.1" and build the Docker image

ENV ONNXRUNTIME_EXECUTION_PROVIDERS=TensorrtExecutionProvider

Learn more

Roboflow offers very detailed and comprehensive documentation. So it is highly recommended to check them here.

Tech Support & Product Discussion

Thank you for choosing our products! We are here to provide you with different support to ensure that your experience with our products is as smooth as possible. We offer several communication channels to cater to different preferences and needs.

Prerequisites​

Flash JetPack to Jetson​

Tap into 50,000+ Models at Roboflow Universe​

Obtain Roboflow API Key​

Running Roboflow Inference Server​

Using pip Package​

Using Docker Hub​

Server Setup - Jetson​

Client Setup - Jetson/ PC​

Using Local Docker Build​

Server Setup - Jetson​

Client Setup - Jetson/ PC​

Enable TensorRT​

Learn more​

Tech Support & Product Discussion​

Prerequisites

Flash JetPack to Jetson

Tap into 50,000+ Models at Roboflow Universe

Obtain Roboflow API Key

Running Roboflow Inference Server

Using pip Package

Using Docker Hub

Server Setup - Jetson

Client Setup - Jetson/ PC

Using Local Docker Build

Server Setup - Jetson

Client Setup - Jetson/ PC

Enable TensorRT

Learn more

Tech Support & Product Discussion