Jetson Containers - Building TensorFlow Object Detection Samples
Prologue
This post is part of a series covering the NVIDIA Jetson platform. It may help to take a peek through the other posts beforehand.
- Introduction
- Building the Samples
- Maximizing Jetson Nano Dev Kit Storage
- Pushing Images to Devices
- Building DeepStream Images
Introduction
Installing TensorFlow for Jetson devices seems very straightforward to start. The Jetson Zoo lists the packages needed to install TensorFlow and you’re off to go. This works fine if you you install and run everything on the host. If you want to run TensorFlow in a container, then we need to dig deeper. Before going into details, we should look at the image sizes so that there is no shock later.
Installing TensorFlow in a container requires a ton of space. Depending on what dependencies you bring on, your final base images will be 2.76 GB
to 7.76 GB
. The compiled TensorFlow .so
file is massive. The TensorFlow built by NVIDIA is linked against cublas
, cudart
, cufft
, curand
, cusolver
, cusparse
, cuDNN
, and TensorRT
. If you don’t leverage cuDNN
or TensorRT
, that can save gigabytes off of your images. A little visual aid might help:
+-------------------------------------------------------------------+
| TensorFlow (1.0 GiB + all of its dependencies) |
+-------------------------------------------------------------------+
+-------------+-------------+-------------+------------+------------+
| L4T Devel |
| (5.67 GiB) ______________________________________________________|
| | cuDNN | TensorRT | Dev | Extra |
| | (720 MiB) | (1.1 GiB) | Versions | Depends |
| | | | of | |
| | | | Libs | |
| | | | | |
+-------------------------------------------------------------------+
+-------------------------------------------------------------------+
| L4T Release (cuda libraries) (1.21 GiB) |
| |
+-------------------------------------------------------------------+
+-------------------------------------------------------------------+
| L4T Base (cuda runtime and profiling apis) (493 MiB) |
+-------------------------------------------------------------------+
+-------------------------------------------------------------------+
| L4T Driver Pack (minimum nvidia libs) (483 MiB) |
+-------------------------------------------------------------------+
+-------------------------------------------------------------------+
| Ubuntu 18.04 (80.4MiB) |
+-------------------------------------------------------------------+
TensorFlow Base
We can build our images from base
, release
, or devel
images layering in what we need. Given that we know the exact CUDA libraries we need, we can skip over release
and focus on base
and devel
images. This is a little murky and depends on the definition of dependency
and required
you want to use. TensorFlow is compiled against many things, but we don’t have to use those features. Dockerfiles for TensorFlow along with the Object Detection sample can be found in the jetson-containers repository. The SHA
s used in this post are for the Jetson Nano dev kit, copied directly from the JetPack 4.2.1 devel
image Dockerfile. You can build these images out for your device by copying the appropriate lines.
TLDR for images sizes built in the sections below:
l4t 32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-base-min 2.76GB
l4t 32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-base-min-full 4.41GB
l4t 32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-devel 6.76GB
Devel
If we start off from the devel
images for our device, the TensorFlow installation is pretty straightforward. I’ve moved the h5py
installation into the apt
portion as it is set at the correct version range (as compared to the Jetson Zoo).
ARG IMAGE_NAME
ARG TAG
FROM ${IMAGE_NAME}:${TAG}-devel as tensorflow-base
RUN apt-get update && apt-get install -y \
build-essential \
libhdf5-dev \
libhdf5-serial-dev \
python3-dev \
python3-h5py \
python3-pip \
python3-setuptools \
&& \
python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir --upgrade setuptools && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN python3 -m pip install --no-cache-dir -U numpy grpcio absl-py py-cpuinfo psutil portpicker grpcio six mock requests gast astor termcolor
# Install TensorFlow
#RUN python3 -m pip install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu
# can browse from https://developer.download.nvidia.com/compute/redist/jp/
#RUN python3 -m pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==$TF_VERSION+nv$NV_VERSION
RUN python3 -m pip install --no-cache-dir --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.14.0+nv19.7
This will give us an image we can use as a base for build agents and quickly testing, but we are sitting at ~6.76 GB
which is a lot of space for something like a Jetson Nano. With an image that large, you’ll be hard pressed to be able to issue an update of the base image without running out of eMMC storage and you’ll have to have additional storage configured for docker images. If you’re using a TX2 or Xavier, you will probably be fine.
Minimum Full
Like the devel
sample above, setting up the base image for TensorFlow, which has all required NVIDIA libraries, requires a considerable amount of the devel
image, but we can still make some nice gains. We need to install cublas-dev
and cudart-dev
for TensorRT
- ¯\_(ツ)_/¯
- as well as cublas
, cudart
, cufft
, curand
, cusolver
, cusparse
as normal CUDA libs. Then we’ll need to set up cuDNN
, TensorRT
, Graph Surgeon
, UFF Converter
, OpenCV
(with dependencies), and finally TensorFlow.
ARG DEPENDENCIES_IMAGE
ARG IMAGE_NAME
ARG TAG
FROM ${DEPENDENCIES_IMAGE} as dependencies
ARG IMAGE_NAME
ARG TAG
FROM ${IMAGE_NAME}:${TAG}-base as tensorflow-base
# CUDA Toolkit for L4T
# TensorRT: cuda-cublas-dev-10-0 cuda-cudart-dev-10-0
# TensorFlow: cuda-cublas-10-0 cuda-cudart-10-0 cuda-cufft-10-0
# cuda-curand-10-0 cuda-cusolver-10-0 cuda-cusparse-10-0
ARG CUDA_TOOLKIT_PKG="cuda-repo-l4t-10-0-local-${CUDA_PKG_VERSION}_arm64.deb"
COPY --from=dependencies /data/${CUDA_TOOLKIT_PKG} ${CUDA_TOOLKIT_PKG}
RUN echo "0e12b2f53c7cbe4233c2da73f7d8e6b4 ${CUDA_TOOLKIT_PKG}" | md5sum -c - && \
dpkg --force-all -i ${CUDA_TOOLKIT_PKG} && \
rm ${CUDA_TOOLKIT_PKG} && \
apt-get update && \
apt-get install -y --allow-downgrades \
cuda-cublas-dev-10-0 \
cuda-cudart-dev-10-0 \
cuda-cublas-10-0 \
cuda-cudart-10-0 \
cuda-cufft-10-0 \
cuda-curand-10-0 \
cuda-cusolver-10-0 \
cuda-cusparse-10-0 \
&& \
dpkg --purge cuda-repo-l4t-10-0-local-10.0.326 \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# NVIDIA CUDA Deep Neural Network library (cuDNN)
ENV CUDNN_VERSION 7.5.0.56
ENV CUDNN_PKG_VERSION=${CUDA_VERSION}-1
LABEL com.nvidia.cudnn.version="${CUDNN_VERSION}"
COPY --from=dependencies /data/libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb
RUN echo "9f30aa86e505a3b83b127ed7a51309a1 libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb" | md5sum -c - && \
dpkg -i libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb && \
rm libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb
COPY --from=dependencies /data/libcudnn7-dev_$CUDNN_VERSION-1+cuda10.0_arm64.deb libcudnn7-dev_$CUDNN_VERSION-1+cuda10.0_arm64.deb
RUN echo "a010637c80859b2143ef24461ee2ef97 libcudnn7-dev_$CUDNN_VERSION-1+cuda10.0_arm64.deb" | md5sum -c - && \
dpkg -i libcudnn7-dev_$CUDNN_VERSION-1+cuda10.0_arm64.deb && \
rm libcudnn7-dev_$CUDNN_VERSION-1+cuda10.0_arm64.deb
# NVIDIA TensorRT
ENV LIBINFER_VERSION 5.1.6
ENV LIBINFER_PKG_VERSION=${LIBINFER_VERSION}-1
LABEL com.nvidia.libinfer.version="${LIBINFER_VERSION}"
ENV LIBINFER_PKG libnvinfer5_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
ENV LIBINFER_DEV_PKG libnvinfer-dev_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
ENV LIBINFER_SAMPLES_PKG libnvinfer-samples_${LIBINFER_PKG_VERSION}+cuda10.0_all.deb
COPY --from=dependencies /data/${LIBINFER_PKG} ${LIBINFER_PKG}
RUN echo "dca1e2dadeae2186b57a11861fac7652 ${LIBINFER_PKG}" | md5sum -c - && \
dpkg -i ${LIBINFER_PKG} && \
rm ${LIBINFER_PKG}
COPY --from=dependencies /data/${LIBINFER_DEV_PKG} ${LIBINFER_DEV_PKG}
RUN echo "0e0c0c6d427847d5994f04fbce0401d2 ${LIBINFER_DEV_PKG}" | md5sum -c - && \
dpkg -i ${LIBINFER_DEV_PKG} && \
rm ${LIBINFER_DEV_PKG}
COPY --from=dependencies /data/${LIBINFER_SAMPLES_PKG} ${LIBINFER_SAMPLES_PKG}
RUN echo "e8f021ea1fad99d99f0f551d7ea3146a ${LIBINFER_SAMPLES_PKG}" | md5sum -c - && \
dpkg -i ${LIBINFER_SAMPLES_PKG} && \
rm ${LIBINFER_SAMPLES_PKG}
ENV TENSORRT_VERSION 5.1.6.1
ENV TENSORRT_PKG_VERSION=${TENSORRT_VERSION}-1
LABEL com.nvidia.tensorrt.version="${TENSORRT_VERSION}"
ENV TENSORRT_PKG tensorrt_${TENSORRT_PKG_VERSION}+cuda10.0_arm64.deb
COPY --from=dependencies /data/${TENSORRT_PKG} ${TENSORRT_PKG}
RUN echo "66e6df17b7a92d32dd3465bdfca9fc8d ${TENSORRT_PKG}" | md5sum -c - && \
dpkg -i ${TENSORRT_PKG} && \
rm ${TENSORRT_PKG}
# Graph Surgeon
COPY --from=dependencies /data/graphsurgeon-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb graphsurgeon-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
RUN echo "5729cc195d365335991c58abd75e0c99 graphsurgeon-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb" | md5sum -c - && \
dpkg -i graphsurgeon-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb && \
rm graphsurgeon-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
# UFF Converter
COPY --from=dependencies /data/uff-converter-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb uff-converter-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
RUN echo "b6310b19820a8b844d36dc597d2bf835 uff-converter-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb" | md5sum -c - && \
dpkg -i uff-converter-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb && \
rm uff-converter-tf_${LIBINFER_PKG_VERSION}+cuda10.0_arm64.deb
# Install dependencies for OpenCV
RUN apt-get update && apt-get install -y --no-install-recommends \
libavcodec-extra57 \
libavformat57 \
libavutil55 \
libcairo2 \
libgtk2.0-0 \
libswscale4 \
libtbb2 \
libtbb-dev \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
## Additional OpenCV dependencies usually installed by the CUDA Toolkit
RUN apt-get update && \
apt-get install -y \
libgstreamer1.0-0 \
libgstreamer-plugins-base1.0-0 \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Open CV 3.3.1
ENV OPENCV_VERSION 3.3.1
ENV OPENCV_PKG_VERSION=${OPENCV_VERSION}-2-g31ccdfe11
COPY --from=dependencies /data/libopencv_${OPENCV_PKG_VERSION}_arm64.deb libopencv_${OPENCV_PKG_VERSION}_arm64.deb
RUN echo "dd5b571c08a0098141203daec2ea1acc libopencv_${OPENCV_PKG_VERSION}_arm64.deb" | md5sum -c - && \
dpkg -i libopencv_${OPENCV_PKG_VERSION}_arm64.deb && \
rm libopencv_${OPENCV_PKG_VERSION}_arm64.deb
## Open CV python binding
COPY --from=dependencies /data/libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb
RUN echo "35776ce159afa78a0fe727d4a3c5b6fa libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb" | md5sum -c - && \
dpkg -i libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb && \
rm libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb
The rest follows just like from the devel
image to install TensorFlow:
RUN apt-get update && apt-get install -y \
build-essential \
libhdf5-dev \
libhdf5-serial-dev \
python3-dev \
python3-h5py \
python3-pip \
python3-setuptools \
&& \
python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir --upgrade setuptools && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN python3 -m pip install --no-cache-dir -U numpy grpcio absl-py py-cpuinfo psutil portpicker grpcio six mock requests gast astor termcolor
# Install TensorFlow
#RUN python3 -m pip install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu
# can browse from https://developer.download.nvidia.com/compute/redist/jp/
#RUN python3 -m pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==$TF_VERSION+nv$NV_VERSION
RUN python3 -m pip install --no-cache-dir --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.14.0+nv19.7
Minimum Usable
If you aren’t using TensorRT
, we can drop almost 1.7GB
from the image including all of its dependencies.
Note: In this example, we installed cuDNN
, but we can skip this if you don’t need the APIs that leverage it. You will get a warning when running your application that it can’t load the cuDNN
’s .so
file, but you’re app should still be runnable. This will drop the image down to 2.37GB
.
ARG DEPENDENCIES_IMAGE
ARG IMAGE_NAME
ARG TAG
FROM ${DEPENDENCIES_IMAGE} as dependencies
ARG IMAGE_NAME
ARG TAG
FROM ${IMAGE_NAME}:${TAG}-base as tensorflow-base
# CUDA Toolkit for L4T
# TensorFlow: cuda-cublas-10-0 cuda-cudart-10-0 cuda-cufft-10-0
# cuda-curand-10-0 cuda-cusolver-10-0 cuda-cusparse-10-0
ARG CUDA_TOOLKIT_PKG="cuda-repo-l4t-10-0-local-${CUDA_PKG_VERSION}_arm64.deb"
COPY --from=dependencies /data/${CUDA_TOOLKIT_PKG} ${CUDA_TOOLKIT_PKG}
RUN echo "0e12b2f53c7cbe4233c2da73f7d8e6b4 ${CUDA_TOOLKIT_PKG}" | md5sum -c - && \
dpkg --force-all -i ${CUDA_TOOLKIT_PKG} && \
rm ${CUDA_TOOLKIT_PKG} && \
apt-get update && \
apt-get install -y --allow-downgrades \
cuda-cublas-10-0 \
cuda-cudart-10-0 \
cuda-cufft-10-0 \
cuda-curand-10-0 \
cuda-cusolver-10-0 \
cuda-cusparse-10-0 \
&& \
dpkg --purge cuda-repo-l4t-10-0-local-10.0.326 \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# NVIDIA CUDA Deep Neural Network library (cuDNN)
ENV CUDNN_VERSION 7.5.0.56
ENV CUDNN_PKG_VERSION=${CUDA_VERSION}-1
LABEL com.nvidia.cudnn.version="${CUDNN_VERSION}"
COPY --from=dependencies /data/libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb
RUN echo "9f30aa86e505a3b83b127ed7a51309a1 libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb" | md5sum -c - && \
dpkg -i libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb && \
rm libcudnn7_$CUDNN_VERSION-1+cuda10.0_arm64.deb
# Install dependencies for OpenCV
RUN apt-get update && apt-get install -y --no-install-recommends \
libavcodec-extra57 \
libavformat57 \
libavutil55 \
libcairo2 \
libgtk2.0-0 \
libswscale4 \
libtbb2 \
libtbb-dev \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
## Additional OpenCV dependencies usually installed by the CUDA Toolkit
RUN apt-get update && \
apt-get install -y \
libgstreamer1.0-0 \
libgstreamer-plugins-base1.0-0 \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Open CV 3.3.1
ENV OPENCV_VERSION 3.3.1
ENV OPENCV_PKG_VERSION=${OPENCV_VERSION}-2-g31ccdfe11
COPY --from=dependencies /data/libopencv_${OPENCV_PKG_VERSION}_arm64.deb libopencv_${OPENCV_PKG_VERSION}_arm64.deb
RUN echo "dd5b571c08a0098141203daec2ea1acc libopencv_${OPENCV_PKG_VERSION}_arm64.deb" | md5sum -c - && \
dpkg -i libopencv_${OPENCV_PKG_VERSION}_arm64.deb && \
rm libopencv_${OPENCV_PKG_VERSION}_arm64.deb
## Open CV python binding
COPY --from=dependencies /data/libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb
RUN echo "35776ce159afa78a0fe727d4a3c5b6fa libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb" | md5sum -c - && \
dpkg -i libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb && \
rm libopencv-python_${OPENCV_PKG_VERSION}_arm64.deb
The rest follows just like from the devel
image to install TensorFlow:
RUN apt-get update && apt-get install -y \
build-essential \
libhdf5-dev \
libhdf5-serial-dev \
python3-dev \
python3-h5py \
python3-pip \
python3-setuptools \
&& \
python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir --upgrade setuptools && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN python3 -m pip install --no-cache-dir -U numpy grpcio absl-py py-cpuinfo psutil portpicker grpcio six mock requests gast astor termcolor
# Install TensorFlow
#RUN python3 -m pip install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu
# can browse from https://developer.download.nvidia.com/compute/redist/jp/
#RUN python3 -m pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==$TF_VERSION+nv$NV_VERSION
RUN python3 -m pip install --no-cache-dir --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.14.0+nv19.7
TensorFlow Object Detection Libraries
You might want to give TensorFlow a spin on your Jetson. We can build off of the base images we’ve created here to layer in the TensorFlow Object Detection APIs. To do this we’ll need to:
- Package the APIs as wheels to be installed later. (I’d rather not clone the whole repo and run from there)
- Install the wheels and their dependencies into our app base.
- Install our test app and its dependencies.
Adding this sample application will increase your image size by ~1.48GB
and the model will be downloaded on application run. You can decrease startup time by bundling the model as a layer.
Object Detection Wheels
FROM tensorflow-base as objectdetection-builder
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
protobuf-compiler \
&& \
rm -rf /var/lib/apt/lists/*
# Clone the TensorFlow Models Repository
# the release branches usually don't contain the research folder, so we have to use master.
ARG TF_MODELS_VERSION=master
RUN git clone --depth 1 https://github.com/tensorflow/models.git -b ${TF_MODELS_VERSION}
WORKDIR /models/research
# Compile the Protos
RUN protoc object_detection/protos/*.proto --python_out=.
# Build the Wheels
RUN python3 setup.py build && \
python3 setup.py bdist_wheel && \
(cd slim && python3 setup.py bdist_wheel)
App Base
Now we get to layer in the wheels, data, and app dependencies so that the last bit is simple.
FROM tensorflow-base as app-base
RUN apt-get update && apt-get install -y \
libfreetype6-dev \
pkg-config \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
COPY --from=objectdetection-builder /models/research/dist/object_detection-0.1-py3-none-any.whl .
COPY --from=objectdetection-builder /models/research/slim/dist/slim-0.1-py3-none-any.whl .
RUN python3 -m pip install --no-cache-dir object_detection-0.1-py3-none-any.whl && \
python3 -m pip install --no-cache-dir slim-0.1-py3-none-any.whl && \
rm object_detection-0.1-py3-none-any.whl && \
rm slim-0.1-py3-none-any.whl
COPY --from=objectdetection-builder /models/research/object_detection/data /data
# App dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libeigen3-dev \
libglew-dev \
libtiff5-dev \
libjpeg-dev \
libpng-dev \
libpostproc-dev \
libswscale-dev \
libtbb-dev \
libgtk2.0-dev \
libxvidcore-dev \
libx264-dev \
zlib1g-dev \
libxml2-dev \
libxslt1-dev \
libcanberra-gtk-module \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
The App
This part should be nice and small (unless we drop something big in requirements.txt
) and this is where our app releases should be living for most changes in our deployments.
FROM app-base
RUN mkdir app
WORKDIR /app
COPY requirements.txt ./
RUN python3 -m pip install --no-cache-dir --user -r requirements.txt
COPY app.py ./
RUN echo "#!/bin/bash" >> entrypoint.sh && \
echo "python3 app.py \$*" >> entrypoint.sh && \
chmod +x entrypoint.sh
ENTRYPOINT ["sh", "-c", "./entrypoint.sh $*", "--"]
The application itself, we’re just going to reuse most of the object detection notebook hooked up to a USB camera
# Code adapted from:
# https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
# License notice for TensorFlow Models and TensorFlow Object Detection
# -------------------------------
# Copyright 2016 The TensorFlow Authors. All rights reserved.
# Licensed under the Apache License, Version 2.0.
#
# Available at
# https://github.com/tensorflow/models/blob/master/LICENSE
from object_detection.utils import visualization_utils as vis_util
from object_detection.utils import label_map_util
from object_detection.utils import ops as utils_ops
import object_detection
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from PIL import Image
import argparse
import cv2
import time
if StrictVersion(tf.__version__) < StrictVersion('1.12.0'):
raise ImportError(
'Please upgrade your TensorFlow installation to v1.12.*.')
# What model to download, pick one or find others :)
#MODEL_NAME = 'ssd_mobilenet_v1_coco_2018_01_28'
MODEL_NAME = 'ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync_2018_07_03'
#MODEL_NAME = 'ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03'
#MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09'
#MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('/data', 'mscoco_label_map.pbtxt')
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
file_name = os.path.basename(file.name)
if 'frozen_inference_graph.pb' in file_name:
tar_file.extract(file, os.getcwd())
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.compat.v1.GraphDef()
with tf.io.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
category_index = label_map_util.create_category_index_from_labelmap(
PATH_TO_LABELS, use_display_name=True)
def load_image_into_numpy_array(image):
# (im_width, im_height) = image.size
height, width, channels = image.shape
image_np = np.array(np.asarray(image)).reshape(
(height, width, channels)).astype(np.uint8)
return image_np
def run_inferences(video_capture, graph):
with graph.as_default():
with tf.compat.v1.Session() as sess:
# Get handles to input and output tensors
ops = tf.compat.v1.get_default_graph().get_operations()
all_tensor_names = {
output.name for op in ops for output in op.outputs}
tensor_dict = {}
for key in [
'num_detections', 'detection_boxes', 'detection_scores',
'detection_classes', 'detection_masks'
]:
tensor_name = key + ':0'
if tensor_name in all_tensor_names:
tensor_dict[key] = tf.compat.v1.get_default_graph().get_tensor_by_name(tensor_name)
if 'detection_masks' in tensor_dict:
# The following processing is only for single image
detection_boxes = tf.squeeze(
tensor_dict['detection_boxes'], [0])
detection_masks = tf.squeeze(
tensor_dict['detection_masks'], [0])
# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
real_num_detection = tf.cast(
tensor_dict['num_detections'][0], tf.int32)
detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
detection_masks, detection_boxes, image.shape[1], image.shape[2])
detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)
# Follow the convention by adding back the batch dimension
tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)
image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
if video_capture.isOpened():
windowName = "Jetson TensorFlow Demo"
width = 1280
height = 720
cv2.namedWindow(windowName, cv2.WINDOW_NORMAL)
cv2.resizeWindow(windowName, width, height)
cv2.moveWindow(windowName, 0, 0)
cv2.setWindowTitle(windowName, "Jetson TensorFlow Demo")
font = cv2.FONT_HERSHEY_PLAIN
showFullScreen = False
while cv2.getWindowProperty(windowName, 0) >= 0:
ret_val, frame = video_capture.read()
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(frame)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = sess.run(tensor_dict, feed_dict={image_tensor: image_np_expanded})
# all outputs are float32 numpy arrays, so convert types as appropriate
output_dict['num_detections'] = int(output_dict['num_detections'][0])
output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.uint8)
output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
output_dict['detection_scores'] = output_dict['detection_scores'][0]
if 'detection_masks' in output_dict:
output_dict['detection_masks'] = output_dict['detection_masks'][0]
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
displayBuf = cv2.resize(image_np, (width, height))
cv2.imshow(windowName, displayBuf)
key = cv2.waitKey(10)
if key == -1:
continue
elif key == 27:
break
elif key == ord('f'):
if showFullScreen == False:
cv2.setWindowProperty(
windowName, cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
else:
cv2.setWindowProperty(
windowName, cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_NORMAL)
showFullScreen = not showFullScreen
else:
print("Failed to open the camera.")
def parse_cli_args():
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group()
group.add_argument("--capture_index", dest="capture_index",
help="Video device # of USB webcam (/dev/video?) [0]",
default=0, type=int)
arguments = parser.parse_args()
return arguments
if __name__ == '__main__':
arguments = parse_cli_args()
print("Called with args:")
print(arguments)
print("OpenCV version: {}".format(cv2.__version__))
print("Capture Index:", arguments.capture_index)
video_capture = cv2.VideoCapture(arguments.capture_index)
run_inferences(video_capture, detection_graph)
video_capture.release()
cv2.destroyAllWindows()
Assembling the Pieces
Assuming you’ve cloned the jetson-containers repository:
make build-32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-min
# or
make build-32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-min-full
# or
make build-32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-devel
In a terminal, on the device, lets open X11 forwarding from docker containers
~/jetson-containers$ xhost +local:docker
Now we can run the container image and see boxes and confidence scores rendered to the screen. Press f
for full screen or esc
to exit.
# only using --privileged here for brevity
user@nano-dev:~/$ docker run --rm -it --privileged --net=host -e "DISPLAY" l4t:32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-min
# or
user@nano-dev:~/$ docker run --rm -it --privileged --net=host -e "DISPLAY" l4t:32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-min-full
# or
user@nano-dev:~/$ docker run --rm -it --privileged --net=host -e "DISPLAY" l4t:32.2-nano-dev-jetpack-4.2.1-tensorflow-zoo-devel