Simplifying Tech for You

OpenCV Object Detection

OpenCV: Object Detection Technology

Post author:Luke Barber
Post published:May 15, 2023
Post category:Python / Python Projects and Guides
Post comments:0 Comments

A Guide to Object Detection with OpenCV

In the expansive canvas of computer vision, the ability to perceive and recognize objects is a groundbreaking capability. Welcome to the world of object detection with OpenCV, where algorithms breathe life into pixels, enabling machines to ‘see’ and understand the visual world. In this exploration, we delve into the intricate workings of object detection, from basic shapes to the identification and tracking of complex entities. Join us as we demystify the magic behind OpenCV, unveiling the unseen and opening doors to a future where machines comprehend the visual tapestry as we do.

What’s Covered:

What is Object Detection?
The SSD MobileNet v2 Convolutional Neural Network
Nvidia’s Object Detection Program.
Creating our own Object Detection Program.
Object Detection on Images.

What is Object Detection?

Object Detection uses computer vision and image processing to identify objects of a specific class. Classes such as humans, vehicles, and animals are regularly monitored using object detection software. Object detection can be used for many tasks such as, vehicle counting, people counting, face detection, image annotation, image retrieval, animal monitoring, monitoring consumer activities and habits, even tracking an objects movement like a ball or a bat in a sports match. Object detection is commonly used everywhere. The consumer giant Amazon uses a robotic arm called “Sparrow” which is programmed using computer vision technology to recognize and handle millions of different products efficiently without human intervention.

The SSD MobileNet v2 Convolutional Neural Network

I will be using the SSD MobileNet v2 neural network, which is a combination of two detection models, SSD and MobileNet. SSD (single shot detector) is a family of algorithms used to create classification and bounding boxes for detected objects. MobileNet v2 is a CNN (convolutional neural network) architecture that does classification, object detection and semantic segmentation. MobileNet v2 is designed to work well on low end devices such as mobile devices, raspberry pi’s and in my case the Jetson Nano.

Together, the MobileNet base works as a feature extractor for the SSD layer which will then classify the detected objects.

NVidia’s Object Detection Program

Below is NVidia’s Object Detection program. It’s very effective and has only 10 lines of code. But as this is a post about OpenCV, my aim is to replicate what Dusty from NVidia has done but using OpenCV instead. To run Dusty’s code, you will need to install the Jetson Inference library for Python.

Python Code:

import jetson.inference
import jetson.utils

net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
camera = jetson.utils.videoSource('/dev/video0')
display = jetson.utils.videoOutput()

while True:
  img=camera.Capture()
  detections = net.Detect(img)
  print(detections)
  display.Render(img)
  display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))

Creating our own Object Detection Program

I tried to replicate NVidia’s Object detection with OpenCV as close as i could, and while they look very similar, mine would only display the data and bounding box around one detection at a time. This was mainly due to the way I used cv2.FILLED to create the bounding box filled with color. I won’t bother sharing the code for the replica yet, as it wasn’t as successful as I hoped.

The Final Program

My code is full of comments describing each process, so I won’t go into too much detail about the code as we proceed through this post. If you have any questions, please leave them in the comment sections and I will answer them all as soon as possible.

To achieve multiple detections at a time I had to remove the color fill from the bounding boxes. I also changed the bounding box color depending on the confidence values.

Python Code:

import cv2
import jetson_inference
import jetson_utils
import time
import numpy as np

#Create Inference Object
net = jetson_inference.detectNet("ssd-mobilenet-v2",threshold=.5)

#Set Desired Font for Text
font=cv2.FONT_HERSHEY_SIMPLEX

#Set up Camera
width=640
height=480
camera=cv2.VideoCapture(0)
camera.set(cv2.CAP_PROP_FRAME_WIDTH,width)
camera.set(cv2.CAP_PROP_FRAME_HEIGHT,height)

#Create Objects for FPS Results
fpsTime=time.time()
fpsAVG=0

while True:
  ret,frame=camera.read()

  #Grab Frame Height and Width
  height=frame.shape[0]
  width=frame.shape[1]

  #Convert Frame to RGBA so it can be Processed by Cuda 
  img=cv2.cvtColor(frame, cv2.COLOR_BGR2RGBA).astype(np.float32)
  img=jetson_utils.cudaFromNumpy(img)

  #Run Detections on Captured Frame 
  detections=net.Detect(img,width,height)

  #Create Loop to Grab Detection Data
  for obj in detections:
    print(obj)
    ID=obj.ClassID
    Confidence=obj.Confidence
    top=int(obj.Top)
    bottom=int(obj.Bottom)
    left=int(obj.Left)
    right=int(obj.Right)
    item=net.GetClassDesc(ID)

    #Format the Confidence Values
    newConf='{:.2f}'.format(Confidence)
    myConf=float(newConf)*100
    myConf2=int(myConf)

    #Create Bounding Box Color Based On Confidence
    if myConf2>=75: #Green
      objColor=(100,255,100)
    elif myConf2<75 and myConf2>=55: #Yellow
      objColor=(0,180,255)
    elif myConf2<55: #Red
      objColor=(100,100,255)

    #Add Bounding Box and Text to Detected Object
    cv2.rectangle(frame, (left,top), (right,bottom), (objColor), 2)
    cv2.putText(frame, str(myConf2)+'%'+' '+ item, (left,top+20), font, .5, (255,255,255), 2)

  #Add FPS to Frame
  dt=time.time()-fpsTime
  fpsTime=time.time()
  fps=1/dt
  fpsAVG=.9*fpsAVG+.1*fps
  cv2.putText(frame, str(round(fpsAVG,1))+'fps',(10,20),font,.5,(0,0,0),2)

  #Display Final Frame 
  cv2.imshow('Object Detection',frame)

  if cv2.waitKey(1)==ord('q'):
    break

camera.release()
cv2.destroyAllWindows()

As you can see from the images above, multiple detections were successful. If the detection had a confidence value above 75%, the bounding box would be green. If the value was above 55% and less than 75%, the bounding box would be yellow. If the value was less than 55%, the bounding box would be red. I was pleased with the end result, though I do wish I had figured out how to keep the color fill in the bounding boxes, but I will return and make this work.

Object Detection on Images

The code for detecting objects within images is slightly different than on live video, mainly because all the color conversions I had to do, to display the detections.

Python Code:

import cv2
import jetson_inference
import jetson_utils
import numpy as np

#Create Inference Object
net = jetson_inference.detectNet("ssd-mobilenet-v2",threshold=.5)

#Set Image Window Parameters
width=800
height=650
dim=width,height

#Location of images
imgDIR='/home/meganano/Desktop/Python-AI/ObjectDetection/objectImages/'

#Start with first image
image=1

while True:
  #read image
  img=cv2.imread(imgDIR+str(image)+'.jpg')

  #resize image
  resImg=cv2.resize(img,dim)

  #Convert image to RGBA so it can be processed by Cuda 
  imgRGB=cv2.cvtColor(resImg, cv2.COLOR_BGR2RGBA).astype(np.float32)
  imgRGB=jetson_utils.cudaFromNumpy(imgRGB)

  #Run detections on image 
  detections=net.Detect(imgRGB, width, height)

  #Convert image from Cuda back to Numpy array 
  imgRGB=jetson_utils.cudaToNumpy(imgRGB)

  #Convert RGBA image to RGB
  imgRGB=cv2.cvtColor(imgRGB, cv2.COLOR_RGBA2RGB).astype(np.uint8)

  #Convert RGB image back to BGR
  imgRGB=cv2.cvtColor(imgRGB, cv2.COLOR_RGB2BGR)

  #Display image
  cv2.imshow('image', imgRGB)

  #Add Pause, press any key to continue
  cv2.waitKey(0)

  #Next image control
  image+=1

  #Exit program if last image has been displayed
  if image>8: 
    print('Program Exited Cleanly')
    break

cv2.destroyAllWindows()

To my surprise I realized I didn’t need to add the bounding boxes and confidence values, as they appeared without my input, I’m guessing this was applied by the SSD side of the SSD MobileNet v2 neural network. Maybe this was the key to making the perfect replica of Dusty’s Object Detection program. I will hopefully return soon with an updated program.

Conclusion

As we conclude our expedition through the fascinating landscape of object detection with OpenCV, it’s evident that the ability to recognize and understand visual elements is a cornerstone of modern technology. From surveillance systems to autonomous vehicles, the applications are as diverse as the objects themselves. Armed with the insights gained from this journey, you now hold the key to unlocking the potential of object detection with OpenCV. As you venture into your own projects, remember that every detected object tells a story, and OpenCV empowers you to reveal and interpret these narratives.

In the next installment of our OpenCV for Beginners guide we will be learning Color Detection

That’s All Folks!

You can find all of our OpenCV guides here: OpenCV for Beginners

Tags: Object Detection, OpenCV, Python

Luke Barber

Hey there! I’m Luke, a tech enthusiast simplifying Arduino, Python, Linux, and Ethical Hacking for beginners. With creds like CompTIA A+, Sec+, and CEH, I’m here to share my coding and tinkering adventures. Join me on Meganano for easy guides and a fun dive into tech, no genius required!

Simplifying Tech for You

A Guide to Object Detection with OpenCV

What’s Covered:

What is Object Detection?

The SSD MobileNet v2 Convolutional Neural Network

NVidia’s Object Detection Program

Python Code:

Creating our own Object Detection Program

The Final Program

Python Code:

Object Detection on Images

Python Code:

Conclusion

Please Share This Share this content

Luke Barber

You Might Also Like

Base64: Python Encryption Methods

Python TKinter: Building Dynamic User Interfaces

OpenCV: Detecting and Tracking Faces

Share this content