YOLO object detection using ONNXRuntime with Ruby

kojix2 - Sep 2 '19 - - Dev Community

ONNXRuntime

Scoring engine for ML models.

YOLOv3

Download

# model
wget https://onnxzoo.blob.core.windows.net/models/opset_10/yolov3/yolov3.onnx
# coco-labels
wget https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-2014_2017.txt
# gems
gem instal numo-narray mini_magick
Enter fullscreen mode Exit fullscreen mode

yolov3.rb

require 'mini_magick'
require 'numo/narray'
require 'onnxruntime'

SFloat = Numo::SFloat

input_path = ARGV[0]
output_path = ARGV[1]

model = OnnxRuntime::Model.new('yolov3.onnx')
labels = File.readlines('coco-labels-2014_2017.txt')

# preprocessing
img = MiniMagick::Image.open(input_path)
image_size = [[img.height, img.width]]
img.combine_options do |b|
  b.resize '416x416'
  b.gravity 'center'
  b.background 'transparent'
  b.extent '416x416'
end
img_data = SFloat.cast(img.get_pixels)
img_data /= 255.0
image_data = img_data.transpose(2, 0, 1)
                     .expand_dims(0)
                     .to_a # NArray -> Array

# inference
output = model.predict(input_1: image_data, image_shape: image_size)

# postprocessing
boxes, scores, indices = output.values
results = indices.map do |idx|
  { class: idx[1],
    score: scores[idx[0]][idx[1]][idx[2]],
    box: boxes[idx[0]][idx[2]] }
end

# visualization
img = MiniMagick::Image.open(input_path)
img.colorspace 'gray'
results.each do |r|
  hue = r[:class] * 100 / 80.0
  label = labels[r[:class]]
  score = r[:score]
  # draw box
  y1, x1, y2, x2 = r[:box].map(&:round)
  img.combine_options do |c|
    c.draw        "rectangle #{x1}, #{y1}, #{x2}, #{y2}"
    c.fill        "hsla(#{hue}%, 20%, 80%, 0.25)"
    c.stroke      "hsla(#{hue}%, 70%, 60%, 1.0)"
    c.strokewidth (score * 3).to_s
  end
  # draw text
  img.combine_options do |c|
    c.draw "text #{x1}, #{y1 - 5} \"#{label}\""
    c.fill 'white'
    c.pointsize 18
  end
end

img.write output_path
Enter fullscreen mode Exit fullscreen mode

Run

ruby yolov3.rb a.jpg b.jpg
Enter fullscreen mode Exit fullscreen mode

Elephants.
elephant.png
Ok. Easy.
elephant2.png

Person, horse, dog.
dog2.jpg
dog3.jpg

Where is a dog?
It is a pity that the dog is not recognized by my computer.

dog.png

Now It's time for a human being to do something.

inu.jpg

That's it!
(inu means dog in the Japanese dialect of human language.)

I don't want to lose the human mind even in the age of AI.

Images

Thank you for reading.
Have a nice day!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player