content/blog/2020-09-01-visual-recognition.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191

#+date: <2020-09-01>
#+title: IBM Watson Visual Recognition
#+description: 
#+slug: visual-recognition

* What is IBM Watson?

If you've never heard of [[https://www.ibm.com/watson][Watson]], this service is a suite of enterprise-ready AI
services, applications, and tooling provided by IBM. Watson contains quite a few
useful tools for data scientists and students, including the subject of this
post today: visual recognition.

If you'd like to view the official documentation for the Visual Recognition API,
visit the [[https://cloud.ibm.com/apidocs/visual-recognition/visual-recognition-v3?code=python][API Docs]].

* Prerequisites

To be able to use Watson Visual Recognition, you'll need the following:

1. Create a free account on [[https://www.ibm.com/cloud/watson-studio][IBM Watson Studio]].
2. Add the [[https://www.ibm.com/cloud/watson-visual-recognition][Watson Visual Recognition]] service to your IBM Watson account.
3. Get your API key and URL. To do this, first go to the [[https://dataplatform.cloud.ibm.com/home2?context=cpdaas][profile dashboard]] for
   your IBM account and click on the Watson Visual Recognition service you
   created. This will be listed in the section titled *Your services*. Then
   click the *Credentials* tab and open the *Auto-generated credentials*
   dropdown. Copy your API key and URL so that you can use them in the Python
   script later.
4. *[Optional]* While not required, you can also create the Jupyter Notebook for
   this project right inside [[https://www.ibm.com/cloud/watson-studio][Watson Studio]]. Watson Studio will save your
   notebooks inside an organized project and allow you to use their other
   integrated products, such as storage containers, AI models, documentation,
   external sharing, etc.

* Calling the IBM Watson Visual Recognition API

Okay, now let's get started.

To begin, we need to install the proper Python package for IBM Watson.

#+begin_src sh
pip install --upgrade --user "ibm-watson>=4.5.0"
#+end_src

Next, we need to specify the API key, version, and URL given to us when we
created the Watson Visual Recognition service.

#+begin_src python
apikey = "<your-apikey>"
version = "2018-03-19"
url = "<your-url>"
#+end_src

Now, let's import the necessary libraries and authenticate our service.

#+begin_src python
import json
from ibm_watson import VisualRecognitionV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator(apikey)
visual_recognition = VisualRecognitionV3(
  version=version,
  authenticator=authenticator
)

visual_recognition.set_service_url(url)
#+end_src

*[Optional]* If you'd like to tell the API not to use any data to improve their
products, set the following header.

#+begin_src python
visual_recognition.set_default_headers({'x-watson-learning-opt-out': "true"})
#+end_src

Now we have our API all set and ready to go. For this example, I'm going to
include a =dict= of photos to load as we test out the API.

#+begin_src python
data = [
  {
    "title": "Grizzly Bear",
    "url": "https://example.com/photos/image1.jpg"
  },
  {
    "title": "Nature Lake",
    "url": "https://example.com/photos/image2.jpg"
  },
  {
    "title": "Welcome Sign",
    "url": "https://example.com/photos/image3.jpg"
  },
  {
    "title": "Honey Badger",
    "url": "https://example.com/photos/image4.jpg"
  },
  {
    "title": "Grand Canyon Lizard",
    "url": "https://example.com/photos/image5.jpg"
  },
  {
    "title": "Castle",
    "url": "https://example.com/photos/image6.jpg"
  }
]
#+end_src

Now that we've set up our libraries and have the photos ready, let's create a
loop to call the API for each image. The code below shows a loop that calls the
URL of each image and sends it to the API, requesting results with at least 60%
confidence. The results are output to the console with dotted lines separating
each section.

In the case of an API error, the codes and explanations are output to the
console.

#+begin_src python
from ibm_watson import ApiException

for x in range(len(data)):
try:
   url = data[x]["url"]
   images_filename = data[x]["title"]
   classes = visual_recognition.classify(
       url=url,
       images_filename=images_filename,
       threshold='0.6',
       owners=["IBM"]).get_result()
   print("-----------------------------------------------")
   print("Image Title: ", data[x]["title"], "\n")
   print("Image URL: ", data[x]["url"], "\n")
   classification_results = classes["images"][0]["classifiers"][0]["classes"]
   for result in classification_results:
       print(result["class"], "(", result["score"], ")")
   print("-----------------------------------------------")
except ApiException as ex:
   print("Method failed with status code " + str(ex.code) + ": " + ex.message)
#+end_src

* The Results

Here we can see the full result set of our function above. If you view each of
the URLs that we sent to the API, you'll be able to see that it was remarkably
accurate. To be fair, these are clear high-resolution, clear photos shot with a
professional camera. In reality, you will most likely be processing images that
are lower quality and may have a lot of noise in the photo.

However, we can clearly see the benefit of being able to call this API instead
of attempting to write our own image recognition function. Each of the
classifications returned was a fair description of the image.

If you wanted to restrict the results to those that are at least 90% confident
or greater, you would simply adjust the =threshold= in the
=visual_recognition.classify()= function.

When your program runs, it should show the output below for each photo you
provide.

#+begin_src txt
----------------------------------------------------------------
Image Title:  Grizzly Bear
Image URL: https://example.com/photos/image1.jpg

brown bear ( 0.944 )
bear ( 1 )
carnivore ( 1 )
mammal ( 1 )
animal ( 1 )
Alaskan brown bear ( 0.759 )
greenishness color ( 0.975 )
----------------------------------------------------------------
#+end_src

* Discussion

Now, this was a very minimal implementation of the API. We simply supplied some
images and looked to see how accurate the results were. However, you could
implement this type of API into many machine learning (ML) models.

For example, you could be working for a company that scans their warehouses or
inventory using drones. Would you want to pay employees to sit there and watch
drone footage all day in order to identify or count things in the video?
Probably not. Instead, you could use a classification system similar to this one
in order to train your machine learning model to correctly identify items that
the drones show through video. More specifically, you could have your machine
learning model watch a drone fly over a field of sheep in order to count how
many sheep are living in that field.

There are many ways to implement machine learning functionality, but hopefully
this post helped inspire some deeper thought about the tools that can help
propel us further into the future of machine learning and AI.