aboutsummaryrefslogtreecommitdiff
path: root/blog/2020-09-01-visual-recognition.org
diff options
context:
space:
mode:
authorChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
committerChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
commitcaccd81c3eb7954662d20cab10cc3afeeabca615 (patch)
tree567ed10350c1ee319c178952ab6aa48265977e58 /blog/2020-09-01-visual-recognition.org
downloadcleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.gz
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.bz2
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.zip
initial commit
Diffstat (limited to 'blog/2020-09-01-visual-recognition.org')
-rw-r--r--blog/2020-09-01-visual-recognition.org189
1 files changed, 189 insertions, 0 deletions
diff --git a/blog/2020-09-01-visual-recognition.org b/blog/2020-09-01-visual-recognition.org
new file mode 100644
index 0000000..9e4f739
--- /dev/null
+++ b/blog/2020-09-01-visual-recognition.org
@@ -0,0 +1,189 @@
+#+date: 2020-09-01
+#+title: IBM Watson Visual Recognition
+
+* What is IBM Watson?
+
+If you've never heard of [[https://www.ibm.com/watson][Watson]], this service is a suite of enterprise-ready AI
+services, applications, and tooling provided by IBM. Watson contains quite a few
+useful tools for data scientists and students, including the subject of this
+post today: visual recognition.
+
+If you'd like to view the official documentation for the Visual Recognition API,
+visit the [[https://cloud.ibm.com/apidocs/visual-recognition/visual-recognition-v3?code=python][API Docs]].
+
+* Prerequisites
+
+To be able to use Watson Visual Recognition, you'll need the following:
+
+1. Create a free account on [[https://www.ibm.com/cloud/watson-studio][IBM Watson Studio]].
+2. Add the [[https://www.ibm.com/cloud/watson-visual-recognition][Watson Visual Recognition]] service to your IBM Watson account.
+3. Get your API key and URL. To do this, first go to the [[https://dataplatform.cloud.ibm.com/home2?context=cpdaas][profile dashboard]] for
+ your IBM account and click on the Watson Visual Recognition service you
+ created. This will be listed in the section titled *Your services*. Then
+ click the *Credentials* tab and open the *Auto-generated credentials*
+ dropdown. Copy your API key and URL so that you can use them in the Python
+ script later.
+4. *[Optional]* While not required, you can also create the Jupyter Notebook for
+ this project right inside [[https://www.ibm.com/cloud/watson-studio][Watson Studio]]. Watson Studio will save your
+ notebooks inside an organized project and allow you to use their other
+ integrated products, such as storage containers, AI models, documentation,
+ external sharing, etc.
+
+* Calling the IBM Watson Visual Recognition API
+
+Okay, now let's get started.
+
+To begin, we need to install the proper Python package for IBM Watson.
+
+#+BEGIN_SRC sh
+pip install --upgrade --user "ibm-watson>=4.5.0"
+#+END_SRC
+
+Next, we need to specify the API key, version, and URL given to us when we
+created the Watson Visual Recognition service.
+
+#+BEGIN_SRC python
+apikey = "<your-apikey>"
+version = "2018-03-19"
+url = "<your-url>"
+#+END_SRC
+
+Now, let's import the necessary libraries and authenticate our service.
+
+#+BEGIN_SRC python
+import json
+from ibm_watson import VisualRecognitionV3
+from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
+
+authenticator = IAMAuthenticator(apikey)
+visual_recognition = VisualRecognitionV3(
+ version=version,
+ authenticator=authenticator
+)
+
+visual_recognition.set_service_url(url)
+#+END_SRC
+
+*[Optional]* If you'd like to tell the API not to use any data to improve
+their products, set the following header.
+
+#+BEGIN_SRC python
+visual_recognition.set_default_headers({'x-watson-learning-opt-out': "true"})
+#+END_SRC
+
+Now we have our API all set and ready to go. For this example, I'm going to
+include a =dict= of photos to load as we test out the API.
+
+#+BEGIN_SRC python
+data = [
+ {
+ "title": "Grizzly Bear",
+ "url": "https://example.com/photos/image1.jpg"
+ },
+ {
+ "title": "Nature Lake",
+ "url": "https://example.com/photos/image2.jpg"
+ },
+ {
+ "title": "Welcome Sign",
+ "url": "https://example.com/photos/image3.jpg"
+ },
+ {
+ "title": "Honey Badger",
+ "url": "https://example.com/photos/image4.jpg"
+ },
+ {
+ "title": "Grand Canyon Lizard",
+ "url": "https://example.com/photos/image5.jpg"
+ },
+ {
+ "title": "Castle",
+ "url": "https://example.com/photos/image6.jpg"
+ }
+]
+#+END_SRC
+
+Now that we've set up our libraries and have the photos ready, let's create a
+loop to call the API for each image. The code below shows a loop that calls the
+URL of each image and sends it to the API, requesting results with at least 60%
+confidence. The results are output to the console with dotted lines separating
+each section.
+
+In the case of an API error, the codes and explanations are output to the
+console.
+
+#+BEGIN_SRC python
+from ibm_watson import ApiException
+
+for x in range(len(data)):
+try:
+ url = data[x]["url"]
+ images_filename = data[x]["title"]
+ classes = visual_recognition.classify(
+ url=url,
+ images_filename=images_filename,
+ threshold='0.6',
+ owners=["IBM"]).get_result()
+ print("-----------------------------------------------")
+ print("Image Title: ", data[x]["title"], "\n")
+ print("Image URL: ", data[x]["url"], "\n")
+ classification_results = classes["images"][0]["classifiers"][0]["classes"]
+ for result in classification_results:
+ print(result["class"], "(", result["score"], ")")
+ print("-----------------------------------------------")
+except ApiException as ex:
+ print("Method failed with status code " + str(ex.code) + ": " + ex.message)
+#+END_SRC
+
+* The Results
+
+Here we can see the full result set of our function above. If you view each of
+the URLs that we sent to the API, you'll be able to see that it was remarkably
+accurate. To be fair, these are clear high-resolution, clear photos shot with a
+professional camera. In reality, you will most likely be processing images that
+are lower quality and may have a lot of noise in the photo.
+
+However, we can clearly see the benefit of being able to call this API instead
+of attempting to write our own image recognition function. Each of the
+classifications returned was a fair description of the image.
+
+If you wanted to restrict the results to those that are at least 90% confident
+or greater, you would simply adjust the =threshold= in the
+=visual_recognition.classify()= function.
+
+When your program runs, it should show the output below for each photo you
+provide.
+
+#+BEGIN_SRC txt
+----------------------------------------------------------------
+Image Title: Grizzly Bear
+Image URL: https://example.com/photos/image1.jpg
+
+brown bear ( 0.944 )
+bear ( 1 )
+carnivore ( 1 )
+mammal ( 1 )
+animal ( 1 )
+Alaskan brown bear ( 0.759 )
+greenishness color ( 0.975 )
+----------------------------------------------------------------
+#+END_SRC
+
+* Discussion
+
+Now, this was a very minimal implementation of the API. We simply supplied some
+images and looked to see how accurate the results were. However, you could
+implement this type of API into many machine learning (ML) models.
+
+For example, you could be working for a company that scans their warehouses or
+inventory using drones. Would you want to pay employees to sit there and watch
+drone footage all day in order to identify or count things in the video?
+Probably not. Instead, you could use a classification system similar to this one
+in order to train your machine learning model to correctly identify items that
+the drones show through video. More specifically, you could have your machine
+learning model watch a drone fly over a field of sheep in order to count how
+many sheep are living in that field.
+
+There are many ways to implement machine learning functionality, but hopefully
+this post helped inspire some deeper thought about the tools that can help
+propel us further into the future of machine learning and AI.