aboutsummaryrefslogtreecommitdiff
path: root/content/blog/2020-09-01-visual-recognition.org
diff options
context:
space:
mode:
authorChristian Cleberg <hello@cleberg.net>2024-03-29 01:42:38 -0500
committerChristian Cleberg <hello@cleberg.net>2024-03-29 01:42:38 -0500
commit00b2726e0561f174393ae600f0f11adb8afebaab (patch)
treea4733d553ce68f64277ffa3a52f800dc58ff72de /content/blog/2020-09-01-visual-recognition.org
parent8ba3d90a0f3db7e5ed29e25ff6d0c1b557ed3ca0 (diff)
parent41bd0ad58e44244fe67cb36e066d4bb68738516f (diff)
downloadcleberg.net-00b2726e0561f174393ae600f0f11adb8afebaab.tar.gz
cleberg.net-00b2726e0561f174393ae600f0f11adb8afebaab.tar.bz2
cleberg.net-00b2726e0561f174393ae600f0f11adb8afebaab.zip
merge org branch into main
Diffstat (limited to 'content/blog/2020-09-01-visual-recognition.org')
-rw-r--r--content/blog/2020-09-01-visual-recognition.org197
1 files changed, 197 insertions, 0 deletions
diff --git a/content/blog/2020-09-01-visual-recognition.org b/content/blog/2020-09-01-visual-recognition.org
new file mode 100644
index 0000000..d703113
--- /dev/null
+++ b/content/blog/2020-09-01-visual-recognition.org
@@ -0,0 +1,197 @@
+#+title: IBM Watson Visual Recognition
+#+date: 2020-09-01
+#+description: Exploring and visualizing data with Python.
+#+filetags: :dev:
+
+* What is IBM Watson?
+If you've never heard of [[https://www.ibm.com/watson][Watson]], this
+service is a suite of enterprise-ready AI services, applications, and
+tooling provided by IBM. Watson contains quite a few useful tools for
+data scientists and students, including the subject of this post today:
+visual recognition.
+
+If you'd like to view the official documentation for the Visual
+Recognition API, visit the
+[[https://cloud.ibm.com/apidocs/visual-recognition/visual-recognition-v3?code=python][API
+Docs]].
+
+* Prerequisites
+To be able to use Watson Visual Recognition, you'll need the following:
+
+1. Create a free account on
+ [[https://www.ibm.com/cloud/watson-studio][IBM Watson Studio]].
+2. Add the [[https://www.ibm.com/cloud/watson-visual-recognition][Watson
+ Visual Recognition]] service to your IBM Watson account.
+3. Get your API key and URL. To do this, first go to the
+ [[https://dataplatform.cloud.ibm.com/home2?context=cpdaas][profile
+ dashboard]] for your IBM account and click on the Watson Visual
+ Recognition service you created. This will be listed in the section
+ titled *Your services*. Then click the *Credentials** tab and open the
+ *Auto-generated credentials** dropdown. Copy your API key and URL so
+ that you can use them in the Python script later.
+4. *[Optional]** While not required, you can also create the Jupyter
+ Notebook for this project right inside
+ [[https://www.ibm.com/cloud/watson-studio][Watson Studio]]. Watson
+ Studio will save your notebooks inside an organized project and allow
+ you to use their other integrated products, such as storage
+ containers, AI models, documentation, external sharing, etc.
+
+* Calling the IBM Watson Visual Recognition API
+Okay, now let's get started.
+
+To begin, we need to install the proper Python package for IBM Watson.
+
+#+begin_src sh
+pip install --upgrade --user "ibm-watson>=4.5.0"
+#+end_src
+
+Next, we need to specify the API key, version, and URL given to us when
+we created the Watson Visual Recognition service.
+
+#+begin_src python
+apikey = "<your-apikey>"
+version = "2018-03-19"
+url = "<your-url>"
+#+end_src
+
+Now, let's import the necessary libraries and authenticate our service.
+
+#+begin_src python
+import json
+from ibm_watson import VisualRecognitionV3
+from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
+
+authenticator = IAMAuthenticator(apikey)
+visual_recognition = VisualRecognitionV3(
+ version=version,
+ authenticator=authenticator
+)
+
+visual_recognition.set_service_url(url)
+#+end_src
+
+*[Optional]* If you'd like to tell the API not to use any data to
+improve their products, set the following header.
+
+#+begin_src python
+visual_recognition.set_default_headers({'x-watson-learning-opt-out': "true"})
+#+end_src
+
+Now we have our API all set and ready to go. For this example, I'm going
+to include a =dict= of photos to load as we test out the API.
+
+#+begin_src python
+data = [
+ {
+ "title": "Grizzly Bear",
+ "url": "https://example.com/photos/image1.jpg"
+ },
+ {
+ "title": "Nature Lake",
+ "url": "https://example.com/photos/image2.jpg"
+ },
+ {
+ "title": "Welcome Sign",
+ "url": "https://example.com/photos/image3.jpg"
+ },
+ {
+ "title": "Honey Badger",
+ "url": "https://example.com/photos/image4.jpg"
+ },
+ {
+ "title": "Grand Canyon Lizard",
+ "url": "https://example.com/photos/image5.jpg"
+ },
+ {
+ "title": "Castle",
+ "url": "https://example.com/photos/image6.jpg"
+ }
+]
+#+end_src
+
+Now that we've set up our libraries and have the photos ready, let's
+create a loop to call the API for each image. The code below shows a
+loop that calls the URL of each image and sends it to the API,
+requesting results with at least 60% confidence. The results are output
+to the console with dotted lines separating each section.
+
+In the case of an API error, the codes and explanations are output to
+the console.
+
+#+begin_src python
+from ibm_watson import ApiException
+
+for x in range(len(data)):
+try:
+ url = data[x]["url"]
+ images_filename = data[x]["title"]
+ classes = visual_recognition.classify(
+ url=url,
+ images_filename=images_filename,
+ threshold='0.6',
+ owners=["IBM"]).get_result()
+ print("-----------------------------------------------")
+ print("Image Title: ", data[x]["title"], "\n")
+ print("Image URL: ", data[x]["url"], "\n")
+ classification_results = classes["images"][0]["classifiers"][0]["classes"]
+ for result in classification_results:
+ print(result["class"], "(", result["score"], ")")
+ print("-----------------------------------------------")
+except ApiException as ex:
+ print("Method failed with status code " + str(ex.code) + ": " + ex.message)
+#+end_src
+
+* The Results
+Here we can see the full result set of our function above. If you view
+each of the URLs that we sent to the API, you'll be able to see that it
+was remarkably accurate. To be fair, these are clear high-resolution,
+clear photos shot with a professional camera. In reality, you will most
+likely be processing images that are lower quality and may have a lot of
+noise in the photo.
+
+However, we can clearly see the benefit of being able to call this API
+instead of attempting to write our own image recognition function. Each
+of the classifications returned was a fair description of the image.
+
+If you wanted to restrict the results to those that are at least 90%
+confident or greater, you would simply adjust the =threshold= in the
+=visual_recognition.classify()= function.
+
+When your program runs, it should show the output below for each photo
+you provide.
+
+#+begin_src txt
+----------------------------------------------------------------
+Image Title: Grizzly Bear
+Image URL: https://example.com/photos/image1.jpg
+
+brown bear ( 0.944 )
+bear ( 1 )
+carnivore ( 1 )
+mammal ( 1 )
+animal ( 1 )
+Alaskan brown bear ( 0.759 )
+greenishness color ( 0.975 )
+----------------------------------------------------------------
+#+end_src
+
+* Discussion
+Now, this was a very minimal implementation of the API. We simply
+supplied some images and looked to see how accurate the results were.
+However, you could implement this type of API into many machine learning
+(ML) models.
+
+For example, you could be working for a company that scans their
+warehouses or inventory using drones. Would you want to pay employees to
+sit there and watch drone footage all day in order to identify or count
+things in the video? Probably not. Instead, you could use a
+classification system similar to this one in order to train your machine
+learning model to correctly identify items that the drones show through
+video. More specifically, you could have your machine learning model
+watch a drone fly over a field of sheep in order to count how many sheep
+are living in that field.
+
+There are many ways to implement machine learning functionality, but
+hopefully this post helped inspire some deeper thought about the tools
+that can help propel us further into the future of machine learning and
+AI.