diff options
author | Christian Cleberg <hello@cleberg.net> | 2023-12-02 11:23:08 -0600 |
---|---|---|
committer | Christian Cleberg <hello@cleberg.net> | 2023-12-02 11:23:08 -0600 |
commit | caccd81c3eb7954662d20cab10cc3afeeabca615 (patch) | |
tree | 567ed10350c1ee319c178952ab6aa48265977e58 /blog/2020-09-01-visual-recognition.org | |
download | cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.gz cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.bz2 cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.zip |
initial commit
Diffstat (limited to 'blog/2020-09-01-visual-recognition.org')
-rw-r--r-- | blog/2020-09-01-visual-recognition.org | 189 |
1 files changed, 189 insertions, 0 deletions
diff --git a/blog/2020-09-01-visual-recognition.org b/blog/2020-09-01-visual-recognition.org new file mode 100644 index 0000000..9e4f739 --- /dev/null +++ b/blog/2020-09-01-visual-recognition.org @@ -0,0 +1,189 @@ +#+date: 2020-09-01 +#+title: IBM Watson Visual Recognition + +* What is IBM Watson? + +If you've never heard of [[https://www.ibm.com/watson][Watson]], this service is a suite of enterprise-ready AI +services, applications, and tooling provided by IBM. Watson contains quite a few +useful tools for data scientists and students, including the subject of this +post today: visual recognition. + +If you'd like to view the official documentation for the Visual Recognition API, +visit the [[https://cloud.ibm.com/apidocs/visual-recognition/visual-recognition-v3?code=python][API Docs]]. + +* Prerequisites + +To be able to use Watson Visual Recognition, you'll need the following: + +1. Create a free account on [[https://www.ibm.com/cloud/watson-studio][IBM Watson Studio]]. +2. Add the [[https://www.ibm.com/cloud/watson-visual-recognition][Watson Visual Recognition]] service to your IBM Watson account. +3. Get your API key and URL. To do this, first go to the [[https://dataplatform.cloud.ibm.com/home2?context=cpdaas][profile dashboard]] for + your IBM account and click on the Watson Visual Recognition service you + created. This will be listed in the section titled *Your services*. Then + click the *Credentials* tab and open the *Auto-generated credentials* + dropdown. Copy your API key and URL so that you can use them in the Python + script later. +4. *[Optional]* While not required, you can also create the Jupyter Notebook for + this project right inside [[https://www.ibm.com/cloud/watson-studio][Watson Studio]]. Watson Studio will save your + notebooks inside an organized project and allow you to use their other + integrated products, such as storage containers, AI models, documentation, + external sharing, etc. + +* Calling the IBM Watson Visual Recognition API + +Okay, now let's get started. + +To begin, we need to install the proper Python package for IBM Watson. + +#+BEGIN_SRC sh +pip install --upgrade --user "ibm-watson>=4.5.0" +#+END_SRC + +Next, we need to specify the API key, version, and URL given to us when we +created the Watson Visual Recognition service. + +#+BEGIN_SRC python +apikey = "<your-apikey>" +version = "2018-03-19" +url = "<your-url>" +#+END_SRC + +Now, let's import the necessary libraries and authenticate our service. + +#+BEGIN_SRC python +import json +from ibm_watson import VisualRecognitionV3 +from ibm_cloud_sdk_core.authenticators import IAMAuthenticator + +authenticator = IAMAuthenticator(apikey) +visual_recognition = VisualRecognitionV3( + version=version, + authenticator=authenticator +) + +visual_recognition.set_service_url(url) +#+END_SRC + +*[Optional]* If you'd like to tell the API not to use any data to improve +their products, set the following header. + +#+BEGIN_SRC python +visual_recognition.set_default_headers({'x-watson-learning-opt-out': "true"}) +#+END_SRC + +Now we have our API all set and ready to go. For this example, I'm going to +include a =dict= of photos to load as we test out the API. + +#+BEGIN_SRC python +data = [ + { + "title": "Grizzly Bear", + "url": "https://example.com/photos/image1.jpg" + }, + { + "title": "Nature Lake", + "url": "https://example.com/photos/image2.jpg" + }, + { + "title": "Welcome Sign", + "url": "https://example.com/photos/image3.jpg" + }, + { + "title": "Honey Badger", + "url": "https://example.com/photos/image4.jpg" + }, + { + "title": "Grand Canyon Lizard", + "url": "https://example.com/photos/image5.jpg" + }, + { + "title": "Castle", + "url": "https://example.com/photos/image6.jpg" + } +] +#+END_SRC + +Now that we've set up our libraries and have the photos ready, let's create a +loop to call the API for each image. The code below shows a loop that calls the +URL of each image and sends it to the API, requesting results with at least 60% +confidence. The results are output to the console with dotted lines separating +each section. + +In the case of an API error, the codes and explanations are output to the +console. + +#+BEGIN_SRC python +from ibm_watson import ApiException + +for x in range(len(data)): +try: + url = data[x]["url"] + images_filename = data[x]["title"] + classes = visual_recognition.classify( + url=url, + images_filename=images_filename, + threshold='0.6', + owners=["IBM"]).get_result() + print("-----------------------------------------------") + print("Image Title: ", data[x]["title"], "\n") + print("Image URL: ", data[x]["url"], "\n") + classification_results = classes["images"][0]["classifiers"][0]["classes"] + for result in classification_results: + print(result["class"], "(", result["score"], ")") + print("-----------------------------------------------") +except ApiException as ex: + print("Method failed with status code " + str(ex.code) + ": " + ex.message) +#+END_SRC + +* The Results + +Here we can see the full result set of our function above. If you view each of +the URLs that we sent to the API, you'll be able to see that it was remarkably +accurate. To be fair, these are clear high-resolution, clear photos shot with a +professional camera. In reality, you will most likely be processing images that +are lower quality and may have a lot of noise in the photo. + +However, we can clearly see the benefit of being able to call this API instead +of attempting to write our own image recognition function. Each of the +classifications returned was a fair description of the image. + +If you wanted to restrict the results to those that are at least 90% confident +or greater, you would simply adjust the =threshold= in the +=visual_recognition.classify()= function. + +When your program runs, it should show the output below for each photo you +provide. + +#+BEGIN_SRC txt +---------------------------------------------------------------- +Image Title: Grizzly Bear +Image URL: https://example.com/photos/image1.jpg + +brown bear ( 0.944 ) +bear ( 1 ) +carnivore ( 1 ) +mammal ( 1 ) +animal ( 1 ) +Alaskan brown bear ( 0.759 ) +greenishness color ( 0.975 ) +---------------------------------------------------------------- +#+END_SRC + +* Discussion + +Now, this was a very minimal implementation of the API. We simply supplied some +images and looked to see how accurate the results were. However, you could +implement this type of API into many machine learning (ML) models. + +For example, you could be working for a company that scans their warehouses or +inventory using drones. Would you want to pay employees to sit there and watch +drone footage all day in order to identify or count things in the video? +Probably not. Instead, you could use a classification system similar to this one +in order to train your machine learning model to correctly identify items that +the drones show through video. More specifically, you could have your machine +learning model watch a drone fly over a field of sheep in order to count how +many sheep are living in that field. + +There are many ways to implement machine learning functionality, but hopefully +this post helped inspire some deeper thought about the tools that can help +propel us further into the future of machine learning and AI. |