Data & AI

IBM Watson Services vs Microsoft Cognitive Services


IBM Watson Services vs Microsoft Cognitive Services

I’ve changed many jobs over the years, but by far the hardest and most important transition I ever made was the year I left IBM.

I started my career back in 2000 as a software engineering, while it was a fascinating company and a wonderful job, IBM in the 2000s was exciting, growing rapidly, and was setting the stage for computing around the world. During last years I had the opportunity to work for Watson division, I had great time using the cognitice computing for example the Visual Recognition service and and other machine learning services like the Retrieve and Rank service.

You can browse my blog to find a lot of scenarios where I use Watson’s services.  

In general I used web-based APIs to help customers tailor services to their businesses, integrate into their core systems, and jump-start efforts to develop new solutions and offerings. I had consistently used Bluemix cloud platforms to fuel innovation.

I have been with Microsoft almost one year now and I am really enjoying my role. Everyone has been extremely friendly and helpful. There is a central drive to succeed in everything we set out to do. Be part of a services company that is innovating and creating mind-blowing technology solutions every day. Working as digital advisor allows me to work with Microsoft Cognitive Services, so I was thinking about doing a sort of contrast between Watson services and Cognitive Services.

As you know the cognitive computing is the ability to simulate human thought process in a computerized model. Thanks to cloud computing and Big Data, cognitive computing has become affordable and accessible to businesses. The availability of abundant compute and storage resources combined with the evolution of analytics is accelerating the adoption of cognitive computing. Cognitive computing systems depend on various aspects of artificial intelligence (AI) such as machine learning, natural language processing, speech and vision, human-computer interaction, dialog and more.

IBM Watson Services and Microsoft Cognitive Services are popular cognitive computing platforms that expose powerful capabilities through simple APIs. Though they were initially meant for developers, both the companies are building additional layers that are aligned with enterprise needs.

The Watson Services catalog:

  IBM Watson Services vs Microsoft Cognitive Services


The Cognitive Services catalog:

  IBM Watson Services vs Microsoft Cognitive Services

  • To use the Microsoft Cognitive Services you will need a Microsoft Account, all the APIs will have a free trial plan. As paid offerings become available for each API you will be directed to the Azure portal to complete the purchase. You can find Buy links in your Subscriptions page if you are already using them or you can skip trial altogether and purchase with provided links on Pricing. You will need to set up a Azure subscriber account with a credit card and a phone number. For the first time subscriber you get a $200 discount.


  • To use the IBM Watson Services you will need a IBM Account on Bluemix, you can create a free 30-day trial account here by Sign-up for free. After you activate your account and log in, you go through a simple process to set up your environment, a wizard takes you through the process.


Let’s take a first look at IBM Watson Services vs Microsoft Cognitive Services, here is a simple table where I compare some services that I have already tested, I would like to take the liberty of mentioning a personal view concerning the innovative use of these services.

Watson Service

Cognitive Service

My personal opinion

Visual Recognition: it understands the contents of images – visual concepts tag the image, find human faces, approximate age and gender, and find similar images in a collection. You can also train the service by creating your own custom concepts. Use Visual Recognition to detect a dress type in retail, identify spoiled fruit in inventory, and more.  
  • Computer Visionthis feature returns information about visual content found in an image. Use tagging, descriptions and domain-specific models to identify content and label it with confidence. Apply the adult/racy settings to enable automated restriction of adult content. Identify image types and colour schemes in pictures.
  • Face: detect one or more human faces in an image and get back face rectangles for where in the image the faces are, along with face attributes which contain machine learning-based predictions of facial features. After detecting faces, you can take the face rectangle and pass it to the Emotion API to speed up processing. The face attribute features available are: Age, Gender, Pose, Smile, and Facial Hair along with 27 landmarks for each face in the image
  • Emotion: the API takes a facial expression in an image as an input, and returns the confidence across a set of emotions for each face in the image, as well as bounding box for the face, using the Face API. If a user has already called the Face API, they can submit the face rectangle as an optional input. The emotions detected are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. These emotions are understood to be cross-culturally and universally communicated with particular facial expressions.  
I tested IBM Visual Recognition in very deep way, I enjoyed to use the service for a lot of scenarios. The strong point of IBM services is about the custom classifiers creation, it’s very simple to adapt the IBM Visual Recognition in almost any domain. Moreover the IBM Visual Recognition is able to analyze the human faces in images and get data about them, such as estimated age, gender, plus names of celebrities but this functionality is not trainable, and does not support general bio-metric facial recognition. During my previous job I proposed to Italian Customer the Visual Recognition for quality control in production lines and I got really good feeling about this approach.

Microsoft Cognitive Services splits the visual recognition services in three different APIs. The Computer Vision returns information about visual content found in an image, it’s the most effective way to analyze an image, unfortunately it is not possible use the API to create custom classifier. Face service allows to create a database of personas and to use the service for an real authentication based on the human faces, the service can be trained by many different face’s images for a single person, moreover the API allows to count the number of faces contained in an image, it’s a interesting feature that can be used in many scenarios where you want estimate the number of person that you have to manage, to check, to control.

The Emotion service is able to capture the human emotion by the face expression. I tested it and according to me it works fine, it could be used in any scenario where it is useful to have an emotional feedback, a real emotion feedback, for example in retail domain.

Language Translator: it translates text from one language to another. The service offers multiple domain-specific models that you can customise based on your unique terminology and language.

Translator: text translation, for any of the 60 supported languages, to your app. Both services work fine. I used the IBM Language Translator in a customer support scenario, I addressed a real scenario where the service requests are opened by self-service application from UK users and the service requests are handled by Latin American call and contact centre. The Microsoft Translator works fine, I tested it by the console API, honestly I have no clear preference between the two services
Tone Analyzer: it uses linguistic analysis to detect three types of tones in written text: emotions, social tendencies, and writing style.  Text Analytics: Detect sentiment, key phrases, topics, and language from your text. I used the IBM Tone Analyzer to classify a service request in terms of internal priority based on emotional tone of request. The IBM Tone Analyzer calculate the scores for various emotions like anger, joy, sadness, etc.

Microsoft Text Analytics returns a numeric score between 0 and 1. Scores close to 1 indicate positive sentiment and scores close to 0 indicate negative sentiment. Sentiment score is generated using classification techniques. The input features of the classifier include n-grams, features generated from part-of-speech tags, and word embeddings. 

If you want to separate a negative sentiment from a positive sentiment it’s very simple to use Microsoft API, if you want the sentiment details you may use the IBM API.

Retrieve and Rank: it can surface the most relevant information from a collection of documents. QnA Maker: it extracts all possible pairs of questions and answers from user provided content as FAQ URLs, documents and editorial content.  The QnA Maker service allows you to quickly build, train and publish a question and answer bot service based on FAQ URLs or structured lists of questions and answers. Once published you can call a QnA Maker service using simple HTTP calls and integrate it with applications, including bots built on the Bot Framework. The strong point of QnA Marker is the simplicity, you may use it for any your chat BOT. 

BM Retrieve and Rank service helps users find the most relevant information for their query by using a combination of search and machine learning algorithms to detect “signals” in the data. You can load data into the service, train a machine learning model based on known relevant results, then leverage this model to provide improved results to their end users based on their question or query. 

The service is based on two different phases: retrieve when you can send runtime queries and Rank where you can the rank in your runtime queries to use this model to boost the relevancy of your results with queries that the model has not previously seen.

From my point of view to support a chat BOT is better to use the QnA Marker service, in a customer support scenario may be a good alternative to use the IBM service.

AlchemyData: it provides news and blog content enriched with natural language processing to allow for highly targeted search and trend analysis. Now you can query the world’s news sources and blogs like a database Bing News Search: it search the web for news articles. Results include details like authoritative image of the news article, related news and categories, provider info, article URL, and date added. AlchemyData News indexes 250k to 300k English language news and blog articles every day with historical search available for the past 60 days. You can query the News API directly with no need to acquire, enrich and store the data yourself, enabling you to go beyond simple keyword, based searches. I used it to feed a social network channel with hyper relevant news about a specific topic, for example IoT and so on. The Bing News Search may be use almost in the same way, the real difference is the concept that IBM defines as hyper-relevant.


To summurize:

with Microsoft Cognitive Services, you can create powerful apps with the capability to think and process information like humans and any developers can create powerful applications with the capability to think and process information like humans. You can easily build customer-facing AI applications with Vision, Language, Speech, Knowledge and Search functionality.

The Cognitive Services API collection include almost 30 APIs and you will be able to use the AI in the real-world to solve business problems.


Related Articles


  1. Hi, I am a fresh electronic engineering graduate and I am so interested in the topics you discuss. However, I find myself don’t really understand all the concepts, what do you advice me?. Also did you work with both IBM and Microsoft machine learning tools alone, I mean did you receive any training to do so?

Back to top button