Industry titans are pushing their AI solutions to the market. In some cases there is, however, more intelligence in their marketing push than in the actual solutions. To find out the real performance we tested how well leading text analysis solutions from Amazon, Google, IBM and Microsoft compare to Leiki’s Semantic AI service.

Much of today’s online content is short. Videos with brief headlines, tweets and Instagram posts are very popular and commercially valuable – but poorly understood by machines. Semantic understanding of short text is needed for both content discovery and audience data analysis of these content types. To see how leading natural language analysis tools match up, we compiled a list of varied short text strings and ran them through five services.

Leiki’s SmartProfiles came up with the largest number of categories for all text strings – with 16.2 correct categories on average for our batch of 2-4 word phrases. Leiki also produced the smallest number of incorrect categories (0).

Solutions compared

  • Leiki – SmartProfiles
  • IBM – Watson Natural Language Understanding
  • Microsoft – Azure Text Analytics
  • Google – Cloud Natural Language
  • Amazon – Comprehend

Text strings used

  • Keith Richards
  • Champagne Glass
  • Payment Card Readers
  • Marathon Running Shoes
  • Macaroni and Cheddar Cheese

How many semantic categories different text analysis solutions find from short text samples?

Results from IBM’s Watson Natural Language Understanding show that a pure machine learning approach that takes words, not meaning, as input and learns which words occur in the same articles does not work to find out the correct meaning of words. While it’s still far behind Leiki, it does come closest in the number of categories it can extract from short text, with 5 correct categories and an additional 2 incorrect categories on average per phrase.

For example, an analysis of “Champagne Glass” categorises it correctly as Food and drink / beverages / alcoholic beverages / wine and Art and entertainment / visual art and design / design; it does fail to recognise its relation to stemware or tableware. It also incorrectly categorizes “Champagne Glass” as flowers.

Google Cloud Natural Language does not offer semantic analysis on short text. It can on some cases of short text extract entities and sentiment. With long pieces of text (articles) their tool does produce a limited set of categories – limited in depth as Leiki’s ontology consists of 200,000+ topics and Google’s 700+ topics. Microsoft Azure and Amazon Comprehend do not offer offer semantic analysis.

Semantic understanding of short text – Leiki in a class of its own

In summary, Leiki SmartProfiles semantic text analysis is in a class of its own in understanding text compared to what huge IT companies can offer. This makes a big difference in the accuracy and flexibility of content recommendation and audience segmentation. Leiki’s solution is the only one that can, for example, reliably find videos with short text descriptions that are most closely related to a long article. Semantic understanding means that no keyword matches are needed for Leiki to understand that the content is related to the same topics.