Machine based analysis of customer’s reviews

Cosmic Paladin
3 min readJan 15, 2022


Co-authored with Manjunath Srinivas

Customers and prospects, send us feedback & signals across various channels and modes. But one that frequents the most, is the dedicated reviews section for a particular product or store, the customer may have transacted with. So we just hear them out and jump into actions? If it were only as easy as that! The reviews section is a hub also, for biased opinions, rants, fraudulent posts and such. Hence getting closer to “how” the customer feel about the product they’ve purchased and the overall journey they embarked on towards it, is very important for business. After all, we need to hear out the “voice of the customer” and serve the genuine ones, the best

Reviews can arrive at many levels — products, stores, overall platform to name a few. With marketplaces hosting 1000s of merchants, selling millions of products the reviews too, can run into millions by the week, especially for the global ones. Analysing these manually are not only tedious, but can lack precision, be error prone, skewed consistencies and most importantly, time consuming — this would mean the reaction time is so high, the customer may have bid goodbye after a bitter experience. All of this is applicable for Rakuten too, which deals with around 50K merchants in Japan and 300 million products and close to 100 M customers. Here is one of the ways on how we are approaching to get to the “voice of the customer” !

Step 1 : Business driven categorization

Understanding what aspects about the reviews the business is most interested and how do they rank them — we term this the “review categorisation”. These can come in many levels, but the top level can constitute of the usual suspects — Delivery, Quality, Price, Customer Support, Overall Experience

Step 2 : Sub-Classifying the problem

Post deliberation and analysis, we scoped this under “supervised” classification and broke it down further to “Categorization” and “Sentiment Analysis”

Step 3 : Choosing the right models

We would require a library to deal with word embeddings & text classification, and started with fastText — a very popular model from Facebook. Some reasons why we went for it:

  • State of the art library for nlp related tasks developed by facebook AI and backed by industry class research.
  • Minimal text preprocessing steps required before training phase
  • Training phase completes within a few seconds
  • Automatic hyper parameter tuning for getting optimal performance
  • Excellent documentation and support community
  • Subjected to 290+ languages ok

Tohuku, which is a bert based Japanese model was chosen for Sentiment Analysis, for the following reasons:

  • Bert model trained on a large corpus of Japanese text leading to better generalisation of results.
  • Text preprocessing tasks like word segmentation, tokenization are inbuilt. (Japanese text is not separated by spaces like English text)
  • Bert is tweaked for sentiment analysis task which gives optimal performance as opposed to building a model from scratch

Step 4 : Data Cleaning & Preparation

We picked up manually annotated data for the purposes of training the models. The models would learn from pre-annotated data and follow suite when the actual data is subjected. Business had 13 categories of interest, but there was a considerable amount of data that were either yet to be annotated or couldn’t be categorised into pre-defined aspects. These were discarded. To further stabilise the training data, we got rid of what are called as “stop words” pertaining to Japanese language. This would ensure reducing in the overall noise, for the categorization task

Step 5 : Results

Categorization model (fastText) gave a precision of 92 % validated by Human Operator for confidence score > 70 %

Sentiment Analysis model(BERT based) gave a precision of 93 % validated by Human Operator

Next Steps

Now that the working principle of the models have been demonstrated, its time to subject them to a bigger or complete corpus the business considers important and valid, for a pre determined period of time

As against going after all the 13 categories, discussions with business are leaning towards picking up the top 5 of interest or most impactful ones, so the focus is sharper and we aim for higher accuracies of the models, and then expand in a staggered manner! Hey yo :)