Whether it’s a brick and mortar company or a cutting edge SaaS enterprise, there is hardly a business that does not value customer feedback. Making sense of that feedback is tough. It’s easy to analyze feedback in a spreadsheet when you have a handful of customers, but as your customer base grows you need a system to help make sense of all of the feedback you receive. At Wootric, we built CXInsight™to solve this exact problem.
Wootric CXInsight offers machine learning and NLP-powered customer feedback analysis within our Customer Experience (CX) management system. It classifies a given customer comment into a set of pre-defined categories, and for each category it tries to figure out its sentiment. It then leverages this analysis across your customer feedback to show you a set of topics that are trending negative or positive. CX professionals can use the platform to slice and dice data to test hypotheses or set up custom watchlists to track feedback themes or segments of customers overtime. For this data to be meaningful, it’s important that our sentiment analysis has a low margin of error — otherwise there will be lots of false alarms.
Here is an example of the kind of feedback classification we do:
Early sentiment analytics model shows its limitations
We had been using Google (GCP NLAPI ) from its early beta days for sentiment analysis, entity detection and syntax analysis. We had been very happy with this service and had this service not existed, the launch of our product would have been significantly delayed. GCP NL API gives an overall sentiment of a document (a single piece of customer feedback in this case) as well as sentiment for each sentence in the feedback. We would then feed this into our feedback classification ML system to help provide the insights that we talked about above.
However, we noticed that using the Google product was resulting in a significant incident of incorrect sentiment application. This, of course reduces the value our customers can get from our platform. While we are aware that no ML system is 100% correct--a few misclassifications are expected--we wondered if we could improve upon these results. We tried using AWS Comprehend and saw similar misclassification issue.
Here are some samples of incorrect sentiment classification on feedback using Google and AWS in September of this year:
Moving on to our own model
In parallel, over the past 12 months we have focused on improving the accuracy of our classification system (not sentiment analysis) for customer feedback. You can read the details on our ML Journey blog. Following recent research papers advocating transfer learning in NLP, we created our own transfer learning model based on millions of customer feedback we have collected in last 4 years. This approach has vastly improved accuracy over our previous classification system based on our ensemble approach.
Inspired by our success classifying feedback based on our transfer learning model, we created a new sentiment analysis model. To our surprise it gave us a 4.6% boost on accuracy (or a 56% reduction in error) over GCP NL API. We could not believe it on the first attempt, so we triple checked our tests just to be triple sure because it’s such a huge jump.
Overall, accuracy of our system rocks when it comes to customer feedback sentiment analysis and classification. In full transparency, our system still gets sentiment classification wrong sometimes. For example, take this comment:
“The interface leaves a good amount to be desired. I would like to be able to understand this data better than just clicking through an unfiltered list of every file in my app.”
Our system classified this comment as positive, which obviously is incorrect. To reiterate what we have said in previous posts though, we think our results are better because our models are fine-tuned for customer feedback. This gives us confidence in the future of our classification models as they continue to evolve.
Stay tuned for more of our journey.