Skip to main content
Qualtrics Home page

Data Analysis

Anomaly Detection in eXperience Management | Response Count Time Series

Introduction

Line chart of the predicted and actual count of survey responses in the past 90 days where actual responses on Feb 11 is two times greater than predicted.

Anomaly detection is an important business problem that allows identification of anomalous events and their subsequent analysis and remediation1. According to Hawkins an anomaly is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism2. In data mining and statistics literature, the anomalies are also referred to as abnormalities, outliers, novelties, or discordants. Depending on the domain, anomalies typically translate to some kind of a problem such as bank fraud, medical problem or data errors. Early and accurate detection of such events enables fast reaction that might be essential to avoid potentially serious consequences.

At Qualtrics we’re helping our customers to collect and analyze the experience data (X-data) to enable the designing of breakthrough products and services and to continuously improve customer or brand experiences. The X-data contains rich information, e.g., Net Promoter Score (NPS)3 and Customer Satisfaction (CSAT)4, that can be used to measure the effectiveness of the past and guide future business decisions. Qualtrics enables customers to collect the X-data in more than 100 ways - like surveys, social media, text, or WhatsApp to name a few. Any abnormal behavior in the recorded survey response counts such as sudden spikes or dips can potentially indicate a serious issue that should be analyzed and mitigated as soon as possible. Moreover, to get a full picture the X-data needs to be broken down by factors such as touchpoint, geography, and demographics, which makes it very time-consuming if not infeasible for a manual investigation. Therefore, we decided to design and implement an automated anomaly detection system monitoring the X-data over time to detect and notify our customers of any identified abnormalities.

Problem Definition

Our goal is to detect anomalies in a timely manner so that the notified customers can take action and address the underlying issues without delay. Therefore, we posed the problem in the following way:

Given a univariate time series slice containing daily response counts from the past 90 days, detect whether the last observation is an anomaly. 

Moreover, we aim to have a system working in a high precision regime to minimize the number of false alerts. Sending too many irrelevant notifications would cause alarm fatigue.

Anomaly Detection System Design

Given the fact that the response counts time series are very noisy (Fig. 1), we decided to design an anomaly detection system consisting of the following modules:

  1. Time Series Classification This module describes characteristics of an input signal by assigning it to one of the predefined time series classes.
  2. Anomaly Detection This module picks an appropriate time series model based on the class from the previous step and performs anomaly detection.
  3. Anomaly Validation This module is designed to protect clients from false alerts and to allow for customer-specific adjustments, e.g., anomalies of a specific type should be ignored, or only very severe anomalies should be notified.
  4. Feedback Collection Each notification contains a section for feedback collection on whether the detected anomaly was relevant. This information will be used to improve and fine-tune the system.

Six line charts of the count of survey responses showing how response counts can be noisy depending on context.

Fig. 1 - Examples of most frequent signal types present in the survey response counts data.

Implementation

Anomaly Detection Models

Our anomaly detection system is designed to pick a dedicated model depending on the characteristics of a given time series. The selected model is then used to compute an anomaly score of the last observation that quantifies the anomaly severity. We consider the following anomaly detection approaches:

  • Gaussian Model is a simplistic model assuming that data points come from a single Gaussian distribution. The probability under the model of observing a more extreme value than a new data point can be used as an anomaly score.
  • Gaussian Mixture Model is a generalization of the Gaussian model that assumes that the data comes from a generative process in which each data point belongs to one of k clusters each following a Gaussian distribution. This method outputs the membership probability to different clusters, which provides a natural way to model anomalies.
  • ARIMA5 is one of the most widely-used approaches for time series forecasting. It can be extended to capture seasonality and exogenous regressors to, for instance, account for holidays.
  • Gaussian Process6 is a nonparametric model working well in a small data regime. It’s characterized by a kernel function that encodes desired properties of the modeled time series such as long-term trends, seasonal patterns and holiday effects. The probability of future observations under the model can be used as an abnormality score.
  • Prophet7 is an algorithm designed specifically for forecasting time series “at scale” and is capable of modelling seasonality, holiday effects, and changepoints. Additional regressors can be fed into the model to account for external factors. The forecasting model returns a predictive distribution that can be used to derive an anomaly score.
  • DeepAR8 is a recurrent, autoregressive encoder-decoder deep neural network that can be used in a forecasting setting. It learns jointly from multiple time series which improves its prediction accuracy. Moreover, it easily allows for explicit modelling of different signal types, e.g., count data. An anomaly score can be computed based on the empirical distribution obtained by sampling from the model.
  • Matrix Profile9 is a vector that stores distances between any subsequence within a time series and its nearest neighbors. This distance information can be used to find most outlying observations that translate to anomalies.

Anomaly Validation

Major responsibilities of the anomaly validation module include the following:

  1. Adjusting anomaly score threshold value to find an optimal customer-specific trade-off between sensitivity and false alerts.
  2. Notification overload protection mechanism limiting the number of sent anomaly notifications per given time frame, e.g., a week.
  3. Sanity check mechanism preventing notifications from being sent in case of an over proportionate number of detected anomalies. In such situations, the system requests a human review.

Feedback Collection

Anomaly notifications contain a feedback section allowing the recipient to rate whether it was useful and optionally provide textual comment (see Fig. 2). Information gathered in this way will be one of the drivers for future improvements of the system.

An email from Qualtrics with a line chart showing actual responses on May 10 is two times greater than the predicted range. A feedback question asks if the notification is useful.

Fig. 2 - E-mail anomaly notification with a feedback section. Clicking on the Yes/No redirects to a Qualtrics-powered survey where additional textual information can be provided.

Offline Evaluation

In the absence of anomaly labels, we evaluated our anomaly detection system to get an estimate of its precision. To this end, we run our system with a low anomaly threshold on a random sample of 50 000 cases to find anomaly candidates. This approach allowed us to greatly reduce the labelling effort at the cost of having a biased sample. The anomaly candidates were then reviewed and labeled manually by three annotators. This process resulted in 1375 data points that we used to assess the precision of our system. 

Online Evaluation

Following the offline evaluation, we evaluated our anomaly detection system in a production setting for a period of two months. Based on the collected customer feedback we’ve seen that the vast majority of respondents found our anomaly notifications to be useful. Detailed feedback analysis revealed that some of the anomaly types received unanimously positive feedback (see Fig. 3a-b) whereas others got mixed ratings (see Fig. 3c-d). We found that anomalies detected in sparse, event-driven signals received negative opinions (see Fig. 3e). We plan to address this issue in the near future by refining the underlying model or by taking the external event data into account. We’ve also seen that all respondents found non-obvious spike anomalies in trending signals to be useful (see Fig. 3f), which is in line with results of our previous UX research study showing that users are interested in trend detection functionality.

Six line charts of the count of survey responses showing the relationship between different anomaly patterns and customer feedback about notification emails.

Fig. 3 - Representative signals for which our customers provided feedback whether they found the anomaly notification to be useful.

Summary

Monitoring and analyzing experience data (X-Data) is essential to continuously improve how customers perceive a company. At Qualtrics we’re dedicated to helping our customers in collecting, processing, and understanding the X-Data. Therefore, we designed and deployed a modular and flexible anomaly detection system allowing for early identification of abnormal patterns that enables our customers to quickly investigate and resolve the issue. From the feedback collected so far, anomalies detected by our system were found to be useful by 87% of respondents.

===========

References

[1] Blázquez-García A, Conde A, Mori U, Lozano JA. A review on outlier/anomaly detection in time series data. arXiv preprint arXiv:2002.04236. 2020 Feb 11.

[2] D. Hawkins. Identification of Outliers,Chapman and Hall, 1980.

[3] What is NPS? Your ultimate guide to Net Promoter Score. https://www.qualtrics.com/experience-management/customer/net-promoter-score/

[4] What is CSAT and how do you measure it? https://www.qualtrics.com/experience-management/customer/what-is-csat/ 

[5] Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. OTexts; 2018 May 8.

[6] Rasmussen CE. Gaussian processes in machine learning. InSummer school on machine learning 2003 Feb 2 (pp. 63-71). Springer, Berlin, Heidelberg.

[7] Taylor SJ, Letham B. Forecasting at scale. The American Statistician. 2018 Jan 2;72(1):37-45.

[8] Salinas D, Flunkert V, Gasthaus J, Januschowski T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting. 2020 Jul 1;36(3):1181-91.

[9] Yeh CC, Zhu Y, Ulanova L, Begum N, Ding Y, Dau HA, Silva DF, Mueen A, Keogh E. Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In2016 IEEE 16th international conference on data mining (ICDM) 2016 Dec 12 (pp. 1317-1322). Ieee.

Related Articles