The Cutting Edge of Contact Centers - Chapter 2 -
Table of contents
What is the speech recognition solution "transpeech"?
In 2018, transcosmos released "transpeech," a speech recognition solution for call centers. The service name "Trans (exceed) + Speech (speech recognition system)" comes from the concept of creating essential value in contact centers across systems.
We use AmiVoice as the basic speech recognition system of "transpeech." The feature of the system is to incorporate a dashboard from BI tool "tableau" and our original algorithms programmed with python. Recent upgrades includes a new feature "emotion analysis function" that allows users to take advantage of the new metrics that can analyze the emotions of the speaker.
Thanks to everyone's support, "transpeech" received a favorable reception. As of 2019, the number of workstations was almost 1,500 (increased by 300%), which was much higher than our earlier projection. Similarly, it was effective at transcosmos in improving work efficiencies such as risk management, quality control, cost reduction, as well as creating added value. Within a year after introducing it created a business impact of approximately 30 million yen. Besides, we have succeeded in making a list of the standard recognition rates of "transpeech" for each industry with fairly detailed categorization and in accumulating knowledge base. As a result, it scored higher than the benchmark at the time of implementation.
In this chapter, I will explain why transcosmos is actively working to utilize such speech recognition, and in which direction these technologies will develop in the future.
AI automatic scoring system for customer service quality control
Here are a few examples of efforts of transcosmos in speech recognition and emotion analysis. Figure 2 is an illustration of customer service quality management using speech recognition and emotion analysis.
Figure 1: Illustration of AI automatic scoring system for customer service quality control
The concept behind the AI automatic scoring system is to ensure quality from 3 perspectives.
(1) AI monitoring: checking the entire conversation with speech recognition and machine learning function
AI ensures the answering quality by automatically checking all the conversations from the point of rule observance of "basic behavior such as greeting and saying thanks" and "right guidance according to the script."
(2) Emotion monitoring: checking the customer's feelings and operator's condition with emotion analysis function
It secures kind and empathetic service quality with checking from psychological and emotional aspects such as, "Does operator find requests out of customers?" or "Do they have anxiety?" in addition to "Does operator achieve sincere thanks from customers?"
(3) Human monitoring: checking the right voice expression based on the contact center operation know-how accumulated over many years
It ensures the customer service quality by making good use of human advantages and warm response, and checks evaluation items such as "comfortability," "clarity of voice," and "short pause," which cannot be calculated by existing AI.
By combining AI and human work in a balanced manner, transcosmos aims to enable merging these 3 perspectives and to maintain and improve the optimal customer service quality. Figure 2 shows an example of the improvement in efficiency of checking all calls at the inbound center of financial services business and the loan contact desk.
Figure 2: Example of the improvement in efficiency of checking all the calls at the inbound center of financial services business as the loan contact desk
Use of emotion analysis for maximizing sales
I will introduce our more advanced initiatives to you for the future.
Currently, transcosmos suggests the following 4 themes as the ways of data utilization obtained from "emotion analysis": (1) substitution of CS/NPS, (2) control of customer expectations, (3) motivation/trouble prediction, and (4) maximization of sales.
Of these, about (4) maximization of sales, the development of this solution started with a case that a manufacturing call center office used emotion recognition function for inbound and inside sales.
This office promoted product replacement and conduct cross-selling to customers when responding to the inbound calls which inquire about products, repair consulting, and apply for repairing. However, the performance of operators was not very good against the target number of orders. It was because the office faced with the dilemma that if they promoted on all incoming calls to win orders, the conversation time would be prolonged and the total number of incoming calls would decrease.
Besides, even if they wanted to narrow down the promotion target, some operators hesitated to promote because they did not know which customers are easy to succeed in promotion, so there were difference between persons in the conduct rate of inbound sales promotion.
That's when emotion analysis came in. The development project started with the idea that if we could find out "the customers who are highly likely to succeed in promotion" in inbound sales from the emotion analysis data in real time while answering to the incoming call so that operators can make sure to start sales promotion.
Specifically, we conducted an emotion data analysis to capture the characteristics of "the customers who are easy to succeed in promotion." We extracted and processed 43 emotions × 100,000 speech that was a total of 4.3 million data, calculated the statistics of these data, and then we repeatedly refined the hypothesis verification while referring to the spreadsheet.
As a result, we found that there was a difference related to inbound sales success, in the 6 items of emotional scores (displeased, regretful, sad, happy, energetic, and emotional), which could be regarded as statistically significant. We named them as "Customer Success Triggers of 6 Emotions" as shown in Figure 3, and developed into our unique solutions.
Figure 3: "Customer Success Triggers of 6 Emotions" derived from speech recognition and emotion analysis
Use of AI and automation technologies for customer service quality control
Another example of our efforts to control customer service quality is "AI Defender."
As shown in Figure 4, "AI Defender" is our original AI developed by transcosmos that checks "the talk must be said" by the operator at high speed and with high accuracy.
Figure 4: High speed and high accuracy check for "the talk must be said" by speech recognition
"AI Defender" makes it possible to evaluate talk content, optimize administrative man-hours, and prevent the risk of misguiding. This also reduces the processing work of transcription (from audio to text) by approximately 98%. Furthermore, because it can determine the content on "sentence" level rather than word level, it covers the inconsistency from misrecognition that is technically unavoidable when using a speech recognition system.
Besides, we have succeeded in standardizing the know-how of on-site experienced staff at call centers with our latest technology and incorporating into the highly accurate AI. With one click, it can rate with high accuracy of 99%, which is the same as humans or better.
In addition, we have developed 3 automation tools to dramatically reduce work time as follows:
1. Automation of "pre-data processing" (automation of processing previous data before input to AI)
2. Automation of "AI prediction" (automation of prediction tasks to eliminate unfamiliar tool operations and simplify its usability)
3. Automation of "results aggregation" (aggregation automation to use data immediately after the prediction)
The most important and hard things in using speech recognition and emotion analysis
Finally, I would like to share with you what I feel every day through the projects with speech recognition and emotion analysis.
I think that the most important thing in using AI, including speech recognition and emotion analysis in call centers, is the ability to make decisions without losing sight of the actions after analysis.
At transcosmos, we think the AI and algorithms development which will not generate actions is meaningless. We keep in mind that the most essential things are task defining, analysis and design with an emphasis on how we can affect the improvements at the site. This kind of field-based AI development cannot be done by "analysts" like data scientists alone, and if we entirely leave it to them, it can go in the wrong direction.
Although many people may feel that AI and data are far from the daily call center operations, I feel that the success or failure of the project depends on whether or not the site staff make efforts with a sense of ownership and lead the project in the right direction to resolve the problems at the site.
However, the hardest part of the project is to handle large amounts of data. Expertise and analytical knowledge are required to process big data. But what is needed more than expertise and knowledge is a "perseverance" to complete the pre-analysis processing work called annotation.
A data scientist at transcosmos always says, "There is no magic in data analysis, but the fruit of human wisdom and efforts.” I am painfully aware of this.
In constructing next-generation contact centers, it is important not to be swayed by the word "AI" or by the empty concept out of our feelings, but to turn it into specific tasks and work steadily at the site.
Thanks to everyone's effort, "transpeech" development project has received such a favorable reception from the public, but there is still much room for improvement in terms of service. We will continue to sincerely respond to requests and inquiries from our customers and the site, and will stay a step ahead to meet the needs of customers.
transcosmos is a global and one-stop solution provider that offers digital marketing, 5A analysis, eCommerce and contact center support that adds value through optimizing cost and driving sales. For more information, please feel free to email us at;
Let’s talk ✉：firstname.lastname@example.org