Overall Best Practice: Amplify Organic Adoption
Both customer care and security specialists see the value of establishing higher levels of trust with their customers and fellow employees. That equates to stronger authentication of individuals, not “end-points,” tokens or mobile devices. With Apple’s introduction of Siri, a speech-enabled mobile
assistant, the general public is getting tuned into the power of the spoken word for search, control and dictation. Authentication will not be far behind. The pre-requisite here is one of packaging.
The result of these “best practices” is reflected in the slope of the curve in Figure 1, which is Opus Research’s forecast for registered voiceprints on a global basis. Note that, in this document we sometimes refer to “indicated best practices” which we define as the approaches that solution providers have indicated they will carry forward because they are finding the highest acceptance rates in real world implementations.
Voice biometric solution providers are unlikely to be able to change behavior among the general public. But they can demonstrate how automated speech processing, when coupled with “artificial intelligence” (AI) as it is with Siri, is an empowering force. Early application of multi-factor authentication (including voice biometrics) helps accomplish two very important goals. First,
as people carry out more e-commerce and other routine activities on their phones, biometric-based security will help prevent fraud in general. Second, during each conversation, strong authentication promotes “trust,” meaning that both parties can have confidence that they are in touch with the
individual that they want to carry out business with.
“Mobile” Means Feature Phones, Smartphones and Tablets
Enterprise security experts know that they cannot rely on any single factor, but must instead look to more robust security methods. As soon as a C-level executive starts using her iPad for work and personal use, it becomes more important to authenticate the person who’s in possession of that device. The move to mobile makes user authentication more important than ever,precisely because it is no longer sufficient to merely secure hardware endpoints. Participants want assurances that the individual at that endpoint is the person he or she claims to be.
Waiting in the wings are large-scale implementations by large government agencies that have the need to closely manage disbursement of transfer payments (like social security or unemployment benefits). Such implementations have already moved past the “pilot” stage in Australia and
New Zealand. The largest pension organization in the Philippines has been offering voice-based authentication of retirees, leveraging an existing smart card-based identification/verification infrastructure. The ability to enroll and authenticate remotely has proven to be valuable in all of these cases.
Lessons from the past
The rest of this document provides specifics surrounding what Opus Research regards as “best practices” based on the lessons learned over more than a decade of bringing voice biometric-based solutions to the market. In it we evaluate a set of offerings, including proof-of-concept, trials and formal offerings, that failed to reach a level of enrollment or use that would deem them successful. Our assumption is that today’s best practices – at the very least – will be those that correct long-standing shortcomings. This will lead to high growth in both the enrollment of voiceprints and the repeated use of voice authentication.
To support our efforts, we will organize the roots of failure into five major categories:
Product shortcomings - including technological failures, which were prevalent in the “first generation” of voice biometric-based authentication 1999-2003.
• Packaging – addressing how well the technology was integrated into solutions to known problems.
• Partnerships and Promotion – looking at the go-to-market strategies and relationships with system integrators, third-party solution providers and specialists in adjacent industries like identity
management and credit service bureaus.
• Personnel – assessing the considerations that need to be given to organizational structure on both the buy side and the sell side, as solution providers learn which executives and departments influence
acquisition and implementation strategies.
• Pricing issues – looking at the ROI of specific implementations and making comparisons to alternative forms of authentication, access control and transaction authorization.
Today’s voice biometric-based authentication engines start with greater accuracy than the turn-of-the-century technologies. Solution providers also benefit from indicated best practices that package voice-based authentication as an extension of existing caller authentication and imposter detection
software and services. Thus, the primary point of integration is not an IVR.
Instead, solutions are, more accurately customer-centric and context sensitive, leveraging business logic to apply the proper level of security as needed given a caller’s identity and the nature of the transaction being undertaken. They also leverage the token-based service that back-ends today’s PIN and Password infrastructure.
Partnering To Address Perceived Deficiencies
Voice biometric-based authentication platforms are on the front line and in the critical path of all manner of secure e-commerce. Thus it is understandable that large companies in financial services,
telecommunications, healthcare and retailing might hesitate to contract with small, unproven firms. This explains why a number of the early adopters with large, customer-facing implementations contracted with large system integrators or solutions providers.
Voice authentication was only “one of many” IVR applications. It was also treated as one of seven major IVR applications relevant to user authentication. It was primarily treated as a PIN replacement when PINs are forgotten or they expire. Trials were also delayed during protracted vendor evaluations.
Lessons learned include:
• Make sure security is balanced with customer usability
• Engage your stakeholders early in the process and take them along the journey with you
• Prove the technology works and gain business confidence
• Be ready to adapt, including improving call flows and / or business rules
• Think about the future, including upgrades / new technology
• Plan ahead for change, for example an aging template
These cautionary suggestions, especially those involving the engagement of stakeholders early in the process, can be considered best practices for implementations in large businesses as well as government bureaucracies.This brings us to ABN AMRO, which provides another set of cautionary
considerations for today’s solution providers. In this case, a commercial bank sought to differentiate itself by giving a high profile to a novel approach to user authentication and security. In 2007, as it started to evaluate and deploy speaker authentication in its contact centers, it was one of the top 20
independent banks in the world. Its Dutch contact center received over 35 million calls each year and all but 7 million were routed to live agents.
In 2006 ABN AMRO began two separate but related initiatives surrounding speech technologies. One was to replace the menu-driven IVR systems with voice-based self-service with speech-recognition-based navigation. At the same time the bank initiated a “PIN replacement” effort that closely linked voice biometric-based user authentication with a token-based device. In essence it was replacing the “PIN” part of “Chip and PIN” with a spoken passphrase. This approach was quite different to other layered, multifactor approaches that we had observed.
Opus Research believes that replacing PIN-based authentication with an approach that requires unique hardware and multiple steps, where entry of a simple PIN would have sufficed, was a barrier to adoption. While it was promoted as a mechanism for voice-based verification of frequent callers,
bank management approved the technology based on the use of a hardware token as a back-up for the less frequent callers.
Pricing and ROI Concerns
There is much evidence that many firms have found themselves unable to cost-justify voice biometric implementations. This is quite frustrating for veteran sales people in both the security and the speech application domains. At a time when e-commerce fraud and identity theft is reportedly at an all-time high, it is frustrating that banks, credit card issuers, major retailers and even government agencies find it necessary to leave the losses from such activity largely undisclosed. What’s more, they seldom “connect the dots” between strong authentication and reduction in such fraud losses.
By contrast, the providers of speech-enabled applications have become adept at proving the ROI of their solutions by documenting that the combination of speech-based navigation and voice biometric-based authentication shortens the average time that it takes for a caller to complete a phone-based transaction. For high-volume, customer-facing contact centers, saving seconds can amount to millions of dollars in savings over the course of a year.
Opus Research sees that a best practice in pricing is to bring implementation costs in line with the expected savings that will result from the reduction in call times. These are the “hard dollar” savings resulting from reduced personnel costs, minutes of use on communications networks and facilities
costs. Beyond these documented benefits are “soft” benefits that related to customer satisfaction, retention and the ability to up-sell and cross-sell based on rapid authentication of callers and high levels of confidence that the individual on the other end of a phone call is, indeed, who he or she claims to be. But the unspoken benefit of strong authentication is, of course, fraud reduction. While most companies (especially among financial services providers, card issuers and retailers) are loathe to reveal exact figures for fraud losses, they are well-aware of the “hard” dollar benefits of fraud
reduction.
Does Enrollment Equate to Loyalty?
Acceptance and repeated use of voice-based authentication starts with an enrollment process that puts emphasis on a long-term relationship. The largest implementers have learned that a proposition that says, “We’d like to take about 3 minutes of your time to save you a minute on all your subsequent calls” elicits a significant positive response and a large percentage of callers opt in to the service.
When it comes to promoting a loyal, long-term relationship, hosted voice biometric solutions provider, VoiceTrust, experiences very high enrollment rates for its G2P (Government to People) transfer payment services. In this case, registering one’s voiceprint is a pre-requisite for receiving pension benefits.
Is Bulk Enrollment a Possibility?
Because some compliance directives (in healthcare, insurance, banking) require audiofiles to be stored for a matter of years, it has been suggested that large companies could take advantage of this rich set of spoken utterances to engage in bulk caller authentication. Once a single speaker’s voice has been isolated from others on a phone line and it can be associated with a known identity and, upon capture of sufficient spoken material, be distilled into a voiceprint.
There are ethical, and perhaps legal, issues associated with this “passive” approach to enrollment. But some vendors have argued that callers can be informed that their voice is being recorded to provide more convenient security on future calls. It cannot be characterized as a “best practice” and, given the pace at which buyers move to new approaches, it probably won’t become a common practice either. What is more, the “creepiness factor” will always be very high for capture of personal characteristics (whether voice, fingerprints, iris scans or pictures of one’s face); it will always seem surreptitious.
What’s in the Wings for Authentication?
The most common practice, when looking at the largest, customer-facing implementations, is the use of a pass phrase – such as “my voice is my password” – for speaker verification. Such “text-dependent” solutions predominate because they have proven demonstrably more mature. Enrollment routines are well-defined, Equal Error Rates (EER) are well documented and tools exist for assessing scores and adjusting acceptable False Accept Rates (FARs) and False Reject Rates (FRRs) to suit each use case.
Where randomness is required, as part of a routine to discourage replay attacks, a digit-based system is well-received by current implementers. In these instances, enrollment involves capturing the digits zero-though-nine a number of times from which the voiceprint is distilled. For authentication, the
digits are displayed in a random order so that the replay of a tape-recorded pass phrase would not be accurate. Providers of pass phrase-based authentication systems have other ways of detecting the use of a tape recorder, such as detecting a “tape hiss” or humming sound that commonly occurs with playback equipment.
Is Text-Independence a “Best Practice”?
Text-independent methods for enrollment and subsequent authentication are making slow headway out of development labs and the government sector, where text- and language-independent solutions have matured as part of surveillance and speaker detection schemes. Prior to acquisition by Nuance,
PerSay had installed its FreeSpeech solution at a commercial bank in Israel. Text independent, “passive,” enrollment has been demonstrated for several years. A few investment houses have trialled “convenient and secure” interactions to high net worth individuals.
Text-independence equates to language-independence as well and, for that reason, a number of multinational banks have been monitoring or trialing new systems with an eye to determining the minimum length of spoken utterances required to build a voice print (this had been 30 seconds to a
minute but has been reduced to 20 seconds “in the lab”) and the minimum amount of “conversational speech” to do an accurate authentication (moving toward 10 seconds).
As layered, multi-factor and risk-based authentication solutions prevail, we expect to see more creative, text-independent applications. In many industries, enterprises are required to monitor and record all phone-based interactions, which means that a certain amount of forensic work can be
performed on stored audiofiles. Text-independent solutions can support “passive” enrollment as well as “speaker change detection,” for which there is an obvious need from educational institutions that offer remote testing or certification and need assurance that the person taking the test is the person
he or she claims to be. There are similar applications for companies with virtual contact centers that require work-at-home agents to authenticate themselves at the beginning of a shift but offer no assurance that the same person is there to complete a shift.
Promoting Trust and Long-term Relationships
After initial enrollment, constant behavioral reinforcement is required to keep interest and activity up. The most conspicuous and appropriate applications have been for financial services companies who have launched voice biometric-based authentication services designed to give high net worth
individuals confidence that their financial services provider is making a special effort to provide good service and high levels of security.
Fitting into Multi-factor Solutions
The mission for solutions providers at large is to make voice biometrics part of a multi-factor, layered solution that provides a highly secure, convenient service to targeted customers. This will become especially important as transactions of all kinds are initiated from mobile devices and it becomes
more crucial than ever before to authenticate the individual in possession of the mobile device through which each transaction is initiated.
In all cases, such solutions should provide a pleasing experience for callers. One strategy is to closely link voiceprints with a very convenient mechanism for strong authentication on an “as-needed” basis. Other factors that may come into play are obvious signs that an individual is an imposter. For example, if a customer living in California, originates a call from Sudan or a caller named “Dan Miller” has a decidedly female voice.
When it comes to detecting imposters or assigning the “high-risk” tag to a particular caller or individual the “voice channel” alone does not provide all the answers. Most large businesses already have rule-based systems and business logic in place to support the goals of the Chief Security Officer.
These well-established and time-tested rules are the ones that should be applied to govern the level of risk associated with the individual on the other end of the phone or at a student or employee’s workstation.
Expect Accelerated Adoption
Bolstered by real world experience, regulatory guidelines and analysts’ recommendations, executives at financial services companies, healthcare providers and telecommunications service providers are forging ahead with plans to implement voice biometric technologies. Opus Research has identified approaches to user enrollment and subsequent authentication that overcome long-standing barriers to adoption by creating a pleasing and engaging user experience and leveraging existing security infrastructure to support the goal of fraud prevention.
To answer the question, “What has changed?” in regards to adoption of the technology in the past year: companies are finding that both employees and customers have grown comfortable using multiple mobile devices and making seamless transition from real-time conversations to ones that happen over a period of time via texting, chat, tweeting or other social network-based tools.
What’s more, they use smartphones, wireless PCs and laptops as “mobile assistants,” on which personal contact lists, applications and data coexist with sensitive information that is supposed to be under tight corporate control.
In the social, mobile environment, user authentication (as opposed to device-oriented access control) has become more important than ever before in order to prevent unauthorized access to corporate resources. Meanwhile, “user experience” aficionados are new to the security discussion, but have
rapidly come to a similar conclusion: banks, insurance companies, government agencies, telcos and retailers serve customers, not their phones or other devices. Over time, they will find that strong, rapid authentication will enable them to serve their best, longest-standing customers in the most
personal and efficient way. Because the devices are often phones, voice prints are perceived as the most natural and efficient authentication mechanism. In conjunction with “layers” of other factors, protocols and business rules, decision makers are making voice biometrics part of security solutions that provide high levels of confidence and foster stronger customer bonds over time.
To support secure, customer-centric e-commerce using voice biometrics, the time is now.
No comments:
Post a Comment