Early in the story of Neal Stephensons 1995 novel The Diamond Age, a wealthy client is mildly peeved to learn that an interactive educational project that he has commissioned must be delivered partly in the form of a service, rather than a completely self-contained device. Regretfully, the project leader explains the problem: “After all of our technology, the pseudo-intelligence algorithms, the vast exception matrices, the portent and content monitors, and everything else, we still cant come close to generating a human voice that sounds as good as what a real, live ractor can give us.”
On the bright side, that engineer continues, “At any given time there are tens of millions of professional ractors in their stages all over the world, in every time zone, ready to take on this kind of work at a moments notice.” Stephenson envisions a vast pool of talent, ready to read a script thats generated on a moments notice by what we would call today a Web service, with the interactive actor — the “ractor” — needing little or no actual knowledge of the subject matter. All thats needed is the ability to read, and speak, and create a verbal illusion of being involved and interested in the resulting conversation. Even the face of the speaker is synthesized, if needed to support a video link, to follow the spoken words while being tailored in appearance to the customers personal preferences.
This isnt a huge leap of techno-fantasy beyond some of the things that we do today. When you call a toll-free telephone number, you really dont know what time zone is at the other end of the conversation. You dont know whether the person who helps you is really an expert in that area, or is merely well-supported by a keyword-searchable database of frequently asked questions and standard procedures. A telephone call center in India is staffed by people who dont merely speak excellent English — theyve even been trained in the differences, for example, between Canadian and U.S. dialects and accents.
Combining text, video and speech into presence awareness is now only a mildly challenging piece of the problem. You may or may not consider this an advance, but you can now be approached by a helpful salesperson while shopping online. Adaptive learning algorithms guide live representatives in timing their approach. Electronic gaming increasingly involves interacting with communities of other players youve never met.
Two things have to happen, though, before reality can close the remaining shortfalls from Stephensons vision. The first is that script-based customer interaction has to become much better than the one-size-fits-all compromise that we tolerate today. Its always an exercise in patience to use any kind of telephone-based technical support, for example, knowing that its going to take either a lot of time or a delicately tailored exhibit of annoyance to get the voice at the other end to skip the baby talk and start actually solving the problem.
It seems as if the current state of the art in writing tech-support scripts has a No. 1 goal of making sure that the customer isnt confused. Id urge people who build such services to change that: to make the No. 1 priority a clear understanding of what the customer already knows, and what the customer believes. This would mean putting the state of the customers mind ahead of the state of the product or the problem, but any good salesman already knows the importance of doing that.
The second thing that has to happen is a massive transfer of knowledge from the minds of people into the databases and decision trees of customer relationship management systems. Remember “expert systems”? They turned out, back in the AI 80s, to be enormously difficult to build and maintain — but perhaps their time has finally come.
What are the other missing links between todays call center and tomorrows “racting”? Tell me at firstname.lastname@example.org