A New Call Center Tool -- Automatic E-Mail Distributor
David Goodtree
David M. Cooperstein
Volume One, Number Seventeen
June 30, 1997
Aetna, Nike, and others are soliciting customer inquiries on their Web sites by asking users to click a button and fill out an e-mail form. But few firms have worked out how to handle the incoming deluge destined for sales and service reps (see the February, 1997 Telecom Strategies Report, "Call Centers Meet The Web"). Forrester believes that the solution will arrive in a new product, the "automatic e-mail distributor" (AED). Expanding on and integrated with the call center's automatic call distributor, the AED is defined by Forrester as a system for shepherding inbound customer e-mail from receipt through final handling by a group of agents. The AED will:
1) Look up addresses and categorize messages. Like today's screen pop application, the AED will search for a match of the sender's e-mail address in the corporation's customer database. Additionally, by scanning the header and body of the message for keywords like "credit card" or "mortgage," the AED will classify the subject of the e-mail for the next step of processing.
2) Auto-reply. The AED will offer businesses the option to send an instant, canned e-mail reply upon message receipt, promising further action within a specified time. If the incoming message was preformatted by a Web page (like a catalog request), the AED will pass the inquiry to other systems for automatic processing.
3) Route the mail and report on the queue. The AED will
route the inquiry to the most appropriate rep for handling based on the
customer and subject information already discovered. Supervisors
will also be able to monitor and trend the number of messages handled,
speed of answer, and agent productivity from a console integrated with
the voice management system.
BUILD OR BUY AN AED?
Some components of an AED already exist from vendors like Brightware and Silknet, but no one has assembled the total package yet. If call center managers need the help today, MIS is forced to cobble together its own system. But in 1998, Forrester sees additional first-generation offers rolling out from:
Computer-telephony ISVs. Firms like Genesys and Nabnasset understand the tie between customer asset management systems and phone calls. Making the leap to e-mail will not be very hard.
E-mail ISVs. Software.com, Lotus, and Oracle already sell corporate e-mail servers and could team up with the ISVs above to build a potent system based on their routing engines.
Switch makers. Lucent, Nortel, Aspect, and Rockwell will
become the most credible of the lot by extending their telephony-based
queue management expertise to e-mail.
PROFICIENT HANDLING OF E-MAIL WILL BEGET OUTBOUND CAMPAIGNS
Once call centers can comfortably handle large flows of customer inbound e-mail, businesses will feel confident that they can kick off outbound campaigns. As a result:
Marketers will weave e-mail into their channel strategy. In addition to setting up "click here to give feedback" buttons, marketers will target customer segments with: 1) teaser messages to pull Web site visitations; 2) clearance promotions to sell overstocks while avoiding print catalog expense; and 3) appreciation letters to thank clients for their patronage, with embedded coupons or notification of a credit line increase.
E-mail addresses will be worth real money. Godiva chocolatier will pay Bon Appètit magazine a nickel or more for subscriber names with e-mail addresses. In turn, the confectioner will blast out pre-Valentine's Day missives pitching its romantic truffle assortment to well-qualified target customers.
Junk e-mail will rise from nuisance to fact of life. Just
like postal mail, unsolicited messages will arrive en masse. Occasionally,
recipients will be interested; most of the time, the message will end up
in the trash. Some consumers will seek out the help of organizations
like the Direct Marketing Association to put their names on the "do not
e-mail" list.
IBM Research has today the best technology for E-Mail categorization which is based on the uptodate available research both in categorization and search technology. clustering and document similarity techniques. The groups in Haifa Research Lab and in T.J. Watson are working on e-mail handling technology using Information Retrieval techniques for sometime already. The "war on the mailers" is on. Currently, IBM computers are still handling a large percentage of the e-mail traffic. To maintain IBM position in this area, enhancements to mailers such as automatic categorization, responding and filing is essential.
IBM Call Center solutions, such as CallPath, EarlyCloud and DirectTalk, should be enhanced with features to handle inbound e-mail and generators and responders to handle outbound e-mail.
text categorization (SuperCat) based on automatic rule induction method developed at the IBM T.J. Watson Research Center
Lotus agent a Lotus script application running on the Lotus notes server and serves as an intelligent agent which reacts to every incoming mail and process the new mail for automatic categorization and responding.
Lotus script client application (with a Call Center based application CallFlow or CallPath) invoked on the agent workstation when new e-mail gets to the top of the agent's worklist.
categorization development software which is an off-line preparation phase executed as a setup process to create the SuperCat rules.
Administration tools needed to edit and manipulate the mapping table between categories and skills or agents or queues.
logging, tracking, monitoring tools are needed to log every incoming mail and track its time in the queue, the method used for reply, the duration it took to handle the mail until completion, statistics about agents performance and reports capabilities for post mortum.
call center software products which can handle queues, workflow and worklists and can distribute messages between agents (such as CallFlow or CallPath).
Presentation explaining the main principles of the E-Mail Responder
is attached below:

Statistical text analysis techniques are used to extract from textual data the most significant words and word combinations.
This task is conducted in several steps:
1. 1 tokenization and lemmatization
Words and sentences are first isolated from documents
(via automaton-based mechanism), then every individual word is mapped into
a canonical form. The individual word is the atomic unit of conceptual
information in most indexing schemes. Therefore, it is highly desirable
that several variants of a same word (plural vs sinvs.lar, various verb
declinations, construct forms) be mapped into a same canonical form, the
"lemma", as they are conceptually identical. This lemmatization stage is
a crucial and can be solved according to two approaches:
via "stemming" (based on ad-hoc suffix stripping rules
and exception lists), or morphological analysis (based
on a actual grammatical rules and a dictionary). The
approach we propose to use will make use of context-based disambiguation
methods that proved valuable in other indexing applications (see [Maarek
et. al 1989])
Assuming that a unique lemma has been obtained, the next step is to obtain a profile of indexing units that will characterize each document.
1.2 indexing
The indexing stage consists of inferring from the list of lemmas that compose a text the most significant ones so as to form a "profile" or "document vector" in Salton's vector space model [Salton & McGill 83], in which each different indexing unit induces a distinct dimension. Most indexing unit are based on single words (in their lemmatized form). The document similarity engine is taking advantage of an original indexing unit developed at IBM Research, based upon the notion of "lexical affinity" and embodied in several IBM products, in which indexing units consist of word pairs in close context that disambiguate each other. Lexical-affinity based schemes have been shown to give higher precision results in retrieval effectiveness [Maarek 91].
2 text categorization (machine learning)
The SuperCat text categorization solution is based on supervised machine learning methods that induce symbolic rules, and includes:
(A) Categorization Development software and
(B) Runtime Categorization system.
The Categorization Development software consists of three components:
1. Feature Selection and Vector Generation,
2. Machine Learning Method, and
3. Rules Generation and Simplification.
The Categorization Development phase (or training phase) is a batch process. The amount of time required for the machine learning method depends on the amount of data and the number of categories. However, it is reasonably fast on a reasonably powerful machine, i.e., although we do not have precise benchmark timings, our experience indicates that training per category should take very roughly no more than 15 minutes per category, including feature selection, vector generation and rules induction.
The runtime system consists of:
1. Linguistic Preprocessor,
2. Rule Applier. Note that one of the competitive edges of SuperCat is that rules can be hand-altered or hand-built as well as automatically generated. Thus the rule applier can mix and match machine-generated and hand-built rules.
3. Performance Tracker which provides information on precision and recall overall and on a per category basis, reports which rules applied correctly or incorrectly on test data, as well as information on processing speed (documents per second, rules per second, bytes per second). In categorizing a new document, the system reports back as "supporting data" which rules applied to the document. One could also customize a system so that the words or phrases in a document that we involved in rule applying would be highlighted so that portion of a text could be examined. (not currently implemented).
Advantages of Rule-Induction Systems:
The use of rules has a number of advantages:
1. Rule application is fast and does not depend on the size of the database of documents (unlike e.g., Nearest Neighbor algorithms),
2. Training is straightforward
3. Rules are understandable,
4. Rules can be hand-altered or hand-crafted.
5. Rule induction approaches support multiple categorization and hierarchical categorization.
SuperCat Runtime System Requirements:
The SuperCat runtime system runs on Windows95 & WindowsNT; AIX, OS/2 and requires only modest CPU and RAM,
e.g., 100 MHz Pentium with 16-32 MB of RAM. The runtime classification system is delivered as a DLL with easy-to- use API.