|
|
MotivationAuthoring an effective and coordinated multimedia presentation is difficult and time-consuming. Authoring a tailored multimedia discourse in a dynamic human-computer conversation setting is more demanding. In a dialogue, a multimedia discourse is a stream of multimedia acts (e.g., graphics acts and language acts) performed by a computer during its turn(s). Furthermore, a coherent multimedia discourse ensures the necessary coordination (e.g., coordinating a graphics display act with a speech act) and proper transition between relevant multimedia acts (e.g., using a visual dissolve act to bridge the gap between two display acts). Unlike a multimedia retrieval-based authoring system, the majority of our multimedia contents, such as the spoken sentences, textual table, and graphics scenes, are to be automatically generated on the fly based on the conversation context. Since it is difficult to predict how exactly a human-computer conversation would unfold, it is not practical to hand-craft all multimedia discourse in advance. ApproachTo support a full-fledged human-computer multimedia dialogue, we are developing IMPRESA, which can automatically generate an interactive and coordinated multimedia discourse. IMPRESA authors a multimedia discourse in two phases. First, it authors a multimedia draft. Based on the draft, media-specific designers (e.g., visual designer and language designer) then work cooperatively to create a multimedia blueprint. Both the draft and the blueprint are made up of a set of media acts. Beyond the usual functions (e.g., content selection, content organization, and media allocation), IMPRESA has three unique functions which are not addressed by existing automated authoring systems. First, IMPRESA can automatically author multimedia interaction acts (e.g., posing an inquiry verbally or visually) to engage users in conversation. Second, IMPRESA can dynamically insert proper multimedia punctuation acts that are used to separate or connect relevant media acts to form a coherent and effective discourse. Third, IMPRESA can systematically design cross-media acts (e.g., using speech to refer to the content conveyed in graphics) to present users with a coordinated, rich multimedia tour of information.
|
|
||||||||||||||||||