Workshop: The 5th IEEE International Conference on Advanced Learning Technologies
Adel Elsayed
The University of Bolton
a.elsayed@bolton.ac.uk
Roger Hartley
The University Leeds
j.r.hartley@education.leeds.ac.uk
Milena Pesheva
The University of Bolton
mp6ect@bolton.ac.uk
Active multimodal presentations (AMPs) are stand-alone, free running, presentations that optimize the utilization of the principal perceptual modalities of the audience, i.e. the auditory and visual modes. This would enhance the effectiveness of presentations, which would be particularly useful for achieving their educational objectives. The optimization in this context can be realized in two ways, mode assignment and integration of modalities. In the former case, mode assignment, the message content is distributed amongst modalities: a verbal component that addresses the auditory mode, and a pictorial component that addresses the visual mode. In addition, gestures can be considered as a third modality that supports and enhances the semantic content of the message. Gestures also provide an instrument for integrating message modalities, which is the second means of optimization.
AMPs, therefore, have three main components: an auditory component, representing speech narrative, a visual component, representing visual objects and a gestural component representing the integrating object, which jells together the other two components into a coherently integrated multimodal (CIM) environment. In human communication, the natural integrating object is the hand. We use our hands, not only to externalize our internal representations, by scripting, sketching etc, but also to add a gesture component to our communication act that directs attention, provides illustration, emphasizes the contours of the narrative, as well as expressing affect and empathy. The role of gestures, therefore, is not confined to the integration of message modalities, but extends to a social level, which integrates together the presenter and the audience into a socially coherent communication environment.
One of our research themes is concerned with the analysis and understanding of the nature and role of these gestural attributes in various types of multimodal presentations. One of these is the computer-based presentation mode that we refer to as an AMP, the version introduced in this workshop. The other versions include board-based presentations and table-based presentations, captured through video cameras. AMPs are usually captured through a screen capturing software utility, sometimes referred to as a Screen Cam or by using a dedicated software presentation tool, specifically developed for that purpose.
Reflection on the differences between the above three methods of capturing presentations, video, screen cam and bespoke software would reveal the motivation for developing AMP technology. These differences relate to three main factors, one is concerned with resource issues; the second is concerned with issues of tractability to machine processing; while the third relates to productionflexibility, an important factor that influences cost, and quality.
As for the resource aspect, which relates to communication resources, video is known to be a highly demanding medium for streaming bandwidth. Hence, there would be a natural interest in developing a medium that is as effective as video, but not as demanding on bandwidth requirements. The solution lies in AMP, for reasons indicated hereafter. In terms of tractability to machine processing, video is a scanning-based production technology, which does not lend itself to readily identifying the individual components that make up the captured scene. In fact, this has become the objective of another research area, known as computer vision, which aims at identifying and manipulating individual components. In addition, there is some research interest directed towards ‘component-based encoding’ for interactive television. Using these new technologies, however, would entail adding unnecessary complications and costs to the process when it is possible to compose the scene out of the constituting components at the point of production.
Video, therefore, is not the right technology for a composition-based production approach. AMP on the other hand is a synthetic medium, hence naturally suited for this kind of flexible production approach; in AMP, every component is independently acquired and, hence, is conveniently encoded for full accessibility for machine processing. Moreover, the independent acquisition nature of AMP components makes it quite compatible with the natural requirements of human delivery of presentations, which makes this approach very economical in production costs, yet very effective in its communication attributes. In addition, the component-based approach would allow every element of content to be independently optimized for streaming bandwidth, hence reducing bandwidth requirements to a minimum.
Part of our research, at the APT Lab, aims at developing software that addresses the problems of AMP acquisition, processing, streaming and delivery, while the major part is concerned with investigating the characteristics and attributes of AMP components e.g. gestures’ roles and attributes, characteristics of AMP discourse, semantic processing of AMP content etc. As our target software is still under development, we have identified intermediate solutions for the creation of AMP content using generic multimedia tools e.g. Macromedia’s Director. In spite of its limitations, Director provides a convenient platform for prototyping AMPs e.g. for empirical proof-of-concept investigations. The generic nature of Director, however, meant that its IDE (integrated development environment) has become so complex that a specialist would usually be required to use it. A simpler alternative can be found in screen capturing utilities e.g. ScreenCam. This is a straight forward option to use, that does not need any special training, yet is capable of producing content that is moderate in its demand for bandwidth. Its drawback, however, is the lack of component accessibility for machine processing. Since the scope of this workshop does not address the machine processing aspect of AMP, we will confine ourselves to the use of screen cam software, which is quite adequate for this purpose.
The purpose of the workshop is to introduce and illustrate participants:
and
The software tools come in three categories:
The hardware tools include a tablet PC, or a standard PC/Notebook with a digitizing tablet as an additional peripheral.
The techniques offered in the workshop are not limited to how to use the software and hardware elements, but extend to cover the following:
The workshop session will be organized in the following order:
Although anyone with an interest in developing and using computer-based presentations would benefit from this workshop, the context of the workshop would be of particular interest to those engaged in educational, training and instructional activities, or have some interest in researching this topic. No specific pre-requisite experience is expected in participants except an interest in producing stand-alone, free running, computer-based presentations. The workshop audience, however, is expected to know how to interact with a software application, including a minimum level of familiarity with computer graphics e.g. how to prepare a simple power point presentation.
Adel Elsayed
Adel Elsayed is the Research Leader of the Active Presentation Technology Lab, based in the Department of Computing and Electronic Technology at the University of Bolton . His early research training and interests were in control engineering and speech technology. However, he has a long and passionate interest in using technology to support the learning/teaching processes, which has become the hub of his research activities. His current research interests include computer-mediated multimodal communication, and computer-support for learning.
Roger Hartley
Roger Hartley is Professor of Computers in Education at the University of Leeds . His background is in the physical sciences and psychology and until 1997 he was director of the Computer based Learning Unit at Leeds . His research interests are in communication and interaction in multimodal contexts, and linkages between autonomous and collaborative learning.
Milena Pesheva
Milena Pesheva is reading for a PhD degree at the APT lab, Department of Computing and Electronics, The University of Bolton. Her work relate to the multimodal aspects of human communication in an educational context, with particular interest in the role of gestures. She also has an interest in computer assisted learning and multimedia learning design. Her background, before joining this research programme, was developed in linguistics and ICT, during which she obtained a Masters in IT management.