Advanced acoustic solutions for an IT based healthcare system
The aim of the project is to provide acoustic communication solutions which would allow development of a new product for IT-based health care. The new product is to be used in instructor lead gymnastics exercises on-site in companies, i.e. a health care activity that companies can offer to their employees, using video conferencing techniques. A typical example is a 15min daily shoulder-and-neck gymnastic exercise performed by several employees in the companys own facilities, which is lead by a professional instructor from a remote location using video conference equipment. A pilot system, using existing techniques, has shown that the product is promising.
Sufficient video performance can be achieved using large screens and high quality cameras. However, a challenging audio problem needs to be solved in order achieve a well functioning solution. A main objective for the product is to provide a feeling of the instructor being present in the room. This requires the possibility of simultaneous two-way speech communication between the instructor and the participants. In order to provide such communication, echo cancellation is required. Without echo cancellation the system will allow only one side to speak at a time, due to the acoustic echo.
For some programs it is desirable to present the exercises using accompanying background music. This implies that the there will be a constant transmission of sound. Further, during the exercises the participators will constantly move around. The constant sound and vivid motion makes the environment very acoustically challenging. In this project we aim to research and develop acoustic echo cancellation methods for providing a satisfying audio part of the system.
Realization of the project
The industrial goal of this project is to provide means for two-way speech communication and make a realization of a commercial product with superior quality compared to existing systems. The project starts with the development of a mathematical computer model, which will contain different sets of impulse responses corresponding to different rooms and microphone/loudspeaker setups. The model will test signals corresponding to different scenarios of operation, as well as input/output signals recorded form real physical systems. The simulated environment enables testing of algorithms very close to real surroundings. However, a successful operation in the simulations does not automatically imply that the algorithm will function in a real system. The model is crucial in order to obtain research efficiency, as it will allow efficient evaluation of innovations, and thus enable a high innovation pace.
Invented solutions and methods will be implemented in the form of signal processing algorithms. The functionality will be tested and improved in simulations. Promising algorithms developed in the simulated environment will then be developed and tested in real system scenarios. In this phase, signals from real systems will be used during the development.
Evaluation of the developed system will be performed in collaboration with IKSU. Their training experts will test and take an active role in improving the equipment in real acoustical environments and give feedback for new technical inventions.
An intention for this project is to use our current knowledge in the fields of: fullband echo cancellation, subband solutions, doubletalk detection, low complexity techniques, fix-point processing, systemized evaluation and to use parallel filters to perform echo cancellation as a starting point for research on a innovative new type of parallel filter/subband based acoustic echo cancellers. We aim to meet the project goal through a combination of these fields, as well as with new solutions invented in this project.
The project aims to find methods for an audio conferencing system, i.e. acoustic echo cancellation and other key functions such as doubletalk detection and residual echo non-linear processing, such that the system is able to perform constant acoustic echo cancellation of such level (about >30-50dB Echo Return Loss Enhancement (ERLE), required level will also depend on other key functions) that the echo cancellation together with other key functions will provide a solution that provide full-duplex, i.e. that it allows simultaneous two-way communication. Furthermore, no perceivable audio deterioration on the customer side (verified by evaluating processed and unprocessed sound in psychoacoustic tests) should be present in an environment where >10 persons constantly moves around approximately <10m from the systems microphones and/or loudspeakers. Further, the methods should be implementable as industrial attractive solutions, i.e. they should be fit for real-time implementation in a low-cost processor.
Project leader: Ingvar Claesson