Main content Main content

Audio Technology to Support Video Conference Systems

Audio quality has been greatly improved with the RICOH Unified Communication System (RICOH UCS). The video conference system now features an impulse noise reducer, which reduces noise such as keyboard typing sound, and a high-performance echo canceller.

Impulse noise reducer

Impulse noise (non-stationary noise) is noise that suddenly occurs; it includes the noise generated when one types on the keyboard, clicks a ballpoint, or presses a button. Impulse noise obstructs communication in video conferences as it makes conversations inaudible for an instant.

The impulse noise reducer eliminates impulse noise, estimating its amplitude spectrum. The RICOH UCS significantly reduces impulse noise; the people at the other end of the network hear sounds with attenuated impulse noise.

The specific processes are as follows: First, the RICOH UCS estimates the noise spectrum, taking into account the dynamic characteristics of impulse noise (the rising and falling of the spectrum). The estimation is based on the general properties of impulse noise, so it is independent of the types of noise and timing of occurrence. The estimation is highly accurate even when the noise overlaps other sounds. The estimated noise spectrum is subtracted from the spectrum of the original signals, thus eliminating the noise.

An experiment has yielded the following result:

Figure 1: Experiment reducing impulse noise
Figure 1 Experiment reducing impulse noise

Ricoh's original impulse noise reduction technology has greatly reduced unwanted noise, improving the comfort of voice communication.

High-performance echo canceller

Issues with the conventional technology

Conventional echo cancellers had the following issues:

(1) Convergence is slow for colored signals like voice and music; residual echoes obstruct conversations.

Convergence is the precision and speed of estimating how the sound propagates between the speakers and microphones (the transfer function). If the convergence is fast (the transfer function is accurate and the solution is obtained quickly), the echo is small and conversation is easy.

(2) The conventional system is not sufficiently robust against external disturbances such as listener’s voice and noise. This issue is often addressed using a voice switch, disconnecting one voice channel for simplex communication. If the communication is duplex and both sides are talking at the same time, however, the voice can fail to reach the other end.

External disturbances (air-conditioner noise, fan noise, and listeners' voice, for instance) destroys the estimated transfer function. This destruction needs to be as small as possible. In a robust system, the transfer function is not easily destroyed and duplex communication is enabled.

(3) Wideband communications can adversely affect convergence, making it difficult to improve sound quality.

The IT infrastructure has advanced and communications have become wideband. Wideband systems can now send many more signals at lower cost than conventional telephone lines could. The transfer function estimation system has to support wideband communications to increase the quality of voice communication.

New technology to greatly improve audio performance

The high-performance echo canceller does not use an adaptive filter or Kalman filter, which is generally employed in conventional echo cancellers. Instead, the new echo canceller minimizes the influence of external disturbances*1 on the filter output.
*1 External disturbance: Errors in the initial filter coefficients; unknown fluctuations of echo pass; and environmental noise included in observed signals

The new technology has the following advantages regarding the issues with conventional technology:

(1) The new technology has higher convergence performance than the conventional technology regarding colored signals. Convergence is fast and echo attenuation is large. The new technology transmits voice clearly and makes it easy to talk.

(2) The highly robust system maintains its high convergence performance, effectively cancelling echoes even when there is a high level of environmental noise or when the listener speaks while the voice from the other end is being output from the speaker. The system allows natural conversations between two or more people talking simultaneously.

(3) The new technology allows transmission of high-frequency sounds even when the number of taps (the number of coefficients of the digital filter) is large in a wideband system. Even when there are many taps, the transfer function estimation system of the new technology is fast with the convergence time kept almost as short. The new technology enables high audio quality.

Figure 2: Echo cancelling performance when both sides are talking simultaneously
Figure 2 Echo cancelling performance when both sides are talking simultaneously

*This technology has been jointly developed with ARI Corporation, based on patents of Japan Science and Technology Agency (J-Fast H∞ filter, devised by Professor Kiyoshi Nishiyama, Faculty of Engineering, Iwate University).

Other audio quality improvement technologies

In addition to the impulse noise reducer and high-performance echo canceller, Ricoh has developed many other technologies to improve audio quality and implemented them in the RICOH UCS, including a noise suppressor (to suppress environmental noise), automatic gain control (to regulate the level of the transmitted voice), an equalizer, and a dynamic range compressor/expander.

Sorted by : field “Network” | product type “Video and Web conference systems”