When less is not more: Sound quality in remote interpreting

Two years into the COVID-19 pandemic, with its hard and soft lockdowns, more or less stringent travel restrictions, and varying limitations on indoor gatherings, the world of multilingual multilateral diplomacy, including that of the UN and its specialized agencies, is finally flirting with a return to normal – albeit a new normal. The integration of technologies enabling remote participation is likely to have evolved from an exceptional to a permanent feature in most multilingual conference environments. Such technologies boast numerous advantages: less travel means less time spent (or wasted) on planes, on trains, and in automobiles. This might not only boost efficiency, but more importantly, lower emissions and promote sustainability. It also holds the potential of making access more equitable for those who could not have afforded in-person attendance.

This new normal, however, does not come without challenges and a certain amount of disruption in an environment where meetings are not exceptional, but rather, a daily occurrence. It was not long before the first reports of so-called “Zoom fatigue” made the headlines, referring to the overall feeling of physiological and psychological exhaustion caused by various parameters linked to online meetings, including the type and quality of the audio and video signal. This sounded all too familiar to those who had been providing interpreting services for online multilingual events: conference interpreters, so it would appear, had been bearing the brunt of this technology’s drawbacks. Two years into the pandemic, large institutions and organizations are confronted with many of their staff interpreters on medical leave for issues related to their hearing, exhaustion, and burnout. Pinpointing the precise causes of these effects, therefore, has become a pressing priority for those providing and using conference interpreting services alike.

At the University of Geneva’s research laboratory, we have been experimentally studying the cognitive load involved in complex multilingual tasks, including simultaneous interpreting, since long before the COVID-19 pandemic. In fact, we did so at a time when many, including some scholars in the field, distanced themselves from what was perceived as a quest for purely theoretical knowledge with little or no potential to inform professional practice. Over the past couple of years, however, that has changed significantly: researchers are increasingly studying cognitive load phenomena related to simultaneous interpreting, many of them focusing on conference and remote settings. This is, of course, an encouraging development. Perhaps more importantly, interpreters themselves are showing a growing interest in this type of research, with complex constructs such as cognitive resources, capacity, and load having found their way into the discourse of professionals and their associations, as they explain the complexity of their job. At times, it feels like the pendulum has swung and that – if anything – the risk now is one of oversimplifying what are in fact very complex and largely still underexplored phenomena.

As participants in multilateral conferences will have noticed, there are many different possible technical setups for meetings. Fully online, and therefore virtual meetings, eliminate the physical context within which an event takes place, potentially altering communication. Hybrid meetings, on the other hand, can lead to a substantial divide between communication in the room and remote communication with the room. Finally, while passive remote participation might have little or no repercussion on the proceedings in the room, active remote participation is often perceived as disruptive. From the perspective of conference interpreters servicing these meetings, the principal – but far from the only – challenge is the availability of enough, and good enough, audio and video signal.

Transmitting sound over the internet is far from trivial. The sound waves generated by our articulatory apparatus are captured and transformed into a digital signal before being prepared to be transferred over a network constrained by considerable capacity limitations. While capturing this sound is mainly conditioned by the appropriate hardware, in other words microphones, the preparation and transmission of the signal is conditioned by software, in other words algorithms. They filter out noise and clip entire frequency bands to reduce the amount of data that needs to be transferred – a process often resulting in an impoverished signal, replete with artifacts, and of significantly lower quality than the original sound.

*Research is a process of systematic inquiry that entails collection of data.*

The transmission of visual information comes with similar tradeoffs. Although the human eye best recognizes objects located in the field of central vision of around 60 degrees, our total field of vision extends approximately 120 degrees on the horizontal and the vertical plain. This means that it is almost impossible to capture the amount of visual information accessible to the human eye in a natural meeting room environment on an ordinary computer screen. Instead, depending on the setup, virtual participants in meetings might appear as an overcrowded array of close-ups showing little more than heads and torsos, or a series of names and aliases replacing the captures of turned-off cameras. Either way, this visual information bears little to no resemblance to what the human eye can identify in a physical meeting room, including gestures, body movements and visual aids.

Whereas some of these factors might be seen as a minor nuisance, others cause a lot of frustration and are perceived as making the interpreters’ work more difficult, if not impossible. Some might even have short or long-term effects on interpreters’ health and well-being. Deteriorated sound quality is a case in point, and currently under scrutiny as a potential culprit for recurring reports of interpreters experiencing hearing issues. Whereas extended exposure to so-called toxic sound may result in medically diagnosed physiological consequences, deteriorated sound might impact interpreters’ well-being long before physiological symptoms arise.

Against this background, we recently concluded a study on the effects of sound on cognitive load and fatigue in simultaneous interpreters. Although the study was commissioned prior to the outbreak of the pandemic and thus before many of the developments described, its results are very relevant in today’s context. What makes this study unique is the attempt at measuring the effects of remote interpretation in a controlled laboratory environment, and the sample size of over 80 participants, most likely making it the largest experimental study on load in professional conference interpreters to date.

The study specifically looked at whether the sound typical of online conferencing platforms (in terms of the actual frequencies transmitted) had a repercussion on the amount of load conference interpreters perceive as compared to the full frequency range recommended by AIIC (the International Association of Conference Interpreters) and ISO (the International Organization for Standardization). We furthermore measured interpreters’ electrodermal response as an indicator of cognitive and emotional engagement and asked a group of blind and independent judges to evaluate the quality of the interpretation across the two sound conditions.

The results point to a significant effect of sound on interpreters’ perceived load, driven both by a feeling of higher frustration and increased effort. Importantly, the independent and blind evaluations of the output show a significant drop in the quality of the interpretation with impoverished sound: while style and presentation do not seem to suffer, it is the content of what is said that is negatively affected.

Of course, our results provide but a piece of a puzzle which, as we have seen before, is complex. What is starting to become clear, however, is that even a singular factor like frequency response has a significant repercussion on interpreters and the service they provide.

INSIDE VIEW

Prof. Kilian G. Seeber

Prof. Kilian G. Seeber of the University of Geneva is the leading expert in the field of cognitive load in interpreting.

INSIDE VIEW