Posts Tagged ‘jitter buffer’

A Broken Compass

Henrik Lundin
Posted by Henrik Lundin
on February 2nd, 2010 in Technology

Browsing around the papers presented at the latest NOSSDAV workshop, I found “An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google Talk, and MSN Messenger”. Having worked extensively with GIPS’ jitter buffer algorithms, and having some knowledge of Google Talk, I was intrigued by the title. The paper had some interesting experiments, but also a few giant leaps to conclusions.

The paper’s authors have created a laboratory test bench for PC soft phones where they emulate different network conditions (delay, jitter and packet losses), and measure objective speech quality (PESQ) and the end-to-end delay. Then they apply a previously proposed hybrid between PESQ and the E-Model to arrive at a score which takes both measured speech quality and delay into account. The idea is that both audio quality and end-to-end delay contribute to the total conversation experience, which is an easily supportable proposition. Finally, they derive an optimal playout buffer delay for each network condition based on this hybrid measure. I will come back to this approach later.

The experimental part of the paper, setting up the lab and examining the three clients, seems all fine to me, even though I’m not sure that their delay estimation algorithm really can cope with the rapid delay changes that modern jitter buffers apply. They also make rather wild assumptions on coding, packetization, and soundcard delays. But those are minor issues. My problem is their use of the objective hybrid model as a guide to optimality. It is widely know that PESQ is rubbish when it comes to assessing agile jitter buffers, simply because it cannot follow the swift delay adaptation. Tagging on a delay impairment factor to obtain a total user experience number frankly doesn’t improve the situation.

The authors wrap up their work by comparing the measured delays of the three clients, with the delay that renders the highest score in their hybrid measure under the same network conditions. The three clients all exhibit different behavior – not very surprising since they have different jitter buffers – but none of them follow what the authors claim to be optimal. Hence, the user experience of all three VoIP clients could be vastly improved, if only the “optimal” delay would be applied, is their conclusion. Allow me to disagree.

Surely these VoIP clients can be improved, but to distrust the man-years of design and implementation, and endless hours of in-house and customer tuning and testing, I need something more than the broken compass that is PESQ.