High Quality Flash Based VoIP – Please Wait

Jan Linden
Posted by Jan Linden
on December 18th, 2008 in Technology

Over the last year or so a number of services offering free or low cost calling directly from your PC browser without a download have emerged, GizmoCall, Wengo, Flashphone, and  Zenon to name a few. All are based on Adobe Flash technology.

In this post I will talk about why the quality of these solutions is not as good as desired. There have been big hopes that the newly released Flash Player 10 and the corresponding Flash Media Server should address these issues. However, from what I have seen this far this is unfortunately not the case.

The key here is that it should be possible to place a call without having to rely on downloading a plug-in or an application.  Without this limitation it is very easy to build an ActiveX control that call a DLL running natively on the computer which takes care of the actual media processing and great quality can be achieved.

Building a solution that doesn’t require a download is actually practically impossible because of the limitations that a Java Script cannot access audio and video captured on the PC. I.e., only one way streaming is possible for a Java Script based solution. There are also serious implementation concerns that rules out this approach.

The prevailing solution to avoid the download is to exploit the capabilities of the Adobe Flash Player. Flash Player is downloaded on almost all Internet accessing computers worldwide which means that there is rarely a need for a new download when a user wants to utilize the service. Strictly speaking, however, this is not either a download free solution; it is just based on the fact that the download has almost certainly already happened. A unique version of Flash Player exists for all significant computing platforms, including Flash Player Lite for Mobile platforms.

Currently there are some serious limitations on what quality can be achieved. The way the system has been designed a Flash Media Server (FMS) is required on the other side of the conversation. This means that for a non-Flash solution (e.g. SIP based PSTN termination) to talk with a Flash Player on the end user side transcoding,  both in terms of protocol and media, needs to occur. This will inevitably introduce delay and quality degradation.

Unfortunately, these are not the only limitations on this approach. The conversation quality is limited by the fact that a sub par codec (the NellyMoser codec) is used, significant processing delays, a poor jitter buffer, and echo cancellation issues. The NellyMoser codec is scalable in both bit-rate and audio bandwidth. The most typical scenario is using about 33 kb/s on the network, with 40 ms frame size. The audio bandwidth is roughly 5.5 kHz, i.e. somewhere between narrowband and wideband. Even at this fairly high bit rate there is clear distortion that is quite tiring to listen to.

The delay is maybe the most annoying issue with Flash based VoIP solutions. There are several reasons for the very long delay. The most prominent ones are:

·         Flash Player has internal buffers of 200 – 500 ms (See Adobe blog)

·         The use of TCP as transport protocol (a UDP based protocol is part of the new Flash)

·         Inefficient jitter buffer that adds delay to compensate for network jitter

·         Delay due to transcoding (additional jitter buffer, decoding and encoding)

For example, today I tried  GizmoCall from my PC to a PSTN phone. In the direction from my PC to the PSTN phone the delay was roughly 500 ms which is very high but comparable to a cell phone to cell phone scenario. That is, it is possible to communicate but the talkers will interrupt each other quite frequently. In the other direction of the call from the PSTN phone to the PC the delay was much worse. It was actually almost 2 seconds which resulted in that it was not possible to carry out a normal conversation. To verify that it was not just bad luck I made a second call but it was even worse – the delay was in this case almost 2 seconds in both directions. Of course my experience was worse than what is possible to achieve but the theoretical one way latency seems to be in the order of 500 ms, which is way too high to achieve a reasonable conversational experience.

In the recent release of Flash Player 10,  a new codec (Speex) was introduced. Speex is an open source codec which offers reasonably good speech quality without any licensing fees. Many companies have previously shunned away from this codec due to the uncertain licensing situation. Clearly Adobe must feel pretty confident that Speex is not infringing on any patents. Personally, I am not so sure but to date there are no known claims against Speex.

The addition of Speex hardly solves the problems with Flash based VoIP but it at least introduces some flexibility and makes it possible to avoid transcoding.  This assumption is based on that an open version of RTMP (the transport protocol used), e.g, Red5, is used. Previously, this was not possible due to the need to support the NellyMoser codec. That said, only for Flash Player 10  is this possible. It will for sure take a while before the majority of computers will be upgraded to Flash Player 10. The legal situation concerning the use of code that has been created by reverse engineering also hangs as a dark cloud over this solution.

The question is now: What can be done about this? In my mind the ball is in Adobe’s court. Only if they open up the environment for independent development or design a higher quality solution can the true potential of Flash based VoIP be attained.

Tags: , ,

4 Responses to “High Quality Flash Based VoIP – Please Wait”

  1. High Quality Flash Based VoIP - Please Wait | VoIPtel Blog High Quality Flash Based VoIP - Please Wait | VoIPtel Blog Says:

    [...] In this post I will talk about why the quality of these solutions is not as good as desired. There have been big hopes that the newly released Flash Player 10 and the corresponding Flash Media Server should address these issues. However, from what I have seen this far this is unfortunately not the case.  Read entire article here [...]

  2. Mobile VoIP Java Client | Global IP Solutions Mobile VoIP Java Client | Global IP Solutions Says:

    [...] addressing this topic I will comment on what I wrote in a recent blog entry where I talked about how to create a download free VoIP [...]

  3. Rick Rick Says:

    Will Microsoft build Voip into Silverlight as well? Regardless, I think GIPS should develop a Voip browser plug-in platform for application developers. You can assemble the technical components of which you already have full ownership. This is no different than why Flash was originally built.

  4. Sajid Sajid Says:

    Hello ,

    I really agree with your article , with couple weeks of experiencing developing new web phone , I saw such issues where I couldn’t satisfy some experienced voip clients ,specially with DTMF and delay but atleast its good start for Flash development.
    Mostly with other voip clients , they are interacting directly with voip sever in flash developmetn case , they have to connect with Media server by using rtmp protocol which is going to connect with voip server as client , so for webphone client we having one server based client as well which is also causig issue.
    I wish I could have connect my flash phone directly to server …

Leave a Reply