Linux Audio Needs an Overhaul

With the newest versions of Ubuntu and Fedora, PulseAudio has replaced ESD as the default userspace sound server. But what can it really bring to the table that hasn’t already? The amount of audio servers is staggering, each self described with similar terms to each other ALSA, OSS, ESD, aRts, JACK, and GStreamer, to name a few. Followed shortly by the number of mix and match patches that makes working with audio a pain, alsaplayer-esd, libesd-alsa, alsa-oss, alsaplayer-jack, gstreamer-alsa, gstreamer-esd, and there’s even more!

When does the madness stop?

I would love to know but I’m sure it wont be for a while. The confusion really seems to stem from the fact that there’s no clear documentation unless you’re a developer and the overlapping goals for all of the projects. Lets take a look at the description for ALSA. This is copied from the ALSA site at http://www.alsa-project.org

The Advanced Linux Sound Architecture (ALSA) provides audio and MIDI functionality to the Linux operating system. ALSA has the following significant features:Efficient support for all types of audio interfaces, from consumer sound cards to professional multichannel audio interfaces. Fully modularized sound drivers. SMP and thread-safe design.User space library (alsa-lib) to simplify application programming and provide higher level functionality. Support for the older Open Sound System (OSS) API, providing binary compatibility for most OSS programs.

Ok well that sounds all well and good; however what about ESD? http://www.tux.org/~ricdude/overview.html

The Enlightened Sound Daemon mixes several audio streams for playback by a single audio device. You can also pre-load samples, and play them back without having to send all the data for the sound. Network transparency is also built in, so you can play sounds on one machine, and listen to them on another.

More confusion is to be had if you even dare to make music on a linux box. Most DAWs (Digital Audio Workstations) require JACK. Well JACK can use ALSA, PortAudio, CoreAudio, FreeBoB, FFADO and OSS as a backend. So what do we have so far? A back end sound daemon, a secondary program that the DAW talks to, then the DAW. Why should we need that secondary program? *Nix has a standard for it’s graphics and its XORG, why can’t we decide on one audio server?

The biggest source of confusion is that there are so many audio programs offering their own APIs. Such as the userspace ALSA library, aRts, ESD, and GStreamer. In addition we also have on top of the audio server a secondary application, SDL and OpenAL for games, Open Sound System (OSS) for legacy applications, and JACK for pro-level, low-latency operations.Applications such as Xine and MPlayer don’t even USE these external libraries and do everything internally. When does it end? So far we have 3 or 4 applications running just to play an MP3. An application like Rhythmbox relies on GStreamer to decode sound files from compressed form into raw audio. GStreamer in turn passes the audio down to ESD, and ESD delivers it to the ALSA hardware driver. How does this make any sense?

PulseAudio is a replacement for ESD. However at this point most apps are so reliant on ESD that many things are broken. For example it is impossible for me to select a line in if I wish to use skype or audacity. Instead it selects my built in laptop mic and even with the gain low its still loud, recording mostly my fan.

The whole issue is just absurd. Standardization of the Linux audio system is vital. The whole issue has been summed up best by Adobe, http://blogs.adobe.com/penguin.swf/linuxaudio.png.

Post to Twitter Post to Plurk Plurk This Post Post to Yahoo Buzz Buzz This Post Post to Delicious Delicious Post to Digg Digg This Post Post to Ping.fm Ping This Post Post to Reddit Reddit Post to StumbleUpon Stumble This Post

2 Comments

  1. Edward
    Posted March 24, 2009 at 4:01 pm | Permalink

    GStreamer is not an audio framework. It’s got plugins *for* audio frameworks.

    Its goal is only to not force developers to learn all those different APIs (imagine if you start porting on mac/windows…) whichever one is the one the distro prefers or is installed.

    Whether it’s ALSA, Jack, OSS, OSS4, ESD, PulseAudio, CoreAudio, DirectAudio, iPhone, Android, S60, , your GStreamer application will work the same way.

  2. BigBrother
    Posted March 24, 2009 at 6:55 pm | Permalink

    @Edward: Thanks for the reply, and correction. Looks like GStreamer is trying to standardize like I said. However at this point I think linux audio is too much of a pain. All my music will be done on a PC or Mac

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*