Announcement

Collapse
No announcement yet.

KLANG: A New Linux Audio System For The Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Let me say up front that I'm likely to be rather confrontational in my early responses here. You are someone who has played no visible role in Linux audio over the last decade, you've said a bunch of stuff about digital audio that isn't true, and I've also watched your talk on desktop linux from last year in which you said a bunch of misleading or outright false stuff about that area as well. This doesn't mean that you're a bad or stupid person, but for me personally, it creates a barrier. I apologize for that.

    Originally posted by datenwolf View Post
    On the DLL based position estimation of CoreAudio

    This is in tune what I do in KLANG, only that I don't use a DLL but a PLL which is used as a frequency scaler from buffer interval interrupts to sample position. I come from experimental physics and whenever we have to synchronize high frequency delay counters to low frequency triggers the PLL module gets put into the NIM rack.
    unfortunately, DLL's have a better track record. PLL's make sense when you are trying to track phase, which is not the case here. on the other hand, had you actually been in the linux audio community at any point in the last 10 years, you'd probably be familiar with: http://kokkinizita.linuxaudio.org/papers/usingdll.pdf

    Also our sample rates in physics are several orders of magnitudes higher of that found in audio.
    which you should know is precisely why a DLL is more suited for this purpose. but anyway, its all a bit academic, because its far more important what constants you use with either a PLL or a DLL than precisely which variant it is.

    In the kernel of course, because for one, sample format conversion is a no brainer,
    oh really? see, this is another example of why i feel that you know just enough about digital audio to be dangerous, but not enough to get it right. Something you think is a "no brainer" is actually the source of significant disagreement: http://blog.bjornroche.com/2009/12/i...out-there.html

    your comments on floating point vs. fixed point are about a decade out of date, and simply wrong. i'm thinking back to your post at ardour.org in which you said:

    The main reason to use floating point numbers in audio is for space efficiency when storing large dynamic range audio.
    this just isn't true. the main use to non-integer formats is to avoid clipping due to overflow during summing. it has nothing to do with "large dynamic range".

    A lot of people, especially those coming from PC programming, have a rather low regard for fixed point stuff, for some reason I can't fathom.
    the choice between floating point and fixed point comes down to implementation availability and speed. on general purpose computers, which overwhelmingly dominate the systems on which linux runs, floating point is nearly always faster, except on systems which don't provide it at all. there is no widely accepted fixed point math library (people differ on the best ways to implement certain aspects).


    But if you look around in the high performance, high precision DSP business, most signal processing is done in fixed point there. And that's for good reasons; avoiding loss of precision is one of them.
    its very very far from the most important reason: at any point in time, fixed point DSP has historically offered a better $/computation ratio than floating point hardware, and thus people who need to sell gear used fixed point. whenever they do so, they can rightly say "its faster, and cheaper". the problem is that this state of affairs lasts approximately 10-15 months, at which point, the then-available general purpose systems have floating point that is as fast or faster than what the customer bought. this is why all serious audio software for windows, os x, linux and other systems using floating point for audio, and not fixed point.

    You can do floating point in the Linux kernel, it's just frowned upon. kernel_fpu_begin and kernel_fpu_end allow for FPU register save, restore. But KLANG doesn't need it, because it does everything fixed point.
    so you plan to replace the cost of some context switches with the cost of converting floating point (which will continue to be the overwhelmingly dominant form in user space, unless you manage to figure out how to get cross-platform developers to switch too) to and from floating point. you cannot be serious. remember, the hardware will still be integer and in some cases floating point.

    i'll continue on with some points from your post at ardour.org ...

    KLANG's internal stream format gives at least 8 bits of foot- and headroom for all samples in it. Gain/attenuation is applied by factoring the multiplicator to the closest radix 2 and remainder. Then a bitshift is applied followed by multiplication with the remainder.
    goodbye SSE/SSE2/SSE3. sigh.

    (from me) This makes it very inefficient to implement inter-application audio, since everything has to make extra transfers across the kernel/user space boundary.
    Sorry, but this is just FUD. How do you think data is exchanged between user space processes? You cross the userspace-kernel boundary twice doing so. Any sort of IPC always involves system calls. Even if it goes over shared memory. Because shared memory is actually shared address space and all sorts of "kernel-hell" breaks loose, touching it.
    You cannot be serious. Its hard to take anyone seriously who would claim such a thing. You think that two processes both touching an mmap'ed region in user space causes some kind of kernel hell? this is just ridiculous - you make it appear that you don't know how shared memory works at all! the address spaces have the region mapped. when each process touches part of the region NOTHING HAPPENS - its a memory access.

    But then Lennart Poettering (re-)discoverd a rather old method ? this is one of the few cases where I think he did something good ? how you could get low latency even when operating with large buffer sized. This might sound impossible, but only as long as you assume a filled buffer being intouchable. If you accept, that one may actually perform updates on a already submitted buffer, just slightly ahead from where it's currently read from (for example in a DMA transfer) you can get latency down, even with larger buffers. Lennart implemented this in PA when they were approaching very long buffers (256ms and longer) on mobile devices, but still needed low latency for audio events.
    Lennart's design (which he called glitchfree) is more or less identical to coreaudio's. but it doesn't do ANYTHING to deal with your concerns about power consumption under low latency conditions and doesn't substantively alter my point about low latency audio being the reason why power consumption goes up. Whether its the audio interface interrupt that wakes up the CPU and/or the user space process that wants to write every 8 samples, something is keeping the CPU busy. end of story.

    Moving onto API design. There are two basic models for streaming media on any system. One of them is often called "push" and the other is called "pull". In the push model, applications are free to write/read what data they want, when they want to (they may behave in an isochronous manner, if they are smart, or they might read/write in semi-random bursts: either way, they get to make the choices about when and how much data to receive from/deliver to an endpoint. In the pull model, it is the endpoint (typically an actual audio interface) that determines how much data is to be processed, and when it needs to be done.

    Now, it is entirely possible to write code that uses the push model and actually behaves, over time, very similarly to what happens under a pull model. A few people do this. But its not a coincidence that every serious audio API for low latency work (ASIO, CoreAudio, WaveRT, JACK and more) all use a push model: the system decides how much data needs processing and when, and the apps just do what they are told.

    Why does this matter? It matters because although its very easy to add a push API on top of a pull API (Just Add Buffering (TM) : you're done), it is more or less impossible to add a pull API on top of a push API and simultaneously offer low latency. Several APIs on different platforms over the years have tried this, and they've all been absolutely useless (I'm looking at you, OpenAL).

    So, coming back to your beloved unix system calls, open/read/write/ioctl do not permit the creation of a pull API without increasing latency by at least one "period" (to use ALSA terminology). but then, moving on, they encourage developers to treat audio (or video) i/o as if its file i/o - a process without time deadlines. read when you want, write when you want. now as i said, the developer doesn't have to do it that way - they can block sensibly and sort of model a poll/select call - but here's the reality: the overwhelming majority do not. most developers don't have a clue about realtime programming and they need serious nudging to even go halfway in the right direction. this is no fault of their own - they shouldn't really have to know too much. but the unix system call API has absolutely no notion of timeliness and it does nothing to encourage people to design their code around a push model. That's why JACK (and CoreAudio, and ASIO and WaveRT) do not allow developers to write in a push style unless they use some additional library that does buffering for them (there are at least two of these for JACK). You will note that one of the entire reasons why CoreAudio has been so successful is because it forces EVERY application on OS X, even a pathetic little email notification applet that wants to make a beep, use the same pull-based, threaded audio API that Logic, Digital Performer and Cubase/Nuendo have to use. An API that looks absolutely nothing remotely like open/read/write/ioctl.

    But stepping back again, here's the meta problem with your entire KLANG proposal/idea. I've been preaching this stuff online and at conferences for nearly a decade now. I've discussed this stuff many times with many people on the linux audio developers mailing list, on the JACK mailing list, and back in the day, even on the ALSA development mailing list. I've been on forums explaining to people who know even less about digital audio than you how things work, and why things have converged on common solutions on all major OS platforms, and what the benefits and disadvantages of those common solutions are. And then one morning, someone on IRC points at your page, which contains a bunch of handwaving half-truths that appear to have formed completely in a vacuum, without any interaction with any of the people who really know about and understand digital audio (on linux, or any other platform for that matter).

    You could have showed up on LAD or the JACK ML, or a half dozen other places, and proposed some new ideas. People would have argued with you and you would have changed their minds or they would have changed yours. At the end of the day, we'd either have an idea for a better new "grand design", or we'd back to wondering how/if/when/why to take the next step forward.

    Instead, you've designed KLANG in a bubble, misunderstanding so many of the small details that are ultimately why it tends to take years to get one of these systems right, and you've handed the morons on reddit and slashdot who just want to froth at the mouth about how bad linux audio another shiny new diamond. This is not helpful. Schemes that call for moving policy into the kernel, require rewriting every single audio device driver, that replace cheap context switches with (relatively costly) format conversions (and I didn't even get into how the number of context switches is identical typically only 2 greater than without JACK anyway), that removes interposition by reverting to the unix system call API that was designed before realtime streaming was even a twinkle in dennis ritchie's eye ... i'm sorry, but this is so far from helpful that its just offensive.

    Comment


    • #52
      Originally posted by unknown2 View Post
      i don't get it.
      indeed.

      the issue isn't really about floating point or not - though this is an important issue. its really about whether or not audio "services" get provided all in the kernel, or whether the device driver is in the kernel and the muxing of streams is in user space.

      windows no longer mixes in the kernel, and when it did, latency couldn't go below 30msec and was frequently around 100msec.

      OS X mixes in user space (certainly on Lion and Mountain Lion, and possibly before).

      the only system that mixes in the kernel these days is OSS4.

      i don't think that linux developers are so "superior" to be able to achieve the goal in userspace.
      well, except that we did (its called JACK, and it can go as low latency as your hardware allows). and more: we're superior enough to provide OS X with the only inter-application audio routing system that anyone takes seriously there, and to offer the same general platform for Windows, BSD and Solaris too. that doesn't mean that we're geniuses or even that we know half of what we should know, but it does mean that listening to the endless pontificating about this stuff by people who really don't understand the technical details gets kind of tiring. i'd rather be writing code.

      Comment


      • #53
        Oh, one more thing to follow on from my very long post that is now in moderation:

        Originally posted by datenwolf View Post
        You can do floating point in the Linux kernel, it's just frowned upon. kernel_fpu_begin and kernel_fpu_end allow for FPU register save, restore. But KLANG doesn't need it, because it does everything fixed point.
        You cannot sleep while doing this. That can be quite a restriction on kernel code design. In addition, despite the existence of these calls within the kernel, the kernel development community remains antagonistic to adding more floating point to the kernel.

        Comment


        • #54
          Originally posted by unknown2 View Post
          i don't get it.
          good quality sound mixing requires floating point operation, but this is not allowed in kernel currently.
          Well, technically it is and the RAID checksumming code actually makes use of the FPU registers, which are what the fuzz is all about.

          Originally posted by unknown2 View Post
          i don't get it.
          How can fixed-point operation achieving good quality
          A oftenly found misconception is that floating point equals precision. What floating point gives you is a fixed number of significant (binary) digits, the mantissa, and a value scale defined by the exponent. In a floating point number you never get more precision than the mantissa can hold. The mantissa is usually normalized, which means it's value is interpreted as being in the range 0?1 and the most significant bit shifted to the upper boundary. In addition you have a sign bit which gives you an effective mantissa value range of -1?1 quantized by the integer range given by the mantissa bits. As long as the effective value of a floating point number stays within the range -1?1 the scaling exponent is <=0 and what you actually have is the equivalent of an integer of the same number of bits as the mantissa has (divided by the integer value range).

          Of course with small absolute values, and hence negative exponent the number resolution increases. But you don't benefit from this, because you're dealing with a quantized signal and eventually the signal goes into a DAC which can, by principle not resolve less than one bit. DACs are inherently fixed point, that's the nature of their circurity.

          So why use floating point at all then? Because say if the absolute value is larger than what would fit into the integer range? Then a integer would either wrap or saturate (depending on operations used). A floating point number however uses a exponent >0 then. But this means that now you must drop bits of your mantissa. If the exponent is 1, then your mantissa resolution has been halved, if the exponent is 2 it's only 1/4 of the mantissa resolution and so on.

          In audio applications float is used, because this allows you to mix several signals without fearing of saturating or wrapping them. However this means a loss in precision. Also it allows to apply a wide range of gain without thinking too much about it.

          But let's look at this from an engineering standpoint. An IEEE754 32 bit floating point has 23+1 bits of mantissa. That's the precision you get. Add to this 8 bits exponent, allowing to expand the value range up to 2^127, which equates to some 20*log10(2^127)=770 dB of additional dynamic range (only usable if the each signal was a square wave). You'll never find this situation in audio.

          Remeber why you need that additional headroom? Because when you mix together several channels they may saturate. By how much? Well lets assume every channel makes use of its full scale, then the total value range required is that of n channels. But every bit we add, doubles the value range.

          Lets say we use a 32 bit integer value, but our audio signal is normalized to 24 bits. Then we have 2^8 = 256 times the value range we can work with. Which means we can mix up to 256 channels of 24 bit audio without saturating and without loss of precision, because we don't subquantize our original value range. Eventually we'll have to attenuate this, but until then we don't loose information. And now think about it: You'll hardly find 256 audio sources playing at the same time. So we can in practice mix even more stuff, without problems.

          And similar applies for actually doing calculations on the stuff. What kind of calculations are these? Well actually only one, namely resampling, which is implemented as a fast finite impulse response filter. Wait what, how can a filte resample you ask? well I leave that as an exercise for the reader to think about this. Just a hint: If you reduce the sampling frequency, what kind of cutoff must you apply beforehand to avoid high frequency aliasing?

          This is probably the number one key lesson told in every numerics lecture given (if not, it's a bad lecture): Using floating point makes only sense if your numbers vary over a large range and are not of similar scale. Especially never subtract floating points of similar absolute value. If the numbers are all of similar value, or in a narrow range and precision is pinnacle never use floating point. Use fixed point then. If you are a pure computer science student you actually might not have learnt this, because computer science doesn't deal with such practical problems . But if you're in engineering or physics, then you must know how to work with digital numbers.

          So what does this mean for KLANG? KLANG can be compiled with either 40 bits or 48 bits per sample processing, with a footroom of 8 bits (this is for reversible attenuation of up to 48dB, which is plenty). For the typical desktop audio user the 40 bit configuration is already total overkill, as for his el-cheapo 16 bit audio hardware this gives a headroom of 16 bits and due to the footroom of 8 bits, calculations can be done with a precision of 24 bits. Even with 24 bit audio samples the precision is sufficient for most tasks. If desired 48 bit processing can be enabled, but this is more snake-oil than actually audible. But: Since KLANG is also to be used to pass around audio between applications that additional precision might be required.

          Comment


          • #55
            After hearing you talk about it, this seems to at least be an interesting endeavor. I do have my own questions about where you're trying to take this though.

            Your real competitors seem to be ALSA and OSS, not PulseAudio and JACK. Your goal is to play audio and have a sane, low-latency framework for doing so with high-quality mixing done in the kernel. Especially with an API that is a superset of the OSS, there should be both JACK and PulseAudio backends easily adapted from the OSS backends and perhaps taking advantage of new API calls. A lot of importance is still attributed to these userspace frameworks because they do a lot more than simply relay audio to sound cards, like routing between applications or the network and handling bluetooth. The big change that would need to be done in JACK and PulseAudio to properly leverage KLANG would be to make all mixing and resampling happen within KLANG rather than done by themselves in userspace. That brings up the question of how you're actually handling streams.

            - Would you support per-stream volume controls? Doing so would eat into bits that you would normally be able to associate with adding streams together when mixing, though it would be more accurate and consistant than forcing applications to do it.

            - Would you support flat-volumes like PulseAudio, keeping with that example of dealing with application + system volumes well.

            - Would you be able to increase latency? Low-latency is something that fits much better on a desktop, while using PulseAudio and forcing higher latencies will save power due to wakeups as compared to using ALSA directly which is something that makes people with power-usage concerns happy and would position KLANG as a general audio system than a specialized one.

            Then I'd also want to know how high up you want to go.

            - Would you expect jack detection as a future goal if all goes well?

            - Will there be upmixing and downmixing? For example Stereo to 5.1 or 5.1 to 4.0.

            I'd say this has a long road ahead if it is planning to replace something like PulseAudio or JACK because of all the conveniences they provide.

            Comment


            • #56
              Originally posted by Ferdinand View Post
              Shouldn't a pentium 133Mhz be able to play an mp3 fluently?
              It does. It also plays an Ogg Vorbis file fluently. I know because I was doing that on a P133 :-)

              Comment


              • #57
                I'm with BwackNinja here. This sounds like an interesting concept, and it would be good to get a proof of concept ready, but I don't see this replacing the currently used audio systems any time soon.

                Comment


                • #58
                  Originally posted by GreatEmerald View Post
                  I'm with BwackNinja here. This sounds like an interesting concept, and it would be good to get a proof of concept ready, but I don't see this replacing the currently used audio systems any time soon.
                  If the problems can be solved in ALSA (with a bit work) -at least this is what i understand from PaulDavis answer- why we need a completely new implementation.

                  Comment


                  • #59
                    Originally posted by datenwolf View Post
                    Well, technically it is and the RAID checksumming code actually makes use of the FPU registers, which are what the fuzz is all about.



                    A oftenly found misconception is that floating point equals precision. What floating point gives you is a fixed number of significant (binary) digits, the mantissa, and a value scale defined by the exponent. In a floating point number you never get more precision than the mantissa can hold. The mantissa is usually normalized, which means it's value is interpreted as being in the range 0?1 and the most significant bit shifted to the upper boundary. In addition you have a sign bit which gives you an effective mantissa value range of -1?1 quantized by the integer range given by the mantissa bits. As long as the effective value of a floating point number stays within the range -1?1 the scaling exponent is <=0 and what you actually have is the equivalent of an integer of the same number of bits as the mantissa has (divided by the integer value range).

                    Of course with small absolute values, and hence negative exponent the number resolution increases. But you don't benefit from this, because you're dealing with a quantized signal and eventually the signal goes into a DAC which can, by principle not resolve less than one bit. DACs are inherently fixed point, that's the nature of their circurity.

                    So why use floating point at all then? Because say if the absolute value is larger than what would fit into the integer range? Then a integer would either wrap or saturate (depending on operations used). A floating point number however uses a exponent >0 then. But this means that now you must drop bits of your mantissa. If the exponent is 1, then your mantissa resolution has been halved, if the exponent is 2 it's only 1/4 of the mantissa resolution and so on.

                    In audio applications float is used, because this allows you to mix several signals without fearing of saturating or wrapping them. However this means a loss in precision. Also it allows to apply a wide range of gain without thinking too much about it.

                    But let's look at this from an engineering standpoint. An IEEE754 32 bit floating point has 23+1 bits of mantissa. That's the precision you get. Add to this 8 bits exponent, allowing to expand the value range up to 2^127, which equates to some 20*log10(2^127)=770 dB of additional dynamic range (only usable if the each signal was a square wave). You'll never find this situation in audio.

                    Remeber why you need that additional headroom? Because when you mix together several channels they may saturate. By how much? Well lets assume every channel makes use of its full scale, then the total value range required is that of n channels. But every bit we add, doubles the value range.

                    Lets say we use a 32 bit integer value, but our audio signal is normalized to 24 bits. Then we have 2^8 = 256 times the value range we can work with. Which means we can mix up to 256 channels of 24 bit audio without saturating and without loss of precision, because we don't subquantize our original value range. Eventually we'll have to attenuate this, but until then we don't loose information. And now think about it: You'll hardly find 256 audio sources playing at the same time. So we can in practice mix even more stuff, without problems.

                    And similar applies for actually doing calculations on the stuff. What kind of calculations are these? Well actually only one, namely resampling, which is implemented as a fast finite impulse response filter. Wait what, how can a filte resample you ask? well I leave that as an exercise for the reader to think about this. Just a hint: If you reduce the sampling frequency, what kind of cutoff must you apply beforehand to avoid high frequency aliasing?

                    This is probably the number one key lesson told in every numerics lecture given (if not, it's a bad lecture): Using floating point makes only sense if your numbers vary over a large range and are not of similar scale. Especially never subtract floating points of similar absolute value. If the numbers are all of similar value, or in a narrow range and precision is pinnacle never use floating point. Use fixed point then. If you are a pure computer science student you actually might not have learnt this, because computer science doesn't deal with such practical problems . But if you're in engineering or physics, then you must know how to work with digital numbers.

                    So what does this mean for KLANG? KLANG can be compiled with either 40 bits or 48 bits per sample processing, with a footroom of 8 bits (this is for reversible attenuation of up to 48dB, which is plenty). For the typical desktop audio user the 40 bit configuration is already total overkill, as for his el-cheapo 16 bit audio hardware this gives a headroom of 16 bits and due to the footroom of 8 bits, calculations can be done with a precision of 24 bits. Even with 24 bit audio samples the precision is sufficient for most tasks. If desired 48 bit processing can be enabled, but this is more snake-oil than actually audible. But: Since KLANG is also to be used to pass around audio between applications that additional precision might be required.
                    Is this post on Steam: Steam Linux thread, a new audio project, from a developer of your group?

                    Comment


                    • #60
                      Originally posted by datenwolf View Post
                      Well, technically it is and the RAID checksumming code actually makes use of the FPU registers, which are what the fuzz is all about.



                      A oftenly found misconception is that floating point equals precision. What floating point gives you is a fixed number of significant (binary) digits, the mantissa, and a value scale defined by the exponent. In a floating point number you never get more precision than the mantissa can hold. The mantissa is usually normalized, which means it's value is interpreted as being in the range 0?1 and the most significant bit shifted to the upper boundary. In addition you have a sign bit which gives you an effective mantissa value range of -1?1 quantized by the integer range given by the mantissa bits. As long as the effective value of a floating point number stays within the range -1?1 the scaling exponent is <=0 and what you actually have is the equivalent of an integer of the same number of bits as the mantissa has (divided by the integer value range).

                      Of course with small absolute values, and hence negative exponent the number resolution increases. But you don't benefit from this, because you're dealing with a quantized signal and eventually the signal goes into a DAC which can, by principle not resolve less than one bit. DACs are inherently fixed point, that's the nature of their circurity.

                      So why use floating point at all then? Because say if the absolute value is larger than what would fit into the integer range? Then a integer would either wrap or saturate (depending on operations used). A floating point number however uses a exponent >0 then. But this means that now you must drop bits of your mantissa. If the exponent is 1, then your mantissa resolution has been halved, if the exponent is 2 it's only 1/4 of the mantissa resolution and so on.

                      In audio applications float is used, because this allows you to mix several signals without fearing of saturating or wrapping them. However this means a loss in precision. Also it allows to apply a wide range of gain without thinking too much about it.

                      But let's look at this from an engineering standpoint. An IEEE754 32 bit floating point has 23+1 bits of mantissa. That's the precision you get. Add to this 8 bits exponent, allowing to expand the value range up to 2^127, which equates to some 20*log10(2^127)=770 dB of additional dynamic range (only usable if the each signal was a square wave). You'll never find this situation in audio.

                      Remeber why you need that additional headroom? Because when you mix together several channels they may saturate. By how much? Well lets assume every channel makes use of its full scale, then the total value range required is that of n channels. But every bit we add, doubles the value range.

                      Lets say we use a 32 bit integer value, but our audio signal is normalized to 24 bits. Then we have 2^8 = 256 times the value range we can work with. Which means we can mix up to 256 channels of 24 bit audio without saturating and without loss of precision, because we don't subquantize our original value range. Eventually we'll have to attenuate this, but until then we don't loose information. And now think about it: You'll hardly find 256 audio sources playing at the same time. So we can in practice mix even more stuff, without problems.

                      And similar applies for actually doing calculations on the stuff. What kind of calculations are these? Well actually only one, namely resampling, which is implemented as a fast finite impulse response filter. Wait what, how can a filte resample you ask? well I leave that as an exercise for the reader to think about this. Just a hint: If you reduce the sampling frequency, what kind of cutoff must you apply beforehand to avoid high frequency aliasing?

                      This is probably the number one key lesson told in every numerics lecture given (if not, it's a bad lecture): Using floating point makes only sense if your numbers vary over a large range and are not of similar scale. Especially never subtract floating points of similar absolute value. If the numbers are all of similar value, or in a narrow range and precision is pinnacle never use floating point. Use fixed point then. If you are a pure computer science student you actually might not have learnt this, because computer science doesn't deal with such practical problems . But if you're in engineering or physics, then you must know how to work with digital numbers.

                      So what does this mean for KLANG? KLANG can be compiled with either 40 bits or 48 bits per sample processing, with a footroom of 8 bits (this is for reversible attenuation of up to 48dB, which is plenty). For the typical desktop audio user the 40 bit configuration is already total overkill, as for his el-cheapo 16 bit audio hardware this gives a headroom of 16 bits and due to the footroom of 8 bits, calculations can be done with a precision of 24 bits. Even with 24 bit audio samples the precision is sufficient for most tasks. If desired 48 bit processing can be enabled, but this is more snake-oil than actually audible. But: Since KLANG is also to be used to pass around audio between applications that additional precision might be required.
                      The theory seems ok for mixing sound, but can it be applied to resampling? if upsample from 44Khz to 48 khz, is it feasible in fixed-point?

                      Comment

                      Working...
                      X