Written by 4:21 am Digital

Fumbling Towards DSP #3

Andy Schaub continues his exploration of digital signal processing…

Today’s topic is compression: What is it, why do we need it, and when does it do more harm than good? In the context of DSP, any and all compression involves reducing the size of the music file so that (a) it takes up less storage space and (b) it can be copied over a LAN or downloaded from the Internet more quickly. There are many kinds of compression, but they tend to fall into two categories: lossy and lossless. Lossy compression, like MP3 and AAC files, involve throwing out some of the musical information to make the files smaller with the assumption that you won’t hear the difference anyway, which is rarely actually true. Lossless compression does take up more space, generally about 1/2 of the space required by an uncompressed file, but–given sufficient processing power–allows to recreate the uncompressed file perfectly. It just takes a little extra time and, yes, it is possible for errors to creep into the process. 

AR-fumb31a.PNGLossy compression was created when storage cost a whole lot more and network bandwidth, or speed, was much slower than now. It’s really no longer necessary and it harms the fidelity of the music, but licensing agreements with record labels often only cover MP3 or AAC files, so supply full CD resolution files from, say, the iTunes Store could be very costly. A notable exception is TIDAL HI-Fi, where JAYZ negotiated prices for both MP3 and FLAC files, a form of lossless compression, at the same time. So, for an extra $10.00 per month, TIDAL Hi-Fi gives you the ability to stream their whole well-curated music library with more or less the same fidelity you’d get from a CD. 

You can hear the difference and, sooner or later, people will demand that all the music they buy or stream have at least 16/44.1 resolution with no loss of “psycho acoustically irrelevant” information. Until that time, companies that supply music downloads and streaming are unlikely to change their business models, because greater quality usually costs more, although it doesn’t have to, and renegotiating all those licenses with music labels to get 16/44.1, or maybe even 24/48, files as opposed to MP3’s or AAC’s seems unlikely to happen for free. 

Politics and bean counting aside, some discussion of different techniques used for compression within the context of DSP is called for. In my own work as a software engineer, I’ve used three techniques for lossless compression, which can be combined: 

AR-fumb32a.PNGRepeated value compression–In cases where you frequently have the number repeat itself over and over again in a contiguous sequence, you can create a sort of envelope where you use two numbers to represent a much larger count. The first number could be the value you want to repeat, like 128, and the second number is how many times you want the number to repeat, like 32,768. Obviously, this saves a lot of space. 

Another technique, called Huffman coding, involves symbolically representing the N most frequently occurring numbers with the fewest bits. Suppose you have four numbers that occur frequently, 24, 36, 48, and 52. Rather than using a whole byte (8 bits) or word (16 bits) to Store those numbers, you can use 2 bits corresponding to the values 0, 1, 2, 3. Then you have a table that says 0 actually equals 24, 1 actually equals 36, and so forth. Frequently occurring numbers can be static or dynamic (i.e., the values in the table can change from record to record or frame to frame) and you can use more than 2 bits if you have lots of numbers that normally take 8-16 bits, or more, each. 

AR-fumb33a.PNGOne of the more sophisticated techniques for achieving lossless compression is the patented LZW algorithm. As with Huffman coding, you symbolically identify commonly occurring sequences of numbers or sequences of sequences such that very long repeating patterns can be stored in just a few bits (or just a few bytes). This algorithm use a mathematical technique called recursion that lets you fold patterns together over and over again to make the required storage smaller and smaller; but recursion can be complicated and CPU intensive, so using it in moderation is encouraged. 

Once you cross into lossy compression, you actually start to throw away information that’s thought to be musically irrelevant. Part of creating an MP3 files involves reducing the upper frequency limit for a musical waveform to 15kHz IF the amplitude of the signal falls below a certain value. So some music goes all the way up to about 20kHz, but a lot of it just gets cut off at 15kHz with no attempt to smooth over or round the edges. The lower amplitude of these signals implies that they’re overtones or harmonics and are sometimes thought to be irrelevant in the perception of music. However, even a 32Hz bass note has overtones going as high as 20kHz or more, and one’s perception of bass, both quantity and quality, is influenced by those very high frequency overtones. So cutting off the overtones at exactly 15kHz can result in a subjective loss of bass, and no amount of bass boost can make up for the lost information. Plus, throwing away those overtones can alter one’s perception of the spatial relationships between sources because you lose much of the reflected high-frequency information that allows your brain to perceive where those sources of music sit next to one another on a hypothetical sound stage. 

AR-fumb35a.pngTo summarize, compression is not a bad thing. It’s actually necessary given the current limits of storage space and download speeds at affordable prices. However, lossy compression is getting to be outdated. The problem is that you have generations of music lovers who’ve never heard anything besides an MP3 file and, put simply, don’t know what they’re missing, When someone goes into a stereo store with all of their music on a mobile phone using lossy compression, they’re starting with a major disadvantage since no expensive DAC and amplifier will make it sound all that much better since you can’t squeeze blood from a rock, metaphorically speaking. You can’t put back what’s not there, at least not with 100% accuracy, and not given what most of the hardware and software for playing using files you find on the market does today.

(Visited 76 times, 1 visits today)