A Mind for Madness

Musings on art, philosophy, mathematics, and physics

1 Comment

Examining Pro’s Prose Part 3

Today, let’s turn to the master himself, John Cheever. As I said in the first post on the series, many say this modern “MFA” set of rules teaches people to write like Cheever. What might be surprising is just how often he doesn’t follow them. Today’s rule is the roughest of them all.

Rule 3: Avoid narrative summary.

More accurately this should be: minimize narrative summary. Narrative summary means you tell the reader something happened rather than let the reader experience it. Often this falls under “show don’t tell.”

Let’s do an example. Bob went to the store. That is narrative summary. We hit our first difficulty, because summary is not a binary concept. We could expand it to a paragraph. As Bob approached his car, he couldn’t help but be reminded of how desperately he needed a new one. At least the beat up, green ’82 Oldsmobile would get him to the store. The store sat only two miles away, so his frustration built at each successive red light. Kate’s cello lesson ended in a half hour, and he wouldn’t get to hear her play if he was late to pick her up.

There’s still a bunch of summary in there. This trip to the store could easily blow up into a 3000 word short story if you take the advice of Rule 3 too seriously. Despite the shining silver of the handle glinting in the sunlight, it never occurred to Bob to exercise some caution. His finger seared as he touched the handle of his ’82 Oldsmobile, and he pulled away with a quick, jerking motion before any serious damage could be done. His brow furrowed in annoyance as his superstition kicked in. I bet I’ll hit all the red lights with this luck.

To understand why people talk about this rule, it is important to first understand that this whole scale of summary exists, and summary can never be fully removed (nor would you want it to). The point of the rule is that the more fine your description of detail, the more you will pull a reader in. Narrative summary is a problem if it takes the reader out of the moment.

Summary is how we remember books, but the great authors only give an illusion of continuity by creating a sequence of scenes where the time between them can easily be inferred by the reader. New writers often don’t realize this and try to explicitly fill it in, because this is what they think was done.

One of the most common examples of breaking the rule (in a bad way) occurs with backstory. This may be a flashback, or may be stray information. Either way, it is almost always better to turn it into a full-fledged scene that is not summarized or fit it in some more subtle way.

Consider this example. If the main character’s mother died when he was ten, you could say “Bob’s mother died when he was ten.” But if this is relevant information to the story, it will be apparent on its own through conversations or thoughts or interactions or whatever. There is no need to summarize it explicitly like that.

This is what the rule means. Avoid the summary. If it isn’t important enough for the reader to figure out, then it isn’t important enough to summarize. If it is important, then you are repeating the information needlessly and pulling the reader out of the moment.

All this being said, summary provides a moment of respite for the reader. It can be judiciously used to slow down or speed up the pacing. If you constantly describe every little detail of every minor, tiny thing that happens, you get a very intense experience that overwhelms the reader. There must be balance.

This is one of those “damned if you do, damned if you don’t” rules. If you follow the rule too literally, you mess the pacing of your writing up. Even when you are careful, if you break the rule (with good purpose!), critics/editors/reviewers have an easy target: look at this amateur, doesn’t even know to show and not tell.

Are you screwed on this rule? Kind of. The only thing you can do is read the prose of people you admire and really think about how they get their balance right. I imagine this is something the greatest writers struggle with even after decades of success. Maybe I’m wrong.

Let’s see how Cheever handles it in The Wapshot Chronicle. I’ll skip the first chapter which is basically a creative way to get some quick information on each of the main characters. The literary world seems divided on this. Half of writers think you can break the rule in the first few pages to get the reader grounded somewhere. The other half wouldn’t be caught dead doing this. The second chapter is a full family history. This, again, used to be more common back in the 50’s when the novel came out. The third chapter gets to the first real scene (we’re only talking about 15 pages in).

Mr. Pincher’s horse galloped along Hill Street for about a hundred yards–maybe two–and then, her wind gone, she fell into a heavy-footed trot. Fatty Titus followed the float in his car, planning to rescue the charter members of the Woman’s Club, but when he reached them the picture was so tranquil–it looked like a hayride–that he backed his car around and returned to the village to see the rest of the parade. The danger had passed for everyone but Mr. Pincher’s mare. God knows what strains she had put on her heart and her lungs–even her will to live. Her name was Lady, she chewed tobacco and she was worth more to Mr. Pincher than Mrs. Wapshot and all her friends. He loved her sweet nature and admired her perseverance, and the indignity of having a firecracker exploded under her rump made him sore with anger. What was the world coming to? His heart seemed to go out to the old mare and his tender sentiments to spread over her broad back like a blanket.

“I ain’t going to stop her now,” Mr. Pincher said. “She’s had a lot more to put up with than the rest of you. She wants to get home now and I ain’t going to stop her.”

Mrs. Wapshot and her friends resigned themselves to the news of their captivity. After all, none of them had been hurt…

The summary all but disappears. The only place I see it sneak in is when Cheever outright tells the reader “…she was worth more to Mr. Pincher than Mrs. Wapshot and all her friends…” According to Rule 3 (and even Rule 1), this is an egregious error, because we get shown this fact soon after when he prioritizes getting his horse home over letting the group of women off the float. Not only is this needless repetition, but the summary gets shown a few sentences later.

One could argue that the only natural way to guide the reader’s attention properly is to put that summary in. It flows naturally as language and lets the conversation veer in a different direction. On the other hand, it feels like you haven’t tried hard enough if you can’t think of another way to do it. Cutting much of the rest of that first paragraph and continuing with the scene takes absolutely nothing from the novel and keeps the scene moving without repetition or narrative summary:

The danger had passed for everyone but Mr. Pincher’s mare. God knows what strains she had put on her heart and her lungs–even her will to live. Her name was Lady. Mr. Pincher loved her sweet nature and admired her perseverance, and the indignity of having a firecracker exploded under her rump made him sore with anger. What was the world coming to? His heart seemed to go out to the old mare and his tender sentiments to spread over her broad back like a blanket.

You might find this to be extremely nit-picky, but this is why it sometimes takes years to edit a book (and other, less meticulous, popular writers take less time). These changes are so minor that they don’t seem worth it. It’s true that a typical reader can read something carefully polished in this way and something less polished yet good enough to get published and not be able to articulate any of these differences. But over the course of a novel, there will be thousands of these tiny differences, and they add up to a very different reading experience.

The Wapshot Chronicle was John Cheever’s first novel, and my guess is it would be harder to find these subtle repetitions and unnecessary lines of summary in later ones.

1 Comment

Examining Pro’s Prose Part 2

Today I’ll pick a passage from Ian McEwan’s Black Dogs to talk about. I really wanted to do something from McEwan, because he is considered the quintessential example of clean, clear writing for our time. I’ve read many of his books (including the more famous ones), but I only own two of the lesser known ones, which is why I’ve chosen this book. Let’s get to our rule.

Rule 2: Simple past tense is better than past progressive (or even past perfect).

This builds upon Rule 1 (Avoid repetition). Recall that past progressive tense takes the form “was [verb]ing.” If you overuse this tense, you will naturally repeat the word “was” in ever sentence. This gets monotonous and tedious to read. But there is another, possibly more important reason to minimize past progressive. It isn’t quite passive voice, but it makes all actions passive.

Mary was throwing the ball. And? It feels like there must be something else happening at the same time that is more important. Mary threw the ball. That shows the action and brings it into focus. This rule might be the loosest of all that we examine, because there are so many instances where avoiding a certain tense makes the prose awkward.

You must do what is necessary, but my suggestion is to edit every “was” out of the prose first and only put it back in if absolutely necessary. If you keep making excuses because you are too lazy to figure out how to edit it out, you can trick yourself into thinking the tense was necessary when it actually wasn’t. Don’t be lazy.

Now on to the passage:

I was the one who was startled. She was watching me, slightly amused, as I began to apologize for the interruption.

She said, ‘…’ She did not have the strength to move against my disbelief. The afternoon was at an end.

I was trying again to apologize for my rudeness, and she spoke over me. Her tone was light enough, but it could well have been that she was offended.

(I edited out the one sentence that is spoken with “…” because speech follows a different set of rules than prose.)

Notice that about half of the uses of “was” come from the past progressive tense and the other half are simple past tense where “was” is the verb. This makes it a bit trickier to analyze. Also note how grating the repetition of that word becomes by the end. In 6 sentences, the word “was” appears 7 times. The rule exists for exactly this reason.

I’ll first point out that both places where he uses past progressive seem necessary, because it is chained to another action happening simultaneously. But honestly, it is hard to imagine why converting it to simple past tense doesn’t just make it better: She watched me, slightly amused, as I began to apologize for the interruption. Or: I tried again to apologize for my rudeness, and she spoke over me.

Let’s say, for the sake of argument, the past progressive is truly important to emphasize the simultaneity. Then why keep “was” as the main verb in the simple past tense sentences? The middle part makes sense in terms of Golden Rule 1. He uses longer, complex sentences on either side, so he wants two simple declarative statements in the middle for contrast in flow.

For example, something like “Because the afternoon neared its end, she did not have the strength to move against my disbelief” flows too much like the other sentences for contrast and alters the meaning slightly. On the other hand, that one change breaks up the “was” enough that maybe all the other ones can stay.

Another simple fix is to change the last part: but it could well have been that I had offended her. The only downside I see to this is so subtle that it doesn’t outweigh the overwhelming repetition in my mind. One could argue that this last use of “was” keeps the style consistent because of the previous ones. If this is the case, change the earlier ones as well!

I hate to say this, but I think this is a passage that slipped through. When you write a book, there is too much to edit for editors and writers to catch everything. Even the best of the best have places that can be tweaked. My guess is that Black Dogs is around 75,000 words. There are probably 50 or so of these rules. You do the math.

In any case, I’ll write my change here in full, so you can decide for yourself. Maybe you like the original better.


I was the one who was startled. She watched me, slightly amused, as I began to apologize for the interruption.

She said, ‘…’ Because the afternoon neared its end, she did not have the strength to move against my disbelief.

I tried again to apologize for my rudeness, and she spoke over me. Her tone was light enough, but it could well have been that I had offended her.

1 Comment

Examining Pro’s Prose Part 1

If you read any modern book on writing or editing, you’ll find the same sets of rules to follow over and over. These rules come out of an aesthetic known as minimalism and is the type of thing you’ll be taught to do if you go to one of the big name MFA programs like the Iowa Writers’ Workshop.

The idea of these rules is to produce tight, clear writing. Some people go so far as to say they teach you to write like John Cheever (though I find this a bit unfair as Robert Coover was faculty at Iowa, and I don’t consider his style to be minimalistic at all).

The idea of this series is to take people famous for their excellent prose and look at whether they follow some of the most common rules. I’ll also try to pick writers from at least 1950 onward, because before “modernism” there were some factors which messed with prose (Dickens was a master, but when you’re paid by installment …).

Rule 1: Avoid repetition. This is vague, but it means at the word level (I saw a saw next to the seesaw), repetition of an idea a paragraph or two later, or repetition of themes/concepts across the whole book. The reasoning is often called “1 + 1 = 1/2.” A technique or word or idea is most effective when done once. The next time it is done, people have seen it, and both lose their punch.

I’ll start with Michael Chabon. The Amazing Adventures of Kavalier & Clay won the Pulitzer in 2001, which is the novel we’ll examine. It has “rapturous passages” (Entertainment Weekly) and “sharp language” (The New York Times Book Review). I don’t point this out sarcastically. The novel is excellent, and if I ever made an X books everyone should read, it would probably be on it.

I’ll admit, this is not the easiest rule to start with. A rule like “don’t use adverbs” or “don’t use passive voice” is much easier. Luckily, after some scanning, I think I found something. Here’s the start of Part II Chapter 4:

Sammy was thirteen when his father, the Mighty Molecule, came home. The Wertz vaudeville circuit had folded that spring, a victim of Hollywood, the Depression, mismanagement, bad weather, shoddy talent, philistinism, and a number of other scourges and furies whose names Sammy’s father would invoke, with incantatory rage, in the course of the long walks they took together that summer. At one time or another he assigned blame for his sudden joblessness, with no great coherence or logic, to bankers, unions, bosses, Clark Gable, Catholics, Protestants, theater owners, sister acts, poodle acts, monkey acts, Irish tenors, English Canadians, French Canadians, and Mr. Hugo Wertz himself.

As you might have guessed, the technique that gets repeated here is making a long list. The first list contains the reasons that the Wertz vaudeville circuit closed. The second list contains what Sammy’s father blamed losing his job on.

The intended effect is humor. There is no doubt, reading that second list brings a smile to my face as I visualize someone blaming such an absurd list of things. The way it morphs as it goes on is brilliant: sister acts, poodle acts, …

It is hard to see the first list as intended humor. It is more a statement of fact. One could argue that the repetition of the technique in such close proximity is fine here because it is being used for two different purposes, but I’m not so sure.

The first list primes you for the second. Imagine if the first list only contained “the Depression” or “a victim of Hollywood,” and then all the rest got thrown into the second so it morphed from more serious blame to the absurd. My guess is that this would increase the comic effect, not having already just seen a list.

This might seem nitpicky, and I’d agree. I wouldn’t have chosen this example if it ended there. The next paragraph:

The free and careless use of obscenity, like the cigars, the lyrical rage, the fondness for explosive gestures, the bad grammar, and the habit of referring to himself in the third person were wonderful to Sammy; until that summer of 1935, he had possessed few memories or distinct impressions of his father.

I think this is where it crosses the line. Yet another list right after those first two becomes tiresome. This is almost certainly what an editor would tell you if you wrote this book and got it professionally edited. The rule exists for a reason, and if you do a quick mental re-write, you’ll see that the passage becomes much tighter and easy to read with only one list.

But it’s kind of weird to blindly critique a passage out of context like this, so let’s talk about some reasons why such a great writer broke this rule. The narrator of the book has an erudite and exhaustive style. Part of the charm of the book is that it breaks from clean minimalism to present fascinating (but possibly unnecessary) details to create a rich texture surrounding the story.

In the context of the narrative style, these lists fit perfectly. The narrator is so concerned about not being exhaustive with them that he feels the need to qualify with a parenthetical, “And any of the above qualities (among several others his father possessed) would …”

This brings us to the first Golden Rule, a rule that supersedes all others. The problem with these exceptions is that you will often trick yourself into thinking you are allowed to break a rule where you aren’t. These are not excuses! If in doubt, follow the rule.

Golden Rule 1: You may break any other rule in order to create a unique and consistent narrative voice.

This is where the book shines. The narrative voice itself provides so much entertainment independent of the plot. Note that once you break a rule for this reason, you are locked in. You have a long, bumpy road ahead of you. It will take hundreds of times more effort to keep that voice consistent than to keep to the rules.

If it ever falters, you risk giving up the illusion and losing your readers. The end result can be spectacular if you pull it off. Go read The Amazing Adventures of Kavalier & Clay if you’re interested in an example where it succeeds.

1 Comment

Composers You Should Know Part 4

It’s been a while since the last “Composers You Should Know,” so let’s do another one. Recently Julia Wolfe’s piece Anthracite Fields won the 2015 Pulitzer Prize. I had been planning on including Wolfe in this series anyway, because she is a founder of one of the most important contemporary music collectives: Bang on a Can. If you don’t know about this, it came about in the late 80’s in New York to put on contemporary music concerts and remains an important source of new music concerts around the world.

Wolfe has written a large number of pieces for basically every ensemble, but for the purposes of this post, I’ll go through three pieces in chronological order. Recordings of these pieces can be found for free at her website if you want to follow along. Wolfe has a very clear minimalist strain, but it could be said that a change happened in 1994 with her piece “Lick.”

Once the piece gets going, it almost feels like John Adams’ “Short Ride in a Fast Machine” with the style of minimalism it uses (as opposed to Reich, which is surprising considering the East coast/West coast divide in minimalism). But the important change is the introduction of pop culture elements, most prominently rock and funk.

The driving bass and drums simulate rock, and the guitar and sax introduce some funk riffs. All of this gets tied up in minimalism, but it isn’t that simple. Large sections of the piece lose all sense of time in a confusing mess. The work was groundbreaking and set the stage for how her style would progress in the following years.

In no way do I presume to speak for her or oversimplify anything, but we get a major change in the years after September 11, 2001. The next piece we will look at is “My Beautiful Scream,” which is a concerto for amplified string quartet. This piece is a direct response to the attacks and simulates a slow-motion scream. It almost completely throws off the driving rhythms in favor of building suspense through sustained dissonance.

It is a chilling and moving experience to listen to. The driving beat is part of her musical syntax, so it isn’t completely absent in this work. Here it feel more like pulses, quavers, and bouts of horror. Before, the technique was used to push the piece forward which made the listener feel light and floating along. Here we get a pulse that struggles, as if trapped, trying to stay above the dense sustained notes engulfing it.

In general, her music had been getting more complicated and dissonant, but after 2003 there is a sense that the tie to “Lick” is all but severed. The evolution happened little-by-little to arrive at darker, more severe, and emotionally rich pieces. That driving rhythm remained, but its purpose changed. Listen to “Cruel Sister,” “Fuel,” and “Thirst,” and then compare to earlier works like “Lick” and “Believing.”

This brings us to present day with “Anthracite Fields,” which is a study of the anthracite mines of Pennsylvania. It is a work for chorus and chamber ensemble. The choral parts are set to historical texts including lists of names of people who died mining. I’ve only heard the fourth movement in full from the website, but you can find pieces of other movements in the short documentary “The Making of Anthracite Fields.”

The piece is chilling at times and soaring and beautiful at others. There’s certainly some folk and Americana influence as well. I’m pretty excited to hear a recording. The work makes sense in her evolution as a composer and sounds like it is the most diverse and wide-ranging yet.

Overall, one of Julia Wolfe’s lasting achievements is her ability to blend and push the boundaries of rock and classical elements, but her finished products are so much more than that.


The 77 Cent Wage Gap Fallacy

I almost posted about this last month when “Equal Pay Day” happened. Instead, I sat back on the lookout for a good explanation of why the “fact” that “women only make 77 cents for every dollar a man makes” is meaningless. There were a ton of excellent take downs by pointing out all sorts of variables that weren’t controlled for. This is fine, but the reason the number is meaningless is so much more obvious.

Now, this blog talks about math and statistics a lot, so I felt somewhat obligated to point this out. Unfortunately, this topic is politically charged, and I’ve heard some very smart, well-intentioned people repeat this nonsense who should know better. This means bias is at work.

Let’s be clear before I start. I’m not saying there is no pay gap or no discrimination. This post is only about the most prominent figure that gets thrown around: 77 cents for every $1 and why it doesn’t mean what people want it to mean. This number is everywhere and still pops up in viral videos monthly (sometimes as “78” because they presume the gap has decreased?):

I include this video to be very clear that I am not misrepresenting the people who cite this number. They really propagate the idea that the number means a woman with the same experience and same job will tend to make 77% of what a man makes.

I did some digging and found the number comes from this outdated study. If you actually read it, you’ll find something shocking. This number refers to the median salary of a full-time, year round woman versus the median salary of a full-time, year round man. You read that right: median across everything!!

At this point, my guess is that all my readers immediately see the problem. In case someone stumbles on this who doesn’t, let’s do a little experiment where we control for everything so we know beyond all doubt that two groups of people have the exact same pay for the same work, but a median gap appears.

Company A is perfectly egalitarian. Every single employee gets $20 an hour, including the highest ranking people. This company also believes in uniforms, but gives the employees some freedom. They can choose blue or green. The company is a small start-up, so there are only 10 people: 8 choose blue and 2 choose green.

Company B likes the model of A, but can’t afford to pay as much. They pay every employee $15 an hour. In company B it turns out that 8 choose green and 2 choose blue.

It should be painfully obvious that there is no wage gap between blue and green uniformed people in any meaningful sense, because they are paid exactly the same as their coworkers with the same job. Pay is equal in the sense that everyone who argues for pay equality should want.

But, of course, the median blue uniform worker makes $20/hour whereas the green uniform worker only makes $15/hour. There is a uniform wage gap!

Here’s some of the important factors to note from this example. It cannot be from discriminatory hiring practices, because the uniform was chosen after being hired. It cannot be that green uniform people are picking lower paying jobs, because they picked the uniform after picking the job. It cannot be from green uniforms wanting to give up their careers to go have a family, because we’ll assume for the example that all the workers are single.

I’ll reiterate, it can’t be from anything, because no pay gap exists in the example! But it gets worse. Now suppose that both companies are headed by a person who likes green and gives a $1/hour raise to all green employees. This means both companies have discriminatory practices which favor green uniforms, but the pay gap would tell us that green are discriminated against!

This point can’t be stated enough. It is possible (though obviously not true based on other, narrower studies) that every company in the U.S. pays women more for equal work, yet we could still see the so-called “77 cent gender wage gap” calculated from medians. If you don’t believe this, then you haven’t understood the example I gave. Can we please stop pretending this number is meaningful?

Someone who uses a median across jobs and companies to say there is a pay gap has committed a statistical fallacy or is intentionally misleading you for political purposes. My guess is we’ll be seeing this pop up more and more as we get closer to the next election, and it will be perpetuated by both sides. It is a hard statistic to debunk in a small sound bite without sounding like you advocate unequal pay. I’ll leave you with a clip from a few weeks ago (see how many errors you spot).

Leave a comment

Lossless Compression by Example Part 2: Huffman Coding

Last time we looked at some reasons why lossy compression is considered bad for music, and we looked at one possible quick and dirty way to compress. This time we’ll introduce the concept of lossless compression.

I’ll first point out that even if this seems like a paradoxical notion, everyone already believes it can be done. We use it all the time when we compress files on our computers by zipping them. Of course, this results in a smaller file, but no one thinks when they unzip they will have lost information. This means that there must exist ways to do lossless compression.

Today’s example is a really simple and brilliant way of doing it. It will have nothing to do with music for now, but don’t think of this as merely a toy example. Huffman coding is actually used as a step in mp3 encoding, so it relates to what we’ve been discussing.

Here’s the general idea. Suppose you want to encode (into binary) text in the most naive way possible. You assign A to 0, B to 1, C to 10, D to 11, etc. When you get to Z you’ll have 11001. This means that you have to use 5 bits for every single letter. “CAT” would be 00010 00000 10011.

To encode “CAT” we did something dumb. We only needed 3 letters, so if we had chosen ahead of time a better encoding method, maybe C = 00, A = 01, T = 10, then we could encode the text as 00 01 10. In other words, we compress our data without losing any information by a clever choice of encoding 00010 00000 10011 -> 000110.

I know your complaint already. Any sufficiently long text will contain every letter, so there is no way to do better than that original naive method. Well, you’re just not being clever enough!

Some letters will occur with more frequency than others. So if, for example, the letter “s” occurs with frequency 100 and then the next most frequent letter occurs 25 times, you will want to choose something like “01” for “s”. That way the smallest number of bits is used for the most frequent letters.

Ah, but the astute reader complains again. The reason we couldn’t do this before is because we won’t be able to tell the difference in a long string between two frequent letters: 10 01, and a single less-frequent letter: 1001. This was why we needed all 5 bits when we used the whole alphabet.

This is a uniqueness problem. What we do is not allow “01” to be a prefix on an assigned string once we’ve assigned it. This way, when we encounter 01, we stop. We know that is the letter “s” because no other letter begins “01”.

Of course, what ends up happening is that we have to go to much more than 5 bits for some letters, but the idea is that they will be used with such infrequency and the 2 and 3 bit letters used with such high frequency that it ends up saving way more space than if we stuck to 5.

Now you should be asking two questions: Is it provably smaller and is there some simple algorithm to figure out how to assign a letter to a bit sequence so that the uniqueness and smallness happens? Yes to both!

We won’t talk about proofs, since this is a series “by example.” But I think the algorithm to generate the symbol strings to encode is pretty neat.

Let’s generate the Huffman tree for “Titter Twitter Top” (just to get something with high frequency and several “repeat” frequencies).

First, make an ordered list of the letters and their frequencies: (T:7), (I:2), (E:2), (R:2), (W:1), (O:1), (P:1).

Now we will construct a binary tree with these as leaves. Start with the bottom 2 as leaves and connect them to a parent with a placeholder (*) and the sum of the frequencies. Then insert this new placeholder into the correct place on the list and remove the two you used:

Now repeat the process with the bottom two on the list (if a node is on the list already, use it in the tree):

Keep repeating this process until you’ve exhausted the list and you will get the full binary tree we will use:

Now to work out how to encode each letter, write a 0 on every left edge and a 1 on every right edge. Descend from the top to the letter you want and write the digits in order. This is the encoding. So T = 1, I = 000, R = 010, E = 011, W = 0011, O = 00101, and P = 00100. Test it out for yourself. You will find there is no ambiguity because each string of digits used for a letter never appears as a prefix of another letter.

Also, note that the letter that occurs with the highest frequency is a single bit, and the bits needed gets longer only as the frequency gets less. The encoding for Titter Twitter Top with this Huffman code is 39 bits whereas the naive encoding is 80. This compresses to half the space needed and loses no information!

We won’t get into the tedious details of how computers actually store information to see that there are lots of subtleties we’ve ignored for executing this in practice (plus we have to store the conversion table as part of the data), but at least we’ve seen an example of lossless compression in theory. Also, there was nothing special about letters here. We could do this with basically any information (for example frequencies in a sound file).


Lossless Compression by Example Part 1: Lossy Methods

Since I’m into music, it often comes up there is a growing trend: music is sold digitally and as vinyl. Sometimes I’ll hear people mistakenly call the vinyl trend “retro” or “trendy” or “hip” or whatever. But if you actually ask someone why they prefer records, they’ll probably tell you the sound quality is better.

I thought I’d do a series on lossless compression and try to keep everything to general concepts or example. Let’s start with the terminology. First, media files can be large, and back in the day when computers didn’t have basically infinite space, compression was an important tool for reducing the size of a media file.

Compression is basically an algorithm to take the size of a file and makes it smaller. The most obvious method for doing this is lossy compression. This just means you lose information. The goal of such an algorithm is to only lose information that is “unimportant” and “won’t be noticed.”

A far more surprising method of compression is called lossless. At first it seems paradoxical. How can you make the file size smaller, but not lose any information? Isn’t the file size basically the information? We won’t get to this in this post. Teaser for next time!

Now lets talk about why people don’t like lossy compressed audio files. There is one quick and dirty thing you can do to immediately lose information and reduce the size of an audio file. This is dynamic range (DR) compression.

Think of a soundwave. The amplitude basically determines how loud it is. You can literally compress the wave to have a smaller amplitude without changing any other musical qualities. But this is terrible! One of the most important parts of music is the DR. A moving, soaring climax will not have the same effect if the entire build up to it is the same loudness.

This is such a controversial compression technique that many people switch to vinyl purely for DR reasons. There is a whole, searchable online database of albums to find out the DR and whether it is consider good, acceptable, or bad. Go search for your favorite albums. It is kind of fun to find out how much has been squashed out even in lossless CD format vs vinyl! (e.g. System of a Down’s Toxity is DR 11 [acceptable] on vinyl and DR 6 [truly bad] on lossless CD).

The other most common lossy compression technique for audio is a bit more involved, but it actually changes the music, so it is worth thinking about. Let’s actually make a rough algorithm for doing this (there currently exist much better and subtler forms of the following, but it amounts to the same thing).

This is a bit of a silly example, but I went to http://www.wavsource.com to get a raw wav file to work with. I grabbed one of the first ones, an audio sample from the movie 2001: A Space Odyssey. Here is the data visualization of the sound waves and the actual clip:



One thing we can do is the Fast Fourier Transform. This will take these sound waves and get rid of the time component. Normally you’ll want to make a “moving window,” so you keep track of some time. For example, we can see that from 0.5 sec to 1.5 sec is one “packet.” We should probably transform that first, then move to the next.

The FFT leaves us just with the frequencies that occur and how loud they are. I did this with python’s scypy.fftpack:

import matplotlib.pyplot as plt
import scipy.fftpack as sfft
import numpy as np
from scipy.io import wavfile

fs, data = wavfile.read('daisy.wav')
b=[(ele/2**8.)*2-1 for ele in data]
c = sfft.fft(b)
d = len(c)/2

compressed = []
for ele in c:
	if abs(ele) > 50:

compressed = np.asarray(compressed)

e = sfft.ifft(compressed)


Ignore the scales which were changed just to make everything more visible but not normalized. The most crude thing we could do is set a cutoff and just remove all frequencies that we assume will be inaudible anyway:


If we do this too much, we are going to destroy how natural the sound is. As I’ve explained before, all sounds occurring naturally have tons of subtle overtones. You often can’t explicitly hear these, so they will occur below the cutoff threshold. This will bring us towards a “pure” tone which will sound more synthetic or computer generated. This is probably why no one actually compresses this way. This example was just to give an idea of one way it could be done (to finish it off you can now just inverse FFT and write to wav).

A slightly better compression technique would be to take short time intervals and multiply the peak frequency by a bump function. This will shrink all the extraneous frequencies without completely removing the robustness of the sound. This is how some lossy compression is actually done. There are other more fun things with wavelets which would take several posts to describe and the goal is to get to lossless compression.

I hope that helps to see what lossy compression is, and that it can cause some serious harm when done without care. With care, you will still lose enough sound quality that many music aficionados avoid mp3 and digital downloads completely in favor of vinyl.

Next time we’ll tackle the seemingly paradoxical concept of lossless compression.


Get every new post delivered to your Inbox.

Join 256 other followers