A Mind for Madness

Musings on art, philosophy, mathematics, and physics

1 Comment

Composers You Should Know Part 4

It’s been a while since the last “Composers You Should Know,” so let’s do another one. Recently Julia Wolfe’s piece Anthracite Fields won the 2015 Pulitzer Prize. I had been planning on including Wolfe in this series anyway, because she is a founder of one of the most important contemporary music collectives: Bang on a Can. If you don’t know about this, it came about in the late 80’s in New York to put on contemporary music concerts and remains an important source of new music concerts around the world.

Wolfe has written a large number of pieces for basically every ensemble, but for the purposes of this post, I’ll go through three pieces in chronological order. Recordings of these pieces can be found for free at her website if you want to follow along. Wolfe has a very clear minimalist strain, but it could be said that a change happened in 1994 with her piece “Lick.”

Once the piece gets going, it almost feels like John Adams’ “Short Ride in a Fast Machine” with the style of minimalism it uses (as opposed to Reich, which is surprising considering the East coast/West coast divide in minimalism). But the important change is the introduction of pop culture elements, most prominently rock and funk.

The driving bass and drums simulate rock, and the guitar and sax introduce some funk riffs. All of this gets tied up in minimalism, but it isn’t that simple. Large sections of the piece lose all sense of time in a confusing mess. The work was groundbreaking and set the stage for how her style would progress in the following years.

In no way do I presume to speak for her or oversimplify anything, but we get a major change in the years after September 11, 2001. The next piece we will look at is “My Beautiful Scream,” which is a concerto for amplified string quartet. This piece is a direct response to the attacks and simulates a slow-motion scream. It almost completely throws off the driving rhythms in favor of building suspense through sustained dissonance.

It is a chilling and moving experience to listen to. The driving beat is part of her musical syntax, so it isn’t completely absent in this work. Here it feel more like pulses, quavers, and bouts of horror. Before, the technique was used to push the piece forward which made the listener feel light and floating along. Here we get a pulse that struggles, as if trapped, trying to stay above the dense sustained notes engulfing it.

In general, her music had been getting more complicated and dissonant, but after 2003 there is a sense that the tie to “Lick” is all but severed. The evolution happened little-by-little to arrive at darker, more severe, and emotionally rich pieces. That driving rhythm remained, but its purpose changed. Listen to “Cruel Sister,” “Fuel,” and “Thirst,” and then compare to earlier works like “Lick” and “Believing.”

This brings us to present day with “Anthracite Fields,” which is a study of the anthracite mines of Pennsylvania. It is a work for chorus and chamber ensemble. The choral parts are set to historical texts including lists of names of people who died mining. I’ve only heard the fourth movement in full from the website, but you can find pieces of other movements in the short documentary “The Making of Anthracite Fields.”

The piece is chilling at times and soaring and beautiful at others. There’s certainly some folk and Americana influence as well. I’m pretty excited to hear a recording. The work makes sense in her evolution as a composer and sounds like it is the most diverse and wide-ranging yet.

Overall, one of Julia Wolfe’s lasting achievements is her ability to blend and push the boundaries of rock and classical elements, but her finished products are so much more than that.


The 77 Cent Wage Gap Fallacy

I almost posted about this last month when “Equal Pay Day” happened. Instead, I sat back on the lookout for a good explanation of why the “fact” that “women only make 77 cents for every dollar a man makes” is meaningless. There were a ton of excellent take downs by pointing out all sorts of variables that weren’t controlled for. This is fine, but the reason the number is meaningless is so much more obvious.

Now, this blog talks about math and statistics a lot, so I felt somewhat obligated to point this out. Unfortunately, this topic is politically charged, and I’ve heard some very smart, well-intentioned people repeat this nonsense who should know better. This means bias is at work.

Let’s be clear before I start. I’m not saying there is no pay gap or no discrimination. This post is only about the most prominent figure that gets thrown around: 77 cents for every $1 and why it doesn’t mean what people want it to mean. This number is everywhere and still pops up in viral videos monthly (sometimes as “78” because they presume the gap has decreased?):

I include this video to be very clear that I am not misrepresenting the people who cite this number. They really propagate the idea that the number means a woman with the same experience and same job will tend to make 77% of what a man makes.

I did some digging and found the number comes from this outdated study. If you actually read it, you’ll find something shocking. This number refers to the median salary of a full-time, year round woman versus the median salary of a full-time, year round man. You read that right: median across everything!!

At this point, my guess is that all my readers immediately see the problem. In case someone stumbles on this who doesn’t, let’s do a little experiment where we control for everything so we know beyond all doubt that two groups of people have the exact same pay for the same work, but a median gap appears.

Company A is perfectly egalitarian. Every single employee gets $20 an hour, including the highest ranking people. This company also believes in uniforms, but gives the employees some freedom. They can choose blue or green. The company is a small start-up, so there are only 10 people: 8 choose blue and 2 choose green.

Company B likes the model of A, but can’t afford to pay as much. They pay every employee $15 an hour. In company B it turns out that 8 choose green and 2 choose blue.

It should be painfully obvious that there is no wage gap between blue and green uniformed people in any meaningful sense, because they are paid exactly the same as their coworkers with the same job. Pay is equal in the sense that everyone who argues for pay equality should want.

But, of course, the median blue uniform worker makes $20/hour whereas the green uniform worker only makes $15/hour. There is a uniform wage gap!

Here’s some of the important factors to note from this example. It cannot be from discriminatory hiring practices, because the uniform was chosen after being hired. It cannot be that green uniform people are picking lower paying jobs, because they picked the uniform after picking the job. It cannot be from green uniforms wanting to give up their careers to go have a family, because we’ll assume for the example that all the workers are single.

I’ll reiterate, it can’t be from anything, because no pay gap exists in the example! But it gets worse. Now suppose that both companies are headed by a person who likes green and gives a $1/hour raise to all green employees. This means both companies have discriminatory practices which favor green uniforms, but the pay gap would tell us that green are discriminated against!

This point can’t be stated enough. It is possible (though obviously not true based on other, narrower studies) that every company in the U.S. pays women more for equal work, yet we could still see the so-called “77 cent gender wage gap” calculated from medians. If you don’t believe this, then you haven’t understood the example I gave. Can we please stop pretending this number is meaningful?

Someone who uses a median across jobs and companies to say there is a pay gap has committed a statistical fallacy or is intentionally misleading you for political purposes. My guess is we’ll be seeing this pop up more and more as we get closer to the next election, and it will be perpetuated by both sides. It is a hard statistic to debunk in a small sound bite without sounding like you advocate unequal pay. I’ll leave you with a clip from a few weeks ago (see how many errors you spot).

Leave a comment

Lossless Compression by Example Part 2: Huffman Coding

Last time we looked at some reasons why lossy compression is considered bad for music, and we looked at one possible quick and dirty way to compress. This time we’ll introduce the concept of lossless compression.

I’ll first point out that even if this seems like a paradoxical notion, everyone already believes it can be done. We use it all the time when we compress files on our computers by zipping them. Of course, this results in a smaller file, but no one thinks when they unzip they will have lost information. This means that there must exist ways to do lossless compression.

Today’s example is a really simple and brilliant way of doing it. It will have nothing to do with music for now, but don’t think of this as merely a toy example. Huffman coding is actually used as a step in mp3 encoding, so it relates to what we’ve been discussing.

Here’s the general idea. Suppose you want to encode (into binary) text in the most naive way possible. You assign A to 0, B to 1, C to 10, D to 11, etc. When you get to Z you’ll have 11001. This means that you have to use 5 bits for every single letter. “CAT” would be 00010 00000 10011.

To encode “CAT” we did something dumb. We only needed 3 letters, so if we had chosen ahead of time a better encoding method, maybe C = 00, A = 01, T = 10, then we could encode the text as 00 01 10. In other words, we compress our data without losing any information by a clever choice of encoding 00010 00000 10011 -> 000110.

I know your complaint already. Any sufficiently long text will contain every letter, so there is no way to do better than that original naive method. Well, you’re just not being clever enough!

Some letters will occur with more frequency than others. So if, for example, the letter “s” occurs with frequency 100 and then the next most frequent letter occurs 25 times, you will want to choose something like “01” for “s”. That way the smallest number of bits is used for the most frequent letters.

Ah, but the astute reader complains again. The reason we couldn’t do this before is because we won’t be able to tell the difference in a long string between two frequent letters: 10 01, and a single less-frequent letter: 1001. This was why we needed all 5 bits when we used the whole alphabet.

This is a uniqueness problem. What we do is not allow “01” to be a prefix on an assigned string once we’ve assigned it. This way, when we encounter 01, we stop. We know that is the letter “s” because no other letter begins “01”.

Of course, what ends up happening is that we have to go to much more than 5 bits for some letters, but the idea is that they will be used with such infrequency and the 2 and 3 bit letters used with such high frequency that it ends up saving way more space than if we stuck to 5.

Now you should be asking two questions: Is it provably smaller and is there some simple algorithm to figure out how to assign a letter to a bit sequence so that the uniqueness and smallness happens? Yes to both!

We won’t talk about proofs, since this is a series “by example.” But I think the algorithm to generate the symbol strings to encode is pretty neat.

Let’s generate the Huffman tree for “Titter Twitter Top” (just to get something with high frequency and several “repeat” frequencies).

First, make an ordered list of the letters and their frequencies: (T:7), (I:2), (E:2), (R:2), (W:1), (O:1), (P:1).

Now we will construct a binary tree with these as leaves. Start with the bottom 2 as leaves and connect them to a parent with a placeholder (*) and the sum of the frequencies. Then insert this new placeholder into the correct place on the list and remove the two you used:

Now repeat the process with the bottom two on the list (if a node is on the list already, use it in the tree):

Keep repeating this process until you’ve exhausted the list and you will get the full binary tree we will use:

Now to work out how to encode each letter, write a 0 on every left edge and a 1 on every right edge. Descend from the top to the letter you want and write the digits in order. This is the encoding. So T = 1, I = 000, R = 010, E = 011, W = 0011, O = 00101, and P = 00100. Test it out for yourself. You will find there is no ambiguity because each string of digits used for a letter never appears as a prefix of another letter.

Also, note that the letter that occurs with the highest frequency is a single bit, and the bits needed gets longer only as the frequency gets less. The encoding for Titter Twitter Top with this Huffman code is 39 bits whereas the naive encoding is 80. This compresses to half the space needed and loses no information!

We won’t get into the tedious details of how computers actually store information to see that there are lots of subtleties we’ve ignored for executing this in practice (plus we have to store the conversion table as part of the data), but at least we’ve seen an example of lossless compression in theory. Also, there was nothing special about letters here. We could do this with basically any information (for example frequencies in a sound file).


Lossless Compression by Example Part 1: Lossy Methods

Since I’m into music, it often comes up there is a growing trend: music is sold digitally and as vinyl. Sometimes I’ll hear people mistakenly call the vinyl trend “retro” or “trendy” or “hip” or whatever. But if you actually ask someone why they prefer records, they’ll probably tell you the sound quality is better.

I thought I’d do a series on lossless compression and try to keep everything to general concepts or example. Let’s start with the terminology. First, media files can be large, and back in the day when computers didn’t have basically infinite space, compression was an important tool for reducing the size of a media file.

Compression is basically an algorithm to take the size of a file and makes it smaller. The most obvious method for doing this is lossy compression. This just means you lose information. The goal of such an algorithm is to only lose information that is “unimportant” and “won’t be noticed.”

A far more surprising method of compression is called lossless. At first it seems paradoxical. How can you make the file size smaller, but not lose any information? Isn’t the file size basically the information? We won’t get to this in this post. Teaser for next time!

Now lets talk about why people don’t like lossy compressed audio files. There is one quick and dirty thing you can do to immediately lose information and reduce the size of an audio file. This is dynamic range (DR) compression.

Think of a soundwave. The amplitude basically determines how loud it is. You can literally compress the wave to have a smaller amplitude without changing any other musical qualities. But this is terrible! One of the most important parts of music is the DR. A moving, soaring climax will not have the same effect if the entire build up to it is the same loudness.

This is such a controversial compression technique that many people switch to vinyl purely for DR reasons. There is a whole, searchable online database of albums to find out the DR and whether it is consider good, acceptable, or bad. Go search for your favorite albums. It is kind of fun to find out how much has been squashed out even in lossless CD format vs vinyl! (e.g. System of a Down’s Toxity is DR 11 [acceptable] on vinyl and DR 6 [truly bad] on lossless CD).

The other most common lossy compression technique for audio is a bit more involved, but it actually changes the music, so it is worth thinking about. Let’s actually make a rough algorithm for doing this (there currently exist much better and subtler forms of the following, but it amounts to the same thing).

This is a bit of a silly example, but I went to http://www.wavsource.com to get a raw wav file to work with. I grabbed one of the first ones, an audio sample from the movie 2001: A Space Odyssey. Here is the data visualization of the sound waves and the actual clip:



One thing we can do is the Fast Fourier Transform. This will take these sound waves and get rid of the time component. Normally you’ll want to make a “moving window,” so you keep track of some time. For example, we can see that from 0.5 sec to 1.5 sec is one “packet.” We should probably transform that first, then move to the next.

The FFT leaves us just with the frequencies that occur and how loud they are. I did this with python’s scypy.fftpack:

import matplotlib.pyplot as plt
import scipy.fftpack as sfft
import numpy as np
from scipy.io import wavfile

fs, data = wavfile.read('daisy.wav')
b=[(ele/2**8.)*2-1 for ele in data]
c = sfft.fft(b)
d = len(c)/2

compressed = []
for ele in c:
	if abs(ele) > 50:

compressed = np.asarray(compressed)

e = sfft.ifft(compressed)


Ignore the scales which were changed just to make everything more visible but not normalized. The most crude thing we could do is set a cutoff and just remove all frequencies that we assume will be inaudible anyway:


If we do this too much, we are going to destroy how natural the sound is. As I’ve explained before, all sounds occurring naturally have tons of subtle overtones. You often can’t explicitly hear these, so they will occur below the cutoff threshold. This will bring us towards a “pure” tone which will sound more synthetic or computer generated. This is probably why no one actually compresses this way. This example was just to give an idea of one way it could be done (to finish it off you can now just inverse FFT and write to wav).

A slightly better compression technique would be to take short time intervals and multiply the peak frequency by a bump function. This will shrink all the extraneous frequencies without completely removing the robustness of the sound. This is how some lossy compression is actually done. There are other more fun things with wavelets which would take several posts to describe and the goal is to get to lossless compression.

I hope that helps to see what lossy compression is, and that it can cause some serious harm when done without care. With care, you will still lose enough sound quality that many music aficionados avoid mp3 and digital downloads completely in favor of vinyl.

Next time we’ll tackle the seemingly paradoxical concept of lossless compression.

Leave a comment

Thoughts on ToME’s Adventure Mode

I’ve done several posts explaining why I think roguelikes are a great genre of game to play. It is probable that the most important feature of a roguelike for me is permadeath. For example, see this post for reasons why.

If you aren’t up on roguelikes, there are only a handful of games that standout as the “giants” that most people have heard of. One of these is called ToME (aka ToME 4; aka Tales of Maj’Eyal). There are more interesting features in ToME than could fit in a single blog post. Someday I may come back and post about these.

I’ll fully admit that my views on permadeath have evolved a bit, possibly due to my age. I think the older someone gets, the more likely they are to view losing all progress in a game as too punishing to be worth it. You tend to grow out liking the more extreme and hardcore elements of certain games.

Anyway, I stand by my original post. I’ll recall some key points. Permadeath is a great game mechanic, because it forces you to contemplate the consequences of your actions. It gives weight to the game. It makes you become better at it in order to win. You can’t just “save scum” until you get through a particularly difficult section.

Before you take this the wrong way, ToME is possibly the most well-balanced roguelike I’ve played. Every death feels like my own fault and not me getting screwed by the randomness. But when a game involves as much randomness as any of the great classic roguelikes, you are bound to get the occassional unavoidable death that is not your fault.

This becomes more and more pronounced as a game’s design is less thoroughly vetted for imbalances. Part of ToME’s excellent balance comes from people who have put in thousands of hours of play who can spot these things. The developer takes their opinions seriously which makes the game more fair.

ToME has three modes of play: roguelike, adventure, and explore. Roguelike has traditional permadeath. Once your die, you must start the entire game over. Adventure gives you five lives. Once those five are gone, you start the game over. Explore is deathless.

The main point I’ve been contemplating is whether Adventure mode ruins the permadeath experience of a roguelike. This will be a highly controversial statement, but I think it keeps all the original greatness of the mechanic and eliminates the negative aspects.

If you only have five lives, then each one is still precious. You’ll play the early game as if you only have one life, because if you waste one early, you will probably restart anyway. This makes the beginning just as intense as if you only had one life.

Let’s put it this way. If you don’t play as if you only have one life, then you will probably quickly lose them all anyway and revert to roguelike mode. So nothing really changes. In the middle and late game, if you are really good and don’t lose any lives, then it didn’t matter anyway. If you’re like me, you’ll probably be back to one life by that point and get all the benefits of roguelike mode.

It seems to me that Adventure mode merely serves to alleviate the annoyance and waste of time that comes from getting killed in one hit by some out of depth enemy that randomly appeared due to no fault of your own. It keeps all the intensity and pressure of permadeath, but gives some much needed buffer for the extreme amount of randomness of roguelikes.

I’d be quite happy to see some other roguelikes incorporate this as an option, but I’d also be totally understanding if they saw it as a compromise on the quality of the play experience.

Leave a comment

Texas Sharpshooter Fallacy

In the world of big data that constantly bombards us with fancy graphics, the statistical fallacy that I think we are most likely to fall for is called the Texas Sharpshooter Fallacy. What makes this fallacy so dangerous is that it is propped up by solid, correct statistics which can be hard to argue against.

Here’s the idea. A person goes into the yard and shoots their rifle at random at their barn. Maybe even say the person is drunk, so the holes have no underlying pattern to them. The person then goes to the barn and figures out a way to draw a bullseye after the fact that makes it look like they are a competent sharpshooter.

The fallacy is that if you look at a large enough amount of data with good enough visualization tools, you will probably start to find patterns that aren’t actually there by strategically drawing artificial boundaries. Let’s make the example a bit more real.

Suppose you want to better understand the causes of Disease X, something just discovered and occurs in 10% of the population naturally. You plot the data of a nearby town of 10,000 to see if you can find a pattern.

Here is the plot (I used a uniform distribution so we know any clumps have no underlying cause):


Your eye gets drawn to an oddly dense clump of cases of Disease X. You circle it and then run a statistical test to see if the number of cases is significant. You’re shocked! Your properly run statistical test shows you the increased number of cases is significant and with 95% certainty you conclude it isn’t just a fluke.

So what do you do? You start looking for causes. Of course you’ll be able to find one. Maybe that clump of houses has a power station nearby, or they drink from the same well water source or whatever. When you are looking for something in common, you’ll be able to find it.

When this happens, you’ve committed the Texas Sharpshooter Fallacy. It might be okay to use this data exploration to look for a cause if you merely intend to turn it into a hypothesis to be tested. So you hypothesize that it is radon in the water that caused the spike of cases in that cluster.

Now do real science where you do a randomized controlled study to actually test your null hypothesis. Doing statistics on big data is risky business, because any clever person can construct correlations from a large enough data set that first off may not actually be there but second off is almost surely not causally related.

Another way to think about why this is a fallacy is that when you have 95% certainty, 5 out of 100 times you will falsely find correlation where none exists. So if your data set is large enough to draw 100 different boundaries, then by random chance 5 of those will have false correlations. When you allow your eye to catch the cluster, it is your brain being good at finding patterns. It probably rejected 100 non-clusters to find that one.

This is scary in today’s world, because lots of news articles do exactly this. They claim some crazy thing, and they use statistics people don’t understand to “prove” its legitimacy (numbers can’t lie don’t you know). But really it is just this fallacy at work. The media don’t want to double check it because “Cancer rate five times higher near power station” is going to get a lot of hits and interest.

Actually, cancer is particularly susceptible to this type of fallacy as dozens of examples of these studies getting publicity despite no actual correlation (yet alone causation!) are documented in George Johnson’s (excellent) The Cancer Chronicles or an older The New Yorker article called “The Cancer-Cluster Myth.”

So the next time you read about one of these public health outcries, you should pay careful attention in the article to see if this fallacy has been made. For example, the vaccination causes autism myth also orignated this way.

Probably the most egregious example is The China Study, a highly praised vegan propaganda book. It takes the largest diet study ever done (367 variables) and pulls out the correlations that support the hypothesis “meat is bad.”

What the book doesn’t tell you is that the study found over 8000 statistically significant correlations, many contradicting the ones presented in the book. This is why large studies of observational epidemiology always have to be treated with caution. The larger the study, the more likely you will be able to find a way to support your hypothesis.

If you don’t believe me, and you want to protect marriage in Maine, then make sure you eat less margarine this year:


Leave a comment

Thoughts on Roth’s American Pastoral

The first time I read Philip Roth’s American Pastoral, I had nothing but criticism for it. I’ll try to set the stage for my first reading. It was my early undergraduate days about 10 years ago.

I had had a fairly sheltered childhood. I grew up in a highly apolitical house. At that point, I had not been of age to vote during a major election, and so the extent of my political knowledge was the ability to name the president.

Despite this, I read the book at the height of my reading career. No offense for the university I attended, but I breezed through (a perfect 4.0 finishing GPA) with almost no work. This meant I supplemented my studies by reading a lot.

By this I mean I sometimes read 2 novels a week. I read Infinite Jest and Gravity’s Rainbow during this time. I wanted to read every book anyone had ever recommended to me or had said was “unreadable” (is that a description or a challenge?).

So what were my complaints? Well, it read like realism, yet nothing struck me as realistic in the book. It seemed filled with hyperbole and extreme character overreaction. Here’s a few of the things I remember saying, but there were probably more:

1. How could anyone be so upset over politics to do something so extreme?
2. How could someone’s perception of someone else be so skewed?
3. How could one event cause someone to change so radically and suddenly?
4. The pacing is too slow.
5. The second half is too bizarrely different from the first to create something coherent.

Anyway, I decided to reread it and was shocked to find how much 10 years can change your perspective. The book is a delicate portrait of how a tragedy wrecked a family’s life.

What I originally perceived as too slow of pacing turned out to be a striking dive into the psyche of a man torn by conflicting and paradoxical emotions. It tries to answer the question: How does one grapple with continuing to love someone after they have done something horrible? It is heartbreaking to witness.

What I originally thought of as radical and sudden change of a character turned out to be a perfectly natural reaction of changing values and priorities. It’s happened to me. It’s happened to people I know. It happens to everyone. With a catalyst of such magnitude as happens in the book, it doesn’t seem at all extreme to me anymore.

I’ve learned a lot about bias and the human mind since last reading the book. Now the inconceivable false perception of someone strikes a chord of truth in me.

In fact, none of my initial criticisms ring true anymore. The book presents all of these complicated human interactions and emotions in a unified, compelling story.

The thing I most love about Roth’s style (at least in the second Zuckerman trilogy) is ever present in American Pastoral. He has the ability to lead you down a somewhat illogical, yet fully natural series of thoughts to land on a beautifully constructed gem of a sentence to contemplate.

It is hard to describe or give an example, because to pull the quote out of context removes how striking it is to read in real time. I often found myself having to stop and contemplate how illuminating the paragraph was. I could always relate to a time when I had a similar thought process. I first noticed this style in The Human Stain which drove me to the other Roth novels.

Needless to say, I loved this book. At a time when our politics seem to be more divided and more extreme than ever, and outrage and violence surrounding it has become more public (the recent Ferguson protests come to mind), a book of such introspection on the topic has only grown in its importance among the rank of American literature of the past fifty years.


Get every new post delivered to your Inbox.

Join 237 other followers