# Chomsky

I’ve decided to start reading Chomsky’s pivotal book Syntactic Structures, since I’m into the phil language thing and it was a really important work. This post is going to be sillyness, but it was something I couldn’t get out of my head while attempting to read past the first couple sentences.

“Each language has a finite number of phonemes (or letters in its alphabet) and each sentence is representable as a finite sequence of these phonemes (or letters), though there are infinitely many sentences.”

Now we encounter “free structures” all the time in math. It is perfectly legitimate to create something with an infinite number of elements from just stringing together a finite alphabet. In fact, you can impose lots of structure such as the free group on two generators. This “language” only has an alphabet of two, must satisfy group axioms, and ignores triviality (must be fully reduced), yet still achieves an infinite number of words (not even sentences, but words).

I must argue, though, that there are not “infinitely many sentences.” I don’t think it would be controversial to claim that there are a finite number of words in a language. Take English, for example. Use the good old OED plus maybe a slang dictionary and throw in a couple thousand for good measure as an upper bound on the number of words in the language.

This number of words is huge, though finite. When we generate sentences, if we do so in the “free” way, then we clearly get an infinite number. Now I’m not so concerned with “grammatically” correct sentences, as I am with imposing conditions on repetition. The sentence “the dog ran dog ran” is pointless. Due to repetition, I argue that there must then be some upper bound on the length of the longest sentence possible (to continue the group analogy, this is like the “free presentation” with restrictions like $\{a : a^3=1 \}=\mathbb{Z}_3$).

To make this easier, let’s reduce our sentences to ones that are not conjunctions of two complete sentences (if former is finite, then so is the latter). Now a sentence can only be so long (non-conjunctively), say you use basically every word in the language a couple of times (which I find hard to believe that you would still have a “sentence” at that point). So now we have an upper bound of, I don’t know, a couple billion words in a sentence. This would give us on the order of a couple billion factorial number of sentences. This is absurdly large (and an absurdly overestimate in my opinion), but still finite.

Despite having zero relevance to your book Mr. Chomsky, I must respectfully disagree with your opening lines. What does everyone else think?

## 3 thoughts on “Chomsky”

1. You make a good pointless. 🙂

Language is a tool for expressing concepts. Why study its capacity for nonsense? I guess it proves there must be a finite limit to the concepts a language can express.

Cheers,
jim

2. I think Chomsky meant that due to the recursive structure of language, there is in principle no upper bound on the length of a sentence, leaving aside the question of plausibility of a human saying or writing it. If someone lived for an infinitely long time, one can imagine his uttering an infinitely long sentence, with lots of embedded structures (“… the dog ran, and as the dog ran, he…”) and even meta-commentary on the sentence itself.

If you prefer, you can think of some of his theories as suggesting a kind of mathematical approximation to the structure of language, and hence approximate as all theories must be.

3. According to Wikipedia, the sentence (Buffalo)^n is grammatically correct for any $n \ge 1$, so I’m going to have to disagree with you there.