It Is Far Easier Just To Learn The Basics

facebooktwittergoogle_plusredditpinterestlinkedin

In this post I am going to prove to you that it is a better to learn a bit of many different languages than to learn one or a few languages until fluency if you care about understanding what words are being said in the world in general. There is a lot of statistics written about it but it’s just common sense. Let me illustrate it for you in a way to that you can make sure of it by yourself.

Imagine a universe where all communication is done by using 8 simple words. Let’s say they are:

  • run
  • fight
  • eat
  • sex
  • pee
  • sleep
  • hide
  • wash

Well, that seems like something that most life forms on Earth do most of the time anyway.

Let’s just suppose that whenever one individual wants to do one of those things he just conveys this information by saying this word. Saying it could get you something (for example, food) and even if you don’t need something to do it, it is necessary to get the approval of the community to do all of these things thus you need to say it anyhow.

Let’s then look into an average individual. We would give him a name but that would mean having an extra word so we can’t. Unless we were to name him something like pee but that would add some confusion to the language. A day wouldn’t be a good time scale so we will be using a week. Let’s just say that he does these things:

  • he has to run somewhere once a day and that is 7 times a week
  • he has to fight someone about 1 time per week (pretty peaceful isn’t it?)
  • he does eat three times a day that would be three times seven or 21 times a week
  • he has sex say 2 times a week (nature still has to somehow trick those individuals into reproduction but doing it too often would leave less time for finding food or running from predators)
  • he does pee two times a day and that is two times seven or 14 times per week
  • he does sleep once a day so that is 7 times per week
  • he has to hide from predators 3 times a week
  • he has to wash himself almost every day but not quite so it comes out as 5 times per week

Let’s now add all of those together. Let’s see how many things he does and thus how many words he says per week. I’ll be completely transparent about my math: 7 + 1 + 21 + 2 + 14 + 7 + 3 + 5 = 60

Let’s now arrange our things in the order of frequency and count what percent of the whole words said every week they make:

  1. eat: 21 (21/60 * 100% = 35%)
  2. pee: 14 (14/60 * 100% ≈ 23.3%)
  3. sleep: 7 (7/60 * 100% ≈ 11.7%)
  4. run: 7 (7/60 * 100% ≈ 11.7%)
  5. wash 5 (5/60 * 100% ≈ 8.3%)
  6. hide 3 (3/60 * 100% = 5%)
  7. sex 2 (2/60 * 100% ≈ 3.3%)
  8. fight 1 (1/60 * 100% ≈ 1.7%)

If that’s true then we say the word eat 35% of the time while we say the word fight only approximately 1.7% of the time. Thus out of every hundred words any average individual says, thirty-five of them are eat and roughly two are fight.

Now imagine we would like to learn this language by learning one word at a time. Also imagine that you are lucky enough to have this data and you are smart enough to look at this data before learning the words so you learn them in order from the most used to the least used. Here I have graphed how much of the total vocabulary you know after having learned each individual word.

The percentages of acquired vocabulary: used versus total vocabulary

Look at it. The red line shows how much of the vocabulary you have learnt by its frequency and the blue line shows the number of the vocabulary learned by its count. The green bar shows the difference.

If you know the words eat and pee you know over 58.3% that is over a half of the total vocabulary used in the language (while you only know 2/8 * 100% = 25% that is only one fourth of the total vocabulary)! The more words a language has, the slower the green line grows while that is not necessarily true for the red line.

Another interesting thing is that this graph refutes the 80/20 rule. Do you know the rule that suggests that you are reaping 80% of the benefits from only 20% of the work. Well, it is not quite true in this graph: here it should be the 80/50 rule instead. However, it is true that you achieve the highest efficiency at around 20 percent (25% actually) so perhaps the rule does do some good.

Here is another chart with the difference between the two percentages:

The difference between the percentages of acquired vocabulary: used versus total vocabulary

Here you can see the same thing but more clearly: the graph goes up in the beginning, then peaks at the second word and then goes down fast. It is conceivable that most human languages do not peak as early because there are a lot of words in these languages and the differences in the frequency of their usage is not as drastic as shown here. Here is the last graph to illustrate the changing returns directly to you:

Your returns of learning each extra word

You can again seen how you get a big edge by learning the first word because it is used 35% of the time yet it constitutes only 12.5% of the vocabulary (35%12.5% = 22.5%) and you still keep that positive with the second word because it is used 23.3% of the time and still counts for another 12.5% of the time (23.3% – 12.5% = 10.8%), yet you suddenly get negative returns from the third word because it is used 11.7% of the time and still takes up 12.5% of the vocabulary. Thus you ccan increase your edge (although not as drastically) with the second word and then you begin decreasing it with the third word. Ideally if you were very worried about efficiency you would only be learning the first two words of this imaginary language.

This all means that it only takes you to learn 12.5% of the vocabulary to move from understanding nothing to understanding the first 38.5% (that is roughly 40%) of the spoken language while it takes you to learn another 37.5% of the vocabulary to understand 81.7% (that is roughly 80%) of the spoken language and then it takes you to learn the remaining 50% of the vocabulary to understand the last 18.7% (roughly 20%) of the spoken language.

What is the conclusion? The conclusion is that it takes you a lot less effort to learn to understand some certain percent of any language you are learning and it gets harder and harder after that.

Thus, take this hypothetical situation that there are four different tribes living in the world with 4 different languages spoken and completely different words for each of these 8 things. Suppose that you have time to learn 8 words in total. Let’s compare two strategies you could take here. You could choose to learn all 8 words of the same language. If you do that, you would be able to understand 100% / 4 = 25% of the words said in the world (assuming that all four tribes are of the same size and speak with the same frequency…). Another strategy you could take would be to learn two words of each language. If you chose this strategy you would then be able to understand 58% * 4 / 4 = 58% of the worlds said in the world. In the first case, you would be able to understand only one fourth of what the whole population of the world says and in the second case you would get more than a half.

Of course, is is true that this only counts if you care about understanding in general. If you live in one of those communities and you have to be able to communicate to get approval to do each of those things then your best bet is to learn all of the words of this community because otherwise you will not be able to, for example, get approval to sleep or whatnot. However, if you do not happen to live there and you just want to get to know them all a bit then the second strategy is clearly superior.

Thus if you are interested in understanding what the world is talking about in general it is a better strategy to learn a bit of a lot of different languages than to learn one or a few languages to fluency.

Finally, let’s answer one last question: does this apply to real languages spoken in the world as oppose to imaginary languages in imaginary universes? Well, the whole hypothesis is based on the notion that different words are not spoken with the same frequency. If this presupposition is true then following logics the conclusion must be true at least to some extent (the extent of this being true depending on word frequency). Now if you believe that the word have and the word syzygy are used with a different frequency then you believe that our presupposition is true and that this conclusion also holds for the languages of the world! I rest my case.

Similar Posts:

How To Learn Languages Fast? Answer: Teach Them!
Language and Travel
facebooktwittergoogle_plusredditpinterestlinkedin

5 Comments

  1. Anonymous
    ·

    You went through all that effort just to prove the retarded point that you should learn words like eat, pee, and fight (or any arbitrary number of high-frequency words) in many languages? Seriously? Don't take me wrong, I like your blog, but I already regret the 10 seconds it took me to skim through that article.. how long did you write on that? Don't you have anything better to do?That comes off as pretty agressive, sorry, but it's a stupid argument built on a stupid basis, and I'm used to better content on this blog.

  2. Max
    ·

    Wow, that must be the first time ever in the history of the internet, that such an aggressive post has been received so calmly and with such an open mind :)
    Again, I need to apologize for the agressivness, one shouldn't write when the blood is boiling ;)
    Keep up the good work!


  3. ·

    I don’t think this was dumb at all, though I think a really great opportunity was missed.

    The most important thing that should be said, of course, is that no language is made up of only 8 words… that’s just convenient for an example. No one is going to learn a handful of words and suddenly be useful in that language.

    However, it’s not hard to find frequency lists for vocabulary. In fact, it’s the use of a frequency list for Russian that I believe helped me the most on my goal to being fluent so quickly in such a difficult language.

    For those who aren’t familiar, a frequency list is simply a list of the most commonly used words in a language, ranked by how often they appear. The data is usually based on a broad sample of data including books, newspapers, transcribed speeches, etc.

    A frequency list for English, for instance, might start out with “the, a/an, it, he, my, I, you, where, how, love, us” etc.

    While you’re not going to be useful after learning even the first 10 words, I could make a strong case for the first 100. And it’s reasonable to say that knowing the most common 1,000 words in a language will make you conversational (within reason) and equip you to understand 40% or more of what you encounter on a daily basis.


  4. ·

    I agree with the premise, but not the method. What I would do is learn through context, by listening to songs and reading stuff and so on, and then think if the word is a word you’d likely use (for example, when I listen to a Dutch rap song I reason that ‘stang’ [bar] would be less ueseful than ‘boos’ [angry] because I do have emotions but am not a drinker). I did learn a list of the top few dozen Spanish verbs when I was towards the beginning of my Spanish study and it did help me quite a lot, but I would limit it to very early learning.

    I also wouldn’t agree that a superficial understanding of lots of languages is better than a deep understanding of a few. What if you want to read literature in a language? Or study at a university in the country? Or are marrying a speaker? There are very good reasons for ind-depth language study. Furthermore, when do you know when to stop? How conversationally proficient is enough? Language skill exists on a continuum. I guess you’re advocating learning a lot of languages to B1 level, but even that requires quite a lot of work.


  5. ·

    I think all of your points are legit.

    The reality is that there is just a lot of possible reasons somebody might have for learning a language and there is a lot of different outcomes that they might wish depending on the particular reason. 

    Posts like mine are attempts to characterize the complex specter of human experiences into a single categorical suggestion and since life doesn’t work like that my suggestion shouldn’t be taken seriously.

Comments are closed.