Korean for Programmers


I was inspired to write this after reading German for Programmers.[0] It's quite likely that I'm just biased to see things a certain way because I work with programming languages for a living, but I often find myself drawing comparisons among Korean and programming.

Hangeul Basics

Before we can get into grammar, some explanation about Hangeul–the Korean alphabet–is needed. Korean is actually like English in that letters are put together, so you can read any Korean once you learn Hangeul. However, Korean letters join together in syllable blocks.

You may have seen the infamous "Learn how to read Hangeul in 10 minutes" graphic. While Hangeul is indeed very easy to learn, it does have exceptions–quite a few actually. So maybe you can read 독립 (independence), but the actual pronunciation is different from what's written due to ㄹ following ㄱ. These are called 받침[1] rules and one just has to memorize 'em.

Grid System

Unlike Chinese or Japanese, Korean syllable blocks take the Hangeul letters and make them into grids in a particular order.

Each block contains a vowel and it's always the second character. The block are constructed depending on whether the vowel is vertical (ㅣ ㅏ ㅓ) or horizontal (ㅡㅜㅗ). If you want to say "ah", ㅇ is used as a silent character in the first slot–which makes 아: + .

Some vowels can also be combined: is + , is + .

For Korean fonts, all finished blocks (e.g. 쀍: ㅃㅜㅔㄹㄱ) have to be included because it's possible to construct them via grid rules, but a fair amount are just gibberish. Hangeul is represented by 5 unicode blocks in total and I recommend reading about how they get composed together.[2]

Subject Elision, especially -you-

In English, we (almost) always need to specify the subject. I can't write just "Can't write." or "Ate an apple.". In Korean however, the subject isn't necessary in every sentence. In fact, always including it isn't natural at all! Korean is highly contextual, so if you say:

먹었어요 – literally: ate. --> interpreted as: "I ate"

In a lot of contexts, that would make most sense as (I) ate. However if someone asks you, "Did Charlie eat?", you can still say:

(네,) 먹었어요. – (yes,) ate. --> "Yes, Charlie ate."

Similarly if you ask someone else a question, it's usually obvious if they are the subject.

이 책을 읽었어요? – lit: this book read? --> "Did you read this book?"

It makes little sense to ask "Did I read this book?" in most situations. (If you want to muse to yourself like "ah, did I (already) read this book..?", that requires a specific grammar point.)

If you're wondering, people do clarify from time to time.

"먹었어요?" - ate? --> "(Did you) eat?"

"저요? / 저는요?" --> "(were you referring to) Me?" / "As for me?"


The subject/topic still has to be included when it changes or it's ambiguous. If you start off by saying "저는 ..." ("as for me, ..."), it's implied the following sentences will still be about you.

In this sense I guess one could say Korean leans toward being duck-ly typed–but that's a tenuous comparison. Anyway, it's different from English where the subject almost always must be specified (strongly typed...?), except in some spoken slang. Even very informal slang like "d'ya eat?" still includes the subject. We're playing a bit fast and loose here with the comparisons.


A word of caution: Using the direct words for "you" (너, 네가, 당신) is also quite rude unless you are close with someone. Papago and google translate often translate "you" to 너 or 네가. Don't use these until you know what you're doing. 당신 has different nuances and should also be avoided. It's really hard for English speakers to let go of "you", but you must.

Sentences & Grammar

Sentence Order

English is an SVO (Subject Verb Object) language. "Billy ate an apple". Korean is an SOV language. "Billy apple ate". This reversal makes thinking in Korean actually quite difficult once one gets to causation and more difficult sentences.

English: I went to the park because it was sunny out today. Korean: It was sunny out today so I went to the park.

English: I was so tired I could barely walk. Korean: To the extent that I could barely walk I was so tired.


From the perspective of a native English speaker, I find this reversal utterly fascinating. With English the last bit of the sentence usually has the important bit:

"Do you know where the bathroom is?"

You basically have to finish the whole sentence there before the other person can understand what you intend. Now with Korean:

화장실이 어디에 있는지 아세요?

I often can barely get this out in time before I start getting an answer.

Finally, a sentence

We made it! Let's start with a simple sample sentence in Korean.

저는 사과를 어제 먹었어요.

저(는) 사과(를) 어제 어요 .

subject + marker object + marker adverb verb stem past tense modifier politness conjugation .

There's already a lot going on here. The first difference you may notice is the markers. In English, one has to identify what the subject is. In Korean, topics and subjects are marked with 은/는 or 이/가, respectively. Objects get marked with 을/를. Just like the subject, these are not always necessary and get dropped in speech and texting all the time.

Markers

Korean has a number of postpositional particles that imbue meaning, some of which vary if the last character is a vowel or not. We can easily use a pure function to model this:

fn topic_marker(word: &str) -> &str {
  match word.last_character_type() {
    Vowel => "는",
    Consonant => "은"
  }
}

fn plural_marker() -> &str {
  "들"
}

Here are a few other markers (with simplified definitions). The text inside the parentheses is used if the last character is a consonant:

  • ~에서 => from
  • ~까지 => until
  • ~(으)로 => several meanings. roughly-speaking it shows how/via what method or material something is carried out, or "toward" a place if used with "to go", etc.

Pipelining Data Transformations

Where the previous sentence started to get interesting was at the end, with the verb, tense, and politeness. That's not all we can transform though. The next two sentences here are "I closed the door" (informal impolite, then informal polite), after is "My parents closed the door" (formal, polite).

나는 그것을 .

저는 그것을 어요 .

우리 부모님은 그것을 으시 습니다 . (Note that I'm showing here separately to break down the components but they would get merged to 셨)

The main verb here is 닫다–to close. All Korean verbs end in 다, so the first thing we need to do is get the stem by removing .

Now we need the honorific level. In the first two sentences I'm talking about myself, so I can't use honorifics. However when the actor is "my parents" it's common to use honorifics, which is ~(으)시.

Before we can add a tense, we need to determine what vowel to add, and if it should merge or not. 닫 ends in a consonant, so merging is not possible, but the last vowel is 아 so we append . (If the verb stem ends in a vowel, like , the following vowel would just get merged into )

One way to do past tense is ~ㅆ, which gets merged with the previous vowel. Other tenses can depend on the last character being a vowel or not, like future tense (~ㄹ/을).

Finally, we need to think about our relationship to the audience and append or merge a politeness/speech level. See politeness below for more details.

We can model this as a basic pipeline à la Clojure:

(defn conjugate-verb
  [subject verb speaker audience]
  (->> verb
    (remove-stem)
    (maybe-append-honorific subject)
    (append-or-merge-vowel)
    (append-or-merge-tense)
    (append-or-merge-politeness-level speaker audience)
  )
)

Note that there are even more transformations that we can apply depending on the grammar point and the nuance of what one is trying to say. Fun, isn't it? At least it's relatively consistent unlike English–even irregular verb rules are fairly regular.

Adding to the Stack

In linguistics, nominalization or nominalisation is the use of a word which is not a noun (e.g., a verb, an adjective or an adverb) as a noun, ... The term refers, for instance, to the process of producing a noun from another part of speech by adding a derivational affix (e.g., the noun legalization from the verb legalize).

In Korean, one can nominalize entire clauses and use them in other constructs! Korean lets you do this with the ~는 것 principle. 것 means thing, but any noun can be used in place of . Based on the tense, verb type, and whether the verb ends in a vowel, has variations like ~ㄴ, and . It can also combine with other grammar forms, like , which is 더~ + ~ㄴ/은. Digging into these would be beyond the scope of this post.

To give you a small example, one way to say "exit" in Korean is literally "going out place", or "a place that one goes out".

Source: Wikipedia 나가는 곳

나가다 means "to go out". means place. A 나가 is a going out place .

Note that 나가다 is a pure Korean word. To the right is 出口 (pronounced 출구), which also means exit, and is derived from classical Chinese. Just like English has Latin and Greek influences, Korean has Chinese and to a small extent, Japanese influences.

Sentence the First

So, let's take 'the girl walked to school'. In English and Korean this is straightfoward enough:

여자는 학교 걸어 갔어요 The girl walked to school

But what if you wanted to talk about that person? You could say "the girl who walked to school". In English, this these are known as relative causes. They can begin with who, which, that, where, etc, following the noun. Korean uses the ~는 것 nominalizer before the noun, which leads to:

학교 걸어 여자 !

Note that changed to . is 가다 (to go) + ~ㅆ (past tense). But the past tense nominalization form uses ~ㄴ (것). Instead of 것 (thing) we swapped it for another noun 여자 (woman).

Not that one would only say "the girl who walked to school" by itself, but we can now use the entire construct as a noun in other sentences:

저는 학교로 걸어 간 여자를 알았어요 I the girl who walked to school knew

Sentence the Second

We can try a more complex sentence now: "That's the place (that) I thought I went to!". First, we need to break it down in a sentence can that be nominalized. "I thought I went". Here we can use the grammar point ~ㄴ 줄 알다 which when used means the speaker thought something was true, but realized it wasn't–due to a lapse in judgement, etc.

제가 간 줄 알다 I thought (that) I went

You may have noticed that this grammar point itself uses ~ㄴ 것, but with instead of 것! This 줄 is a bound noun, meaning that it can only be described by a ~는 것 clause. Outside of ~ㄴ 줄 알다, 줄 can also be a regular noun meaning line/rope.

그곳은 제가 어디에 간 줄 알았 이야! That is the place I thought I went to!

Sentence the Third

Can we go even deeper?

[나는] ((( 상황이 억울하다고 말하는 불평불만 ) 만 하는 사람은 ) ( 별로 좋아하 지 않는다 ).

[I'm] ( not keen on ) ( people who only ( complain that things are unfair )).

This sentence doesn't really translate 1:1 to English, as is the case with most intermediate/advanced Korean sentences.


Nominalizing with ~는 것 is my favorite aspect of Korean because it's an important grammar point that blew my mind once I learned how it worked. It's quite commonly used as well. In day to day usage I might say something like the house (that) I used to live (in) –  제가 았던 , et cetera.

Language Tidbits

These are some cool traits about Korean, or things related to this post, that don't necessarily have to deal with programming.

Politeness / Formality

The Korean language conjugates differently based on the status of speaker and intended audience. For example, one of the simplest ways to conjugate any verb is to add ~어/아/여 to it. This is based on the last vowel, not the last character.

For example, you may have seen 감사합니다 before ("thank you", formal polite). This is 감사하다, merged with ~ㅂ니다 because ends in a vowel. 고마워요 is another way to say thank you(informal polite): 고맙다 + apply irregular ㅂ consonant ending filter + ~아/어/여요.

fn get_vowel_for_verb(verb: &str, formality: Formality) -> {
  // ha = 하
  if verb.stem_ends_with_ha() {
    "여"
  } else {
    match verb.last_vowel() {
      "아" => "아"
      "오" => "아"
      "어" => "어"
      "우" => "어"
      "이" => "어"
      "의" => "어"
      "위" => "어"
    }
  }
}

Korean has seven speech levels [4]. When learning Korean, the 아/어/여요 and ~ㅂ/습니다 levels are commonly used, in that order. Using 아/어/여 (no ) to anyone other than close friends (who have agreed to use lowered speech) or young kids is rude. Foreigners get a pass at first but it's still impolite.

Plain (sometimes known as diary) form is also used, such as in diaries and books/novels.

English lacks this concept, as we use the same conjugation for everyone – "the prisoner ate", "the king ate", and "a God ate". What English does have is different registers[5], such as when you text versus when you write an academic paper or a business email. This includes overly polite language like "Might you be interested in eating, sir?", but nevertheless the verb remains the same.

Quoting Statements

Quoting plain statements in Korean is very easy. All you need to do is take the sentence, conjugate the verb into plain form, and append ~고 (말)하다. For verbs, ~ㄴ/는다 is the plain form. For adjectives, it's just , or the base verb.

(저는) 먹었어요. (제가) 먹었 다고 말했다 – I ate. I said I ate.

Depending on the type of statement, different particles than 다 are used.

  • declarative =>
  • inquisitive =>
  • propositive (let's ...) =>
  • imperative or =>
  • declarative with 이다 (to be) as the verb =>

이것을 좋아하다고? – (You said that) you like this?

Since this is used so much in speech, the (말)하다 (to say) part is often omitted. If you're learning Korean, expect to hear this a lot from natives because Korean pronunciation is tough.

Lack of Romanization

Why does this article lack romanization? Because romanization is bad. English and Korean sounds do not map neatly to one another. The issue here is that Korean learners mentally map {some english sound} => {some hangeul letter} and it hurts their pronunciation skills immensely. For example, you may see this in beginner resources:

d = ㄷ

Except this is wildly wrong because only ㄷ is ㄷ. It is close to d, in the same sense that fresh water is close to salt water. Close enough, right? If you're learning Korean, listen to videos that teach the sound, not the most approximte English letter.

I also have seen people write things like anyeonghasaeyo jal jinaeyo? or similar, and it hurts my brain and heart trying to read it.

Furthermore, romanization systems can change over time. 조 used to be romanized as Cho, now it's Jo. So when I read older books that have romanized Korean it forces me to go and learn the older system as well. 조 isn't Jo or Cho anyway... it's somewhere in between.

Contributors

  • Article – Andrew Zah
  • Editing, sentence suggestions – 웁스
Have any observations, criticisms, or corrections? Feel free to email me. Have a nice day!