Golang Grammar Quiz vol.2: Map

Golang Grammar Quiz vol.2: Map

In the previous post, I presented a series of quizzes based on insights from the Jon Bodner's presentation at GopherCon 2018.

This time, let's dive into some intriguing grammar quizzes centered around the map type.

Are you ready? Let's get started.

Question 1

What do you think the output of the following code is?

package main

import "fmt"

func main() {
    var m map[string]int
    fmt.Println(m["foo"])
    m["foo"] = 5
    fmt.Println(m["foo"])
}

A. Does not compile

B. Panics immediately

C. Prints 0, then panics

D. Prints 0, 5

Answer and Explanation

The correct answer is C. Prints 0, then panics.

When a map is declared using the var statement, it is initialized to nil (referred to as a "nil map" by Jon Bodner).

Reading from a nil map returns the zero value of the map's value type, which, in this case is int, resulting in 0.

However, attempting to write to a nil map triggers a panic. The panic occurs because memory allocation for adding elements hasn't taken place.

Here, let's take a look at the source code of the Go compiler to see why it panics.

When the compiler is executed, the walkIndexMap function in /usr/local/go/src/cmd/compile/internal/gc/walk.go is invoked by walkExpr function that is defined in the same file. Then, walkIndexMap function checks if the map is being read (e.g. bar := m[k]) or assigned a value(e.g. m[k] = "buzz").

func walkIndexMap(n *ir.IndexExpr, init *ir.Nodes) ir.Node {
    // skip unnecessary lines
    var mapFn ir.Node
    switch {
    case n.Assigned:
        mapFn = mapfn(mapassign[fast], t, false)
    // edge case for assignment of large zero-sized value
    default:
        mapFn = mapfn(mapaccess1[fast], t, false)
    }
    // rest of function
}

In the code snippet, the walkIndexMap function calls the mapfn function with mapassign as the first argument when the map is being assigned a value. Conversely, it calls mapfn function with mapaccess1 as the first argument when the map is being read.

In /usr/local/go/src/runtime/map.go, the fucntions mapassign and mapaccess1 are defined as follows.

// Like mapaccess, but allocates a slot for the key if it is not present in the map.
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
    if h == nil {
        panic(plainError("assignment to entry in nil map"))
    }
    // rest of function
}

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
    // skip
    if h == nil || h.count == 0 {
        if t.HashMightPanic() {
            // edge case for using interface type as key
        }
        return unsafe.Pointer(&zeroVal[0])
    }
    // rest of function
}

The mapassign function examines whether the map is nil or not. If the map is nil, it triggers a panic. On the other hand, mapaccess1 essentially returns the zero value if the map is nil or empty.

This is the explanation Jon Bodner gave in his presentation.

Disclaimer: There have been changes in the Go source code since the time of Jon Bodner's presentation, so the code above is not identical to the code in his slide

Further Explanation

While exploring the reasons behind the panic, I came across a blog post by kotobuki providing additional insights into the "assignment to entry in nil map" panic.

The Jon Bodner's presentation shed light on the code-level reasons why the panic occurs, but I still didn't understand why it has to panic at all. The reason is quite simple: panic occurs because memory has not been allocated for adding elements.

When it comes to adding elements to a slice, memory does not have to be allocated because it creates a new slice with double the capacity when the capacity is not enough. And this is managed by append function.

So, how to avoid panic when adding a key-value pair to a map? The answer is to initialize the map with make function or initialize it with a map literal.

package main

import "fmt"

func main() {
    // var m map[string]int
    m := make(map[string]int) // or m := map[string]int{}
    fmt.Println(m["foo"])
    m["foo"] = 5
    fmt.Println(m["foo"])
}
// Output:
// 0
// 5

Question 2

What do you think the code below prints?

package main

import "fmt"

func main() {
    m := map[string]int{
        "D": 4, "F": 6,
        "B": 2, "A": 1,
        "C": 3, "E": 5,
    }
    var order []string
    for k, _ := range m {
        order = append(order, k)
    }
    fmt.Println(order)
}

A. Order inserted; [D F B A C E]

B. Alphabetical order; [A B C D E F]

C. Reverse alphabetical order; [F E D C B A]

D. Random order

Answer and Explanation

The correct answer is D. Random order.

Go's map type is implemented as hash table. Hash table hashes the key first, then stores the key-value pair in the bucket that corresponds to the hash value.

This way, the elements are stored based on hash values rather than the order of insertion. Consequently, the order of the elements is random.

Question 3

This one may be a bit tricky. How many different ways will Go iterate through the map in the code below?

package main

import "fmt"

func main() {
    m := map[string]int{
        "D": 4, "F": 6,
        "B": 2, "A": 1,
        "C": 3, "E": 5,
        "G": 7, "H": 8,
        "I": 9,
    }
    counts := make(map[string]int)
    for i := 0; i < 100_000; i++ {
        var order string
        for k, _ := range m {
            order += k
        }
        counts[order]++
    }
    fmt.Println(len(counts))
}

A. 1

B. between 1 and 20

C. between 20 and 10,000

D. more than 10,000

Answer and Explanation

The answer is B. between 1 and 20.

When I run the program, mostly it prints 10, and other times it can be 12 or 14

Jon Bodner explains that in the early versions of Go, the iteration order of maps was usually same as the insertion order. However, there were two problems.

  1. People started to rely on the iteration order, which is not even guaranteed.
  2. Vulnerability to Hash DoS attacks.

Hash Dos is a type of DoS attack that exploits the hash table. If you send a bunch of values that result in the same hash value, the key-value pairs will be stored in the same bucket of the table. Then, the time complexity of searching for a value in the bucket slows down to O(n) from O(1). This is called Hash DoS attack.

To solve these problems, two changes were made.

  • The runtime picks a random number as a seed every time a value gets hashed.
  • When a map is iterated in a for-range loop, it starts from a random element in a random bucket of the map.

This is the source code of the makemap function in /usr/local/go/src/runtime/map.go.

func makemap(t *maptype, hint int, h *hmap) *hmap {
    // skip
    // initialize Hmap
    if h == nil {
        h = new(hmap)
    }
    h.hash0 = fastrand()
    // rest of function

Note that hash0 is the hash seed for the map, and it is assigned a random number by the fastrand function. So, every time there is a read or write to the map, a new seed is generated.

When a map gets iterated in a for-range loop, the mapiterinit function in /usr/local/go/src/runtime/map.go is called and picks which element in which bucket to start iterating from.

func mapiterinit(t *maptype, h *hmap, it *hiter) {
    // skip
    // decide where to start
    var r uintptr
    if h.B > 31-bucketCntBits {
        r = uintptr(fastrand64()) // edge case handling for a large bucket to avoid overflow error
    } else {
        r = uintptr(fastrand())
    }
    it.startBucket = r & bucketMask(h.B)
    it.offset = uint8(r >> h.B & (bucketCnt - 1))
    // rest of function
}

This is why the possible number of different ways to iterate through the map is not the factorial of the number of elements in a map. It is just picking where to start iterating from randomly.

To wrap up this question, let's take a look at the actual iteration order of the map in the code above.

package main

import "fmt"

func main() {
    m := map[string]int{
        "D": 4, "F": 6,
        "B": 2, "A": 1,
        "C": 3, "E": 5,
        "G": 7, "H": 8,
        "I": 9,
    }
    counts := make(map[string]int)
    for i := 0; i < 100_000; i++ {
        var order string
        for k, _ := range m {
            order += k
        }
        counts[order]++
    }
    fmt.Println(len(counts))
    var i int                  // added
    for k, _ := range counts { // added
        i++                    // added
        fmt.Println(i, k)      // added
    }
}

// Output:
// 10
// 1 FACGDEHIB
// 2 ACGDFHIBE
// 3 IBEHCGDFA
// 4 BEHIGDFAC
// 5 EHIBFACGD
// 6 HIBEACGDF
// 7 CGDFAIBEH
// 8 GDFACBEHI
// 9 DFACGBEHI
// 10 BEHIDFACG

If you look closely at the output, you can see that the order is partly fixed by small chunks, like "BEHI" in the last three lines, and "ACGD" in line 1, 2, 5 and 6. This is how the iteration order is actually determined.

Question 4

This will be the last question of this post. What do you think the output of the code below is?

package main

import "fmt"

func main() {
    m := make(map[string]int)
    m["a"] = 1
    add(m)
    fmt.Println(m)
}

func add(m map[string]int) {
    m["b"] = 2
}

A. map[a:1 b:2]

B. map[a:1]

C. Does not compile

D. Panics immediately

Answer and Explanation

The correct answer is A. map[a:1 b:2].

It might feel a bit strange that the function receives the value of the map instead of the pointer, yet the map is modified after the function call.

If you have read the previous post about slice, you may well remember that slice can also be modified in a function, but we cannot add elements to it when it is the value that is passed to a function, not a pointer to the slice.

To understand this, let's compare the signatures of makemap and makeslice functions in runtime package.

func makemap(t *maptype, hint int, h *hmap) *hmap

func makeslice(et *_type, len, cap int) slice

As you can see, makemap returns a pointer to the map, while makeslice returns a slice value. This means that when the value of the map is passed to a function, it actually has access to the underlying data of the map and can modify and add elements to it.

Wrap up

We have covered some quizzes about the map type in Go in this post.

How many questions did you get right? Even if you missed all of them, it is a hundred percent fine as long as you have learned something new, which is the whole point of this post.

There are more quizzes about Go grammar from Jon Bodner's presentation, so just stay tuned for the next post. See you soon! ;)