The Slice in Golang Can Be Tricky

Always be cautious about the slice in Golang since it’s not so reference-like as a “reference type”.

Slice is a Reference Type

According to The Go Programming Language by Alan A. A. Donovan and Brian W. Kernighan, different from array which is an aggregate type, slice is a reference type because when you pass a Slice to a function, you may change its underlying array indirectly and it can be seen by other Slice variables sharing the same underlying array.

However, unlike the reference type Array in Java, if you change the length of a Slice by passing it to a function, the change of len and cap will not be seen out side of the function. It’s easy to understand if you look into the source code of it.

Source Code of Slice Type

Let’s look into the source code of Slice type in runtime/slice.go:

package runtime

import (
	"internal/abi"
	"internal/goarch"
	"runtime/internal/math"
	"runtime/internal/sys"
	"unsafe"
)

type slice struct {
	array unsafe.Pointer
	len   int
	cap   int
}

// ... ...

We can see the built-in type slice is defined as a struct with three fields: a pointer to the underlying array, and two integers len and cap.

One Case

package main

import "fmt"

func main() {
	s := []int{1, 2, 3, 2, 4}
	del(s, 2)
	fmt.Println(s)
}

func del(nums []int, target int) {
	for i := len(nums) - 1; i >= 0; i-- {
		if nums[i] == target {
			nums = append(nums[:i], nums[i+1:]...)
		}
	}
}

This program trys to delete targets from an integer slice.

The output is “[1 3 4 4 4]” however, since the underlying array was changed twice from {1, 2, 3, 2, 4} to {1, 2, 3, 4, 4} and then {1, 3, 4, 4, 4} while len(s) and cap(s) were never changed in main()

In order to make the function del() work as expected (outputing “[1, 3, 4]” in this case), one solution is to pass the pointer of the slice.

Solution 1:

package main

import "fmt"

func main() {
	s := []int{1, 2, 3, 2, 4}
	del(&s, 2)  // pass the pointer of the slice
	fmt.Println(s)
}

func del(nums *[]int, target int) {
	for i := len(*nums) - 1; i >= 0; i-- {
		if (*nums)[i] == target {
			*nums = append((*nums)[:i], (*nums)[i+1:]...)
		}
	}
}

Another solution is to return the updated slice and store it to the original slice variable.

Solution 2:

package main

import "fmt"

func main() {
	s := []int{1, 2, 3, 2, 4}
	s = del(s, 2)  // update the slice variable
	fmt.Println(s)
}

func del(nums []int, target int) []int {
	for i := len(nums) - 1; i >= 0; i-- {
		if nums[i] == target {
			nums = append(nums[:i], nums[i+1:]...)
		}
	}
	return nums
}

Be careful when you append()

In practice, the second solution above is more common and also more graceful.

The built-in function append() also takes the second way:

The append built-in function appends elements to the end of a slice. If it has sufficient capacity, the destination is resliced to accommodate the new elements. If it does not, a new underlying array will be allocated.

Append returns the updated slice. It is therefore necessary to store the result of append, often in the variable holding the slice itself:
slice = append(slice, elem1, elem2)
slice = append(slice, anotherSlice...)

As a special case, it is legal to append a string to a byte slice, like this:
slice = append([]byte("hello "), "world"...)

func append(slice []Type, elems ...Type) []Type

As the documentation says, if the slice has no enough capacity, a new underlying array will be allocated, otherwise the appended values will just be assigned to next vacancy of the original arrays with length +1 simply.

Therefore, never assume whether the underlying array is the same one or a different one after using append() or other function that may change the length or capacity of a slice or make it refer to a different underlying array.

We can’t assume that the original slice refers to the same array asthe resulting slice, nor that it refers to a different one. Similarly, we must not assume that operations on elements of the old slice will (or will not) be reflected in the new slice.

As a result, it’s usual to assign the result of a call to append to the same slice variable whose value we passed to append.

–– The Go Programming Language (Alan A. A. Donovan and Brian W. Kernighan)

Do not be assumptive about the underlying array of a slice.

Example (LeetCode No.39)

In the LeetCode Problem No.39 Combination Sum, we are asked to find all possible combinations of numbers that add up to a target number.

Here is a common solution using backtracking to find the combinations recursively, which contains a bug though:

func combinationSum(candidates []int, target int) [][]int {
    var ans [][]int
    sort.Ints(candidates)
    var dfs func(sum int, combination []int, candidates []int)
    dfs = func(sum int, combination []int, candidates []int) {
        if sum == target {
            ans = append(ans, combination)
            return
        }
        if sum > target {
            return
        }
        for i := range candidates {
            combination := append(combination, candidates[i])
            dfs(sum + candidates[i], combination, candidates[i:])
        }
    }
    dfs(0, nil, candidates)
    return ans
}

This solution failed to pass all the test cases on LeetCode, because the combination slice is changed in the recursive function dfs() and the change is sometimes reflected in the ans slice.

Solution 1

In the 14th line, make sure that it allocates a new slice with a new underlying array to pass to the recursive function dfs().

This is a simple way to assure a new underlying array with the same content.

newCombination := append([]int{}, combination...)

func combinationSum(candidates []int, target int) [][]int {
    var ans [][]int
    sort.Ints(candidates)
    var dfs func(sum int, combination []int, candidates []int)
    dfs = func(sum int, combination []int, candidates []int) {
        if sum == target {
            ans = append(ans, combination)
            return
        }
        if sum > target {
            return
        }
        for i := range candidates {
            newCombination := append([]int{}, combination...)
            newCombination = append(newCombination, candidates[i])
            dfs(sum + candidates[i], newCombination, candidates[i:])
        }
    }
    dfs(0, nil, candidates)
    return ans
}

Solution 2

In the 7th line, we can also use this method to make sure that the combination appended to the ans slice is a new one.

func combinationSum(candidates []int, target int) [][]int {
    var ans [][]int
    sort.Ints(candidates)
    var dfs func(sum int, combination []int, candidates []int)
    dfs = func(sum int, combination []int, candidates []int) {
        if sum == target {
            ans = append(ans, append([]int{}, combination...))
            return
        }
        if sum > target {
            return
        }
        for i := range candidates {
            combination := append(combination, candidates[i])
            dfs(sum + candidates[i], combination, candidates[i:])
        }
    }
    dfs(0, nil, candidates)
    return ans
}

Appendix: How Slice Grows

The slice growing strategy is in thegrowslice() function in runtime/slice.go:

// growslice allocates new backing store for a slice.
//
// arguments:
//
//	oldPtr = pointer to the slice's backing array
//	newLen = new length (= oldLen + num)
//	oldCap = original slice's capacity.
//	   num = number of elements being added
//	    et = element type
//
// return values:
//
//	newPtr = pointer to the new backing store
//	newLen = same value as the argument
//	newCap = capacity of the new backing store
//
// Requires that uint(newLen) > uint(oldCap).
// Assumes the original slice length is newLen - num
//
// A new backing store is allocated with space for at least newLen elements.
// Existing entries [0, oldLen) are copied over to the new backing store.
// Added entries [oldLen, newLen) are not initialized by growslice
// (although for pointer-containing element types, they are zeroed). They
// must be initialized by the caller.
// Trailing entries [newLen, newCap) are zeroed.
//
// growslice's odd calling convention makes the generated code that calls
// this function simpler. In particular, it accepts and returns the
// new length so that the old length is not live (does not need to be
// spilled/restored) and the new length is returned (also does not need
// to be spilled/restored).
func growslice(oldPtr unsafe.Pointer, newLen, oldCap, num int, et *_type) slice {
	oldLen := newLen - num
	if raceenabled {
		callerpc := getcallerpc()
		racereadrangepc(oldPtr, uintptr(oldLen*int(et.size)), callerpc, abi.FuncPCABIInternal(growslice))
	}
	if msanenabled {
		msanread(oldPtr, uintptr(oldLen*int(et.size)))
	}
	if asanenabled {
		asanread(oldPtr, uintptr(oldLen*int(et.size)))
	}

	if newLen < 0 {
		panic(errorString("growslice: len out of range"))
	}

	if et.size == 0 {
		// append should not create a slice with nil pointer but non-zero len.
		// We assume that append doesn't need to preserve oldPtr in this case.
		return slice{unsafe.Pointer(&zerobase), newLen, newLen}
	}

	newcap := oldCap
	doublecap := newcap + newcap
	if newLen > doublecap {
		newcap = newLen
	} else {
		const threshold = 256
		if oldCap < threshold {
			newcap = doublecap
		} else {
			// Check 0 < newcap to detect overflow
			// and prevent an infinite loop.
			for 0 < newcap && newcap < newLen {
				// Transition from growing 2x for small slices
				// to growing 1.25x for large slices. This formula
				// gives a smooth-ish transition between the two.
				newcap += (newcap + 3*threshold) / 4
			}
			// Set newcap to the requested cap when
			// the newcap calculation overflowed.
			if newcap <= 0 {
				newcap = newLen
			}
		}
	}

	var overflow bool
	var lenmem, newlenmem, capmem uintptr
	// Specialize for common values of et.size.
	// For 1 we don't need any division/multiplication.
	// For goarch.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
	// For powers of 2, use a variable shift.
	switch {
	case et.size == 1:
		lenmem = uintptr(oldLen)
		newlenmem = uintptr(newLen)
		capmem = roundupsize(uintptr(newcap))
		overflow = uintptr(newcap) > maxAlloc
		newcap = int(capmem)
	case et.size == goarch.PtrSize:
		lenmem = uintptr(oldLen) * goarch.PtrSize
		newlenmem = uintptr(newLen) * goarch.PtrSize
		capmem = roundupsize(uintptr(newcap) * goarch.PtrSize)
		overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize
		newcap = int(capmem / goarch.PtrSize)
	case isPowerOfTwo(et.size):
		var shift uintptr
		if goarch.PtrSize == 8 {
			// Mask shift for better code generation.
			shift = uintptr(sys.TrailingZeros64(uint64(et.size))) & 63
		} else {
			shift = uintptr(sys.TrailingZeros32(uint32(et.size))) & 31
		}
		lenmem = uintptr(oldLen) << shift
		newlenmem = uintptr(newLen) << shift
		capmem = roundupsize(uintptr(newcap) << shift)
		overflow = uintptr(newcap) > (maxAlloc >> shift)
		newcap = int(capmem >> shift)
		capmem = uintptr(newcap) << shift
	default:
		lenmem = uintptr(oldLen) * et.size
		newlenmem = uintptr(newLen) * et.size
		capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
		capmem = roundupsize(capmem)
		newcap = int(capmem / et.size)
		capmem = uintptr(newcap) * et.size
	}

	// The check of overflow in addition to capmem > maxAlloc is needed
	// to prevent an overflow which can be used to trigger a segfault
	// on 32bit architectures with this example program:
	//
	// type T [1<<27 + 1]int64
	//
	// var d T
	// var s []T
	//
	// func main() {
	//   s = append(s, d, d, d, d)
	//   print(len(s), "\n")
	// }
	if overflow || capmem > maxAlloc {
		panic(errorString("growslice: len out of range"))
	}

	var p unsafe.Pointer
	if et.ptrdata == 0 {
		p = mallocgc(capmem, nil, false)
		// The append() that calls growslice is going to overwrite from oldLen to newLen.
		// Only clear the part that will not be overwritten.
		// The reflect_growslice() that calls growslice will manually clear
		// the region not cleared here.
		memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
	} else {
		// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
		p = mallocgc(capmem, et, true)
		if lenmem > 0 && writeBarrier.enabled {
			// Only shade the pointers in oldPtr since we know the destination slice p
			// only contains nil pointers because it has been cleared during alloc.
			bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(oldPtr), lenmem-et.size+et.ptrdata)
		}
	}
	memmove(p, oldPtr, lenmem)
	return slice{p, newLen, newcap}
}

Licensed under CC BY-SA 4.0