Go 1.24
At first glance Go 1.24 does not look as significant as 1.23 (with its advent of iterators and the unique
package), but there are a lot of things that can help make your code (and builds) simpler and more efficient. Some of these things seem to be extensions coming out of Go 1.23, such as new weak pointers package and a better approach to finalizers.
For the build process there’s a new tools tool and related options added to other tools. This will allow for a more consistent handling of build-dependencies.
I love things that make my code faster, especially if I don’t have to do anything (except download the new compiler :). There have been some great performance improvements for maps and mutexes which I look at in detail below.
As always there are improvements for testing (which I also love). Tests (and benchmarks) have improvements that address a couple of issues that some of have encountered - eg support for a Context
for more control of test teardown. Benchmarking has also been simplified, though it has a slight gotcha (see below).
But the highlight is the new testing/syncTest
package which allows testing of timing dependent code in cooperating goroutines. This is an issue I have struggled with and spent a lot of time cobbling together my own solution. It’s great to have a well-implemented solution that comes included, though I have encountered at least one pitfall in my tests.
Background
General
Destructors vs Finalizers
Coming from a language with destructors (like C++) to a GC one it’s easy to think of finalizers as being destructors. This is a trap that is apparently common in Java (but I’ve never seen it in Go :). A finalizer is called when an object is garbage collected (freed to the heap) so it is not deterministic when a finalizer will be called - it may be called much much later or maybe never at all.
- destructor - called when freed (heap) or goes out of scope (stack)
- finalizer - called when an object is freed during garbage collection
In C++ destructors are always called (even when an exception is thrown). This feature is very important for cleaning up - such as unlocking a mutex, closing a file, deleting a temporary file, etc (see RAAI)
In garbage collected languages, like Java and Go, you should not use a finalizer for cleaning up. When closing a file you want it closed immediately since something else may depend on it not being held open- e.g. it might prevent the code from being able to reopen the same file.
Moreover, GC languages don’t always guarantee that a finalizer is ever called. (One exception is C#, unless GC.SuppressFinalize
has been called.) This is so the software can shutdown quickly - if there are a lot of objects on the heap then calling finalizers on all of them could prevent the program from closing quickly, perhaps taking many seconds or even minutes.
Without destructors objects need methods (often called Close
or Unlock
), so cleanup can be
properly (deterministically) controlled. But you have to remember to call them (unlike a destructor which can be
called automatically) - luckily Go has the defer
command which helps enormously (see Defer)
Weak Pointers in C++
Weak pointers have been around for a while. C++11 added weak pointers, but you must first have shared pointer(s).
Shared pointers are (like GC) a way to detect when objects are no longer used in order to free them (release their memory back to the heap). The difference with shared pointers (in contast to GC pointers) is that each user of a shared pointer has to indicate when it stops using it and the object is immediately freed as soon as the reference count drops to zero.
There are pros and cons of reference counting vs garbage collection (just Google it or see Reference Counting).
A weak pointer can be obtained from a C++ shared pointer, the difference being that a weak pointer does not increment the reference count.
This means that as soon as there are no more shared pointers to an object, the weak pointer is invalidated. I think this is a disadvantage compared to weak pointers in Go, which are not invalidated until the object is garbage collected (which may not happen for a long time for various reasons (see Go Finalizers Background->Go below).
Go
Go Finalizers
Most Gophers don’t use Go’s finalizers. A finalizer is run on an object before it is freed (ie, its memory released back to the heap). However, when it will be freed can depend on a lot of factors.
- obviously an object is not freed until there are no more pointers to it (or pointers to sub-objects inside it)
- won’t be free until the garbage collector runs (see Go Garbage Collector Guide)
- won’t ever be freed if GC is turned off (by calling
debug.SetGCPercent
or settingGOGC
to a negative value) - won’t be freed until the subsequent GC if the object has a finalizer
- may not be freed at all if the finalizer “resurrects” the object
- small objects may be combined into one memory block so none are freed until they are all unreferenced
Note: by finalizers I mean those set with runtime.SetFinalizer
not the new “finalizers” of Go 1.24 set with runtime.AddCleanup
.
How can you have a finalizer on a stack object?
The first question I had about finalizers is what happens if you use them on a local variable (ie, allocated on the stack). The simple (obvious?) answer is that as soon as you pass the address to runtime.SetFinalizer
the variable escapes to the heap (see Escape Analysis for more info).
Go 1.24 has a new (better) implementation of finalizers (as explained below). This probably doesn’t affect many people. The only thing (IMHO) that finalizers are good for is to check that an object has been properly cleaned up (file closed, mutex unlocked, …) before it was discarded.
Mutex vs Atomic
I used to prefer atomic operations (from sync/atomic
package) over sync.Mutex
for performance. In fact there are many blogs recommending this (just Google “mutex vs atomic”). Then I recently ran a benchmark and found that sync.Mutex
is mysteriously not really any slower. It turns out that sync.Mutex
is more of a hybrid of a mutex and a spin lock.
First, let me explain mutexes and spin locks.
You can use a bit somewhere in memory to indicate “mutual exclusion” (ie a mutex). A thread (goroutine) can “lock” the mutex by checking that the bit is zero and, if so, setting it to one. Since different threads might be trying to use the bit then their access to that bit must be synchronised by:
- using an atomic CPU instruction, otherwise if the scheduler interrupts the thread half-way through the operation then more than one thread could think they have obtained the lock
- ensuring all cached values are consistent, otherwise concurrent threads running on different CPUs (with their own caches) could both think they have obtained the lock (see Cache Coherence)
The term atomic originally referred a single uninterruptible CPU instruction (1), but has grown to include cache-coherency requirements (2). Functions in the sync/atomic
package encompass both.
Of course, the “bit” does not know who “locked” it. The code that sets the bit must remember that it “owns” the lock and (atomically) clear it when finished, otherwise the mutex will be locked forever.
The next question then is: What happens if the bit is already set?
You cannot keep looping trying to lock the mutex (checking for the bit to be cleared) since that ties up a thread doing nothing useful. You must ask the operating system to put your thread to sleep and wake it up when the mutex has been unlocked. But waking your thread involves the scheduler, and is relatively slow.
Spin Locks
Let’s take a step back.
What if you did keep looping, checking for the bit to be unset. (The check and set if not checked must be atomic which you can do in Go using something like atomic.CompareAndSwapInt32()
.) This is called a spin lock.
The advantage of a spin lock is that it has low-latency. That is, it can be much quicker to obtain a lock (just involving an atomic operation) than a mutex (which requires a thread to be woken). On the other hand spin locks have some disadvantages:
- waste CPU, especially if locks are typically held for a long time
- may cause many CPU stalls if many threads are spinning on the same lock
In other words, a spin-lock can improve performance if there are just two or a few threads locking and unlocking quickly. On the other hand, if they hold the lock for a long time this can cause a lot of spinning (wasted CPU) and be counterproductive. Further, if a lot of threads are spinning at the same time this can slow them all down as all the caches in the system much be updated (kept coherent).
sync.Mutex
Go’s sync.Mutex
is actually a hybrid between a spin lock and a “real” mutex. That is when sync.Mutex.Lock
is called and the lock is not available it briefly spins so if the lock is released quickly it can continue immediately. However, if the lock is not released after a short period of time it reverts to a real mutex, meaning the goroutine sleeps until waiting for the sheduler to awaken it when it has locked the mutex.
This implementation has the advantage that it behaves like a normal mutex but has lower latency for “quick” locking.
My advice is to favour using sync.Mutex
, rather than using operations from the sync/atomic
package, most of the time. There is probably little advantage to atomic
operations and the code is probably harder to understand. An exception would be for something very simple, where using atomic
would make the code simpler to understand.
Furthermore, sync.Mutex
got even better in Go 1.24 by addressing point (2) in Spin Locks above. See below.
Tools
Since the addition of modules (Go 1.11) there have been additions and improvements with almost every Go release that allow you to simplify and automate a reproducible build of your software. I don’t use most of these capabilities, but it is good to be aware of them.
New in Go 1.24 you can keep track of “tools” that are used in the building of your software. These can be things you have written especially for the project or 3rd party tools.
Note: here when we talk about “tools” we are not talking about the Go tools (the executables that come with Go distribution) but extra tools used for building and maintaining the project.
The go tool
There have been many approaches for running tools as part of the build process - from Makefiles, to setting up a dummy
tools project using //go:generate
to run the tools. A lot of teams run things manually which is error-prone.
The new tools additions provide a consistent and documented process for automating your builds.
Runtime
New Map Implementation
The runtime has a new implementation of maps that should be faster and use less memory. I think the speed is mainly due to better cache performance (important with modern hardware).
I did some simple benchmarks and the results for insertions and reads is more significant than the release notes imply - close to 50% faster.
There is a (slight) downside - deletions are slower, but unless you have specific performance requirements (like occasional mass deletions that need to be executed quickly) I think the new maps will improve overall performance a lot.
[Update (Feb 29): The Go Blog has an excellent explanation of the new map internals at Faster Go maps with Swiss Tables]
Finalizers
Finalizers allow you to attach a function to an object that is called when the object is garbage collected. Go 1.24 adds
an improved finalizer. You should now use runtime.AddCleanup
in preference to runtime.SetFinalizer
.
I discussed them in detail in the Background section above, but you should be wary of using (old or new) finalizers as it’s unpredictable when, or even if, they will be called. They might be useful for testing/debugging performance issues with the heap, or for something like “weak pointers” (see new weak package below).
The advantages of the new finalizers are:
- you can have more than one finalizer on the same object
- you can add a finalizer to an interior pointer such as a struct field
- they’re supposedly faster and better at handling cyclic references
- always deallocate on the next GC (except objects <= 16 bytes - below)
A disadvantage is that you can no longer resurrect an object, as you can with the old finalizers. However, I always found that capability dubious and I think we are better without it. (It also means that objects with finalizers do not require two GC cycles before being freed.)
I like the new finalizers, but I did find them tricky to use - in particular, the purpose of the 3rd parameter to AddCleanup
. Also, this simple test did not seem to work:
p := new(int)
*p = 42
//runtime.SetFinalizer(p, func(p2 *int) { println(*p2) }) // prints 42
runtime.AddCleanup(p, func(v int) { println(v) }, *p) // *** not called ***
runtime.GC()
runtime.GC() // old finalizers need 2 GC cycles
I believe this is due to an optimization of how GC works with small objects. I found a workaround in comments in the standard library such as in runtime_test.TestCleanupMultiple
- see mcleanup_test.go
//p := new(struct { dummy [0]int8; v int }) // not called
p := new(struct { dummy [16]int8; v int }) // called
p.v = 42
runtime.AddCleanup(&p.v, func(v int) { println(v) }, p.v)
runtime.GC()
Standard Library
Iterators support
Since iterators were added in Go 1.23, a lot of standard library functions that returns slices would be better served by returning iterators for better performance, scalability, etc.
Last time I questioned why this was not done in Go 1.23 (see Standard Library Iterator Support -> Missing?). Go 1.24 fixes this for much of the standard library including to strings.Split()
. You should now prefer strings.SplitSeq()
.
New weak package
The idea of a weak pointer is simply to hold on to an object on the heap without preventing the GC (garbage collector) from cleaning it up if it needs to.
This is a really cool addition, if only for some esoteric purposes. It’s simpler and more useful than the weak pointers I used in C++ (discussed in Background above).
An example use would be to create a cache which allows you to trade-off space vs time efficiency. That is, the cache keep weak pointers to the cached information to speed up operations but does not prevent the GC from releasing the memory if it is needed.
I won’t explain any more about it as there have been a lot of great articles about it already - e.g. Using Weak Pointers in Go.
Mutex (sync package)
As I explained above (in the Background->Go section) Go’s sync.Mutex
is a hybrid implementation where sync.Mutex.Lock()
briefly tries a spin lock before blocking. This has the advantages of both, being is low-latency for “quick” locking but not wasteful for “slow” locking.
However, it could suffer from excessive CPU stalls when there is a lot of contention on a “quick” lock.
When several goroutines call Lock()
on the same sync.Mutex
at similar times then there will be multiple threads (goroutines) spinning on the same lock. (See the 2nd disadvantage in Spin Locks already discussed above.)
This results in many CPU “stalls” as the hardware enforces cache coherency. These are not as bad as delays due to blocking the thread but if it happens on many CPUs then the whole system will slow down. This affects server software (for which Go is ideally suited) that uses a lot of goroutines.
Go 1.24 avoids this by limiting the number of goroutines simultaneously “spinning” on the same lock. See the original proposal for details.
Restricted Filesystem access (os package)
Go server software often has to give a user (read or write) access to files. The server software will be running as some sort of privileged user - at least not the user needing to access the files. For obvious (security) reasons you need restrictions on what the software does on behalf of the user - e.g. see Directory traversal attack.
Hence, Go 1.24 added the os.OpenRoot()
function. With the returned os.Root
object you can perform many filesystem operations safe in the knowledge that any external file name or path (eg typed in by a user) cannot access anything outside the location you provide.
For example, once you have opened an os.Root
you can open a file using os.Root.Open()
in the same way as os.Open()
. The only difference is you can’t open a file outside the root path.
root, err = os.OpenRoot("/tmp")
if err != nil {...}
defer root.Close()
f, err = root.Open("t.txt") // OK if t.txt exists within /tmp
if err != nil {...}
defer f.Close()
f, err = root.Open("../usr/andrew/.profile")// error: path escapes from parent
f, err = root.Open("/tmp/t.txt") // error: path escapes from parent
Warning: Don’t let os.OpenRoot()
give you a false sense of security. For example, just because a user can only upload files to a certain location does not stop them maliciously filling up the filesystem with large files, unless you use a separate filesystem or some sort of disk quota system.
Language
Generic Type Aliases
You already know how to create a new type.
type MySlice []int // MySlice is a new type
MySlice
can be used much like []int
but the compiler considers it to be a different type (with an underlying type of []int
). For example, you must use a type cast to convert between MySlice
and []int
.
Type aliases are very similar but allow you to create a new name for the same type. There is a very subtle syntax difference with an extra equals sign (=) in the declaration.
type MySlice = []int // alias
Now MySlice
is an alias. Whenever, you use it the compiler just substitutes []int
.
You should avoid using type aliases (less type safety) if possible.
In Go 1.24 you can have generic type aliases like this:
type MySlice[T any] = []T
Thence, when the compiler sees MySlice[int]
it will substitute []int
.
testing
Testing is one (of many) of Go’s strong points. Almost every release has new features and refinements. Go 1.24 has some excellent additions.
New Benchmarks
I do a lot of benchmarking. In the past I became very cautious due to inconsistent results and Dave Cheney’s warning.
I got in the habit of always assigning results to a global called TestGlobal
. Here is an example, that also includes setup and teardown code (sometimes necessary, and often time-consuming):
var TestGlobal any
func BenchmarkOld(b *testing.B) {
var s string
n := New(42) // setup
b.ResetTimer()
for i := 0; i < b.N; i++ {
s = n.String() // what we sre benchmarking
}
b.StopTimer()
n.Close() // teardown
TestGlobal = s
}
Another problem with benchmarking (before Go 1.24), also mentioned by Dave, is that people find it tempting to make use of the b.N
value or the loop variable ‘i’.
Go 1.24 addresses these issues so the above benchmark can now be simply written like this:
func BenchmarkNew(b *testing.B) {
n := New() // setup
for b.Loop() {
n.String()
}
n.Close() // teardown
}
There is no longer any need to reset the timer when you have lengthy setup code as the only thing that is measured is within the loop. Moreover, the setup/teardown code is only executed once which makes running benchmarks faster if the setup/teardown is expensive.
Further, the new benchmarking guarantees that code (such as the call to n.String()
) will not be optimised away by the compiler. You don’t need to assign results to a global.
Warning: the new benchmarking facility also prevents top-level function calls (like n.String()
above) from being inlined. This is probably not what you want. If the function is “inlineable” then you probably don’t want the function call overhead to be included in the timing.
For example, the above benchmarks give values of 1.8 ns/op (Old) and 2.1 ns/op (New). This is consistent with my estimate of function call overhead of about 0.3 ns on my system.
New synctest package
Go is great for concurrency and Go is great for testing. What is (now) obvious is that it sorely needed a way to simply, quickly, and accurately test concurrent code. This is what the synctest
package provides.
Note that synctest
is an “experiment” in Go 1.24. To enable it just add synctest
to your GOEXPERIMENT environment variable. You probably won’t need to do this in Go 1.25.
GoLand: In the Jetbrains editor I just add “synctest” to File/Settings/Build Tags [TAB]/Experiments [TEXT].
There’s afull description with examples see the ever-informatve Go Blog Testing concurrent code with testing/synctest, so I’ll give a brief description and some convincing reasons to use it.
In brief, it allows you to run your test code, including creating goroutines, in an isolated environment or “bubble”. Everything in the bubble uses a fake clock which controls timers, tickers, sleeps, etc.
More accurate tests
Testing concurrent code can be difficult because the scheduling of goroutines is non-deterministic. The Go runtime scheduler can’t make any guarantee of when your goroutine will run because of the effects of other concurrent goroutines. Moreover, the OS scheduler may starve the Go scheduler’s threads of CPU time if other threads in the system (perhaps with higher priority) are running.
This is why time.Sleep()
is specified to pause the current goroutine for at least the given duration. In other words, time.Sleep(time.Second)
will sleep for a minimum of 1 second but there is no prescribed maximum.
For testing, this means that you need to add a “fudge factor” to allow for scheduling non-determinism. Using the correct fudge factor relies on an understanding of how the scheduler works, how delays might happen, and the environment in which your tests will run.
If you make the fudge factor too big - eg making a sleep too long before checking for a timeout - then your tests will run too slow. But if too small then you run the risk of the test failing when it shouldn’t have.
The synctest
package is not susceptible to this problem as it does not use “real” time. When you call time.Sleep()
(or use tickers, timers, etc) you can be sure that the duration that you see with the fake time is exactly what you asked for.
Testing is more accurate. There is no need for fudge factors.
Simpler tests
The synctest
package provides just two functions: Run()
and Wait()
. You have to use synctest.Run()
to create the test “bubble”.
OTOH synctest.Wait()
is optional but useful as a way to avoid sleeps, mutexes, timers, channels, WaitGroups, etc as testing machinery. It waits until everything in the bubble is blocked.
See the Go Blog Post for an example of how it can simplify tests.
Faster tests
You should be aware that I have an unnatural fixation with automated module regression tests (common misnomer = unit tests). One problem with these tests is that if they take too long to run they might be manually disabled, or even automatically terminated. I don’t like it when that happens.
It starts when tests need to use the (above-mentioned) fudge factor. For example, testing a timeout will use time.Sleep()
to wait for a lot longer than required. The result is the tests use a lot more “real” time than they need to.
By using a fake time synctest
neatly sidesteps this problem. So instead of a test taking many milliseconds, or even seconds, it can run in negligible “real” time. You no longer have to worry about the trade-off between tests taking too long and generating false failures. Nobody has a reason to disable your tests!
Detect Goroutine leaks
Another, perhaps unforeseen, advantage of synctest
is that it allows tests to find goroutine leaks. This is a common problem that often goes undiagnosed. An example is a goroutine blocked reading from a channel but the writer(s) of the channel have themselves terminated, so the goroutine can never continue and terminate. This is a goroutine leak.
synctest.Run()
does not return until all the gorotines in the bubble have terminated. If there is a goroutine leak the test will fail.
Example
This simple example use synctest.Run()
to create a “bubble” that starts a goroutine that sleeps for 10 seconds, then itself sleeps for 1 second.
func TestSyncTest(t *testing.T) {
synctest.Run(func() {
go func() {
fmt.Println("A:", time.Now())
time.Sleep(10 * time.Second)
fmt.Println("B:", time.Now())
}()
time.Sleep(time.Second)
fmt.Println("C:", time.Now())
})
}
Running this test will generate something like:
A: 2000-01-01 11:00:00 +1100 AEDT m=+946499188.159015001
C: 2000-01-01 11:00:01 +1100 AEDT m=+946499189.159015001
B: 2000-01-01 11:00:10 +1100 AEDT m=+946499198.159015001
--- PASS: TestSyncTest (0.00s)
This demonstrates a few things:
synctext.Run()
creates a bubble with a fake time starting at 2000/1/1 00:00:00 UTC- sleeps (on the fake clock) take exactly the duration specified
- even though 10 seconds of fake time passed, the test took very little real time to run (3ms)
synctext.Run()
did not return until both goroutines ended
Conclusion
Go 1.24 has some nice performance improvements, such as great changes to maps and mutexes. The weak
package is also useful for things like caches. I was disappointed that no further PGO enhancements have been added AFAIK.
The new tools tool (and related toolset changes) allow for a more consistent handling of build-dependencies. I encourage you to adopt its use if you have such dependencies.
As many people may know, I am very keen on automated testing. Go does this far better than any language I have tried and almost every release has useful new capabilities. The new facilities, especially syncTest
make it even better.
There are further things in Go 1.24 that did not seem important to me but for more please see Go 1.24 Release Notes.
Comments