Kirsle.net logo Kirsle.net

Globally Recover Panics in Go

June 22, 2018 by Noah

In the past, I had been tasked to update some Python web apps to have them send e-mails out whenever they run into an uncaught Exception in one of our endpoints. This way we could identify any runtime errors in production and fix them.

In Python it was pretty easy: our web framework, Flask, provides a way to catch any error that happens in any of your HTTP routes. Even if you didn't use Flask, you can set sys.excepthook to a function and catch exceptions globally whenever other code failed to.

But then we had an app written in Go and it needed to have this feature, too. If it were simply a web service, this wouldn't have been bad, but this particular service listened on both an HTTP port and a separate TCP port in another goroutine.

Catching HTTP panics in middleware (or the standard net/http library, which catches panics itself) doesn't do anything to help catch the panics thrown by the TCP server. So, like sys.excepthook, I needed to find a way to globally catch panics in my app.

This was surprisingly very hard to do.

tl;dr. code at the bottom

Error Handling in Go

Go doesn't have the concept of Exceptions in the way that Python and other languages do. There is no try / catch syntax in Go to safely execute code that might throw an exception, because Go has no exceptions to catch.

Instead, Go code is expected to return an error value along with a result from a function. All of the function's results must be accepted by the developer into variables, or it will raise a compile-time error and you can't build your program.

// Divide takes two integers and divides them.
func Divide(a, b float64) (float64, error) {
	if b == 0 {
		return 0, errors.New("can not divide by zero")
	} else {
		return a / b, nil
	}
}

// your code
func main() {
	// You must accept both the result and the error value when you call the
	// function.
	result, err := Divide(12, 4)
	if err != nil {
		fmt.Printf("Got an error: %s\n", err)
		os.Exit(1)
	}
	fmt.Printf("Result is: %f\n", result)
		
	// This would've been a compile-time error:
	result := Divide(12, 4)  // not taking all results from function
}

If you really don't care about the error, and don't want to check whether it's nil or anything, you can assign it to the special variable _, like:

result, _ := Divide(5, 0)  // the error is accepted but thrown away with _

But when you type that _ character, you know you are deliberately choosing to ignore an error response. When you divide by zero, it will return 0 and you won't know there was an error because you threw it away.

Note: if a function returned an error, it's probably a good idea to treat the result variable as untrusted. In our example the function returns a result of 0 when there's an error. A lot of functions may return their zero-value, or they could return their object in an inconsistent state. Usually you'll do something with the error and then return up your stack and not try and use the result.

Compared to languages that have Exceptions, I prefer Go's approach. You must accept and deal with the error, so you will never write code that suddenly panics in production.

You might say "well don't write shitty Python code that panics in production." This is sometimes hard to avoid. For example you could log some data the user sent in their request, and your logger didn't support Unicode and the user sent some, and you get a runtime exception that you didn't program a try/except for and your app crashed. Maybe you didn't even consider Unicode, and there are a billion moving parts that cause bugs to squeak through the cracks like this.

Exceptions can be hidden any little place in your code and any place could raise an exception for any reason. Even the documentation may not help you: the writer of a function himself might not even know that some dependency 3 call layers deep might raise an exception on Leap Years or something.

With Go the error values are part of the syntax and the compiler forces you to deal with them, so a whole entire class of problems is eliminated in production applications.

Go Does Have Panics

The closest to raising an exception that Go has is the panic() function, but this is a nuclear option of last resort that should never be used in production code. Panics are very hard to handle. Besides users calling panic() themselves, some runtime errors will cause panics (null pointer exceptions or out-of-bounds errors).

When a panic() is called, it will bubble up the call stack of the current goroutine, return every function along the way (calling any deferred functions), until it reaches the top of the goroutine's stack. Then it will climb the parent goroutine, until it reaches your main() function and then it will exit the program with a stack trace.

If nothing recovers the panic along the way, it kills your application.

The only way to handle a panic is to do so in a deferred function. Deferred functions are run at the end of a function call, right before that function returns, and they're often used to clean up after yourself. Close open file handles, etc.

func LoadConfig(filename string) (*Config, error) {
	// open the config file
	fh, err := os.Open(filename)
	if err != nil {  // handle those errors!
		return nil, err
	}
	
	// we may be doing a LOT of stuff in this function, but we want to make
	// sure we close the file when we're done, no matter when or how we
	// `return`, so defer it.
	defer fh.Close()
	
	// so if we want to make sure we handle any panics from this
	// function on down the call stack, we defer a recovery.
	defer func() {
		if err := recover(); err != nil {
			// if we're in here, we had a panic and have caught it
			fmt.Printf("we safely caught the panic: %s\n", err)
		}
	}()
	
	// do a bunch of stuff. whenever we `return` the file will be closed.
	// if anything raises a panic, we'll recover from it
}

So in a normal application, if you wanted to catch all uncaught panics and prevent your app from crashing, you could probably install a recover() deferral somewhere near your main() function at the top of your call stack.

But this doesn't work with Go web servers and it especially didn't work for my Go web/TCP server!

net/http recovers its own panics

If you're making a web server in Go, I have good news: the standard library net/http has its own panic recovery, so if one of your endpoints raises a panic it doesn't take down your entire server.

This does present a problem, though, if you want a global panic catcher in your app. You can't register one in your main() function that will catch your HTTP panics, because net/http will already catch them and prevent them from bubbling.

You can't register your own HTTP panic catcher and re-raise the panic with another call to panic(), because net/http will always sit between your main function and your panic-catching middleware. You can either catch your HTTP panics earlier than net/http to do your own thing with them, or else net/http will always catch them.

So at the very least, my app would need two redundant panic catchers: an HTTP middleware and one outside the HTTP server, in my main function. This wasn't very ideal either.

Emulate Python's excepthook With Channels

I eventually thought of a solution that would act like Python's sys.excepthook and would let me centrally handle exceptions while having a minimal boilerplate code that would need to be pasted around the codebase.

Using channels in Go, I could make a generic panic recovery function that writes the error and stack trace into a channel, and then have one central function reading it, to then send out e-mails or whatever.

Then I could sprinkle around calls to defer errors.Defer() throughout my codebase in strategic positions to catch all sorts of panics. My web server had a panic middleware that forwarded them along this way, and I made clever use of it throughout the TCP server code -- for example, keeping panics close to the TCP clients as close to them as possible, so that the TCP server wasn't taken down whenever an error occurred.

Code: Globally Catch Panics in Go

// Package errors handles unexpected panics by emailing them out.
package errors

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"net/url"
	"os"
	"runtime"
	"strconv"
	"strings"
	"time"
)

// Signal is a channel that lets goroutines all throughout the app (outside of
// HTTP handler functions) catch their own panics and safely forward them to one
// central listener, who can then email them out.
var Signal = make(chan PanicInformation)

// PanicInformation sends a panic message and its stack trace up the Signal
// channel to be safely handled.
type PanicInformation struct {
	RecoveredPanic interface{}
	Stack          string
}

// Bubble sends panic information up the Signal channel. If the traceback is
// empty, this function will collect the stack information.
func Bubble(err interface{}, traceback ...string) {
	if len(traceback) == 0 {
		stack := make([]byte, 1024*8)
		stack = stack[:runtime.Stack(stack, false)]
		traceback = []string{string(stack)}
	}

	Signal <- PanicInformation{
		RecoveredPanic: err,
		Stack:          traceback[0],
	}
}

// Defer is a deferred function that recovers from a panic and Bubble's it
// through the Signal channel.
func Defer() {
	if err := recover(); err != nil {
		log.Printf("errors.Defer(): caught a panic! %s", err)
		Bubble(err)
	}
}

// EXAMPLE FUNCTION to await panics from your main app
// AwaitPanics watches the errors.Signal channel for any panics caught by the
// sub-packages, to send the details via email.
func AwaitPanics() {
	var pi errors.PanicInformation
	for {
		pi = <-errors.Signal
		log.Debug("AwaitPanics: saw a panic! %s", pi.RecoveredPanic)
		EmailPanicOrWhatever(pi.RecoveredPanic, pi.Stack)
	}
}

Tags:

Comments

There is 1 comment on this page. Add yours.

Avatar image
programmer04 posted on February 15, 2022 @ 15:13 UTC

Hello, thanks for the great post. Unfortunately in Go the sentence "So in a normal application, if you wanted to catch all uncaught panics and prevent your app from crashing, you could probably install a recover() deferral somewhere near your main() function at the top of your call stack." is not true in all cases. Please check this example and you can read the explanation here.

From my perspective, it's a drawback of Go, because I can't implement robust panic() handling (even just for logging) for the entire application in which goroutines are used. Of course, I can add deffer for every goroutine but is very inconvenient and when the third-party library is imported which spawns goroutines that may panic(), it's impossible to write code that handles it properly.

Add a Comment

Used for your Gravatar and optional thread subscription. Privacy policy.
You may format your message using GitHub Flavored Markdown syntax.