Mastering Regular Expressions in Go

go dev.to

Go has an inbuilt regexp package which supports regular expression by the RE2 engine. This is because of this single and self-sufficient design choice, Go regexps are safe, predictable and production-ready.
RE2 is guaranteed O(n) time with respect to input length, avoiding disastrous backtracking and ReDoS vulnerabilities - even with untrusted user input.

1. Compile vs MustCompile
Before matching, a pattern must be compiled into a *regexp.Regexp. Two functions handle this:

// Compile → returns an error. Use for dynamic / user-supplied patterns.
re, err := regexp.Compile(`\d{3}-\d{4}`)
if err != nil {
    log.Fatal("invalid pattern:", err)
}

// MustCompile → panics on invalid pattern. Use for known-good package-level vars.
var phoneRe = regexp.MustCompile(`\d{3}-\d{4}`)

// Invalid pattern example
_, badErr := regexp.Compile(`[invalid`)
// error parsing regexp: missing closing ]: `[invalid`

Enter fullscreen mode Exit fullscreen mode

2. Basic Matching
Three levels of match output — boolean, first match, all matches:

var digitRe = regexp.MustCompile(`\d+`)

text := "Order #4821 placed on 2024-06-15, total: $99.99"

fmt.Println(digitRe.MatchString(text))      // true
fmt.Println(digitRe.FindString(text))       // "4821"
fmt.Println(digitRe.FindAllString(text, -1)) // [4821 2024 06 15 99 99]
fmt.Println(digitRe.FindAllString(text, 2))  // [4821 2024]  (first 2)
Enter fullscreen mode Exit fullscreen mode

3. Positional (Index) Matching
Index methods return byte offsets instead of strings — useful when you need to reconstruct or replace around the match.

loc := digitRe.FindStringIndex(text)
// loc = [7, 11]  →  text[7:11] = "4821"

allLocs := digitRe.FindAllStringIndex(text, -1)
// [[7 11] [22 26] [27 29] [30 32] [42 44] [45 47]]

Enter fullscreen mode Exit fullscreen mode
  1. Capture Groups (Numbered) Use FindStringSubmatch to get the full match plus all capture group contents. Index 0 is always the full match.
var ipRe = regexp.MustCompile(`(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})`)

m := ipRe.FindStringSubmatch("Server: 192.168.1.42")
// m[0] = "192.168.1.42"   (full match)
// m[1] = "192"             (group 1)
// m[2] = "168"             (group 2)
// m[3] = "1"               (group 3)
// m[4] = "42"              (group 4)

// FindAllStringSubmatch — all matches with their groups
allMatches := ipRe.FindAllStringSubmatch(input, -1)
Enter fullscreen mode Exit fullscreen mode

5. Named Capture Groups
Syntax: _(?P...) _ makes code far more readable and survives pattern refactoring.

var dateRe = regexp.MustCompile(
    `(?P<year>\\d{4})-(?P<month>\\d{2})-(?P<day>\\d{2})`)

func ParseDate(s string) map[string]string {
    match := dateRe.FindStringSubmatch(s)
    if match == nil { return nil }

    result := make(map[string]string)
    for i, name := range dateRe.SubexpNames() {
        if name != "" { result[name] = match[i] }
    }
    return result
}

// ParseDate("2024-11-28") → map[year:2024 month:11 day:28]

//  direct index lookup by name
yearIdx := dateRe.SubexpIndex("year")  // 1
Enter fullscreen mode Exit fullscreen mode

6. Replace Methods

// 1. ReplaceAllString — static, supports $N back-references
wsRe   := regexp.MustCompile(`\\s+`)
wsRe.ReplaceAllString("foo   bar  baz", " ")  // "foo bar baz"

swapRe := regexp.MustCompile(`(\\w+)=(\\w+)`)
swapRe.ReplaceAllString("a=1 b=2", "$2=$1")  // "1=a 2=b"

// 2. ReplaceAllStringFunc — dynamic via a function
digitRe.ReplaceAllStringFunc("item1 qty5", func(s string) string {
    n, _ := strconv.Atoi(s)
    return strconv.Itoa(n * 2)
})  // "item2 qty10"

// 3. ReplaceAllLiteralString — $ signs are NOT interpreted
litRe  := regexp.MustCompile(`foo`)
litRe.ReplaceAllLiteralString("foobar", "$1")  // "$1bar"
Enter fullscreen mode Exit fullscreen mode

7. Split

sepRe := regexp.MustCompile(`[,;\\s]+`)

sepRe.Split("one,two; three  four", -1)
// ["one" "two" "three" "four"]

sepRe.Split("a,b,c,d", 3)  // ["a" "b" "c,d"]  (n=3 → at most 3 pieces)
Enter fullscreen mode Exit fullscreen mode

8. Longest match

re.Longest() switches the engine to leftmost-longest (POSIX) semantics before the first call.

greedyRe := regexp.MustCompile(`a+`)
greedyRe.Longest()
greedyRe.FindString("aaa")  // "aaa"  (not just "a")
Enter fullscreen mode Exit fullscreen mode

9. LiteralPrefix
LiteralPrefix() returns the fixed-string prefix before the first metacharacter. The engine uses this to fast-skip non-matching positions.

urlRe              := regexp.MustCompile(`^https://`)
prefix, complete := urlRe.LiteralPrefix()
// prefix = "https://"   complete = true

partialRe          := regexp.MustCompile(`^https?://`)
p2, c2          := partialRe.LiteralPrefix()
// p2 = "http"   c2 = false  (the ? breaks the literal)

Enter fullscreen mode Exit fullscreen mode

10. Real-World: Email pattern checker

package main

import (
    "fmt"
    "regexp"
)

func main() {
    email := "test@example.com"

    // Simple email regex pattern
    pattern := `^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$`

    re := regexp.MustCompile(pattern)

    if re.MatchString(email) {
        fmt.Println("Valid email")
    } else {
        fmt.Println("Invalid email")
    }
}
Enter fullscreen mode Exit fullscreen mode

When Should You Use RegExp?
Use it when:

Validating input (email, phone, etc.)
Parsing logs
Extracting structured data

Further Reading
Official docs: https://pkg.go.dev/regexp
RE2 syntax ref: https://pkg.go.dev/regexp/syntax
Run all examples: go run regexp_complete.go

Source: dev.to

arrow_back Back to Tutorials