Go Profiling 101 — ดูว่าโค้ดเราช้าตรงไหน โดยไม่ต้องเดา

เวลาโค้ดช้า — สิ่งที่เรามักทำคือ เดา

"น่าจะเป็น query database นี่แหละที่ช้า"
"สงสัย loop ตรงนี้ inefficient"

แล้วก็ optimize ตามที่เดา — บางทีถูก บางทีผิด — แต่เราไม่มีทางรู้แน่ ๆ ว่าเสียเวลาไปกับอะไรกันแน่

Profiling คือเครื่องมือที่ตอบคำถามนี้แบบไม่ต้องเดา — มันวัดว่า CPU ใช้เวลาไปกับ function ไหน, จัดสรร memory เท่าไหร่, goroutine ทำงานยังไง — แล้วแสดงผลให้เราเห็นเป็นภาพ

Profiling คืออะไร — ภาษาชาวบ้าน

สมมติโค้ดคุณใช้เวลา 2 วิในการ process request — profiling จะบอกว่า:

CPU time:
  1.2s (60%) → json.Unmarshal
  0.5s (25%) → database query
  0.2s (10%) → business logic
  0.1s ( 5%) → everything else

แทนที่จะเดาว่า database ช้า — คุณรู้ทันทีว่า json.Unmarshal คือ culprit — ไป optimize ตรงนั้นก่อน

เริ่มต้น — 3 ขั้นตอน

1. เพิ่ม profiling เข้าไปในโค้ด

package main

import (
    "net/http"
    _ "net/http/pprof" // ← บรรทัดเดียว!
)

func main() {
    // เปิด HTTP server สำหรับ pprof บน port 6060
    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()

    // โค้ดปกติของคุณ...
    runYourServer()
}

import _ "net/http/pprof" — บรรทัดเดียว! — Go จะ expose endpoint ให้อัตโนมัติ:

Endpoint	วัดอะไร
`/debug/pprof/profile`	CPU — function ไหนใช้เวลาเยอะ
`/debug/pprof/heap`	Memory — จัดสรรไปเท่าไหร่ อะไรกิน memory
`/debug/pprof/goroutine`	Goroutine — มีกี่ตัว เกิดจากตรงไหน
`/debug/pprof/allocs`	Allocation — จัดสรร memory บ่อยแค่ไหน
`/debug/pprof/block`	Blocking — goroutine ติดรอตรงไหน
`/debug/pprof/mutex`	Mutex — แย่ง lock กันตรงไหน

2. เก็บข้อมูล

# CPU profile — เก็บ 30 วิ
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# Memory profile
go tool pprof http://localhost:6060/debug/pprof/heap

# Goroutine profile
go tool pprof http://localhost:6060/debug/pprof/goroutine

หรือถ้าไม่มี server — เก็บเป็นไฟล์:

// ในโค้ด — เก็บ CPU profile เป็นไฟล์
f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()

// ... โค้ดที่อยากวัด ...

go tool pprof cpu.prof

3. อ่านผล

พอเข้า go tool pprof แล้ว — คุณจะเห็น prompt:

(pprof)

คำสั่งสำคัญ:

คำสั่ง	ทำอะไร
`top`	แสดง function ที่ใช้ CPU/memory มากสุด 10 อันดับแรก
`top 20`	แสดง 20 อันดับ
`list functionName`	แสดง source code ของ function นั้น พร้อมกำกับว่าแต่ละบรรทัดใช้เวลาเท่าไหร่
`web`	สร้าง graph → เปิดในเบราว์เซอร์ (ต้องมี graphviz)
`pdf`	สร้าง PDF graph
`peek functionName`	ดูว่า function นี้ถูกเรียกจากใคร และเรียกใครต่อ

ตัวอย่าง — อ่าน CPU Profile

สมมติรัน top:

(pprof) top
Showing nodes accounting for 2.50s, 89.29% of 2.80s total
      flat  flat%   sum%        cum   cum%
     0.80s 28.57% 28.57%      1.20s 42.86%  encoding/json.(*decodeState).object
     0.50s 17.86% 46.43%      0.50s 17.86%  runtime.mallocgc
     0.30s 10.71% 57.14%      0.30s 10.71%  regexp.(*machine).match
     0.20s  7.14% 64.29%      0.80s 28.57%  json.Unmarshal

อ่านยังไง:

คอลัมน์	ความหมาย
`flat`	เวลาที่ใช้ใน function นี้โดยตรง (ไม่นับที่มันเรียก function อื่น)
`flat%`	คิดเป็นกี่ %
`sum%`	สะสม — 3 อันดับแรกใช้ไป 57.14%
`cum`	เวลารวม — function นี้ + ทุก function ที่มันเรียกต่อ
`cum%`	คิดเป็นกี่ %

ตีความ:

อันดับ 1: encoding/json.object — 0.80s (28.57%)
  → JSON decoding กินเวลาเยอะสุด! — ต้อง optimise ตรงนี้ก่อน

อันดับ 2: runtime.mallocgc — 0.50s (17.86%)
  → จัดสรร memory บ่อย — อาจเกิดจาก JSON decoding

อันดับ 3: regexp.match — 0.30s (10.71%)
  → ใช้ regex ที่ช้า — เช็คว่ามี regex ที่ compile ทุกรอบไหม

`list` — ดูรายบรรทัด

(pprof) list json.Unmarshal

จะเห็น source code พร้อมเวลากำกับ:

Total: 2.80s
ROUTINE ===== json.Unmarshal
     0.20s      0.80s (flat, cum)
     ...
     .          .   func Unmarshal(data []byte, v any) error {
    50ms      50ms       return d.unmarshal()    ← บรรทัดนี้ใช้ 50ms
    10ms     100ms       err := d.validate()     ← บรรทัดนี้ใช้ 10ms (แต่รวม 100ms เพราะเรียก validate)

ตัวอย่าง — อ่าน Memory Profile

go tool pprof http://localhost:6060/debug/pprof/heap

(pprof) top
      flat  flat%   sum%        cum   cum%
   32.50MB 65.00% 65.00%    32.50MB 65.00%  encoding/json.Marshal
   10.00MB 20.00% 85.00%    10.00MB 20.00%  io.ReadAll
    5.00MB 10.00% 95.00%     5.00MB 10.00%  bytes.Buffer.String

ตีความ:

json.Marshal กิน memory 32.5MB — 65%!
io.ReadAll อ่าน request body ทั้งก้อนเข้า memory — อาจเปลี่ยนเป็น streaming

`inuse_space` vs `alloc_space`

# inuse_space (default) — ตอนนี้ใช้ memory อยู่เท่าไหร่
go tool pprof -inuse_space heap.prof

# alloc_space — เคยจัดสรรไปทั้งหมดเท่าไหร่ (รวมที่คืนไปแล้ว)
go tool pprof -alloc_space heap.prof

inuse = memory leak or high baseline
alloc = จัดสรรเยอะ → GC ทำงานหนัก

ตัวอย่าง — Goroutine Profile

go tool pprof http://localhost:6060/debug/pprof/goroutine

(pprof) top
      flat  flat%   sum%        cum   cum%
       450 90.00% 90.00%        450 90.00%  runtime.gopark
        20  4.00% 94.00%         20  4.00%  net/http.(*conn).serve

ตีความ: มี 450 goroutines ติด gopark (กำลังรออะไรสักอย่าง) — อาจมี goroutine leak หรือ I/O slow

Graph — เห็นภาพ

(pprof) web

จะได้ graph แบบนี้ (ต้องมี graphviz):

         ┌─────────────┐
         │   main()    │
         │    0.05s    │
         └──────┬──────┘
                │
      ┌─────────┴──────────┐
      ▼                    ▼
┌───────────┐       ┌────────────┐
│ handleReq │       │  initDB()  │
│   0.20s   │       │   0.10s    │
└─────┬─────┘       └────────────┘
      │
      ▼
┌───────────┐
│json.Decode│  ← กล่องใหญ่สุด = ใช้เวลามากสุด
│   0.60s   │
└───────────┘

กล่องใหญ่ = ใช้เวลา/memory เยอะ → ควร optimise
เส้นหนา = เรียกบ่อย → อาจเรียกเกินจำเป็น

เทคนิค — Benchmark + Profile

อยากรู้ว่า function A vs B ใครเร็วกว่า — ใช้ benchmark + profile:

// xxx_test.go
func BenchmarkProcessOrder(b *testing.B) {
    for i := 0; i < b.N; i++ {
        ProcessOrder(testOrder)
    }
}

go test -bench=. -cpuprofile=cpu.prof -memprofile=mem.prof
go tool pprof cpu.prof

สรุป — 3 กฏง่าย ๆ

CPU profile: flat สูง → function นั้นช้า — optimise logic
Memory profile: flat สูง → จัดสรร memory เยอะ — ลด allocation
Goroutine profile: จำนวนเยอะ + โตไม่หยุด → leak — ใส่ context/timeout

อย่า optimise โดยไม่ profile — มันเหมือนขับรถโดยไม่ดูแผนที่ — คุณอาจจะไปถูกทาง แต่โอกาสหลงมีสูงกว่า

📚 อ่านต่อ:

Go Blog: Profiling Go Programs — official guide

pprof Documentation — all options

runtime/pprof Package Docs

Go Profiling 101 — ดูว่าโค้ดเราช้าตรงไหน โดยไม่ต้องเดา