Source file src/runtime/mgc.go
1 // Copyright 2009 The Go Authors. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 // Garbage collector (GC). 6 // 7 // The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple 8 // GC threads to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is 9 // non-generational and non-compacting. Allocation is done using size segregated per P allocation 10 // areas to minimize fragmentation while eliminating locks in the common case. 11 // 12 // The algorithm decomposes into several steps. 13 // This is a high level description of the algorithm being used. For an overview of GC a good 14 // place to start is Richard Jones' gchandbook.org. 15 // 16 // The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see 17 // Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978. 18 // On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978), 19 // 966-975. 20 // For journal quality proofs that these steps are complete, correct, and terminate see 21 // Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world. 22 // Concurrency and Computation: Practice and Experience 15(3-5), 2003. 23 // 24 // 1. GC performs sweep termination. 25 // 26 // a. Stop the world. This causes all Ps to reach a GC safe-point. 27 // 28 // b. Sweep any unswept spans. There will only be unswept spans if 29 // this GC cycle was forced before the expected time. 30 // 31 // 2. GC performs the mark phase. 32 // 33 // a. Prepare for the mark phase by setting gcphase to _GCmark 34 // (from _GCoff), enabling the write barrier, enabling mutator 35 // assists, and enqueueing root mark jobs. No objects may be 36 // scanned until all Ps have enabled the write barrier, which is 37 // accomplished using STW. 38 // 39 // b. Start the world. From this point, GC work is done by mark 40 // workers started by the scheduler and by assists performed as 41 // part of allocation. The write barrier shades both the 42 // overwritten pointer and the new pointer value for any pointer 43 // writes (see mbarrier.go for details). Newly allocated objects 44 // are immediately marked black. 45 // 46 // c. GC performs root marking jobs. This includes scanning all 47 // stacks, shading all globals, and shading any heap pointers in 48 // off-heap runtime data structures. Scanning a stack stops a 49 // goroutine, shades any pointers found on its stack, and then 50 // resumes the goroutine. 51 // 52 // d. GC drains the work queue of grey objects, scanning each grey 53 // object to black and shading all pointers found in the object 54 // (which in turn may add those pointers to the work queue). 55 // 56 // e. Because GC work is spread across local caches, GC uses a 57 // distributed termination algorithm to detect when there are no 58 // more root marking jobs or grey objects (see gcMarkDone). At this 59 // point, GC transitions to mark termination. 60 // 61 // 3. GC performs mark termination. 62 // 63 // a. Stop the world. 64 // 65 // b. Set gcphase to _GCmarktermination, and disable workers and 66 // assists. 67 // 68 // c. Perform housekeeping like flushing mcaches. 69 // 70 // 4. GC performs the sweep phase. 71 // 72 // a. Prepare for the sweep phase by setting gcphase to _GCoff, 73 // setting up sweep state and disabling the write barrier. 74 // 75 // b. Start the world. From this point on, newly allocated objects 76 // are white, and allocating sweeps spans before use if necessary. 77 // 78 // c. GC does concurrent sweeping in the background and in response 79 // to allocation. See description below. 80 // 81 // 5. When sufficient allocation has taken place, replay the sequence 82 // starting with 1 above. See discussion of GC rate below. 83 84 // Concurrent sweep. 85 // 86 // The sweep phase proceeds concurrently with normal program execution. 87 // The heap is swept span-by-span both lazily (when a goroutine needs another span) 88 // and concurrently in a background goroutine (this helps programs that are not CPU bound). 89 // At the end of STW mark termination all spans are marked as "needs sweeping". 90 // 91 // The background sweeper goroutine simply sweeps spans one-by-one. 92 // 93 // To avoid requesting more OS memory while there are unswept spans, when a 94 // goroutine needs another span, it first attempts to reclaim that much memory 95 // by sweeping. When a goroutine needs to allocate a new small-object span, it 96 // sweeps small-object spans for the same object size until it frees at least 97 // one object. When a goroutine needs to allocate large-object span from heap, 98 // it sweeps spans until it frees at least that many pages into heap. There is 99 // one case where this may not suffice: if a goroutine sweeps and frees two 100 // nonadjacent one-page spans to the heap, it will allocate a new two-page 101 // span, but there can still be other one-page unswept spans which could be 102 // combined into a two-page span. 103 // 104 // It's critical to ensure that no operations proceed on unswept spans (that would corrupt 105 // mark bits in GC bitmap). During GC all mcaches are flushed into the central cache, 106 // so they are empty. When a goroutine grabs a new span into mcache, it sweeps it. 107 // When a goroutine explicitly frees an object or sets a finalizer, it ensures that 108 // the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish). 109 // The finalizer goroutine is kicked off only when all spans are swept. 110 // When the next GC starts, it sweeps all not-yet-swept spans (if any). 111 112 // GC rate. 113 // Next GC is after we've allocated an extra amount of memory proportional to 114 // the amount already in use. The proportion is controlled by GOGC environment variable 115 // (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M 116 // (this mark is computed by the gcController.heapGoal method). This keeps the GC cost in 117 // linear proportion to the allocation cost. Adjusting GOGC just changes the linear constant 118 // (and also the amount of extra memory used). 119 120 // Oblets 121 // 122 // In order to prevent long pauses while scanning large objects and to 123 // improve parallelism, the garbage collector breaks up scan jobs for 124 // objects larger than maxObletBytes into "oblets" of at most 125 // maxObletBytes. When scanning encounters the beginning of a large 126 // object, it scans only the first oblet and enqueues the remaining 127 // oblets as new scan jobs. 128 129 package runtime 130 131 import ( 132 "internal/cpu" 133 "internal/goarch" 134 "internal/goexperiment" 135 "internal/runtime/atomic" 136 "internal/runtime/gc" 137 "unsafe" 138 ) 139 140 const ( 141 _DebugGC = 0 142 143 // concurrentSweep is a debug flag. Disabling this flag 144 // ensures all spans are swept while the world is stopped. 145 concurrentSweep = true 146 147 // debugScanConservative enables debug logging for stack 148 // frames that are scanned conservatively. 149 debugScanConservative = false 150 151 // sweepMinHeapDistance is a lower bound on the heap distance 152 // (in bytes) reserved for concurrent sweeping between GC 153 // cycles. 154 sweepMinHeapDistance = 1024 * 1024 155 ) 156 157 // heapObjectsCanMove always returns false in the current garbage collector. 158 // It exists for go4.org/unsafe/assume-no-moving-gc, which is an 159 // unfortunate idea that had an even more unfortunate implementation. 160 // Every time a new Go release happened, the package stopped building, 161 // and the authors had to add a new file with a new //go:build line, and 162 // then the entire ecosystem of packages with that as a dependency had to 163 // explicitly update to the new version. Many packages depend on 164 // assume-no-moving-gc transitively, through paths like 165 // inet.af/netaddr -> go4.org/intern -> assume-no-moving-gc. 166 // This was causing a significant amount of friction around each new 167 // release, so we added this bool for the package to //go:linkname 168 // instead. The bool is still unfortunate, but it's not as bad as 169 // breaking the ecosystem on every new release. 170 // 171 // If the Go garbage collector ever does move heap objects, we can set 172 // this to true to break all the programs using assume-no-moving-gc. 173 // 174 //go:linkname heapObjectsCanMove 175 func heapObjectsCanMove() bool { 176 return false 177 } 178 179 func gcinit() { 180 if unsafe.Sizeof(workbuf{}) != _WorkbufSize { 181 throw("size of Workbuf is suboptimal") 182 } 183 // No sweep on the first cycle. 184 sweep.active.state.Store(sweepDrainedMask) 185 186 // Initialize GC pacer state. 187 // Use the environment variable GOGC for the initial gcPercent value. 188 // Use the environment variable GOMEMLIMIT for the initial memoryLimit value. 189 gcController.init(readGOGC(), readGOMEMLIMIT()) 190 191 // Set up the cleanup block ptr mask. 192 for i := range cleanupBlockPtrMask { 193 cleanupBlockPtrMask[i] = 0xff 194 } 195 196 work.startSema = 1 197 work.markDoneSema = 1 198 work.spanSPMCs.list.init(unsafe.Offsetof(spanSPMC{}.allnode)) 199 lockInit(&work.sweepWaiters.lock, lockRankSweepWaiters) 200 lockInit(&work.assistQueue.lock, lockRankAssistQueue) 201 lockInit(&work.strongFromWeak.lock, lockRankStrongFromWeakQueue) 202 lockInit(&work.wbufSpans.lock, lockRankWbufSpans) 203 lockInit(&work.spanSPMCs.lock, lockRankSpanSPMCs) 204 lockInit(&gcCleanups.lock, lockRankCleanupQueue) 205 } 206 207 // gcenable is called after the bulk of the runtime initialization, 208 // just before we're about to start letting user code run. 209 // It kicks off the background sweeper goroutine, the background 210 // scavenger goroutine, and enables GC. 211 func gcenable() { 212 // Kick off sweeping and scavenging. 213 c := make(chan int, 2) 214 go bgsweep(c) 215 go bgscavenge(c) 216 <-c 217 <-c 218 memstats.enablegc = true // now that runtime is initialized, GC is okay 219 } 220 221 // Garbage collector phase. 222 // Indicates to write barrier and synchronization task to perform. 223 var gcphase uint32 224 225 // The compiler knows about this variable. 226 // If you change it, you must change builtin/runtime.go, too. 227 // If you change the first four bytes, you must also change the write 228 // barrier insertion code. 229 // 230 // writeBarrier should be an internal detail, 231 // but widely used packages access it using linkname. 232 // Notable members of the hall of shame include: 233 // - github.com/bytedance/sonic 234 // 235 // Do not remove or change the type signature. 236 // See go.dev/issue/67401. 237 // 238 //go:linkname writeBarrier 239 var writeBarrier struct { 240 enabled bool // compiler emits a check of this before calling write barrier 241 pad [3]byte // compiler uses 32-bit load for "enabled" field 242 alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load 243 } 244 245 // gcBlackenEnabled is 1 if mutator assists and background mark 246 // workers are allowed to blacken objects. This must only be set when 247 // gcphase == _GCmark. 248 var gcBlackenEnabled uint32 249 250 const ( 251 _GCoff = iota // GC not running; sweeping in background, write barrier disabled 252 _GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED 253 _GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED 254 ) 255 256 //go:nosplit 257 func setGCPhase(x uint32) { 258 atomic.Store(&gcphase, x) 259 writeBarrier.enabled = gcphase == _GCmark || gcphase == _GCmarktermination 260 } 261 262 // gcMarkWorkerMode represents the mode that a concurrent mark worker 263 // should operate in. 264 // 265 // Concurrent marking happens through four different mechanisms. One 266 // is mutator assists, which happen in response to allocations and are 267 // not scheduled. The other three are variations in the per-P mark 268 // workers and are distinguished by gcMarkWorkerMode. 269 type gcMarkWorkerMode int 270 271 const ( 272 // gcMarkWorkerNotWorker indicates that the next scheduled G is not 273 // starting work and the mode should be ignored. 274 gcMarkWorkerNotWorker gcMarkWorkerMode = iota 275 276 // gcMarkWorkerDedicatedMode indicates that the P of a mark 277 // worker is dedicated to running that mark worker. The mark 278 // worker should run without preemption. 279 gcMarkWorkerDedicatedMode 280 281 // gcMarkWorkerFractionalMode indicates that a P is currently 282 // running the "fractional" mark worker. The fractional worker 283 // is necessary when GOMAXPROCS*gcBackgroundUtilization is not 284 // an integer and using only dedicated workers would result in 285 // utilization too far from the target of gcBackgroundUtilization. 286 // The fractional worker should run until it is preempted and 287 // will be scheduled to pick up the fractional part of 288 // GOMAXPROCS*gcBackgroundUtilization. 289 gcMarkWorkerFractionalMode 290 291 // gcMarkWorkerIdleMode indicates that a P is running the mark 292 // worker because it has nothing else to do. The idle worker 293 // should run until it is preempted and account its time 294 // against gcController.idleMarkTime. 295 gcMarkWorkerIdleMode 296 ) 297 298 // gcMarkWorkerModeStrings are the strings labels of gcMarkWorkerModes 299 // to use in execution traces. 300 var gcMarkWorkerModeStrings = [...]string{ 301 "Not worker", 302 "GC (dedicated)", 303 "GC (fractional)", 304 "GC (idle)", 305 } 306 307 // pollFractionalWorkerExit reports whether a fractional mark worker 308 // should self-preempt. It assumes it is called from the fractional 309 // worker. 310 func pollFractionalWorkerExit() bool { 311 // This should be kept in sync with the fractional worker 312 // scheduler logic in findRunnableGCWorker. 313 now := nanotime() 314 delta := now - gcController.markStartTime 315 if delta <= 0 { 316 return true 317 } 318 p := getg().m.p.ptr() 319 selfTime := p.gcFractionalMarkTime.Load() + (now - p.gcMarkWorkerStartTime) 320 // Add some slack to the utilization goal so that the 321 // fractional worker isn't behind again the instant it exits. 322 return float64(selfTime)/float64(delta) > 1.2*gcController.fractionalUtilizationGoal 323 } 324 325 var work workType 326 327 type workType struct { 328 full lfstack // lock-free list of full blocks workbuf 329 _ cpu.CacheLinePad // prevents false-sharing between full and empty 330 empty lfstack // lock-free list of empty blocks workbuf 331 _ cpu.CacheLinePad // prevents false-sharing between empty and wbufSpans 332 333 wbufSpans struct { 334 lock mutex 335 // free is a list of spans dedicated to workbufs, but 336 // that don't currently contain any workbufs. 337 free mSpanList 338 // busy is a list of all spans containing workbufs on 339 // one of the workbuf lists. 340 busy mSpanList 341 } 342 _ cpu.CacheLinePad // prevents false-sharing between wbufSpans and spanWorkMask 343 344 // spanqMask is a bitmap indicating which Ps have local work worth stealing. 345 // Set or cleared by the owning P, cleared by stealing Ps. 346 // 347 // spanqMask is like a proxy for a global queue. An important invariant is that 348 // forced flushing like gcw.dispose must set this bit on any P that has local 349 // span work. 350 spanqMask pMask 351 _ cpu.CacheLinePad // prevents false-sharing between spanqMask and everything else 352 353 // List of all spanSPMCs. 354 // 355 // Only used if goexperiment.GreenTeaGC. 356 spanSPMCs struct { 357 lock mutex 358 list listHeadManual // *spanSPMC 359 } 360 361 // Restore 64-bit alignment on 32-bit. 362 // _ uint32 363 364 // bytesMarked is the number of bytes marked this cycle. This 365 // includes bytes blackened in scanned objects, noscan objects 366 // that go straight to black, objects allocated as black during 367 // the cycle, and permagrey objects scanned by markroot during 368 // the concurrent scan phase. 369 // 370 // This is updated atomically during the cycle. Updates may be batched 371 // arbitrarily, since the value is only read at the end of the cycle. 372 // 373 // Because of benign races during marking, this number may not 374 // be the exact number of marked bytes, but it should be very 375 // close. 376 // 377 // Put this field here because it needs 64-bit atomic access 378 // (and thus 8-byte alignment even on 32-bit architectures). 379 bytesMarked uint64 380 381 markrootNext atomic.Uint32 // next markroot job 382 markrootJobs atomic.Uint32 // number of markroot jobs 383 384 nproc uint32 385 tstart int64 386 nwait uint32 387 388 // Number of roots of various root types. Set by gcPrepareMarkRoots. 389 // 390 // During normal GC cycle, nStackRoots == nMaybeRunnableStackRoots == len(stackRoots); 391 // during goroutine leak detection, nMaybeRunnableStackRoots is the number of stackRoots 392 // scheduled for marking. 393 // In both variants, nStackRoots == len(stackRoots). 394 nDataRoots, nBSSRoots, nSpanRoots, nStackRoots, nMaybeRunnableStackRoots int 395 396 // The following fields monitor the GC phase of the current cycle during 397 // goroutine leak detection. 398 goroutineLeak struct { 399 // Once set, it indicates that the GC will perform goroutine leak detection during 400 // the next GC cycle; it is set by goroutineLeakGC and unset during gcStart. 401 pending atomic.Bool 402 // Once set, it indicates that the GC has started a goroutine leak detection run; 403 // it is set during gcStart and unset during gcMarkTermination; 404 // 405 // Protected by STW. 406 enabled bool 407 // Once set, it indicates that the GC has performed goroutine leak detection during 408 // the current GC cycle; it is set during gcMarkDone, right after goroutine leak detection, 409 // and unset during gcMarkTermination; 410 // 411 // Protected by STW. 412 done bool 413 // The number of leaked goroutines during the last leak detection GC cycle. 414 // 415 // Write-protected by STW in findGoroutineLeaks. 416 count int 417 } 418 419 // Base indexes of each root type. Set by gcPrepareMarkRoots. 420 baseData, baseBSS, baseSpans, baseStacks, baseEnd uint32 421 422 // stackRoots is a snapshot of all of the Gs that existed before the 423 // beginning of concurrent marking. During goroutine leak detection, stackRoots 424 // is partitioned into two sets; to the left of nMaybeRunnableStackRoots are stackRoots 425 // of running / runnable goroutines and to the right of nMaybeRunnableStackRoots are 426 // stackRoots of unmarked / not runnable goroutines 427 // The stackRoots array is re-partitioned after each marking phase iteration. 428 stackRoots []*g 429 430 // Each type of GC state transition is protected by a lock. 431 // Since multiple threads can simultaneously detect the state 432 // transition condition, any thread that detects a transition 433 // condition must acquire the appropriate transition lock, 434 // re-check the transition condition and return if it no 435 // longer holds or perform the transition if it does. 436 // Likewise, any transition must invalidate the transition 437 // condition before releasing the lock. This ensures that each 438 // transition is performed by exactly one thread and threads 439 // that need the transition to happen block until it has 440 // happened. 441 // 442 // startSema protects the transition from "off" to mark or 443 // mark termination. 444 startSema uint32 445 // markDoneSema protects transitions from mark to mark termination. 446 markDoneSema uint32 447 448 bgMarkDone uint32 // cas to 1 when at a background mark completion point 449 // Background mark completion signaling 450 451 // mode is the concurrency mode of the current GC cycle. 452 mode gcMode 453 454 // userForced indicates the current GC cycle was forced by an 455 // explicit user call. 456 userForced bool 457 458 // initialHeapLive is the value of gcController.heapLive at the 459 // beginning of this GC cycle. 460 initialHeapLive uint64 461 462 // assistQueue is a queue of assists that are blocked because 463 // there was neither enough credit to steal or enough work to 464 // do. 465 assistQueue struct { 466 lock mutex 467 q gQueue 468 } 469 470 // sweepWaiters is a list of blocked goroutines to wake when 471 // we transition from mark termination to sweep. 472 sweepWaiters struct { 473 lock mutex 474 list gList 475 } 476 477 // strongFromWeak controls how the GC interacts with weak->strong 478 // pointer conversions. 479 strongFromWeak struct { 480 // block is a flag set during mark termination that prevents 481 // new weak->strong conversions from executing by blocking the 482 // goroutine and enqueuing it onto q. 483 // 484 // Mutated only by one goroutine at a time in gcMarkDone, 485 // with globally-synchronizing events like forEachP and 486 // stopTheWorld. 487 block bool 488 489 // q is a queue of goroutines that attempted to perform a 490 // weak->strong conversion during mark termination. 491 // 492 // Protected by lock. 493 lock mutex 494 q gQueue 495 } 496 497 // cycles is the number of completed GC cycles, where a GC 498 // cycle is sweep termination, mark, mark termination, and 499 // sweep. This differs from memstats.numgc, which is 500 // incremented at mark termination. 501 cycles atomic.Uint32 502 503 // Timing/utilization stats for this cycle. 504 stwprocs, maxprocs int32 505 tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start 506 507 // pauseNS is the total STW time this cycle, measured as the time between 508 // when stopping began (just before trying to stop Ps) and just after the 509 // world started again. 510 pauseNS int64 511 512 // debug.gctrace heap sizes for this cycle. 513 heap0, heap1, heap2 uint64 514 515 // Cumulative estimated CPU usage. 516 cpuStats 517 } 518 519 // GC runs a garbage collection and blocks the caller until the 520 // garbage collection is complete. It may also block the entire 521 // program. 522 func GC() { 523 // We consider a cycle to be: sweep termination, mark, mark 524 // termination, and sweep. This function shouldn't return 525 // until a full cycle has been completed, from beginning to 526 // end. Hence, we always want to finish up the current cycle 527 // and start a new one. That means: 528 // 529 // 1. In sweep termination, mark, or mark termination of cycle 530 // N, wait until mark termination N completes and transitions 531 // to sweep N. 532 // 533 // 2. In sweep N, help with sweep N. 534 // 535 // At this point we can begin a full cycle N+1. 536 // 537 // 3. Trigger cycle N+1 by starting sweep termination N+1. 538 // 539 // 4. Wait for mark termination N+1 to complete. 540 // 541 // 5. Help with sweep N+1 until it's done. 542 // 543 // This all has to be written to deal with the fact that the 544 // GC may move ahead on its own. For example, when we block 545 // until mark termination N, we may wake up in cycle N+2. 546 547 // Wait until the current sweep termination, mark, and mark 548 // termination complete. 549 n := work.cycles.Load() 550 gcWaitOnMark(n) 551 552 // We're now in sweep N or later. Trigger GC cycle N+1, which 553 // will first finish sweep N if necessary and then enter sweep 554 // termination N+1. 555 gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1}) 556 557 // Wait for mark termination N+1 to complete. 558 gcWaitOnMark(n + 1) 559 560 // Finish sweep N+1 before returning. We do this both to 561 // complete the cycle and because runtime.GC() is often used 562 // as part of tests and benchmarks to get the system into a 563 // relatively stable and isolated state. 564 for work.cycles.Load() == n+1 && sweepone() != ^uintptr(0) { 565 Gosched() 566 } 567 568 // Callers may assume that the heap profile reflects the 569 // just-completed cycle when this returns (historically this 570 // happened because this was a STW GC), but right now the 571 // profile still reflects mark termination N, not N+1. 572 // 573 // As soon as all of the sweep frees from cycle N+1 are done, 574 // we can go ahead and publish the heap profile. 575 // 576 // First, wait for sweeping to finish. (We know there are no 577 // more spans on the sweep queue, but we may be concurrently 578 // sweeping spans, so we have to wait.) 579 for work.cycles.Load() == n+1 && !isSweepDone() { 580 Gosched() 581 } 582 583 // Now we're really done with sweeping, so we can publish the 584 // stable heap profile. Only do this if we haven't already hit 585 // another mark termination. 586 mp := acquirem() 587 cycle := work.cycles.Load() 588 if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) { 589 mProf_PostSweep() 590 } 591 releasem(mp) 592 } 593 594 // goroutineLeakGC runs a GC cycle that performs goroutine leak detection. 595 // 596 //go:linkname goroutineLeakGC runtime/pprof.runtime_goroutineLeakGC 597 func goroutineLeakGC() { 598 // Set the pending flag to true, instructing the next GC cycle to 599 // perform goroutine leak detection. 600 work.goroutineLeak.pending.Store(true) 601 602 // Spin GC cycles until the pending flag is unset. 603 // This ensures that goroutineLeakGC waits for a GC cycle that 604 // actually performs goroutine leak detection. 605 // 606 // This is needed in case multiple concurrent calls to GC 607 // are simultaneously fired by the system, wherein some 608 // of them are dropped. 609 // 610 // In the vast majority of cases, only one loop iteration is needed; 611 // however, multiple concurrent calls to goroutineLeakGC could lead to 612 // the execution of additional GC cycles. 613 // 614 // Examples: 615 // 616 // pending? | G1 | G2 617 // ---------|-------------------------|----------------------- 618 // - | goroutineLeakGC() | goroutineLeakGC() 619 // - | pending.Store(true) | . 620 // X | for pending.Load() | . 621 // X | GC() | . 622 // X | > gcStart() | . 623 // X | pending.Store(false) | . 624 // ... 625 // - | > gcMarkDone() | . 626 // - | . | pending.Store(true) 627 // ... 628 // X | > gcMarkTermination() | . 629 // X | ... 630 // X | < GC returns | . 631 // X | for pending.Load | . 632 // X | GC() | . 633 // X | . | for pending.Load() 634 // X | . | GC() 635 // ... 636 // The first to pick up the pending flag will start a 637 // leak detection cycle. 638 for work.goroutineLeak.pending.Load() { 639 GC() 640 } 641 } 642 643 // gcWaitOnMark blocks until GC finishes the Nth mark phase. If GC has 644 // already completed this mark phase, it returns immediately. 645 func gcWaitOnMark(n uint32) { 646 for { 647 // Disable phase transitions. 648 lock(&work.sweepWaiters.lock) 649 nMarks := work.cycles.Load() 650 if gcphase != _GCmark { 651 // We've already completed this cycle's mark. 652 nMarks++ 653 } 654 if nMarks > n { 655 // We're done. 656 unlock(&work.sweepWaiters.lock) 657 return 658 } 659 660 // Wait until sweep termination, mark, and mark 661 // termination of cycle N complete. 662 work.sweepWaiters.list.push(getg()) 663 goparkunlock(&work.sweepWaiters.lock, waitReasonWaitForGCCycle, traceBlockUntilGCEnds, 1) 664 } 665 } 666 667 // gcMode indicates how concurrent a GC cycle should be. 668 type gcMode int 669 670 const ( 671 gcBackgroundMode gcMode = iota // concurrent GC and sweep 672 gcForceMode // stop-the-world GC now, concurrent sweep 673 gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user) 674 ) 675 676 // A gcTrigger is a predicate for starting a GC cycle. Specifically, 677 // it is an exit condition for the _GCoff phase. 678 type gcTrigger struct { 679 kind gcTriggerKind 680 now int64 // gcTriggerTime: current time 681 n uint32 // gcTriggerCycle: cycle number to start 682 } 683 684 type gcTriggerKind int 685 686 const ( 687 // gcTriggerHeap indicates that a cycle should be started when 688 // the heap size reaches the trigger heap size computed by the 689 // controller. 690 gcTriggerHeap gcTriggerKind = iota 691 692 // gcTriggerTime indicates that a cycle should be started when 693 // it's been more than forcegcperiod nanoseconds since the 694 // previous GC cycle. 695 gcTriggerTime 696 697 // gcTriggerCycle indicates that a cycle should be started if 698 // we have not yet started cycle number gcTrigger.n (relative 699 // to work.cycles). 700 gcTriggerCycle 701 ) 702 703 // test reports whether the trigger condition is satisfied, meaning 704 // that the exit condition for the _GCoff phase has been met. The exit 705 // condition should be tested when allocating. 706 func (t gcTrigger) test() bool { 707 if !memstats.enablegc || panicking.Load() != 0 || gcphase != _GCoff { 708 return false 709 } 710 switch t.kind { 711 case gcTriggerHeap: 712 trigger, _ := gcController.trigger() 713 return gcController.heapLive.Load() >= trigger 714 case gcTriggerTime: 715 if gcController.gcPercent.Load() < 0 { 716 return false 717 } 718 lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime)) 719 return lastgc != 0 && t.now-lastgc > forcegcperiod 720 case gcTriggerCycle: 721 // t.n > work.cycles, but accounting for wraparound. 722 return int32(t.n-work.cycles.Load()) > 0 723 } 724 return true 725 } 726 727 // gcStart starts the GC. It transitions from _GCoff to _GCmark (if 728 // debug.gcstoptheworld == 0) or performs all of GC (if 729 // debug.gcstoptheworld != 0). 730 // 731 // This may return without performing this transition in some cases, 732 // such as when called on a system stack or with locks held. 733 func gcStart(trigger gcTrigger) { 734 // Since this is called from malloc and malloc is called in 735 // the guts of a number of libraries that might be holding 736 // locks, don't attempt to start GC in non-preemptible or 737 // potentially unstable situations. 738 mp := acquirem() 739 if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" { 740 releasem(mp) 741 return 742 } 743 releasem(mp) 744 mp = nil 745 746 if gp := getg(); gp.bubble != nil { 747 // Disassociate the G from its synctest bubble while allocating. 748 // This is less elegant than incrementing the group's active count, 749 // but avoids any contamination between GC and synctest. 750 bubble := gp.bubble 751 gp.bubble = nil 752 defer func() { 753 gp.bubble = bubble 754 }() 755 } 756 757 // Pick up the remaining unswept/not being swept spans concurrently 758 // 759 // This shouldn't happen if we're being invoked in background 760 // mode since proportional sweep should have just finished 761 // sweeping everything, but rounding errors, etc, may leave a 762 // few spans unswept. In forced mode, this is necessary since 763 // GC can be forced at any point in the sweeping cycle. 764 // 765 // We check the transition condition continuously here in case 766 // this G gets delayed in to the next GC cycle. 767 for trigger.test() && sweepone() != ^uintptr(0) { 768 } 769 770 // Perform GC initialization and the sweep termination 771 // transition. 772 semacquire(&work.startSema) 773 // Re-check transition condition under transition lock. 774 if !trigger.test() { 775 semrelease(&work.startSema) 776 return 777 } 778 779 // In gcstoptheworld debug mode, upgrade the mode accordingly. 780 // We do this after re-checking the transition condition so 781 // that multiple goroutines that detect the heap trigger don't 782 // start multiple STW GCs. 783 mode := gcBackgroundMode 784 if debug.gcstoptheworld == 1 { 785 mode = gcForceMode 786 } else if debug.gcstoptheworld == 2 { 787 mode = gcForceBlockMode 788 } 789 790 // Ok, we're doing it! Stop everybody else 791 semacquire(&gcsema) 792 semacquire(&worldsema) 793 794 // For stats, check if this GC was forced by the user. 795 // Update it under gcsema to avoid gctrace getting wrong values. 796 work.userForced = trigger.kind == gcTriggerCycle 797 798 trace := traceAcquire() 799 if trace.ok() { 800 trace.GCStart() 801 traceRelease(trace) 802 } 803 804 // Check and setup per-P state. 805 for _, p := range allp { 806 // Check that all Ps have finished deferred mcache flushes. 807 if fg := p.mcache.flushGen.Load(); fg != mheap_.sweepgen { 808 println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen) 809 throw("p mcache not flushed") 810 } 811 // Initialize ptrBuf if necessary. 812 if goexperiment.GreenTeaGC && p.gcw.ptrBuf == nil { 813 p.gcw.ptrBuf = (*[gc.PageSize / goarch.PtrSize]uintptr)(persistentalloc(gc.PageSize, goarch.PtrSize, &memstats.gcMiscSys)) 814 } 815 } 816 817 gcBgMarkStartWorkers() 818 819 systemstack(gcResetMarkState) 820 821 work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs 822 if work.stwprocs > numCPUStartup { 823 // This is used to compute CPU time of the STW phases, so it 824 // can't be more than the CPU count, even if GOMAXPROCS is. 825 work.stwprocs = numCPUStartup 826 } 827 work.heap0 = gcController.heapLive.Load() 828 work.pauseNS = 0 829 work.mode = mode 830 831 now := nanotime() 832 work.tSweepTerm = now 833 var stw worldStop 834 systemstack(func() { 835 stw = stopTheWorldWithSema(stwGCSweepTerm) 836 }) 837 838 // Accumulate fine-grained stopping time. 839 work.cpuStats.accumulateGCPauseTime(stw.stoppingCPUTime, 1) 840 841 if goexperiment.RuntimeSecret { 842 // The world is stopped, which means every M is either idle, blocked 843 // in a syscall or this M that we are running on now. 844 // The blocked Ms had any secret spill on their signal stacks erased 845 // when they entered their respective states. Now we have to handle 846 // this one. 847 eraseSecretsSignalStk() 848 } 849 850 // Finish sweep before we start concurrent scan. 851 systemstack(func() { 852 finishsweep_m() 853 }) 854 855 // clearpools before we start the GC. If we wait the memory will not be 856 // reclaimed until the next GC cycle. 857 clearpools() 858 859 work.cycles.Add(1) 860 861 // Assists and workers can start the moment we start 862 // the world. 863 gcController.startCycle(now, int(gomaxprocs), trigger) 864 865 // Notify the CPU limiter that assists may begin. 866 gcCPULimiter.startGCTransition(true, now) 867 868 // In STW mode, disable scheduling of user Gs. This may also 869 // disable scheduling of this goroutine, so it may block as 870 // soon as we start the world again. 871 if mode != gcBackgroundMode { 872 schedEnableUser(false) 873 } 874 875 // If goroutine leak detection is pending, enable it for this GC cycle. 876 if work.goroutineLeak.pending.Load() { 877 work.goroutineLeak.enabled = true 878 work.goroutineLeak.pending.Store(false) 879 // Set all sync objects of blocked goroutines as untraceable 880 // by the GC. Only set as traceable at the end of the GC cycle. 881 setSyncObjectsUntraceable() 882 } 883 884 // Enter concurrent mark phase and enable 885 // write barriers. 886 // 887 // Because the world is stopped, all Ps will 888 // observe that write barriers are enabled by 889 // the time we start the world and begin 890 // scanning. 891 // 892 // Write barriers must be enabled before assists are 893 // enabled because they must be enabled before 894 // any non-leaf heap objects are marked. Since 895 // allocations are blocked until assists can 896 // happen, we want to enable assists as early as 897 // possible. 898 setGCPhase(_GCmark) 899 900 gcBgMarkPrepare() // Must happen before assists are enabled. 901 gcPrepareMarkRoots() 902 903 // Mark all active tinyalloc blocks. Since we're 904 // allocating from these, they need to be black like 905 // other allocations. The alternative is to blacken 906 // the tiny block on every allocation from it, which 907 // would slow down the tiny allocator. 908 gcMarkTinyAllocs() 909 910 // At this point all Ps have enabled the write 911 // barrier, thus maintaining the no white to 912 // black invariant. Enable mutator assists to 913 // put back-pressure on fast allocating 914 // mutators. 915 atomic.Store(&gcBlackenEnabled, 1) 916 917 // In STW mode, we could block the instant systemstack 918 // returns, so make sure we're not preemptible. 919 mp = acquirem() 920 921 // Update the CPU stats pause time. 922 // 923 // Use maxprocs instead of stwprocs here because the total time 924 // computed in the CPU stats is based on maxprocs, and we want them 925 // to be comparable. 926 work.cpuStats.accumulateGCPauseTime(nanotime()-stw.finishedStopping, work.maxprocs) 927 928 // Concurrent mark. 929 systemstack(func() { 930 now = startTheWorldWithSema(0, stw) 931 work.pauseNS += now - stw.startedStopping 932 work.tMark = now 933 934 // Release the CPU limiter. 935 gcCPULimiter.finishGCTransition(now) 936 }) 937 938 // Release the world sema before Gosched() in STW mode 939 // because we will need to reacquire it later but before 940 // this goroutine becomes runnable again, and we could 941 // self-deadlock otherwise. 942 semrelease(&worldsema) 943 releasem(mp) 944 945 // Make sure we block instead of returning to user code 946 // in STW mode. 947 if mode != gcBackgroundMode { 948 Gosched() 949 } 950 951 semrelease(&work.startSema) 952 } 953 954 // gcMarkDoneFlushed counts the number of P's with flushed work. 955 // 956 // Ideally this would be a captured local in gcMarkDone, but forEachP 957 // escapes its callback closure, so it can't capture anything. 958 // 959 // This is protected by markDoneSema. 960 var gcMarkDoneFlushed uint32 961 962 // gcDebugMarkDone contains fields used to debug/test mark termination. 963 var gcDebugMarkDone struct { 964 // spinAfterRaggedBarrier forces gcMarkDone to spin after it executes 965 // the ragged barrier. 966 spinAfterRaggedBarrier atomic.Bool 967 968 // restartedDueTo27993 indicates that we restarted mark termination 969 // due to the bug described in issue #27993. 970 // 971 // Protected by worldsema. 972 restartedDueTo27993 bool 973 } 974 975 // gcMarkDone transitions the GC from mark to mark termination if all 976 // reachable objects have been marked (that is, there are no grey 977 // objects and can be no more in the future). Otherwise, it flushes 978 // all local work to the global queues where it can be discovered by 979 // other workers. 980 // 981 // All goroutines performing GC work must call gcBeginWork to signal 982 // that they're executing GC work. They must call gcEndWork when done. 983 // This should be called when all local mark work has been drained and 984 // there are no remaining workers. Specifically, when gcEndWork returns 985 // true. 986 // 987 // The calling context must be preemptible. 988 // 989 // Flushing local work is important because idle Ps may have local 990 // work queued. This is the only way to make that work visible and 991 // drive GC to completion. 992 // 993 // It is explicitly okay to have write barriers in this function. If 994 // it does transition to mark termination, then all reachable objects 995 // have been marked, so the write barrier cannot shade any more 996 // objects. 997 func gcMarkDone() { 998 // Ensure only one thread is running the ragged barrier at a 999 // time. 1000 semacquire(&work.markDoneSema) 1001 1002 top: 1003 // Re-check transition condition under transition lock. 1004 // 1005 // It's critical that this checks the global work queues are 1006 // empty before performing the ragged barrier. Otherwise, 1007 // there could be global work that a P could take after the P 1008 // has passed the ragged barrier. 1009 if !(gcphase == _GCmark && gcIsMarkDone()) { 1010 semrelease(&work.markDoneSema) 1011 return 1012 } 1013 1014 // forEachP needs worldsema to execute, and we'll need it to 1015 // stop the world later, so acquire worldsema now. 1016 semacquire(&worldsema) 1017 1018 // Prevent weak->strong conversions from generating additional 1019 // GC work. forEachP will guarantee that it is observed globally. 1020 work.strongFromWeak.block = true 1021 1022 // Flush all local buffers and collect flushedWork flags. 1023 gcMarkDoneFlushed = 0 1024 forEachP(waitReasonGCMarkTermination, func(pp *p) { 1025 // Flush the write barrier buffer, since this may add 1026 // work to the gcWork. 1027 wbBufFlush1(pp) 1028 1029 // Flush the gcWork, since this may create global work 1030 // and set the flushedWork flag. 1031 // 1032 // TODO(austin): Break up these workbufs to 1033 // better distribute work. 1034 pp.gcw.dispose() 1035 1036 // Collect the flushedWork flag. 1037 if pp.gcw.flushedWork { 1038 atomic.Xadd(&gcMarkDoneFlushed, 1) 1039 pp.gcw.flushedWork = false 1040 } 1041 }) 1042 1043 if gcMarkDoneFlushed != 0 { 1044 // More grey objects were discovered since the 1045 // previous termination check, so there may be more 1046 // work to do. Keep going. It's possible the 1047 // transition condition became true again during the 1048 // ragged barrier, so re-check it. 1049 semrelease(&worldsema) 1050 goto top 1051 } 1052 1053 // For debugging/testing. 1054 for gcDebugMarkDone.spinAfterRaggedBarrier.Load() { 1055 } 1056 1057 // There was no global work, no local work, and no Ps 1058 // communicated work since we took markDoneSema. Therefore 1059 // there are no grey objects and no more objects can be 1060 // shaded. Transition to mark termination. 1061 now := nanotime() 1062 work.tMarkTerm = now 1063 getg().m.preemptoff = "gcing" 1064 var stw worldStop 1065 systemstack(func() { 1066 stw = stopTheWorldWithSema(stwGCMarkTerm) 1067 }) 1068 // The gcphase is _GCmark, it will transition to _GCmarktermination 1069 // below. The important thing is that the wb remains active until 1070 // all marking is complete. This includes writes made by the GC. 1071 1072 // Accumulate fine-grained stopping time. 1073 work.cpuStats.accumulateGCPauseTime(stw.stoppingCPUTime, 1) 1074 1075 // There is sometimes work left over when we enter mark termination due 1076 // to write barriers performed after the completion barrier above. 1077 // Detect this and resume concurrent mark. This is obviously 1078 // unfortunate. 1079 // 1080 // See issue #27993 for details. 1081 // 1082 // Switch to the system stack to call wbBufFlush1, though in this case 1083 // it doesn't matter because we're non-preemptible anyway. 1084 restart := false 1085 systemstack(func() { 1086 for _, p := range allp { 1087 wbBufFlush1(p) 1088 if !p.gcw.empty() { 1089 restart = true 1090 break 1091 } 1092 } 1093 }) 1094 1095 // Check whether we need to resume the marking phase because of issue #27993 1096 // or because of goroutine leak detection. 1097 if restart || (work.goroutineLeak.enabled && !work.goroutineLeak.done) { 1098 if restart { 1099 // Restart because of issue #27993. 1100 gcDebugMarkDone.restartedDueTo27993 = true 1101 } else { 1102 // Marking has reached a fixed-point. Attempt to detect goroutine leaks. 1103 // 1104 // If the returned value is true, then detection already concluded for this cycle. 1105 // Otherwise, more runnable goroutines were discovered, requiring additional mark work. 1106 work.goroutineLeak.done = findGoroutineLeaks() 1107 } 1108 1109 getg().m.preemptoff = "" 1110 systemstack(func() { 1111 // Accumulate the time we were stopped before we had to start again. 1112 work.cpuStats.accumulateGCPauseTime(nanotime()-stw.finishedStopping, work.maxprocs) 1113 1114 // Start the world again. 1115 now := startTheWorldWithSema(0, stw) 1116 work.pauseNS += now - stw.startedStopping 1117 }) 1118 semrelease(&worldsema) 1119 goto top 1120 } 1121 1122 gcComputeStartingStackSize() 1123 1124 // Disable assists and background workers. We must do 1125 // this before waking blocked assists. 1126 atomic.Store(&gcBlackenEnabled, 0) 1127 1128 // Notify the CPU limiter that GC assists will now cease. 1129 gcCPULimiter.startGCTransition(false, now) 1130 1131 // Wake all blocked assists. These will run when we 1132 // start the world again. 1133 gcWakeAllAssists() 1134 1135 // Wake all blocked weak->strong conversions. These will run 1136 // when we start the world again. 1137 work.strongFromWeak.block = false 1138 gcWakeAllStrongFromWeak() 1139 1140 // Likewise, release the transition lock. Blocked 1141 // workers and assists will run when we start the 1142 // world again. 1143 semrelease(&work.markDoneSema) 1144 1145 // In STW mode, re-enable user goroutines. These will be 1146 // queued to run after we start the world. 1147 schedEnableUser(true) 1148 1149 // endCycle depends on all gcWork cache stats being flushed. 1150 // The termination algorithm above ensured that up to 1151 // allocations since the ragged barrier. 1152 gcController.endCycle(now, int(gomaxprocs)) 1153 1154 // Perform mark termination. This will restart the world. 1155 gcMarkTermination(stw) 1156 } 1157 1158 // isMaybeRunnable checks whether a goroutine may still be semantically runnable. 1159 // For goroutines which are semantically runnable, this will eventually return true 1160 // as the GC marking phase progresses. It returns false for leaked goroutines, or for 1161 // goroutines which are not yet computed as possibly runnable by the GC. 1162 func (gp *g) isMaybeRunnable() bool { 1163 // Check whether the goroutine is actually in a waiting state first. 1164 if readgstatus(gp) != _Gwaiting { 1165 // If the goroutine is not waiting, then clearly it is maybe runnable. 1166 return true 1167 } 1168 1169 switch gp.waitreason { 1170 case waitReasonSelectNoCases, 1171 waitReasonChanSendNilChan, 1172 waitReasonChanReceiveNilChan: 1173 // Select with no cases or communicating on nil channels 1174 // make goroutines unrunnable by definition. 1175 return false 1176 case waitReasonChanReceive, 1177 waitReasonSelect, 1178 waitReasonChanSend: 1179 // Cycle all through all *sudog to check whether 1180 // the goroutine is waiting on a marked channel. 1181 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1182 if isMarkedOrNotInHeap(unsafe.Pointer(sg.c.get())) { 1183 return true 1184 } 1185 } 1186 return false 1187 case waitReasonSyncCondWait, 1188 waitReasonSyncWaitGroupWait, 1189 waitReasonSyncMutexLock, 1190 waitReasonSyncRWMutexLock, 1191 waitReasonSyncRWMutexRLock: 1192 // If waiting on mutexes, wait groups, or condition variables, 1193 // check if the synchronization primitive attached to the sudog is marked. 1194 if gp.waiting != nil { 1195 return isMarkedOrNotInHeap(gp.waiting.elem.get()) 1196 } 1197 } 1198 return true 1199 } 1200 1201 // findMaybeRunnableGoroutines checks to see if more blocked but maybe-runnable goroutines exist. 1202 // If so, it adds them into root set and increments work.markrootJobs accordingly. 1203 // Returns true if we need to run another phase of markroots; returns false otherwise. 1204 func findMaybeRunnableGoroutines() (moreWork bool) { 1205 oldRootJobs := work.markrootJobs.Load() 1206 1207 // To begin with we have a set of unchecked stackRoots between 1208 // vIndex and ivIndex. During the loop, anything < vIndex should be 1209 // valid stackRoots and anything >= ivIndex should be invalid stackRoots. 1210 // The loop terminates when the two indices meet. 1211 var vIndex, ivIndex int = work.nMaybeRunnableStackRoots, work.nStackRoots 1212 // Reorder goroutine list 1213 for vIndex < ivIndex { 1214 if work.stackRoots[vIndex].isMaybeRunnable() { 1215 vIndex = vIndex + 1 1216 continue 1217 } 1218 for ivIndex = ivIndex - 1; ivIndex != vIndex; ivIndex = ivIndex - 1 { 1219 if gp := work.stackRoots[ivIndex]; gp.isMaybeRunnable() { 1220 work.stackRoots[ivIndex] = work.stackRoots[vIndex] 1221 work.stackRoots[vIndex] = gp 1222 vIndex = vIndex + 1 1223 break 1224 } 1225 } 1226 } 1227 1228 newRootJobs := work.baseStacks + uint32(vIndex) 1229 if newRootJobs > oldRootJobs { 1230 work.nMaybeRunnableStackRoots = vIndex 1231 work.markrootJobs.Store(newRootJobs) 1232 } 1233 return newRootJobs > oldRootJobs 1234 } 1235 1236 // setSyncObjectsUntraceable scans allgs and sets the elem and c fields of all sudogs to 1237 // an untrackable pointer. This prevents the GC from marking these objects as live in memory 1238 // by following these pointers when runnning deadlock detection. 1239 func setSyncObjectsUntraceable() { 1240 assertWorldStopped() 1241 1242 forEachGRace(func(gp *g) { 1243 // Set as untraceable all synchronization objects of goroutines 1244 // blocked at concurrency operations that could leak. 1245 switch { 1246 case gp.waitreason.isSyncWait(): 1247 // Synchronization primitives are reachable from the *sudog via 1248 // via the elem field. 1249 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1250 sg.elem.setUntraceable() 1251 } 1252 case gp.waitreason.isChanWait(): 1253 // Channels and select statements are reachable from the *sudog via the c field. 1254 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1255 sg.c.setUntraceable() 1256 } 1257 } 1258 }) 1259 } 1260 1261 // gcRestoreSyncObjects restores the elem and c fields of all sudogs to their original values. 1262 // Should be invoked after the goroutine leak detection phase. 1263 func gcRestoreSyncObjects() { 1264 assertWorldStopped() 1265 1266 forEachGRace(func(gp *g) { 1267 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1268 sg.elem.setTraceable() 1269 sg.c.setTraceable() 1270 } 1271 }) 1272 } 1273 1274 // findGoroutineLeaks scans the remaining stackRoots and marks any which are 1275 // blocked over exclusively unreachable concurrency primitives as leaked (deadlocked). 1276 // Returns true if the goroutine leak check was performed (or unnecessary). 1277 // Returns false if the GC cycle has not yet computed all maybe-runnable goroutines. 1278 func findGoroutineLeaks() bool { 1279 assertWorldStopped() 1280 1281 // Report goroutine leaks and mark them unreachable, and resume marking 1282 // we still need to mark these unreachable *g structs as they 1283 // get reused, but their stack won't get scanned 1284 if work.nMaybeRunnableStackRoots == work.nStackRoots { 1285 // nMaybeRunnableStackRoots == nStackRoots means that all goroutines are marked. 1286 return true 1287 } 1288 1289 // Check whether any more maybe-runnable goroutines can be found by the GC. 1290 if findMaybeRunnableGoroutines() { 1291 // We found more work, so we need to resume the marking phase. 1292 return false 1293 } 1294 1295 // For the remaining goroutines, mark them as unreachable and leaked. 1296 work.goroutineLeak.count = work.nStackRoots - work.nMaybeRunnableStackRoots 1297 1298 for i := work.nMaybeRunnableStackRoots; i < work.nStackRoots; i++ { 1299 gp := work.stackRoots[i] 1300 casgstatus(gp, _Gwaiting, _Gleaked) 1301 1302 // Add the primitives causing the goroutine leaks 1303 // to the GC work queue, to ensure they are marked. 1304 // 1305 // NOTE(vsaioc): these primitives should also be reachable 1306 // from the goroutine's stack, but let's play it safe. 1307 switch { 1308 case gp.waitreason.isChanWait(): 1309 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1310 shade(sg.c.uintptr()) 1311 } 1312 case gp.waitreason.isSyncWait(): 1313 for sg := gp.waiting; sg != nil; sg = sg.waitlink { 1314 shade(sg.elem.uintptr()) 1315 } 1316 } 1317 } 1318 1319 // Do not report the main goroutine if it is waiting on select{}. 1320 // 1321 // NOTE: We still treat the main goroutine as leaked during the analysis, 1322 // but revert its status to _Gwaiting after the analysis to not include 1323 // it in the goroutine leak profile. 1324 // This preserves the effectiveness of goroutine leak detection 1325 // if the main goroutine holds references to concurrency primitives causing 1326 // other leaks. 1327 // 1328 // Example: 1329 // 1330 // ```go 1331 // func main() { 1332 // ch := make(chan int) 1333 // go func() { 1334 // ... 1335 // <-ch // Leaks 1336 // }() 1337 // 1338 // select {} 1339 // } 1340 // ``` 1341 // 1342 // The main goroutine is blocked by select{}, but holds a reference to "ch". 1343 // Not treating the main goroutine as leaked would cause the analysis to 1344 // miss the legitimate leak at the child goroutine. 1345 // 1346 // The main goroutine should always be allgs[0], but double check 1347 // in case that invariant changes in the future. 1348 if gp0 := allgs[0]; gp0.goid == 1 && gp0.waitreason == waitReasonSelectNoCases { 1349 casgstatus(gp0, _Gleaked, _Gwaiting) 1350 } 1351 1352 // Put the remaining roots as ready for marking and drain them. 1353 work.markrootJobs.Add(int32(work.nStackRoots - work.nMaybeRunnableStackRoots)) 1354 work.nMaybeRunnableStackRoots = work.nStackRoots 1355 return true 1356 } 1357 1358 // World must be stopped and mark assists and background workers must be 1359 // disabled. 1360 func gcMarkTermination(stw worldStop) { 1361 // Start marktermination (write barrier remains enabled for now). 1362 setGCPhase(_GCmarktermination) 1363 1364 work.heap1 = gcController.heapLive.Load() 1365 startTime := nanotime() 1366 1367 mp := acquirem() 1368 mp.preemptoff = "gcing" 1369 mp.traceback = 2 1370 curgp := mp.curg 1371 // N.B. The execution tracer is not aware of this status 1372 // transition and handles it specially based on the 1373 // wait reason. 1374 casGToWaitingForSuspendG(curgp, _Grunning, waitReasonGarbageCollection) 1375 1376 // Run gc on the g0 stack. We do this so that the g stack 1377 // we're currently running on will no longer change. Cuts 1378 // the root set down a bit (g0 stacks are not scanned, and 1379 // we don't need to scan gc's internal state). We also 1380 // need to switch to g0 so we can shrink the stack. 1381 systemstack(func() { 1382 gcMark(startTime) 1383 // Must return immediately. 1384 // The outer function's stack may have moved 1385 // during gcMark (it shrinks stacks, including the 1386 // outer function's stack), so we must not refer 1387 // to any of its variables. Return back to the 1388 // non-system stack to pick up the new addresses 1389 // before continuing. 1390 }) 1391 1392 var stwSwept bool 1393 systemstack(func() { 1394 work.heap2 = work.bytesMarked 1395 if debug.gccheckmark > 0 { 1396 runCheckmark(func(_ *gcWork) { gcPrepareMarkRoots() }) 1397 } 1398 if debug.checkfinalizers > 0 { 1399 checkFinalizersAndCleanups() 1400 } 1401 1402 // marking is complete so we can turn the write barrier off 1403 setGCPhase(_GCoff) 1404 stwSwept = gcSweep(work.mode) 1405 }) 1406 1407 mp.traceback = 0 1408 casgstatus(curgp, _Gwaiting, _Grunning) 1409 1410 trace := traceAcquire() 1411 if trace.ok() { 1412 trace.GCDone() 1413 traceRelease(trace) 1414 } 1415 1416 // all done 1417 mp.preemptoff = "" 1418 1419 if gcphase != _GCoff { 1420 throw("gc done but gcphase != _GCoff") 1421 } 1422 1423 // Record heapInUse for scavenger. 1424 memstats.lastHeapInUse = gcController.heapInUse.load() 1425 1426 // Update GC trigger and pacing, as well as downstream consumers 1427 // of this pacing information, for the next cycle. 1428 systemstack(gcControllerCommit) 1429 1430 // Update timing memstats 1431 now := nanotime() 1432 sec, nsec, _ := time_now() 1433 unixNow := sec*1e9 + int64(nsec) 1434 work.pauseNS += now - stw.startedStopping 1435 work.tEnd = now 1436 atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user 1437 atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us 1438 memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS) 1439 memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow) 1440 memstats.pause_total_ns += uint64(work.pauseNS) 1441 1442 // Accumulate CPU stats. 1443 // 1444 // Use maxprocs instead of stwprocs for GC pause time because the total time 1445 // computed in the CPU stats is based on maxprocs, and we want them to be 1446 // comparable. 1447 // 1448 // Pass gcMarkPhase=true to accumulate so we can get all the latest GC CPU stats 1449 // in there too. 1450 work.cpuStats.accumulateGCPauseTime(now-stw.finishedStopping, work.maxprocs) 1451 work.cpuStats.accumulate(now, true) 1452 1453 // Compute overall GC CPU utilization. 1454 // Omit idle marking time from the overall utilization here since it's "free". 1455 memstats.gc_cpu_fraction = float64(work.cpuStats.GCTotalTime-work.cpuStats.GCIdleTime) / float64(work.cpuStats.TotalTime) 1456 1457 // Reset assist time and background time stats. 1458 // 1459 // Do this now, instead of at the start of the next GC cycle, because 1460 // these two may keep accumulating even if the GC is not active. 1461 scavenge.assistTime.Store(0) 1462 scavenge.backgroundTime.Store(0) 1463 1464 // Reset idle time stat. 1465 sched.idleTime.Store(0) 1466 1467 if work.userForced { 1468 memstats.numforcedgc++ 1469 } 1470 1471 // Bump GC cycle count and wake goroutines waiting on sweep. 1472 lock(&work.sweepWaiters.lock) 1473 memstats.numgc++ 1474 injectglist(&work.sweepWaiters.list) 1475 unlock(&work.sweepWaiters.lock) 1476 1477 // Increment the scavenge generation now. 1478 // 1479 // This moment represents peak heap in use because we're 1480 // about to start sweeping. 1481 mheap_.pages.scav.index.nextGen() 1482 1483 // Release the CPU limiter. 1484 gcCPULimiter.finishGCTransition(now) 1485 1486 // Finish the current heap profiling cycle and start a new 1487 // heap profiling cycle. We do this before starting the world 1488 // so events don't leak into the wrong cycle. 1489 mProf_NextCycle() 1490 1491 // There may be stale spans in mcaches that need to be swept. 1492 // Those aren't tracked in any sweep lists, so we need to 1493 // count them against sweep completion until we ensure all 1494 // those spans have been forced out. 1495 // 1496 // If gcSweep fully swept the heap (for example if the sweep 1497 // is not concurrent due to a GODEBUG setting), then we expect 1498 // the sweepLocker to be invalid, since sweeping is done. 1499 // 1500 // N.B. Below we might duplicate some work from gcSweep; this is 1501 // fine as all that work is idempotent within a GC cycle, and 1502 // we're still holding worldsema so a new cycle can't start. 1503 sl := sweep.active.begin() 1504 if !stwSwept && !sl.valid { 1505 throw("failed to set sweep barrier") 1506 } else if stwSwept && sl.valid { 1507 throw("non-concurrent sweep failed to drain all sweep queues") 1508 } 1509 1510 if work.goroutineLeak.enabled { 1511 // Restore the elem and c fields of all sudogs to their original values. 1512 gcRestoreSyncObjects() 1513 } 1514 1515 var goroutineLeakDone bool 1516 systemstack(func() { 1517 // Pull the GC out of goroutine leak detection mode. 1518 work.goroutineLeak.enabled = false 1519 goroutineLeakDone = work.goroutineLeak.done 1520 work.goroutineLeak.done = false 1521 1522 // The memstats updated above must be updated with the world 1523 // stopped to ensure consistency of some values, such as 1524 // sched.idleTime and sched.totaltime. memstats also include 1525 // the pause time (work,pauseNS), forcing computation of the 1526 // total pause time before the pause actually ends. 1527 // 1528 // Here we reuse the same now for start the world so that the 1529 // time added to /sched/pauses/total/gc:seconds will be 1530 // consistent with the value in memstats. 1531 startTheWorldWithSema(now, stw) 1532 }) 1533 1534 // Flush the heap profile so we can start a new cycle next GC. 1535 // This is relatively expensive, so we don't do it with the 1536 // world stopped. 1537 mProf_Flush() 1538 1539 // Prepare workbufs for freeing by the sweeper. We do this 1540 // asynchronously because it can take non-trivial time. 1541 prepareFreeWorkbufs() 1542 1543 // Free stack spans. This must be done between GC cycles. 1544 systemstack(freeStackSpans) 1545 1546 // Ensure all mcaches are flushed. Each P will flush its own 1547 // mcache before allocating, but idle Ps may not. Since this 1548 // is necessary to sweep all spans, we need to ensure all 1549 // mcaches are flushed before we start the next GC cycle. 1550 // 1551 // While we're here, flush the page cache for idle Ps to avoid 1552 // having pages get stuck on them. These pages are hidden from 1553 // the scavenger, so in small idle heaps a significant amount 1554 // of additional memory might be held onto. 1555 // 1556 // Also, flush the pinner cache, to avoid leaking that memory 1557 // indefinitely. 1558 if debug.gctrace > 1 { 1559 clear(memstats.lastScanStats[:]) 1560 } 1561 forEachP(waitReasonFlushProcCaches, func(pp *p) { 1562 pp.mcache.prepareForSweep() 1563 if pp.status == _Pidle { 1564 systemstack(func() { 1565 lock(&mheap_.lock) 1566 pp.pcache.flush(&mheap_.pages) 1567 unlock(&mheap_.lock) 1568 }) 1569 } 1570 if debug.gctrace > 1 { 1571 pp.gcw.flushScanStats(&memstats.lastScanStats) 1572 } 1573 pp.pinnerCache = nil 1574 }) 1575 if sl.valid { 1576 // Now that we've swept stale spans in mcaches, they don't 1577 // count against unswept spans. 1578 // 1579 // Note: this sweepLocker may not be valid if sweeping had 1580 // already completed during the STW. See the corresponding 1581 // begin() call that produced sl. 1582 sweep.active.end(sl) 1583 } 1584 1585 // Print gctrace before dropping worldsema. As soon as we drop 1586 // worldsema another cycle could start and smash the stats 1587 // we're trying to print. 1588 if debug.gctrace > 0 { 1589 util := int(memstats.gc_cpu_fraction * 100) 1590 1591 var sbuf [24]byte 1592 printlock() 1593 print("gc ", memstats.numgc, 1594 " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ", 1595 util, "%") 1596 if goroutineLeakDone { 1597 print(" (checking for goroutine leaks)") 1598 } 1599 print(": ") 1600 prev := work.tSweepTerm 1601 for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} { 1602 if i != 0 { 1603 print("+") 1604 } 1605 print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev)))) 1606 prev = ns 1607 } 1608 print(" ms clock, ") 1609 for i, ns := range []int64{ 1610 int64(work.stwprocs) * (work.tMark - work.tSweepTerm), 1611 gcController.assistTime.Load(), 1612 gcController.dedicatedMarkTime.Load() + gcController.fractionalMarkTime.Load(), 1613 gcController.idleMarkTime.Load(), 1614 int64(work.stwprocs) * (work.tEnd - work.tMarkTerm), 1615 } { 1616 if i == 2 || i == 3 { 1617 // Separate mark time components with /. 1618 print("/") 1619 } else if i != 0 { 1620 print("+") 1621 } 1622 print(string(fmtNSAsMS(sbuf[:], uint64(ns)))) 1623 } 1624 print(" ms cpu, ", 1625 work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ", 1626 gcController.lastHeapGoal>>20, " MB goal, ", 1627 gcController.lastStackScan.Load()>>20, " MB stacks, ", 1628 gcController.globalsScan.Load()>>20, " MB globals, ", 1629 work.maxprocs, " P") 1630 if work.userForced { 1631 print(" (forced)") 1632 } 1633 print("\n") 1634 1635 if debug.gctrace > 1 { 1636 dumpScanStats() 1637 } 1638 printunlock() 1639 } 1640 1641 // Print finalizer/cleanup queue length. Like gctrace, do this before the next GC starts. 1642 // The fact that the next GC might start is not that problematic here, but acts as a convenient 1643 // lock on printing this information (so it cannot overlap with itself from the next GC cycle). 1644 if debug.checkfinalizers > 0 { 1645 fq, fe := finReadQueueStats() 1646 fn := max(int64(fq)-int64(fe), 0) 1647 1648 cq, ce := gcCleanups.readQueueStats() 1649 cn := max(int64(cq)-int64(ce), 0) 1650 1651 println("checkfinalizers: queue:", fn, "finalizers +", cn, "cleanups") 1652 } 1653 1654 // Set any arena chunks that were deferred to fault. 1655 lock(&userArenaState.lock) 1656 faultList := userArenaState.fault 1657 userArenaState.fault = nil 1658 unlock(&userArenaState.lock) 1659 for _, lc := range faultList { 1660 lc.mspan.setUserArenaChunkToFault() 1661 } 1662 1663 // Enable huge pages on some metadata if we cross a heap threshold. 1664 if gcController.heapGoal() > minHeapForMetadataHugePages { 1665 systemstack(func() { 1666 mheap_.enableMetadataHugePages() 1667 }) 1668 } 1669 1670 semrelease(&worldsema) 1671 semrelease(&gcsema) 1672 // Careful: another GC cycle may start now. 1673 1674 releasem(mp) 1675 mp = nil 1676 1677 // now that gc is done, kick off finalizer thread if needed 1678 if !concurrentSweep { 1679 // give the queued finalizers, if any, a chance to run 1680 Gosched() 1681 } 1682 } 1683 1684 // gcBgMarkStartWorkers prepares background mark worker goroutines. These 1685 // goroutines will not run until the mark phase, but they must be started while 1686 // the work is not stopped and from a regular G stack. The caller must hold 1687 // worldsema. 1688 func gcBgMarkStartWorkers() { 1689 // Background marking is performed by per-P G's. Ensure that each P has 1690 // a background GC G. 1691 // 1692 // Worker Gs don't exit if gomaxprocs is reduced. If it is raised 1693 // again, we can reuse the old workers; no need to create new workers. 1694 if gcBgMarkWorkerCount >= gomaxprocs { 1695 return 1696 } 1697 1698 // Increment mp.locks when allocating. We are called within gcStart, 1699 // and thus must not trigger another gcStart via an allocation. gcStart 1700 // bails when allocating with locks held, so simulate that for these 1701 // allocations. 1702 // 1703 // TODO(prattmic): cleanup gcStart to use a more explicit "in gcStart" 1704 // check for bailing. 1705 mp := acquirem() 1706 ready := make(chan struct{}, 1) 1707 releasem(mp) 1708 1709 for gcBgMarkWorkerCount < gomaxprocs { 1710 mp := acquirem() // See above, we allocate a closure here. 1711 go gcBgMarkWorker(ready) 1712 releasem(mp) 1713 1714 // N.B. we intentionally wait on each goroutine individually 1715 // rather than starting all in a batch and then waiting once 1716 // afterwards. By running one goroutine at a time, we can take 1717 // advantage of runnext to bounce back and forth between 1718 // workers and this goroutine. In an overloaded application, 1719 // this can reduce GC start latency by prioritizing these 1720 // goroutines rather than waiting on the end of the run queue. 1721 <-ready 1722 // The worker is now guaranteed to be added to the pool before 1723 // its P's next findRunnableGCWorker. 1724 1725 gcBgMarkWorkerCount++ 1726 } 1727 } 1728 1729 // gcBgMarkPrepare sets up state for background marking. 1730 // Mutator assists must not yet be enabled. 1731 func gcBgMarkPrepare() { 1732 // Background marking will stop when the work queues are empty 1733 // and there are no more workers (note that, since this is 1734 // concurrent, this may be a transient state, but mark 1735 // termination will clean it up). Between background workers 1736 // and assists, we don't really know how many workers there 1737 // will be, so we pretend to have an arbitrarily large number 1738 // of workers, almost all of which are "waiting". While a 1739 // worker is working it decrements nwait. If nproc == nwait, 1740 // there are no workers. 1741 work.nproc = ^uint32(0) 1742 work.nwait = ^uint32(0) 1743 } 1744 1745 // gcBgMarkWorkerNode is an entry in the gcBgMarkWorkerPool. It points to a single 1746 // gcBgMarkWorker goroutine. 1747 type gcBgMarkWorkerNode struct { 1748 // Unused workers are managed in a lock-free stack. This field must be first. 1749 node lfnode 1750 1751 // The g of this worker. 1752 gp guintptr 1753 1754 // Release this m on park. This is used to communicate with the unlock 1755 // function, which cannot access the G's stack. It is unused outside of 1756 // gcBgMarkWorker(). 1757 m muintptr 1758 } 1759 type gcBgMarkWorkerNodePadded struct { 1760 gcBgMarkWorkerNode 1761 pad [tagAlign - unsafe.Sizeof(gcBgMarkWorkerNode{}) - gcBgMarkWorkerNodeRedZoneSize]byte 1762 } 1763 1764 const gcBgMarkWorkerNodeRedZoneSize = (16 << 2) * asanenabledBit // redZoneSize(512) 1765 1766 func gcBgMarkWorker(ready chan struct{}) { 1767 gp := getg() 1768 1769 // We pass node to a gopark unlock function, so it can't be on 1770 // the stack (see gopark). Prevent deadlock from recursively 1771 // starting GC by disabling preemption. 1772 gp.m.preemptoff = "GC worker init" 1773 // TODO: This is technically not allowed in the heap. See comment in tagptr.go. 1774 // 1775 // It is kept alive simply by virtue of being used in the infinite loop 1776 // below. gcBgMarkWorkerPool keeps pointers to nodes that are not 1777 // GC-visible, so this must be kept alive indefinitely (even if 1778 // GOMAXPROCS decreases). 1779 node := &new(gcBgMarkWorkerNodePadded).gcBgMarkWorkerNode 1780 gp.m.preemptoff = "" 1781 1782 node.gp.set(gp) 1783 1784 node.m.set(acquirem()) 1785 1786 ready <- struct{}{} 1787 // After this point, the background mark worker is generally scheduled 1788 // cooperatively by gcController.findRunnableGCWorker. While performing 1789 // work on the P, preemption is disabled because we are working on 1790 // P-local work buffers. When the preempt flag is set, this puts itself 1791 // into _Gwaiting to be woken up by gcController.findRunnableGCWorker 1792 // at the appropriate time. 1793 // 1794 // When preemption is enabled (e.g., while in gcMarkDone), this worker 1795 // may be preempted and schedule as a _Grunnable G from a runq. That is 1796 // fine; it will eventually gopark again for further scheduling via 1797 // findRunnableGCWorker. 1798 // 1799 // Since we disable preemption before notifying ready, we guarantee that 1800 // this G will be in the worker pool for the next findRunnableGCWorker. 1801 // This isn't strictly necessary, but it reduces latency between 1802 // _GCmark starting and the workers starting. 1803 1804 for { 1805 // Go to sleep until woken by 1806 // gcController.findRunnableGCWorker. 1807 gopark(func(g *g, nodep unsafe.Pointer) bool { 1808 node := (*gcBgMarkWorkerNode)(nodep) 1809 1810 if mp := node.m.ptr(); mp != nil { 1811 // The worker G is no longer running; release 1812 // the M. 1813 // 1814 // N.B. it is _safe_ to release the M as soon 1815 // as we are no longer performing P-local mark 1816 // work. 1817 // 1818 // However, since we cooperatively stop work 1819 // when gp.preempt is set, if we releasem in 1820 // the loop then the following call to gopark 1821 // would immediately preempt the G. This is 1822 // also safe, but inefficient: the G must 1823 // schedule again only to enter gopark and park 1824 // again. Thus, we defer the release until 1825 // after parking the G. 1826 releasem(mp) 1827 } 1828 1829 // Release this G to the pool. 1830 gcBgMarkWorkerPool.push(&node.node) 1831 // Note that at this point, the G may immediately be 1832 // rescheduled and may be running. 1833 return true 1834 }, unsafe.Pointer(node), waitReasonGCWorkerIdle, traceBlockSystemGoroutine, 0) 1835 1836 // Preemption must not occur here, or another G might see 1837 // p.gcMarkWorkerMode. 1838 1839 // Disable preemption so we can use the gcw. If the 1840 // scheduler wants to preempt us, we'll stop draining, 1841 // dispose the gcw, and then preempt. 1842 node.m.set(acquirem()) 1843 pp := gp.m.p.ptr() // P can't change with preemption disabled. 1844 1845 if gcBlackenEnabled == 0 { 1846 println("worker mode", pp.gcMarkWorkerMode) 1847 throw("gcBgMarkWorker: blackening not enabled") 1848 } 1849 1850 if pp.gcMarkWorkerMode == gcMarkWorkerNotWorker { 1851 throw("gcBgMarkWorker: mode not set") 1852 } 1853 1854 startTime := nanotime() 1855 pp.gcMarkWorkerStartTime = startTime 1856 var trackLimiterEvent bool 1857 if pp.gcMarkWorkerMode == gcMarkWorkerIdleMode { 1858 trackLimiterEvent = pp.limiterEvent.start(limiterEventIdleMarkWork, startTime) 1859 } 1860 1861 gcBeginWork() 1862 1863 systemstack(func() { 1864 // Mark our goroutine preemptible so its stack can be scanned or observed 1865 // by the execution tracer. This, for example, lets two mark workers scan 1866 // each other (otherwise, they would deadlock). 1867 // 1868 // casGToWaitingForSuspendG marks the goroutine as ineligible for a 1869 // stack shrink, effectively pinning the stack in memory for the duration. 1870 // 1871 // N.B. The execution tracer is not aware of this status transition and 1872 // handles it specially based on the wait reason. 1873 casGToWaitingForSuspendG(gp, _Grunning, waitReasonGCWorkerActive) 1874 switch pp.gcMarkWorkerMode { 1875 default: 1876 throw("gcBgMarkWorker: unexpected gcMarkWorkerMode") 1877 case gcMarkWorkerDedicatedMode: 1878 gcDrainMarkWorkerDedicated(&pp.gcw, true) 1879 if gp.preempt { 1880 // We were preempted. This is 1881 // a useful signal to kick 1882 // everything out of the run 1883 // queue so it can run 1884 // somewhere else. 1885 if drainQ := runqdrain(pp); !drainQ.empty() { 1886 lock(&sched.lock) 1887 globrunqputbatch(&drainQ) 1888 unlock(&sched.lock) 1889 } 1890 } 1891 // Go back to draining, this time 1892 // without preemption. 1893 gcDrainMarkWorkerDedicated(&pp.gcw, false) 1894 case gcMarkWorkerFractionalMode: 1895 gcDrainMarkWorkerFractional(&pp.gcw) 1896 case gcMarkWorkerIdleMode: 1897 gcDrainMarkWorkerIdle(&pp.gcw) 1898 } 1899 casgstatus(gp, _Gwaiting, _Grunning) 1900 }) 1901 1902 // Account for time and mark us as stopped. 1903 now := nanotime() 1904 duration := now - startTime 1905 gcController.markWorkerStop(pp.gcMarkWorkerMode, duration) 1906 if trackLimiterEvent { 1907 pp.limiterEvent.stop(limiterEventIdleMarkWork, now) 1908 } 1909 if pp.gcMarkWorkerMode == gcMarkWorkerFractionalMode { 1910 pp.gcFractionalMarkTime.Add(duration) 1911 } 1912 1913 // We'll releasem after this point and thus this P may run 1914 // something else. We must clear the worker mode to avoid 1915 // attributing the mode to a different (non-worker) G in 1916 // tracev2.GoStart. 1917 pp.gcMarkWorkerMode = gcMarkWorkerNotWorker 1918 1919 // If this worker reached a background mark completion 1920 // point, signal the main GC goroutine. 1921 if gcEndWork() { 1922 // We don't need the P-local buffers here, allow 1923 // preemption because we may schedule like a regular 1924 // goroutine in gcMarkDone (block on locks, etc). 1925 releasem(node.m.ptr()) 1926 node.m.set(nil) 1927 1928 gcMarkDone() 1929 } 1930 } 1931 } 1932 1933 // gcShouldScheduleWorker reports whether executing a mark worker 1934 // on p is potentially useful. p may be nil. 1935 func gcShouldScheduleWorker(p *p) bool { 1936 if p != nil && !p.gcw.empty() { 1937 return true 1938 } 1939 return gcMarkWorkAvailable() 1940 } 1941 1942 // gcIsMarkDone reports whether the mark phase is (probably) done. 1943 func gcIsMarkDone() bool { 1944 return work.nwait == work.nproc && !gcMarkWorkAvailable() 1945 } 1946 1947 // gcBeginWork signals to the garbage collector that a new worker is 1948 // about to process GC work. 1949 func gcBeginWork() { 1950 decnwait := atomic.Xadd(&work.nwait, -1) 1951 if decnwait == work.nproc { 1952 println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc) 1953 throw("work.nwait was > work.nproc") 1954 } 1955 } 1956 1957 // gcEndWork signals to the garbage collector that a new worker has just finished 1958 // its work. It reports whether it was the last worker and there's no more work 1959 // to do. If it returns true, the caller must call gcMarkDone. 1960 func gcEndWork() (last bool) { 1961 incnwait := atomic.Xadd(&work.nwait, +1) 1962 if incnwait > work.nproc { 1963 println("runtime: work.nwait=", incnwait, "work.nproc=", work.nproc) 1964 throw("work.nwait > work.nproc") 1965 } 1966 return incnwait == work.nproc && !gcMarkWorkAvailable() 1967 } 1968 1969 // gcMark runs the mark (or, for concurrent GC, mark termination) 1970 // All gcWork caches must be empty. 1971 // STW is in effect at this point. 1972 func gcMark(startTime int64) { 1973 if gcphase != _GCmarktermination { 1974 throw("in gcMark expecting to see gcphase as _GCmarktermination") 1975 } 1976 work.tstart = startTime 1977 1978 // Check that there's no marking work remaining. 1979 if next, jobs := work.markrootNext.Load(), work.markrootJobs.Load(); work.full != 0 || next < jobs { 1980 print("runtime: full=", hex(work.full), " next=", next, " jobs=", jobs, " nDataRoots=", work.nDataRoots, " nBSSRoots=", work.nBSSRoots, " nSpanRoots=", work.nSpanRoots, " nStackRoots=", work.nStackRoots, "\n") 1981 panic("non-empty mark queue after concurrent mark") 1982 } 1983 1984 if debug.gccheckmark > 0 { 1985 // This is expensive when there's a large number of 1986 // Gs, so only do it if checkmark is also enabled. 1987 gcMarkRootCheck() 1988 } 1989 1990 // Drop allg snapshot. allgs may have grown, in which case 1991 // this is the only reference to the old backing store and 1992 // there's no need to keep it around. 1993 work.stackRoots = nil 1994 1995 // Clear out buffers and double-check that all gcWork caches 1996 // are empty. This should be ensured by gcMarkDone before we 1997 // enter mark termination. 1998 // 1999 // TODO: We could clear out buffers just before mark if this 2000 // has a non-negligible impact on STW time. 2001 for _, p := range allp { 2002 // The write barrier may have buffered pointers since 2003 // the gcMarkDone barrier. However, since the barrier 2004 // ensured all reachable objects were marked, all of 2005 // these must be pointers to black objects. Hence we 2006 // can just discard the write barrier buffer. 2007 if debug.gccheckmark > 0 { 2008 // For debugging, flush the buffer and make 2009 // sure it really was all marked. 2010 wbBufFlush1(p) 2011 } else { 2012 p.wbBuf.reset() 2013 } 2014 2015 gcw := &p.gcw 2016 if !gcw.empty() { 2017 printlock() 2018 print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork) 2019 if gcw.wbuf1 == nil { 2020 print(" wbuf1=<nil>") 2021 } else { 2022 print(" wbuf1.n=", gcw.wbuf1.nobj) 2023 } 2024 if gcw.wbuf2 == nil { 2025 print(" wbuf2=<nil>") 2026 } else { 2027 print(" wbuf2.n=", gcw.wbuf2.nobj) 2028 } 2029 print("\n") 2030 throw("P has cached GC work at end of mark termination") 2031 } 2032 // There may still be cached empty buffers, which we 2033 // need to flush since we're going to free them. Also, 2034 // there may be non-zero stats because we allocated 2035 // black after the gcMarkDone barrier. 2036 gcw.dispose() 2037 } 2038 2039 // Flush scanAlloc from each mcache since we're about to modify 2040 // heapScan directly. If we were to flush this later, then scanAlloc 2041 // might have incorrect information. 2042 // 2043 // Note that it's not important to retain this information; we know 2044 // exactly what heapScan is at this point via scanWork. 2045 for _, p := range allp { 2046 c := p.mcache 2047 if c == nil { 2048 continue 2049 } 2050 c.scanAlloc = 0 2051 } 2052 2053 // Reset controller state. 2054 gcController.resetLive(work.bytesMarked) 2055 } 2056 2057 // gcSweep must be called on the system stack because it acquires the heap 2058 // lock. See mheap for details. 2059 // 2060 // Returns true if the heap was fully swept by this function. 2061 // 2062 // The world must be stopped. 2063 // 2064 //go:systemstack 2065 func gcSweep(mode gcMode) bool { 2066 assertWorldStopped() 2067 2068 if gcphase != _GCoff { 2069 throw("gcSweep being done but phase is not GCoff") 2070 } 2071 2072 lock(&mheap_.lock) 2073 mheap_.sweepgen += 2 2074 sweep.active.reset() 2075 mheap_.pagesSwept.Store(0) 2076 mheap_.sweepArenas = mheap_.heapArenas 2077 mheap_.reclaimIndex.Store(0) 2078 mheap_.reclaimCredit.Store(0) 2079 unlock(&mheap_.lock) 2080 2081 sweep.centralIndex.clear() 2082 2083 if !concurrentSweep || mode == gcForceBlockMode { 2084 // Special case synchronous sweep. 2085 // Record that no proportional sweeping has to happen. 2086 lock(&mheap_.lock) 2087 mheap_.sweepPagesPerByte = 0 2088 unlock(&mheap_.lock) 2089 // Flush all mcaches. 2090 for _, pp := range allp { 2091 pp.mcache.prepareForSweep() 2092 } 2093 // Sweep all spans eagerly. 2094 for sweepone() != ^uintptr(0) { 2095 } 2096 // Free workbufs and span rings eagerly. 2097 prepareFreeWorkbufs() 2098 for freeSomeWbufs(false) { 2099 } 2100 freeDeadSpanSPMCs() 2101 // All "free" events for this mark/sweep cycle have 2102 // now happened, so we can make this profile cycle 2103 // available immediately. 2104 mProf_NextCycle() 2105 mProf_Flush() 2106 return true 2107 } 2108 2109 // Background sweep. 2110 lock(&sweep.lock) 2111 if sweep.parked { 2112 sweep.parked = false 2113 ready(sweep.g, 0, true) 2114 } 2115 unlock(&sweep.lock) 2116 return false 2117 } 2118 2119 // gcResetMarkState resets global state prior to marking (concurrent 2120 // or STW) and resets the stack scan state of all Gs. 2121 // 2122 // This is safe to do without the world stopped because any Gs created 2123 // during or after this will start out in the reset state. 2124 // 2125 // gcResetMarkState must be called on the system stack because it acquires 2126 // the heap lock. See mheap for details. 2127 // 2128 //go:systemstack 2129 func gcResetMarkState() { 2130 // This may be called during a concurrent phase, so lock to make sure 2131 // allgs doesn't change. 2132 forEachG(func(gp *g) { 2133 gp.gcscandone = false // set to true in gcphasework 2134 gp.gcAssistBytes = 0 2135 }) 2136 2137 // Clear page marks. This is just 1MB per 64GB of heap, so the 2138 // time here is pretty trivial. 2139 lock(&mheap_.lock) 2140 arenas := mheap_.heapArenas 2141 unlock(&mheap_.lock) 2142 for _, ai := range arenas { 2143 ha := mheap_.arenas[ai.l1()][ai.l2()] 2144 clear(ha.pageMarks[:]) 2145 } 2146 2147 work.bytesMarked = 0 2148 work.initialHeapLive = gcController.heapLive.Load() 2149 } 2150 2151 // Hooks for other packages 2152 2153 var poolcleanup func() 2154 var boringCaches []unsafe.Pointer // for crypto/internal/boring 2155 2156 // sync_runtime_registerPoolCleanup should be an internal detail, 2157 // but widely used packages access it using linkname. 2158 // Notable members of the hall of shame include: 2159 // - github.com/bytedance/gopkg 2160 // - github.com/songzhibin97/gkit 2161 // 2162 // Do not remove or change the type signature. 2163 // See go.dev/issue/67401. 2164 // 2165 //go:linkname sync_runtime_registerPoolCleanup sync.runtime_registerPoolCleanup 2166 func sync_runtime_registerPoolCleanup(f func()) { 2167 poolcleanup = f 2168 } 2169 2170 //go:linkname boring_registerCache crypto/internal/boring/bcache.registerCache 2171 func boring_registerCache(p unsafe.Pointer) { 2172 boringCaches = append(boringCaches, p) 2173 } 2174 2175 func clearpools() { 2176 // clear sync.Pools 2177 if poolcleanup != nil { 2178 poolcleanup() 2179 } 2180 2181 // clear boringcrypto caches 2182 for _, p := range boringCaches { 2183 atomicstorep(p, nil) 2184 } 2185 2186 // Clear central sudog cache. 2187 // Leave per-P caches alone, they have strictly bounded size. 2188 // Disconnect cached list before dropping it on the floor, 2189 // so that a dangling ref to one entry does not pin all of them. 2190 lock(&sched.sudoglock) 2191 var sg, sgnext *sudog 2192 for sg = sched.sudogcache; sg != nil; sg = sgnext { 2193 sgnext = sg.next 2194 sg.next = nil 2195 } 2196 sched.sudogcache = nil 2197 unlock(&sched.sudoglock) 2198 2199 // Clear central defer pool. 2200 // Leave per-P pools alone, they have strictly bounded size. 2201 lock(&sched.deferlock) 2202 // disconnect cached list before dropping it on the floor, 2203 // so that a dangling ref to one entry does not pin all of them. 2204 var d, dlink *_defer 2205 for d = sched.deferpool; d != nil; d = dlink { 2206 dlink = d.link 2207 d.link = nil 2208 } 2209 sched.deferpool = nil 2210 unlock(&sched.deferlock) 2211 } 2212 2213 // Timing 2214 2215 // itoaDiv formats val/(10**dec) into buf. 2216 func itoaDiv(buf []byte, val uint64, dec int) []byte { 2217 i := len(buf) - 1 2218 idec := i - dec 2219 for val >= 10 || i >= idec { 2220 buf[i] = byte(val%10 + '0') 2221 i-- 2222 if i == idec { 2223 buf[i] = '.' 2224 i-- 2225 } 2226 val /= 10 2227 } 2228 buf[i] = byte(val + '0') 2229 return buf[i:] 2230 } 2231 2232 // fmtNSAsMS nicely formats ns nanoseconds as milliseconds. 2233 func fmtNSAsMS(buf []byte, ns uint64) []byte { 2234 if ns >= 10e6 { 2235 // Format as whole milliseconds. 2236 return itoaDiv(buf, ns/1e6, 0) 2237 } 2238 // Format two digits of precision, with at most three decimal places. 2239 x := ns / 1e3 2240 if x == 0 { 2241 buf[0] = '0' 2242 return buf[:1] 2243 } 2244 dec := 3 2245 for x >= 100 { 2246 x /= 10 2247 dec-- 2248 } 2249 return itoaDiv(buf, x, dec) 2250 } 2251 2252 // Helpers for testing GC. 2253 2254 // gcTestMoveStackOnNextCall causes the stack to be moved on a call 2255 // immediately following the call to this. It may not work correctly 2256 // if any other work appears after this call (such as returning). 2257 // Typically the following call should be marked go:noinline so it 2258 // performs a stack check. 2259 // 2260 // In rare cases this may not cause the stack to move, specifically if 2261 // there's a preemption between this call and the next. 2262 func gcTestMoveStackOnNextCall() { 2263 gp := getg() 2264 gp.stackguard0 = stackForceMove 2265 } 2266 2267 // gcTestIsReachable performs a GC and returns a bit set where bit i 2268 // is set if ptrs[i] is reachable. 2269 func gcTestIsReachable(ptrs ...unsafe.Pointer) (mask uint64) { 2270 // This takes the pointers as unsafe.Pointers in order to keep 2271 // them live long enough for us to attach specials. After 2272 // that, we drop our references to them. 2273 2274 if len(ptrs) > 64 { 2275 panic("too many pointers for uint64 mask") 2276 } 2277 2278 // Block GC while we attach specials and drop our references 2279 // to ptrs. Otherwise, if a GC is in progress, it could mark 2280 // them reachable via this function before we have a chance to 2281 // drop them. 2282 semacquire(&gcsema) 2283 2284 // Create reachability specials for ptrs. 2285 specials := make([]*specialReachable, len(ptrs)) 2286 for i, p := range ptrs { 2287 lock(&mheap_.speciallock) 2288 s := (*specialReachable)(mheap_.specialReachableAlloc.alloc()) 2289 unlock(&mheap_.speciallock) 2290 s.special.kind = _KindSpecialReachable 2291 if !addspecial(p, &s.special, false) { 2292 throw("already have a reachable special (duplicate pointer?)") 2293 } 2294 specials[i] = s 2295 // Make sure we don't retain ptrs. 2296 ptrs[i] = nil 2297 } 2298 2299 semrelease(&gcsema) 2300 2301 // Force a full GC and sweep. 2302 GC() 2303 2304 // Process specials. 2305 for i, s := range specials { 2306 if !s.done { 2307 printlock() 2308 println("runtime: object", i, "was not swept") 2309 throw("IsReachable failed") 2310 } 2311 if s.reachable { 2312 mask |= 1 << i 2313 } 2314 lock(&mheap_.speciallock) 2315 mheap_.specialReachableAlloc.free(unsafe.Pointer(s)) 2316 unlock(&mheap_.speciallock) 2317 } 2318 2319 return mask 2320 } 2321 2322 // gcTestPointerClass returns the category of what p points to, one of: 2323 // "heap", "stack", "data", "bss", "other". This is useful for checking 2324 // that a test is doing what it's intended to do. 2325 // 2326 // This is nosplit simply to avoid extra pointer shuffling that may 2327 // complicate a test. 2328 // 2329 //go:nosplit 2330 func gcTestPointerClass(p unsafe.Pointer) string { 2331 p2 := uintptr(noescape(p)) 2332 gp := getg() 2333 if gp.stack.lo <= p2 && p2 < gp.stack.hi { 2334 return "stack" 2335 } 2336 if base, _, _ := findObject(p2, 0, 0); base != 0 { 2337 return "heap" 2338 } 2339 for _, datap := range activeModules() { 2340 if datap.data <= p2 && p2 < datap.edata || datap.noptrdata <= p2 && p2 < datap.enoptrdata { 2341 return "data" 2342 } 2343 if datap.bss <= p2 && p2 < datap.ebss || datap.noptrbss <= p2 && p2 <= datap.enoptrbss { 2344 return "bss" 2345 } 2346 } 2347 KeepAlive(p) 2348 return "other" 2349 } 2350