"Fossies" - the Fresh Open Source Software Archive

Member "go/doc/go_mem.html" (26 Apr 2023, 26923 Bytes) of package /linux/misc/go1.20.4.src.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <!--{
    2     "Title": "The Go Memory Model",
    3     "Subtitle": "Version of June 6, 2022",
    4     "Path": "/ref/mem"
    5 }-->
    6 
    7 <style>
    8 p.rule {
    9   font-style: italic;
   10 }
   11 </style>
   12 
   13 <h2 id="introduction">Introduction</h2>
   14 
   15 <p>
   16 The Go memory model specifies the conditions under which
   17 reads of a variable in one goroutine can be guaranteed to
   18 observe values produced by writes to the same variable in a different goroutine.
   19 </p>
   20 
   21 
   22 <h3 id="advice">Advice</h3>
   23 
   24 <p>
   25 Programs that modify data being simultaneously accessed by multiple goroutines
   26 must serialize such access.
   27 </p>
   28 
   29 <p>
   30 To serialize access, protect the data with channel operations or other synchronization primitives
   31 such as those in the <a href="/pkg/sync/"><code>sync</code></a>
   32 and <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> packages.
   33 </p>
   34 
   35 <p>
   36 If you must read the rest of this document to understand the behavior of your program,
   37 you are being too clever.
   38 </p>
   39 
   40 <p>
   41 Don't be clever.
   42 </p>
   43 
   44 <h3 id="overview">Informal Overview</h3>
   45 
   46 <p>
   47 Go approaches its memory model in much the same way as the rest of the language,
   48 aiming to keep the semantics simple, understandable, and useful.
   49 This section gives a general overview of the approach and should suffice for most programmers.
   50 The memory model is specified more formally in the next section.
   51 </p>
   52 
   53 <p>
   54 A <em>data race</em> is defined as
   55 a write to a memory location happening concurrently with another read or write to that same location,
   56 unless all the accesses involved are atomic data accesses as provided by the <code>sync/atomic</code> package.
   57 As noted already, programmers are strongly encouraged to use appropriate synchronization
   58 to avoid data races.
   59 In the absence of data races, Go programs behave as if all the goroutines
   60 were multiplexed onto a single processor.
   61 This property is sometimes referred to as DRF-SC: data-race-free programs
   62 execute in a sequentially consistent manner.
   63 </p>
   64 
   65 <p>
   66 While programmers should write Go programs without data races,
   67 there are limitations to what a Go implementation can do in response to a data race.
   68 An implementation may always react to a data race by reporting the race and terminating the program.
   69 Otherwise, each read of a single-word-sized or sub-word-sized memory location
   70 must observe a value actually written to that location (perhaps by a concurrent executing goroutine)
   71 and not yet overwritten.
   72 These implementation constraints make Go more like Java or JavaScript,
   73 in that most races have a limited number of outcomes,
   74 and less like C and C++, where the meaning of any program with a race
   75 is entirely undefined, and the compiler may do anything at all.
   76 Go's approach aims to make errant programs more reliable and easier to debug,
   77 while still insisting that races are errors and that tools can diagnose and report them.
   78 </p>
   79 
   80 <h2 id="model">Memory Model</h2>
   81 
   82 <p>
   83 The following formal definition of Go's memory model closely follows
   84 the approach presented by Hans-J. Boehm and Sarita V. Adve in
   85 <a href="https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf">Foundations of the C++ Concurrency Memory Model</a>”,
   86 published in PLDI 2008.
   87 The definition of data-race-free programs and the guarantee of sequential consistency
   88 for race-free programs are equivalent to the ones in that work.
   89 </p>
   90 
   91 <p>
   92 The memory model describes the requirements on program executions,
   93 which are made up of goroutine executions,
   94 which in turn are made up of memory operations.
   95 </p>
   96 
   97 <p>
   98 A <i>memory operation</i> is modeled by four details:
   99 </p>
  100 <ul>
  101 <li>its kind, indicating whether it is an ordinary data read, an ordinary data write,
  102 or a <i>synchronizing operation</i> such as an atomic data access,
  103 a mutex operation, or a channel operation,
  104 <li>its location in the program,
  105 <li>the memory location or variable being accessed, and
  106 <li>the values read or written by the operation.
  107 </ul>
  108 <p>
  109 Some memory operations are <i>read-like</i>, including read, atomic read, mutex lock, and channel receive.
  110 Other memory operations are <i>write-like</i>, including write, atomic write, mutex unlock, channel send, and channel close.
  111 Some, such as atomic compare-and-swap, are both read-like and write-like.
  112 </p>
  113 
  114 <p>
  115 A <i>goroutine execution</i> is modeled as a set of memory operations executed by a single goroutine.
  116 </p>
  117 
  118 <p>
  119 <b>Requirement 1</b>:
  120 The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine,
  121 given the values read from and written to memory.
  122 That execution must be consistent with the <i>sequenced before</i> relation,
  123 defined as the partial order requirements set out by the <a href="/ref/spec">Go language specification</a>
  124 for Go's control flow constructs as well as the <a href="/ref/spec#Order_of_evaluation">order of evaluation for expressions</a>.
  125 </p>
  126 
  127 <p>
  128 A Go <i>program execution</i> is modeled as a set of goroutine executions,
  129 together with a mapping <i>W</i> that specifies the write-like operation that each read-like operation reads from.
  130 (Multiple executions of the same program can have different program executions.)
  131 </p>
  132 
  133 <p>
  134 <b>Requirement 2</b>:
  135 For a given program execution, the mapping <i>W</i>, when limited to synchronizing operations,
  136 must be explainable by some implicit total order of the synchronizing operations
  137 that is consistent with sequencing and the values read and written by those operations.
  138 </p>
  139 
  140 <p>
  141 The <i>synchronized before</i> relation is a partial order on synchronizing memory operations,
  142 derived from <i>W</i>.
  143 If a synchronizing read-like memory operation <i>r</i>
  144 observes a synchronizing write-like memory operation <i>w</i>
  145 (that is, if <i>W</i>(<i>r</i>) = <i>w</i>),
  146 then <i>w</i> is synchronized before <i>r</i>.
  147 Informally, the synchronized before relation is a subset of the implied total order
  148 mentioned in the previous paragraph,
  149 limited to the information that <i>W</i> directly observes.
  150 </p>
  151 
  152 <p>
  153 The <i>happens before</i> relation is defined as the transitive closure of the
  154 union of the sequenced before and synchronized before relations.
  155 </p>
  156 
  157 <p>
  158 <b>Requirement 3</b>:
  159 For an ordinary (non-synchronizing) data read <i>r</i> on a memory location <i>x</i>,
  160 <i>W</i>(<i>r</i>) must be a write <i>w</i> that is <i>visible</i> to <i>r</i>,
  161 where visible means that both of the following hold:
  162 
  163 <ol>
  164 <li><i>w</i> happens before <i>r</i>.
  165 <li><i>w</i> does not happen before any other write <i>w'</i> (to <i>x</i>) that happens before <i>r</i>.
  166 </ol>
  167 
  168 <p>
  169 A <i>read-write data race</i> on memory location <i>x</i>
  170 consists of a read-like memory operation <i>r</i> on <i>x</i>
  171 and a write-like memory operation <i>w</i> on <i>x</i>,
  172 at least one of which is non-synchronizing,
  173 which are unordered by happens before
  174 (that is, neither <i>r</i> happens before <i>w</i>
  175 nor <i>w</i> happens before <i>r</i>).
  176 </p>
  177 
  178 <p>
  179 A <i>write-write data race</i> on memory location <i>x</i>
  180 consists of two write-like memory operations <i>w</i> and <i>w'</i> on <i>x</i>,
  181 at least one of which is non-synchronizing,
  182 which are unordered by happens before.
  183 </p>
  184 
  185 <p>
  186 Note that if there are no read-write or write-write data races on memory location <i>x</i>,
  187 then any read <i>r</i> on <i>x</i> has only one possible <i>W</i>(<i>r</i>):
  188 the single <i>w</i> that immediately precedes it in the happens before order.
  189 </p>
  190 
  191 <p>
  192 More generally, it can be shown that any Go program that is data-race-free,
  193 meaning it has no program executions with read-write or write-write data races,
  194 can only have outcomes explained by some sequentially consistent interleaving
  195 of the goroutine executions.
  196 (The proof is the same as Section 7 of Boehm and Adve's paper cited above.)
  197 This property is called DRF-SC.
  198 </p>
  199 
  200 <p>
  201 The intent of the formal definition is to match
  202 the DRF-SC guarantee provided to race-free programs
  203 by other languages, including C, C++, Java, JavaScript, Rust, and Swift.
  204 </p>
  205 
  206 <p>
  207 Certain Go language operations such as goroutine creation and memory allocation
  208 act as synchronization operations.
  209 The effect of these operations on the synchronized-before partial order
  210 is documented in the “Synchronization” section below.
  211 Individual packages are responsible for providing similar documentation
  212 for their own operations.
  213 </p>
  214 
  215 <h2 id="restrictions">Implementation Restrictions for Programs Containing Data Races</h2>
  216 
  217 <p>
  218 The preceding section gave a formal definition of data-race-free program execution.
  219 This section informally describes the semantics that implementations must provide
  220 for programs that do contain races.
  221 </p>
  222 
  223 <p>
  224 First, any implementation can, upon detecting a data race,
  225 report the race and halt execution of the program.
  226 Implementations using ThreadSanitizer
  227 (accessed with “<code>go</code> <code>build</code> <code>-race</code>”)
  228 do exactly this.
  229 </p>
  230 
  231 <p>
  232 Otherwise, a read <i>r</i> of a memory location <i>x</i>
  233 that is not larger than a machine word must observe
  234 some write <i>w</i> such that <i>r</i> does not happen before <i>w</i>
  235 and there is no write <i>w'</i> such that <i>w</i> happens before <i>w'</i>
  236 and <i>w'</i> happens before <i>r</i>.
  237 That is, each read must observe a value written by a preceding or concurrent write.
  238 </p>
  239 
  240 <p>
  241 Additionally, observation of acausal and “out of thin air” writes is disallowed.
  242 </p>
  243 
  244 <p>
  245 Reads of memory locations larger than a single machine word
  246 are encouraged but not required to meet the same semantics
  247 as word-sized memory locations,
  248 observing a single allowed write <i>w</i>.
  249 For performance reasons,
  250 implementations may instead treat larger operations
  251 as a set of individual machine-word-sized operations
  252 in an unspecified order.
  253 This means that races on multiword data structures
  254 can lead to inconsistent values not corresponding to a single write.
  255 When the values depend on the consistency
  256 of internal (pointer, length) or (pointer, type) pairs,
  257 as can be the case for interface values, maps,
  258 slices, and strings in most Go implementations,
  259 such races can in turn lead to arbitrary memory corruption.
  260 </p>
  261 
  262 <p>
  263 Examples of incorrect synchronization are given in the
  264 “Incorrect synchronization” section below.
  265 </p>
  266 
  267 <p>
  268 Examples of the limitations on implementations are given in the
  269 “Incorrect compilation” section below.
  270 </p>
  271 
  272 <h2 id="synchronization">Synchronization</h2>
  273 
  274 <h3 id="init">Initialization</h3>
  275 
  276 <p>
  277 Program initialization runs in a single goroutine,
  278 but that goroutine may create other goroutines,
  279 which run concurrently.
  280 </p>
  281 
  282 <p class="rule">
  283 If a package <code>p</code> imports package <code>q</code>, the completion of
  284 <code>q</code>'s <code>init</code> functions happens before the start of any of <code>p</code>'s.
  285 </p>
  286 
  287 <p class="rule">
  288 The completion of all <code>init</code> functions is synchronized before
  289 the start of the function <code>main.main</code>.
  290 </p>
  291 
  292 <h3 id="go">Goroutine creation</h3>
  293 
  294 <p class="rule">
  295 The <code>go</code> statement that starts a new goroutine
  296 is synchronized before the start of the goroutine's execution.
  297 </p>
  298 
  299 <p>
  300 For example, in this program:
  301 </p>
  302 
  303 <pre>
  304 var a string
  305 
  306 func f() {
  307     print(a)
  308 }
  309 
  310 func hello() {
  311     a = "hello, world"
  312     go f()
  313 }
  314 </pre>
  315 
  316 <p>
  317 calling <code>hello</code> will print <code>"hello, world"</code>
  318 at some point in the future (perhaps after <code>hello</code> has returned).
  319 </p>
  320 
  321 <h3 id="goexit">Goroutine destruction</h3>
  322 
  323 <p>
  324 The exit of a goroutine is not guaranteed to be synchronized before
  325 any event in the program.
  326 For example, in this program:
  327 </p>
  328 
  329 <pre>
  330 var a string
  331 
  332 func hello() {
  333     go func() { a = "hello" }()
  334     print(a)
  335 }
  336 </pre>
  337 
  338 <p>
  339 the assignment to <code>a</code> is not followed by
  340 any synchronization event, so it is not guaranteed to be
  341 observed by any other goroutine.
  342 In fact, an aggressive compiler might delete the entire <code>go</code> statement.
  343 </p>
  344 
  345 <p>
  346 If the effects of a goroutine must be observed by another goroutine,
  347 use a synchronization mechanism such as a lock or channel
  348 communication to establish a relative ordering.
  349 </p>
  350 
  351 <h3 id="chan">Channel communication</h3>
  352 
  353 <p>
  354 Channel communication is the main method of synchronization
  355 between goroutines.  Each send on a particular channel
  356 is matched to a corresponding receive from that channel,
  357 usually in a different goroutine.
  358 </p>
  359 
  360 <p class="rule">
  361 A send on a channel is synchronized before the completion of the
  362 corresponding receive from that channel.
  363 </p>
  364 
  365 <p>
  366 This program:
  367 </p>
  368 
  369 <pre>
  370 var c = make(chan int, 10)
  371 var a string
  372 
  373 func f() {
  374     a = "hello, world"
  375     c &lt;- 0
  376 }
  377 
  378 func main() {
  379     go f()
  380     &lt;-c
  381     print(a)
  382 }
  383 </pre>
  384 
  385 <p>
  386 is guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
  387 is sequenced before the send on <code>c</code>, which is synchronized before
  388 the corresponding receive on <code>c</code> completes, which is sequenced before
  389 the <code>print</code>.
  390 </p>
  391 
  392 <p class="rule">
  393 The closing of a channel is synchronized before a receive that returns a zero value
  394 because the channel is closed.
  395 </p>
  396 
  397 <p>
  398 In the previous example, replacing
  399 <code>c &lt;- 0</code> with <code>close(c)</code>
  400 yields a program with the same guaranteed behavior.
  401 </p>
  402 
  403 <p class="rule">
  404 A receive from an unbuffered channel is synchronized before the completion of
  405 the corresponding send on that channel.
  406 </p>
  407 
  408 <p>
  409 This program (as above, but with the send and receive statements swapped and
  410 using an unbuffered channel):
  411 </p>
  412 
  413 <pre>
  414 var c = make(chan int)
  415 var a string
  416 
  417 func f() {
  418     a = "hello, world"
  419     &lt;-c
  420 }
  421 
  422 func main() {
  423     go f()
  424     c &lt;- 0
  425     print(a)
  426 }
  427 </pre>
  428 
  429 <p>
  430 is also guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
  431 is sequenced before the receive on <code>c</code>, which is synchronized before
  432 the corresponding send on <code>c</code> completes, which is sequenced
  433 before the <code>print</code>.
  434 </p>
  435 
  436 <p>
  437 If the channel were buffered (e.g., <code>c = make(chan int, 1)</code>)
  438 then the program would not be guaranteed to print
  439 <code>"hello, world"</code>.  (It might print the empty string,
  440 crash, or do something else.)
  441 </p>
  442 
  443 <p class="rule">
  444 The <i>k</i>th receive on a channel with capacity <i>C</i> is synchronized before the completion of the <i>k</i>+<i>C</i>th send from that channel completes.
  445 </p>
  446 
  447 <p>
  448 This rule generalizes the previous rule to buffered channels.
  449 It allows a counting semaphore to be modeled by a buffered channel:
  450 the number of items in the channel corresponds to the number of active uses,
  451 the capacity of the channel corresponds to the maximum number of simultaneous uses,
  452 sending an item acquires the semaphore, and receiving an item releases
  453 the semaphore.
  454 This is a common idiom for limiting concurrency.
  455 </p>
  456 
  457 <p>
  458 This program starts a goroutine for every entry in the work list, but the
  459 goroutines coordinate using the <code>limit</code> channel to ensure
  460 that at most three are running work functions at a time.
  461 </p>
  462 
  463 <pre>
  464 var limit = make(chan int, 3)
  465 
  466 func main() {
  467     for _, w := range work {
  468         go func(w func()) {
  469             limit &lt;- 1
  470             w()
  471             &lt;-limit
  472         }(w)
  473     }
  474     select{}
  475 }
  476 </pre>
  477 
  478 <h3 id="locks">Locks</h3>
  479 
  480 <p>
  481 The <code>sync</code> package implements two lock data types,
  482 <code>sync.Mutex</code> and <code>sync.RWMutex</code>.
  483 </p>
  484 
  485 <p class="rule">
  486 For any <code>sync.Mutex</code> or <code>sync.RWMutex</code> variable <code>l</code> and <i>n</i> &lt; <i>m</i>,
  487 call <i>n</i> of <code>l.Unlock()</code> is synchronized before call <i>m</i> of <code>l.Lock()</code> returns.
  488 </p>
  489 
  490 <p>
  491 This program:
  492 </p>
  493 
  494 <pre>
  495 var l sync.Mutex
  496 var a string
  497 
  498 func f() {
  499     a = "hello, world"
  500     l.Unlock()
  501 }
  502 
  503 func main() {
  504     l.Lock()
  505     go f()
  506     l.Lock()
  507     print(a)
  508 }
  509 </pre>
  510 
  511 <p>
  512 is guaranteed to print <code>"hello, world"</code>.
  513 The first call to <code>l.Unlock()</code> (in <code>f</code>) is synchronized
  514 before the second call to <code>l.Lock()</code> (in <code>main</code>) returns,
  515 which is sequenced before the <code>print</code>.
  516 </p>
  517 
  518 <p class="rule">
  519 For any call to <code>l.RLock</code> on a <code>sync.RWMutex</code> variable <code>l</code>,
  520 there is an <i>n</i> such that the <i>n</i>th call to <code>l.Unlock</code>
  521 is synchronized before the return from <code>l.RLock</code>,
  522 and the matching call to <code>l.RUnlock</code> is synchronized before the return from call <i>n</i>+1 to <code>l.Lock</code>.
  523 </p>
  524 
  525 <p class="rule">
  526 A successful call to <code>l.TryLock</code> (or <code>l.TryRLock</code>)
  527 is equivalent to a call to <code>l.Lock</code> (or <code>l.RLock</code>).
  528 An unsuccessful call has no synchronizing effect at all.
  529 As far as the memory model is concerned,
  530 <code>l.TryLock</code> (or <code>l.TryRLock</code>)
  531 may be considered to be able to return false
  532 even when the mutex <i>l</i> is unlocked.
  533 </p>
  534 
  535 <h3 id="once">Once</h3>
  536 
  537 <p>
  538 The <code>sync</code> package provides a safe mechanism for
  539 initialization in the presence of multiple goroutines
  540 through the use of the <code>Once</code> type.
  541 Multiple threads can execute <code>once.Do(f)</code> for a particular <code>f</code>,
  542 but only one will run <code>f()</code>, and the other calls block
  543 until <code>f()</code> has returned.
  544 </p>
  545 
  546 <p class="rule">
  547 The completion of a single call of <code>f()</code> from <code>once.Do(f)</code>
  548 is synchronized before the return of any call of <code>once.Do(f)</code>.
  549 </p>
  550 
  551 <p>
  552 In this program:
  553 </p>
  554 
  555 <pre>
  556 var a string
  557 var once sync.Once
  558 
  559 func setup() {
  560     a = "hello, world"
  561 }
  562 
  563 func doprint() {
  564     once.Do(setup)
  565     print(a)
  566 }
  567 
  568 func twoprint() {
  569     go doprint()
  570     go doprint()
  571 }
  572 </pre>
  573 
  574 <p>
  575 calling <code>twoprint</code> will call <code>setup</code> exactly
  576 once.
  577 The <code>setup</code> function will complete before either call
  578 of <code>print</code>.
  579 The result will be that <code>"hello, world"</code> will be printed
  580 twice.
  581 </p>
  582 
  583 <h3 id="atomic">Atomic Values</h3>
  584 
  585 <p>
  586 The APIs in the <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a>
  587 package are collectively “atomic operations”
  588 that can be used to synchronize the execution of different goroutines.
  589 If the effect of an atomic operation <i>A</i> is observed by atomic operation <i>B</i>,
  590 then <i>A</i> is synchronized before <i>B</i>.
  591 All the atomic operations executed in a program behave as though executed
  592 in some sequentially consistent order.
  593 </p>
  594 
  595 <p>
  596 The preceding definition has the same semantics as C++’s sequentially consistent atomics
  597 and Java’s <code>volatile</code> variables.
  598 </p>
  599 
  600 <h3 id="finalizer">Finalizers</h3>
  601 
  602 <p>
  603 The <a href="/pkg/runtime/"><code>runtime</code></a> package provides
  604 a <code>SetFinalizer</code> function that adds a finalizer to be called when
  605 a particular object is no longer reachable by the program.
  606 A call to <code>SetFinalizer(x, f)</code> is synchronized before the finalization call <code>f(x)</code>.
  607 </p>
  608 
  609 <h3 id="more">Additional Mechanisms</h3>
  610 
  611 <p>
  612 The <code>sync</code> package provides additional synchronization abstractions,
  613 including <a href="/pkg/sync/#Cond">condition variables</a>,
  614 <a href="/pkg/sync/#Map">lock-free maps</a>,
  615 <a href="/pkg/sync/#Pool">allocation pools</a>,
  616 and
  617 <a href="/pkg/sync/#WaitGroup">wait groups</a>.
  618 The documentation for each of these specifies the guarantees it
  619 makes concerning synchronization.
  620 </p>
  621 
  622 <p>
  623 Other packages that provide synchronization abstractions
  624 should document the guarantees they make too.
  625 </p>
  626 
  627 
  628 <h2 id="badsync">Incorrect synchronization</h2>
  629 
  630 <p>
  631 Programs with races are incorrect and
  632 can exhibit non-sequentially consistent executions.
  633 In particular, note that a read <i>r</i> may observe the value written by any write <i>w</i>
  634 that executes concurrently with <i>r</i>.
  635 Even if this occurs, it does not imply that reads happening after <i>r</i>
  636 will observe writes that happened before <i>w</i>.
  637 </p>
  638 
  639 <p>
  640 In this program:
  641 </p>
  642 
  643 <pre>
  644 var a, b int
  645 
  646 func f() {
  647     a = 1
  648     b = 2
  649 }
  650 
  651 func g() {
  652     print(b)
  653     print(a)
  654 }
  655 
  656 func main() {
  657     go f()
  658     g()
  659 }
  660 </pre>
  661 
  662 <p>
  663 it can happen that <code>g</code> prints <code>2</code> and then <code>0</code>.
  664 </p>
  665 
  666 <p>
  667 This fact invalidates a few common idioms.
  668 </p>
  669 
  670 <p>
  671 Double-checked locking is an attempt to avoid the overhead of synchronization.
  672 For example, the <code>twoprint</code> program might be
  673 incorrectly written as:
  674 </p>
  675 
  676 <pre>
  677 var a string
  678 var done bool
  679 
  680 func setup() {
  681     a = "hello, world"
  682     done = true
  683 }
  684 
  685 func doprint() {
  686     if !done {
  687         once.Do(setup)
  688     }
  689     print(a)
  690 }
  691 
  692 func twoprint() {
  693     go doprint()
  694     go doprint()
  695 }
  696 </pre>
  697 
  698 <p>
  699 but there is no guarantee that, in <code>doprint</code>, observing the write to <code>done</code>
  700 implies observing the write to <code>a</code>.  This
  701 version can (incorrectly) print an empty string
  702 instead of <code>"hello, world"</code>.
  703 </p>
  704 
  705 <p>
  706 Another incorrect idiom is busy waiting for a value, as in:
  707 </p>
  708 
  709 <pre>
  710 var a string
  711 var done bool
  712 
  713 func setup() {
  714     a = "hello, world"
  715     done = true
  716 }
  717 
  718 func main() {
  719     go setup()
  720     for !done {
  721     }
  722     print(a)
  723 }
  724 </pre>
  725 
  726 <p>
  727 As before, there is no guarantee that, in <code>main</code>,
  728 observing the write to <code>done</code>
  729 implies observing the write to <code>a</code>, so this program could
  730 print an empty string too.
  731 Worse, there is no guarantee that the write to <code>done</code> will ever
  732 be observed by <code>main</code>, since there are no synchronization
  733 events between the two threads.  The loop in <code>main</code> is not
  734 guaranteed to finish.
  735 </p>
  736 
  737 <p>
  738 There are subtler variants on this theme, such as this program.
  739 </p>
  740 
  741 <pre>
  742 type T struct {
  743     msg string
  744 }
  745 
  746 var g *T
  747 
  748 func setup() {
  749     t := new(T)
  750     t.msg = "hello, world"
  751     g = t
  752 }
  753 
  754 func main() {
  755     go setup()
  756     for g == nil {
  757     }
  758     print(g.msg)
  759 }
  760 </pre>
  761 
  762 <p>
  763 Even if <code>main</code> observes <code>g != nil</code> and exits its loop,
  764 there is no guarantee that it will observe the initialized
  765 value for <code>g.msg</code>.
  766 </p>
  767 
  768 <p>
  769 In all these examples, the solution is the same:
  770 use explicit synchronization.
  771 </p>
  772 
  773 <h2 id="badcompiler">Incorrect compilation</h2>
  774 
  775 <p>
  776 The Go memory model restricts compiler optimizations as much as it does Go programs.
  777 Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs.
  778 In particular, a compiler must not introduce writes that do not exist in the original program,
  779 it must not allow a single read to observe multiple values,
  780 and it must not allow a single write to write multiple values.
  781 </p>
  782 
  783 <p>
  784 All the following examples assume that `*p` and `*q` refer to
  785 memory locations accessible to multiple goroutines.
  786 </p>
  787 
  788 <p>
  789 Not introducing data races into race-free programs means not moving
  790 writes out of conditional statements in which they appear.
  791 For example, a compiler must not invert the conditional in this program:
  792 </p>
  793 
  794 <pre>
  795 *p = 1
  796 if cond {
  797     *p = 2
  798 }
  799 </pre>
  800 
  801 <p>
  802 That is, the compiler must not rewrite the program into this one:
  803 </p>
  804 
  805 <pre>
  806 *p = 2
  807 if !cond {
  808     *p = 1
  809 }
  810 </pre>
  811 
  812 <p>
  813 If <code>cond</code> is false and another goroutine is reading <code>*p</code>,
  814 then in the original program, the other goroutine can only observe any prior value of <code>*p</code> and <code>1</code>.
  815 In the rewritten program, the other goroutine can observe <code>2</code>, which was previously impossible.
  816 </p>
  817 
  818 <p>
  819 Not introducing data races also means not assuming that loops terminate.
  820 For example, a compiler must in general not move the accesses to <code>*p</code> or <code>*q</code>
  821 ahead of the loop in this program:
  822 </p>
  823 
  824 <pre>
  825 n := 0
  826 for e := list; e != nil; e = e.next {
  827     n++
  828 }
  829 i := *p
  830 *q = 1
  831 </pre>
  832 
  833 <p>
  834 If <code>list</code> pointed to a cyclic list,
  835 then the original program would never access <code>*p</code> or <code>*q</code>,
  836 but the rewritten program would.
  837 (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic;
  838 moving `*q` ahead would also require the compiler proving that no other
  839 goroutine can access `*q`.)
  840 </p>
  841 
  842 <p>
  843 Not introducing data races also means not assuming that called functions
  844 always return or are free of synchronization operations.
  845 For example, a compiler must not move the accesses to <code>*p</code> or <code>*q</code>
  846 ahead of the function call in this program
  847 (at least not without direct knowledge of the precise behavior of <code>f</code>):
  848 </p>
  849 
  850 <pre>
  851 f()
  852 i := *p
  853 *q = 1
  854 </pre>
  855 
  856 <p>
  857 If the call never returned, then once again the original program
  858 would never access <code>*p</code> or <code>*q</code>, but the rewritten program would.
  859 And if the call contained synchronizing operations, then the original program
  860 could establish happens before edges preceding the accesses
  861 to <code>*p</code> and <code>*q</code>, but the rewritten program would not.
  862 </p>
  863 
  864 <p>
  865 Not allowing a single read to observe multiple values means
  866 not reloading local variables from shared memory.
  867 For example, a compiler must not discard <code>i</code> and reload it
  868 a second time from <code>*p</code> in this program:
  869 </p>
  870 
  871 <pre>
  872 i := *p
  873 if i &lt; 0 || i &gt;= len(funcs) {
  874     panic("invalid function index")
  875 }
  876 ... complex code ...
  877 // compiler must NOT reload i = *p here
  878 funcs[i]()
  879 </pre>
  880 
  881 <p>
  882 If the complex code needs many registers, a compiler for single-threaded programs
  883 could discard <code>i</code> without saving a copy and then reload
  884 <code>i = *p</code> just before
  885 <code>funcs[i]()</code>.
  886 A Go compiler must not, because the value of <code>*p</code> may have changed.
  887 (Instead, the compiler could spill <code>i</code> to the stack.)
  888 </p>
  889 
  890 <p>
  891 Not allowing a single write to write multiple values also means not using
  892 the memory where a local variable will be written as temporary storage before the write.
  893 For example, a compiler must not use <code>*p</code> as temporary storage in this program:
  894 </p>
  895 
  896 <pre>
  897 *p = i + *p/2
  898 </pre>
  899 
  900 <p>
  901 That is, it must not rewrite the program into this one:
  902 </p>
  903 
  904 <pre>
  905 *p /= 2
  906 *p += i
  907 </pre>
  908 
  909 <p>
  910 If <code>i</code> and <code>*p</code> start equal to 2,
  911 the original code does <code>*p = 3</code>,
  912 so a racing thread can read only 2 or 3 from <code>*p</code>.
  913 The rewritten code does <code>*p = 1</code> and then <code>*p = 3</code>,
  914 allowing a racing thread to read 1 as well.
  915 </p>
  916 
  917 <p>
  918 Note that all these optimizations are permitted in C/C++ compilers:
  919 a Go compiler sharing a back end with a C/C++ compiler must take care
  920 to disable optimizations that are invalid for Go.
  921 </p>
  922 
  923 <p>
  924 Note that the prohibition on introducing data races
  925 does not apply if the compiler can prove that the races
  926 do not affect correct execution on the target platform.
  927 For example, on essentially all CPUs, it is valid to rewrite
  928 </p>
  929 
  930 <pre>
  931 n := 0
  932 for i := 0; i < m; i++ {
  933     n += *shared
  934 }
  935 </pre>
  936 
  937 into:
  938 
  939 <pre>
  940 n := 0
  941 local := *shared
  942 for i := 0; i < m; i++ {
  943     n += local
  944 }
  945 </pre>
  946 
  947 <p>
  948 provided it can be proved that <code>*shared</code> will not fault on access,
  949 because the potential added read will not affect any existing concurrent reads or writes.
  950 On the other hand, the rewrite would not be valid in a source-to-source translator.
  951 </p>
  952 
  953 <h2 id="conclusion">Conclusion</h2>
  954 
  955 <p>
  956 Go programmers writing data-race-free programs can rely on
  957 sequentially consistent execution of those programs,
  958 just as in essentially all other modern programming languages.
  959 </p>
  960 
  961 <p>
  962 When it comes to programs with races,
  963 both programmers and compilers should remember the advice:
  964 don't be clever.
  965 </p>