"Fossies" - the Fresh Open Source Software Archive

Member "go/doc/articles/race_detector.html" (9 Sep 2020, 9811 Bytes) of package /windows/misc/go1.14.9.windows-386.zip:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <!--{
    2     "Title": "Data Race Detector",
    3     "Template": true
    4 }-->
    5 
    6 <h2 id="Introduction">Introduction</h2>
    7 
    8 <p>
    9 Data races are among the most common and hardest to debug types of bugs in concurrent systems.
   10 A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
   11 See the <a href="/ref/mem/">The Go Memory Model</a> for details.
   12 </p>
   13 
   14 <p>
   15 Here is an example of a data race that can lead to crashes and memory corruption:
   16 </p>
   17 
   18 <pre>
   19 func main() {
   20     c := make(chan bool)
   21     m := make(map[string]string)
   22     go func() {
   23         m["1"] = "a" // First conflicting access.
   24         c &lt;- true
   25     }()
   26     m["2"] = "b" // Second conflicting access.
   27     &lt;-c
   28     for k, v := range m {
   29         fmt.Println(k, v)
   30     }
   31 }
   32 </pre>
   33 
   34 <h2 id="Usage">Usage</h2>
   35 
   36 <p>
   37 To help diagnose such bugs, Go includes a built-in data race detector.
   38 To use it, add the <code>-race</code> flag to the go command:
   39 </p>
   40 
   41 <pre>
   42 $ go test -race mypkg    // to test the package
   43 $ go run -race mysrc.go  // to run the source file
   44 $ go build -race mycmd   // to build the command
   45 $ go install -race mypkg // to install the package
   46 </pre>
   47 
   48 <h2 id="Report_Format">Report Format</h2>
   49 
   50 <p>
   51 When the race detector finds a data race in the program, it prints a report.
   52 The report contains stack traces for conflicting accesses, as well as stacks where the involved goroutines were created.
   53 Here is an example:
   54 </p>
   55 
   56 <pre>
   57 WARNING: DATA RACE
   58 Read by goroutine 185:
   59   net.(*pollServer).AddFD()
   60       src/net/fd_unix.go:89 +0x398
   61   net.(*pollServer).WaitWrite()
   62       src/net/fd_unix.go:247 +0x45
   63   net.(*netFD).Write()
   64       src/net/fd_unix.go:540 +0x4d4
   65   net.(*conn).Write()
   66       src/net/net.go:129 +0x101
   67   net.func·060()
   68       src/net/timeout_test.go:603 +0xaf
   69 
   70 Previous write by goroutine 184:
   71   net.setWriteDeadline()
   72       src/net/sockopt_posix.go:135 +0xdf
   73   net.setDeadline()
   74       src/net/sockopt_posix.go:144 +0x9c
   75   net.(*conn).SetDeadline()
   76       src/net/net.go:161 +0xe3
   77   net.func·061()
   78       src/net/timeout_test.go:616 +0x3ed
   79 
   80 Goroutine 185 (running) created at:
   81   net.func·061()
   82       src/net/timeout_test.go:609 +0x288
   83 
   84 Goroutine 184 (running) created at:
   85   net.TestProlongTimeout()
   86       src/net/timeout_test.go:618 +0x298
   87   testing.tRunner()
   88       src/testing/testing.go:301 +0xe8
   89 </pre>
   90 
   91 <h2 id="Options">Options</h2>
   92 
   93 <p>
   94 The <code>GORACE</code> environment variable sets race detector options.
   95 The format is:
   96 </p>
   97 
   98 <pre>
   99 GORACE="option1=val1 option2=val2"
  100 </pre>
  101 
  102 <p>
  103 The options are:
  104 </p>
  105 
  106 <ul>
  107 <li>
  108 <code>log_path</code> (default <code>stderr</code>): The race detector writes
  109 its report to a file named <code>log_path.<em>pid</em></code>.
  110 The special names <code>stdout</code>
  111 and <code>stderr</code> cause reports to be written to standard output and
  112 standard error, respectively.
  113 </li>
  114 
  115 <li>
  116 <code>exitcode</code> (default <code>66</code>): The exit status to use when
  117 exiting after a detected race.
  118 </li>
  119 
  120 <li>
  121 <code>strip_path_prefix</code> (default <code>""</code>): Strip this prefix
  122 from all reported file paths, to make reports more concise.
  123 </li>
  124 
  125 <li>
  126 <code>history_size</code> (default <code>1</code>): The per-goroutine memory
  127 access history is <code>32K * 2**history_size elements</code>.
  128 Increasing this value can avoid a "failed to restore the stack" error in reports, at the
  129 cost of increased memory usage.
  130 </li>
  131 
  132 <li>
  133 <code>halt_on_error</code> (default <code>0</code>): Controls whether the program
  134 exits after reporting first data race.
  135 </li>
  136 
  137 <li>
  138 <code>atexit_sleep_ms</code> (default <code>1000</code>): Amount of milliseconds
  139 to sleep in the main goroutine before exiting.
  140 </li>
  141 </ul>
  142 
  143 <p>
  144 Example:
  145 </p>
  146 
  147 <pre>
  148 $ GORACE="log_path=/tmp/race/report strip_path_prefix=/my/go/sources/" go test -race
  149 </pre>
  150 
  151 <h2 id="Excluding_Tests">Excluding Tests</h2>
  152 
  153 <p>
  154 When you build with <code>-race</code> flag, the <code>go</code> command defines additional
  155 <a href="/pkg/go/build/#hdr-Build_Constraints">build tag</a> <code>race</code>.
  156 You can use the tag to exclude some code and tests when running the race detector.
  157 Some examples:
  158 </p>
  159 
  160 <pre>
  161 // +build !race
  162 
  163 package foo
  164 
  165 // The test contains a data race. See issue 123.
  166 func TestFoo(t *testing.T) {
  167     // ...
  168 }
  169 
  170 // The test fails under the race detector due to timeouts.
  171 func TestBar(t *testing.T) {
  172     // ...
  173 }
  174 
  175 // The test takes too long under the race detector.
  176 func TestBaz(t *testing.T) {
  177     // ...
  178 }
  179 </pre>
  180 
  181 <h2 id="How_To_Use">How To Use</h2>
  182 
  183 <p>
  184 To start, run your tests using the race detector (<code>go test -race</code>).
  185 The race detector only finds races that happen at runtime, so it can't find
  186 races in code paths that are not executed.
  187 If your tests have incomplete coverage,
  188 you may find more races by running a binary built with <code>-race</code> under a realistic
  189 workload.
  190 </p>
  191 
  192 <h2 id="Typical_Data_Races">Typical Data Races</h2>
  193 
  194 <p>
  195 Here are some typical data races.  All of them can be detected with the race detector.
  196 </p>
  197 
  198 <h3 id="Race_on_loop_counter">Race on loop counter</h3>
  199 
  200 <pre>
  201 func main() {
  202     var wg sync.WaitGroup
  203     wg.Add(5)
  204     for i := 0; i < 5; i++ {
  205         go func() {
  206             fmt.Println(i) // Not the 'i' you are looking for.
  207             wg.Done()
  208         }()
  209     }
  210     wg.Wait()
  211 }
  212 </pre>
  213 
  214 <p>
  215 The variable <code>i</code> in the function literal is the same variable used by the loop, so
  216 the read in the goroutine races with the loop increment.
  217 (This program typically prints 55555, not 01234.)
  218 The program can be fixed by making a copy of the variable:
  219 </p>
  220 
  221 <pre>
  222 func main() {
  223     var wg sync.WaitGroup
  224     wg.Add(5)
  225     for i := 0; i < 5; i++ {
  226         go func(j int) {
  227             fmt.Println(j) // Good. Read local copy of the loop counter.
  228             wg.Done()
  229         }(i)
  230     }
  231     wg.Wait()
  232 }
  233 </pre>
  234 
  235 <h3 id="Accidentally_shared_variable">Accidentally shared variable</h3>
  236 
  237 <pre>
  238 // ParallelWrite writes data to file1 and file2, returns the errors.
  239 func ParallelWrite(data []byte) chan error {
  240     res := make(chan error, 2)
  241     f1, err := os.Create("file1")
  242     if err != nil {
  243         res &lt;- err
  244     } else {
  245         go func() {
  246             // This err is shared with the main goroutine,
  247             // so the write races with the write below.
  248             _, err = f1.Write(data)
  249             res &lt;- err
  250             f1.Close()
  251         }()
  252     }
  253     f2, err := os.Create("file2") // The second conflicting write to err.
  254     if err != nil {
  255         res &lt;- err
  256     } else {
  257         go func() {
  258             _, err = f2.Write(data)
  259             res &lt;- err
  260             f2.Close()
  261         }()
  262     }
  263     return res
  264 }
  265 </pre>
  266 
  267 <p>
  268 The fix is to introduce new variables in the goroutines (note the use of <code>:=</code>):
  269 </p>
  270 
  271 <pre>
  272             ...
  273             _, err := f1.Write(data)
  274             ...
  275             _, err := f2.Write(data)
  276             ...
  277 </pre>
  278 
  279 <h3 id="Unprotected_global_variable">Unprotected global variable</h3>
  280 
  281 <p>
  282 If the following code is called from several goroutines, it leads to races on the <code>service</code> map.
  283 Concurrent reads and writes of the same map are not safe:
  284 </p>
  285 
  286 <pre>
  287 var service map[string]net.Addr
  288 
  289 func RegisterService(name string, addr net.Addr) {
  290     service[name] = addr
  291 }
  292 
  293 func LookupService(name string) net.Addr {
  294     return service[name]
  295 }
  296 </pre>
  297 
  298 <p>
  299 To make the code safe, protect the accesses with a mutex:
  300 </p>
  301 
  302 <pre>
  303 var (
  304     service   map[string]net.Addr
  305     serviceMu sync.Mutex
  306 )
  307 
  308 func RegisterService(name string, addr net.Addr) {
  309     serviceMu.Lock()
  310     defer serviceMu.Unlock()
  311     service[name] = addr
  312 }
  313 
  314 func LookupService(name string) net.Addr {
  315     serviceMu.Lock()
  316     defer serviceMu.Unlock()
  317     return service[name]
  318 }
  319 </pre>
  320 
  321 <h3 id="Primitive_unprotected_variable">Primitive unprotected variable</h3>
  322 
  323 <p>
  324 Data races can happen on variables of primitive types as well (<code>bool</code>, <code>int</code>, <code>int64</code>, etc.),
  325 as in this example:
  326 </p>
  327 
  328 <pre>
  329 type Watchdog struct{ last int64 }
  330 
  331 func (w *Watchdog) KeepAlive() {
  332     w.last = time.Now().UnixNano() // First conflicting access.
  333 }
  334 
  335 func (w *Watchdog) Start() {
  336     go func() {
  337         for {
  338             time.Sleep(time.Second)
  339             // Second conflicting access.
  340             if w.last < time.Now().Add(-10*time.Second).UnixNano() {
  341                 fmt.Println("No keepalives for 10 seconds. Dying.")
  342                 os.Exit(1)
  343             }
  344         }
  345     }()
  346 }
  347 </pre>
  348 
  349 <p>
  350 Even such "innocent" data races can lead to hard-to-debug problems caused by
  351 non-atomicity of the memory accesses,
  352 interference with compiler optimizations,
  353 or reordering issues accessing processor memory .
  354 </p>
  355 
  356 <p>
  357 A typical fix for this race is to use a channel or a mutex.
  358 To preserve the lock-free behavior, one can also use the
  359 <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> package.
  360 </p>
  361 
  362 <pre>
  363 type Watchdog struct{ last int64 }
  364 
  365 func (w *Watchdog) KeepAlive() {
  366     atomic.StoreInt64(&amp;w.last, time.Now().UnixNano())
  367 }
  368 
  369 func (w *Watchdog) Start() {
  370     go func() {
  371         for {
  372             time.Sleep(time.Second)
  373             if atomic.LoadInt64(&amp;w.last) < time.Now().Add(-10*time.Second).UnixNano() {
  374                 fmt.Println("No keepalives for 10 seconds. Dying.")
  375                 os.Exit(1)
  376             }
  377         }
  378     }()
  379 }
  380 </pre>
  381 
  382 <h2 id="Supported_Systems">Supported Systems</h2>
  383 
  384 <p>
  385   The race detector runs on
  386   <code>linux/amd64</code>, <code>linux/ppc64le</code>,
  387   <code>linux/arm64</code>, <code>freebsd/amd64</code>,
  388   <code>netbsd/amd64</code>, <code>darwin/amd64</code>,
  389   and <code>windows/amd64</code>.
  390 </p>
  391 
  392 <h2 id="Runtime_Overheads">Runtime Overhead</h2>
  393 
  394 <p>
  395 The cost of race detection varies by program, but for a typical program, memory
  396 usage may increase by 5-10x and execution time by 2-20x.
  397 </p>
  398 
  399 <p>
  400 The race detector currently allocates an extra 8 bytes per <code>defer</code>
  401 and <code>recover</code> statement. Those extra allocations <a
  402 href="https://golang.org/issue/26813">are not recovered until the goroutine
  403 exits</a>. This means that if you have a long-running goroutine that is
  404 periodically issuing <code>defer</code> and <code>recover</code> calls,
  405 the program memory usage may grow without bound. These memory allocations
  406 will not show up in the output of <code>runtime.ReadMemStats</code> or
  407 <code>runtime/pprof</code>.
  408 </p>