    1 Multithreading in memcached *was* originally simple:
    3 - One listener thread
    4 - N "event worker" threads
    5 - Some misc background threads
    7 Each worker thread is assigned connections, and runs its own epoll loop. The
    8 central hash table, LRU lists, and some statistics counters are covered by
    9 global locks. Protocol parsing, data transfer happens in threads. Data lookups
   10 and modifications happen under central locks.
   14 - A secondary small hash table of locks is used to lock an item by its hash
   15   value. This prevents multiple threads from acting on the same item at the
   16   same time.
   17 - This secondary hash table is mapped to the central hash tables buckets. This
   18   allows multiple threads to access the hash table in parallel. Only one
   19   thread may read or write against a particular hash table bucket.
   20 - atomic refcounts per item are used to manage garbage collection and
   21   mutability.
   23 - When pulling an item off of the LRU tail for eviction or re-allocation, the
   24   system must attempt to lock the item's bucket, which is done with a trylock
   25   to avoid deadlocks. If a bucket is in use (and not by that thread) it will
   26   walk up the LRU a little in an attempt to fetch a non-busy item.
   28 - Each LRU (and sub-LRU's in newer modes) has an independent lock.
   30 - Raw accesses to the slab class are protected by a global slabs_lock. This
   31   is a short lock which covers pushing and popping free memory.
   33 - item_lock must be held while modifying an item.
   34 - slabs_lock must be held while modifying the ITEM_SLABBED flag bit within an item.
   35 - ITEM_LINKED must not be set before an item has a key copied into it.
   36 - items without ITEM_SLABBED set cannot have their memory zeroed out.
   40 (incomplete as of writing, sorry):
   42 item_lock -> lru_lock -> slabs_lock
   44 lru_lock -> item_trylock
   46 Various stats_locks should never have other locks as dependencies.
   48 Various locks exist for background threads. They can be used to pause the
   49 thread execution or update settings while the threads are idle. They may call
   50 item or lru locks.
   52 A low priority issue:
   54 - If you remove the per-thread stats lock, CPU usage goes down by less than a
   55   point of a percent, and it does not improve scalability.
   56 - In my testing, the remaining global STATS_LOCK calls never seem to collide.
   58 Yes, more stats can be moved to threads, and those locks can actually be
   59 removed entirely on x86-64 systems. However my tests haven't shown that as
   60 beneficial so far, so I've prioritized other work.