qemu

FORK: QEMU emulator
git clone https://git.neptards.moe/neptards/qemu.git
Log | Files | Refs | Submodules | LICENSE

qcow2-cache.txt (9302B)


      1 qcow2 L2/refcount cache configuration
      2 =====================================
      3 Copyright (C) 2015, 2018-2020 Igalia, S.L.
      4 Author: Alberto Garcia <berto@igalia.com>
      5 
      6 This work is licensed under the terms of the GNU GPL, version 2 or
      7 later. See the COPYING file in the top-level directory.
      8 
      9 Introduction
     10 ------------
     11 The QEMU qcow2 driver has two caches that can improve the I/O
     12 performance significantly. However, setting the right cache sizes is
     13 not a straightforward operation.
     14 
     15 This document attempts to give an overview of the L2 and refcount
     16 caches, and how to configure them.
     17 
     18 Please refer to the docs/interop/qcow2.txt file for an in-depth
     19 technical description of the qcow2 file format.
     20 
     21 
     22 Clusters
     23 --------
     24 A qcow2 file is organized in units of constant size called clusters.
     25 
     26 The cluster size is configurable, but it must be a power of two and
     27 its value 512 bytes or higher. QEMU currently defaults to 64 KB
     28 clusters, and it does not support sizes larger than 2MB.
     29 
     30 The 'qemu-img create' command supports specifying the size using the
     31 cluster_size option:
     32 
     33    qemu-img create -f qcow2 -o cluster_size=128K hd.qcow2 4G
     34 
     35 
     36 The L2 tables
     37 -------------
     38 The qcow2 format uses a two-level structure to map the virtual disk as
     39 seen by the guest to the disk image in the host. These structures are
     40 called the L1 and L2 tables.
     41 
     42 There is one single L1 table per disk image. The table is small and is
     43 always kept in memory.
     44 
     45 There can be many L2 tables, depending on how much space has been
     46 allocated in the image. Each table is one cluster in size. In order to
     47 read or write data from the virtual disk, QEMU needs to read its
     48 corresponding L2 table to find out where that data is located. Since
     49 reading the table for each I/O operation can be expensive, QEMU keeps
     50 an L2 cache in memory to speed up disk access.
     51 
     52 The size of the L2 cache can be configured, and setting the right
     53 value can improve the I/O performance significantly.
     54 
     55 
     56 The refcount blocks
     57 -------------------
     58 The qcow2 format also maintains a reference count for each cluster.
     59 Reference counts are used for cluster allocation and internal
     60 snapshots. The data is stored in a two-level structure similar to the
     61 L1/L2 tables described above.
     62 
     63 The second level structures are called refcount blocks, are also one
     64 cluster in size and the number is also variable and dependent on the
     65 amount of allocated space.
     66 
     67 Each block contains a number of refcount entries. Their size (in bits)
     68 is a power of two and must not be higher than 64. It defaults to 16
     69 bits, but a different value can be set using the refcount_bits option:
     70 
     71    qemu-img create -f qcow2 -o refcount_bits=8 hd.qcow2 4G
     72 
     73 QEMU keeps a refcount cache to speed up I/O much like the
     74 aforementioned L2 cache, and its size can also be configured.
     75 
     76 
     77 Choosing the right cache sizes
     78 ------------------------------
     79 In order to choose the cache sizes we need to know how they relate to
     80 the amount of allocated space.
     81 
     82 The part of the virtual disk that can be mapped by the L2 and refcount
     83 caches (in bytes) is:
     84 
     85    disk_size = l2_cache_size * cluster_size / 8
     86    disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
     87 
     88 With the default values for cluster_size (64KB) and refcount_bits
     89 (16), this becomes:
     90 
     91    disk_size = l2_cache_size * 8192
     92    disk_size = refcount_cache_size * 32768
     93 
     94 So in order to cover n GB of disk space with the default values we
     95 need:
     96 
     97    l2_cache_size = disk_size_GB * 131072
     98    refcount_cache_size = disk_size_GB * 32768
     99 
    100 For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
    101 image size (given that the default cluster size is used):
    102 
    103    8 GB / 8192 = 1 MB
    104 
    105 The refcount cache is 4 times the cluster size by default. With the default
    106 cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for
    107 8 GB of image size:
    108 
    109    262144 * 32768 = 8 GB
    110 
    111 
    112 How to configure the cache sizes
    113 --------------------------------
    114 Cache sizes can be configured using the -drive option in the
    115 command-line, or the 'blockdev-add' QMP command.
    116 
    117 There are three options available, and all of them take bytes:
    118 
    119 "l2-cache-size":         maximum size of the L2 table cache
    120 "refcount-cache-size":   maximum size of the refcount block cache
    121 "cache-size":            maximum size of both caches combined
    122 
    123 There are a few things that need to be taken into account:
    124 
    125  - Both caches must have a size that is a multiple of the cluster size
    126    (or the cache entry size: see "Using smaller cache sizes" below).
    127 
    128  - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
    129    for full coverage of 256 GB images, with the default cluster size). This
    130    value can be modified using the "l2-cache-size" option. QEMU will not use
    131    more memory than needed to hold all of the image's L2 tables, regardless
    132    of this max. value.
    133    On non-Linux platforms the maximal value is smaller by default (8 MB) and
    134    this difference stems from the fact that on Linux the cache can be cleared
    135    periodically if needed, using the "cache-clean-interval" option (see below).
    136    The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
    137 
    138  - The default (and minimum) refcount cache size is 4 clusters.
    139 
    140  - If only "cache-size" is specified then QEMU will assign as much
    141    memory as possible to the L2 cache before increasing the refcount
    142    cache size.
    143 
    144  - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
    145    can be set simultaneously.
    146 
    147 Unlike L2 tables, refcount blocks are not used during normal I/O but
    148 only during allocations and internal snapshots. In most cases they are
    149 accessed sequentially (even during random guest I/O) so increasing the
    150 refcount cache size won't have any measurable effect in performance
    151 (this can change if you are using internal snapshots, so you may want
    152 to think about increasing the cache size if you use them heavily).
    153 
    154 Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
    155 L2 cache size. This resulted in unnecessarily large caches, so now the
    156 refcount cache is as small as possible unless overridden by the user.
    157 
    158 
    159 Using smaller cache entries
    160 ---------------------------
    161 The qcow2 L2 cache can store complete tables. This means that if QEMU
    162 needs an entry from an L2 table then the whole table is read from disk
    163 and is kept in the cache. If the cache is full then a complete table
    164 needs to be evicted first.
    165 
    166 This can be inefficient with large cluster sizes since it results in
    167 more disk I/O and wastes more cache memory.
    168 
    169 Since QEMU 2.12 you can change the size of the L2 cache entry and make
    170 it smaller than the cluster size. This can be configured using the
    171 "l2-cache-entry-size" parameter:
    172 
    173    -drive file=hd.qcow2,l2-cache-size=2097152,l2-cache-entry-size=4096
    174 
    175 Since QEMU 4.0 the value of l2-cache-entry-size defaults to 4KB (or
    176 the cluster size if it's smaller).
    177 
    178 Some things to take into account:
    179 
    180  - The L2 cache entry size has the same restrictions as the cluster
    181    size (power of two, at least 512 bytes).
    182 
    183  - Smaller entry sizes generally improve the cache efficiency and make
    184    disk I/O faster. This is particularly true with solid state drives
    185    so it's a good idea to reduce the entry size in those cases. With
    186    rotating hard drives the situation is a bit more complicated so you
    187    should test it first and stay with the default size if unsure.
    188 
    189  - Try different entry sizes to see which one gives faster performance
    190    in your case. The block size of the host filesystem is generally a
    191    good default (usually 4096 bytes in the case of ext4, hence the
    192    default).
    193 
    194  - Only the L2 cache can be configured this way. The refcount cache
    195    always uses the cluster size as the entry size.
    196 
    197  - If the L2 cache is big enough to hold all of the image's L2 tables
    198    (as explained in the "Choosing the right cache sizes" and "How to
    199    configure the cache sizes" sections in this document) then none of
    200    this is necessary and you can omit the "l2-cache-entry-size"
    201    parameter altogether. In this case QEMU makes the entry size
    202    equal to the cluster size by default.
    203 
    204 
    205 Reducing the memory usage
    206 -------------------------
    207 It is possible to clean unused cache entries in order to reduce the
    208 memory usage during periods of low I/O activity.
    209 
    210 The parameter "cache-clean-interval" defines an interval (in seconds),
    211 after which all the cache entries that haven't been accessed during the
    212 interval are removed from memory. Setting this parameter to 0 disables this
    213 feature.
    214 
    215 The following example removes all unused cache entries every 15 minutes:
    216 
    217    -drive file=hd.qcow2,cache-clean-interval=900
    218 
    219 If unset, the default value for this parameter is 600 on platforms which
    220 support this functionality, and is 0 (disabled) on other platforms.
    221 
    222 This functionality currently relies on the MADV_DONTNEED argument for
    223 madvise() to actually free the memory. This is a Linux-specific feature,
    224 so cache-clean-interval is not supported on other systems.
    225 
    226 
    227 Extended L2 Entries
    228 -------------------
    229 All numbers shown in this document are valid for qcow2 images with normal
    230 64-bit L2 entries.
    231 
    232 Images with extended L2 entries need twice as much L2 metadata, so the L2
    233 cache size must be twice as large for the same disk space.
    234 
    235    disk_size = l2_cache_size * cluster_size / 16
    236 
    237 i.e.
    238 
    239    l2_cache_size = disk_size * 16 / cluster_size
    240 
    241 Refcount blocks are not affected by this.