Bitmap
Introduction
The Bitmap operator is used to build a bitmap that, based on a hash, represents which values may be present in a data flow. Due to the chance of hash collisions in the hash function used, the Bitmap process can produce false positives but not false negatives – so a match based on a bitmap is not guaranteed to be a match to the actual data, but a non-match based on a bitmap is guaranteed to not be a match in the actual data.
The generated bitmap is typically used in other operators to remove rows for which there is no match in the bitmap, and hence guaranteed no match in the original set of data processed by the Bitmap operator. The use of Bitmap operators is most common in execution plans for star join queries in large data warehouses. An example can be seen here.
Visual appearance in execution plans
Depending on the tool being used, a Bitmap operator is displayed in a graphical execution plan as shown below:
SSMS and ADS |
Legacy SSMS |
Plan Explorer |
Paste The Plan |
Algorithm
The basic algorithm for the Bitmap operator is as shown below:
Note that this flowchart is a simplification. It doesn’t show that execution stops whenever a row is returned, and resumes where it was upon next being called.
Bitmap implementation details
It is unknown how exactly the bitmap is stored. I assume that it uses a contiguous array of N bytes, with N being the range of the hash function used divided by 8. I also assume that all bits are initialized to 0 (for false). Then when a row is processed, the hash function results in a number which is divided by 8 to find the byte position (dividend) and bit position within the byte (remainder); that bit is then set to 1 (for true). If it is already 1, it is probably just set to 1 again as testing before writing would be more expensive than overwriting a value with the same value. However, all of the above are speculations (and none are crucial to understanding how the Bitmap operator works).
Row mode vs batch mode
The bitmap operator is only used in row mode (sections of) execution plans, and the bitmap produced is what Paul White calls a “row -mode bitmap”. Since SQL Server 2012, execution plans can also use “batch-mode bitmaps”, but those are created by the Batch Hash Table Build operator in SQL Server 2012, and by Hash Match and Adaptive Join in SQL Server 2014 and later.
Operator properties
The properties below are specific to the Bitmap operator, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the Bitmap operator are marked with a *.
Property name | Description |
---|---|
Defined Values * | This property lists the name of the created bitmap. This is especially important in execution plans that use multiple Bitmap operators. There is no definition explicitly given for the listed column (not needed, it is always a bitmap). Unlike other operators that have a Defined Values property, the defined column is not added to the rows returned and hence not included in the Output List property. The bitmap is instead stored in a centrally accessible location in memory and stores information about all rows processed. |
Hash Keys | The property lists the column(s) that are hashed in order to find the location of a bit within the bitmap. |
Implicit properties
This table below lists the behavior of the implicit properties for the Bitmap operator.
Property name | Description |
---|---|
Batch Mode enabled | The Bitmap operator supports row mode execution only. In batch mode execution plans in SQL Server 2012, Batch Hash Table Build operator fulfills a similar role. As of SQL Server 2014, batch mode execution plans can now build hash tables while processing a Hash Match or Adaptive Join operator, by setting their BitmapCreator property to True. |
Blocking | The Bitmap operator is non-blocking. |
Memory requirement | The Bitmap operator does not have any special memory requirement, except for the memory to store the bitmap itself (which arguably can be considered part of the plan as a whole, not of this operator only). |
Order-preserving | The Bitmap operator is fully order-preserving. |
Parallelism aware | The Bitmap operator is only used in parallel execution plans. However, its operation is not parallelism aware. There is a separate instance of the bitmap for each thread. However, the PROBE function (see Scalar Functions) that checks data in the bitmap can access the bitmaps for all threads, so the data does not have to be partition aligned with Parallelism operators (Distribute Streams or Repartition Streams) to force all rows to the “correct” thread to ensure that the bitmaps work as expected. |
Segment aware | The Bitmap operator is not segment aware. |