# pg_roaringbitmap RoaringBitmap extension for PostgreSQL. It's initial based on https://github.com/zeromax007/gpdb-roaringbitmap. # Introduction Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps. More information https://github.com/RoaringBitmap/CRoaring . # Build ## Requirements - PostgreSQL 10+ - Other Requirements from https://github.com/RoaringBitmap/CRoaring ## Build su - postgres make sudo make install psql -c "create extension roaringbitmap" Note:You can use `make -f Makefile_native` instead of` make` to let the compiler use the SIMD instructions if it's supported by your CPU. In some scenarios, it may double the performance. But if you copy the `pg_roaringbitmap` binary which builded on machine with SIMD support to other machine without SIMD and run, you could get a SIGILL crash. ## Test make installcheck # Build on PostgreSQL 9.x or Greenplum 6.0 Parallel execution is not supported in PostgreSQL 9.5 and earlier. If you want to compile on these early PostgreSQL versions or Greenplum 6.0(based on PostgreSQL 9.4), you need to remove the `PARALLEL` keyword from these SQL files. cd pg_roaringbitmap sed 's/PARALLEL SAFE//g' -i roaringbitmap--*.sql sed -z 's/,[ \r\n]*PARALLEL = SAFE//g' -i roaringbitmap--*.sql Then refer to [Build] above for building, such as the steps to build on Greenplum 6.0: ## Build su - gpadmin make make install psql -c "create extension roaringbitmap" ## Test sudo yum install 'perl(Data::Dumper)' make installcheck Since the expected output is based on PostgreSQL 10+, this test will not pass. Check the difference in the output file. If the execution results are the same, only the execution plan or other content that is not related to pg_roaringbitmap` is different, the test can be considered OK. diff results/roaringbitmap.out expected/roaringbitmap_gpdb6.out # Usage ## about roaringbitmap data type Logically, you could think of roaringbitmap data type as `bit(4294967296)`, and it should be noted that the integers added to bitmaps is considered to be unsigned. Within bitmaps, numbers are ordered according to uint32. We order the numbers like 0, 1, ..., 2147483647, -2147483648, -2147483647,..., -1. But we use bigint to reference the range of these integers, that is [0 4294967296). ## input and ouput Support two kind of input/output syntax 'array' and 'bytea', The default output format is 'bytea'. postgres=# select roaringbitmap('{1,100,10}'); roaringbitmap ------------------------------------------------ \x3a30000001000000000002001000000001000a006400 (1 row) or postgres=# select '\x3a30000001000000000002001000000001000a006400'::roaringbitmap; roaringbitmap ------------------------------------------------ \x3a30000001000000000002001000000001000a006400 (1 row) output format can changed by `roaringbitmap.output_format` postgres=# set roaringbitmap.output_format='bytea'; SET postgres=# select '{1}'::roaringbitmap; roaringbitmap ---------------------------------------- \x3a3000000100000000000000100000000100 (1 row) postgres=# set roaringbitmap.output_format='array'; SET postgres=# select '{1}'::roaringbitmap; roaringbitmap --------------- {1} (1 row) ## Use bitmap as type of column CREATE TABLE t1 (id integer, bitmap roaringbitmap); ## Build bitmap from integers INSERT INTO t1 SELECT 1,rb_build(ARRAY[1,2,3,4,5,6,7,8,9,200]); INSERT INTO t1 SELECT 2,rb_build_agg(e) FROM generate_series(1,100) e; ## Bitmap Calculation (OR, AND, XOR, ANDNOT) SELECT roaringbitmap('{1,2,3}') | roaringbitmap('{3,4,5}'); SELECT roaringbitmap('{1,2,3}') & roaringbitmap('{3,4,5}'); SELECT roaringbitmap('{1,2,3}') # roaringbitmap('{3,4,5}'); SELECT roaringbitmap('{1,2,3}') - roaringbitmap('{3,4,5}'); ## Bitmap Aggregate (OR, AND, XOR, BUILD) SELECT rb_or_agg(bitmap) FROM t1; SELECT rb_and_agg(bitmap) FROM t1; SELECT rb_xor_agg(bitmap) FROM t1; SELECT rb_build_agg(e) FROM generate_series(1,100) e; ## Calculate cardinality SELECT rb_cardinality('{1,2,3}'); ## Convert bitmap to integer array SELECT rb_to_array(bitmap) FROM t1 WHERE id = 1; ## Convert bitmap to SET of integers SELECT unnest(rb_to_array('{1,2,3}'::roaringbitmap)); or SELECT rb_iterate('{1,2,3}'::roaringbitmap); ## Opperator List
Opperator Input Output Desc Example Result
& roaringbitmap,roaringbitmap roaringbitmap bitwise AND roaringbitmap('{1,2,3}') & roaringbitmap('{3,4,5}') {3}
| roaringbitmap,roaringbitmap roaringbitmap bitwise OR roaringbitmap('{1,2,3}') | roaringbitmap('{3,4,5}') {1,2,3,4,5}
| roaringbitmap,integer roaringbitmap add element to roaringbitmap roaringbitmap('{1,2,3}') | 6 {1,2,3,6}
| integer,roaringbitmap roaringbitmap add element to roaringbitmap 6 | roaringbitmap('{1,2,3}') {1,2,3,6}
# roaringbitmap,roaringbitmap roaringbitmap bitwise XOR roaringbitmap('{1,2,3}') # roaringbitmap('{3,4,5}') {1,2,4,5}
<< roaringbitmap,bigint roaringbitmap bitwise shift left roaringbitmap('{1,2,3}') << 2 {0,1}
>> roaringbitmap,bigint roaringbitmap bitwise shift right roaringbitmap('{1,2,3}') >> 3 {4,5,6}
- roaringbitmap,roaringbitmap roaringbitmap difference(bitwise ANDNOT) roaringbitmap('{1,2,3}') - roaringbitmap('{3,4,5}') {1,2}
- roaringbitmap,integer roaringbitmap remove element from roaringbitmap roaringbitmap('{1,2,3}') - 3 {1,2}
@> roaringbitmap,roaringbitmap bool contains roaringbitmap('{1,2,3}') @> roaringbitmap('{3,4,5}') f
@> roaringbitmap,integer bool contains roaringbitmap('{1,2,3,4,5}') @> 3 t
<@ roaringbitmap,roaringbitmap bool is contained by roaringbitmap('{1,2,3}') <@ roaringbitmap('{3,4,5}') f
<@ integer,roaringbitmap bool is contained by 3 <@ roaringbitmap('{3,4,5}') t
&& roaringbitmap,roaringbitmap bool overlap (have elements in common) roaringbitmap('{1,2,3}') && roaringbitmap('{3,4,5}') t
= roaringbitmap,roaringbitmap bool equal roaringbitmap('{1,2,3}') = roaringbitmap('{3,4,5}') f
<> roaringbitmap,roaringbitmap bool not equal roaringbitmap('{1,2,3}') <> roaringbitmap('{3,4,5}') t
## Function List
Function Input Output Desc Example Result
rb_build integer[] roaringbitmap Create roaringbitmap from integer array rb_build('{1,2,3,4,5}') {1,2,3,4,5}
rb_index roaringbitmap,integer bigint Return the 0-based index of element in this roaringbitmap, or -1 if do not exsits rb_index('{1,2,3}',3) 2
rb_cardinality roaringbitmap bigint Return cardinality of the roaringbitmap rb_cardinality('{1,2,3,4,5}') 5
rb_and_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the AND of two roaringbitmaps rb_and_cardinality('{1,2,3}','{3,4,5}') 1
rb_or_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the OR of two roaringbitmaps rb_or_cardinality('{1,2,3}','{3,4,5}') 5
rb_xor_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the XOR of two roaringbitmaps rb_xor_cardinality('{1,2,3}','{3,4,5}') 4
rb_andnot_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the ANDNOT of two roaringbitmaps rb_andnot_cardinality('{1,2,3}','{3,4,5}') 2
rb_is_empty roaringbitmap boolean Check if roaringbitmap is empty. rb_is_empty('{1,2,3,4,5}') t
rb_fill roaringbitmap,range_start bigint,range_end bigint roaringbitmap Fill the specified range (not include the range_end) rb_fill('{1,2,3}',5,7) {1,2,3,5,6}
rb_clear roaringbitmap,range_start bigint,range_end bigint roaringbitmap Clear the specified range (not include the range_end) rb_clear('{1,2,3}',2,3) {1,3}
rb_flip roaringbitmap,range_start bigint,range_end bigint roaringbitmap Negative the specified range (not include the range_end) rb_flip('{1,2,3}',2,10) {1,4,5,6,7,8,9}
rb_range roaringbitmap,range_start bigint,range_end bigint roaringbitmap Return new set with specified range (not include the range_end) rb_range('{1,2,3}',2,3) {2}
rb_range_cardinality roaringbitmap,range_start bigint,range_end bigint bigint Return the cardinality of specified range (not include the range_end) rb_range_cardinality('{1,2,3}',2,3) 1
rb_min roaringbitmap integer Return the smallest offset in roaringbitmap. Return NULL if the bitmap is empty rb_min('{1,2,3}') 1
rb_max roaringbitmap integer Return the greatest offset in roaringbitmap. Return NULL if the bitmap is empty rb_max('{1,2,3}') 3
rb_rank roaringbitmap,integer bigint Return the number of elements that are smaller or equal to the specified offset rb_rank('{1,2,3}',3) 3
rb_jaccard_dist roaringbitmap,roaringbitmap double precision Return the jaccard distance(or the Jaccard similarity coefficient) of two bitmaps rb_jaccard_dist('{1,2,3}','{3,4}') 0.25
rb_select roaringbitmap,bitset_limit bigint,bitset_offset bigint=0,reverse boolean=false,range_start bigint=0,range_end bigint=4294967296 roaringbitmap Return subset [bitset_offset,bitset_offset+bitset_limit) of bitmap between range [range_start,range_end) rb_select('{1,2,3,4,5,6,7,8,9}',5,2) {3,4,5,6,7}
rb_to_array roaringbitmap integer[] Convert roaringbitmap to integer array rb_to_array(roaringbitmap('{1,2,3}')) {1,2,3}
rb_iterate roaringbitmap SET of integer Return set of integer from a roaringbitmap data.
rb_iterate(roaringbitmap('{1,2,3}'))
1
2
3
## Aggregation List
Function Input Output Desc Example Result
rb_build_agg integer roaringbitmap Build a roaringbitmap from a integer set
select rb_build_agg(id)
    from (values (1),(2),(3)) t(id)
{1,2,3}
rb_or_agg roaringbitmap roaringbitmap AND Aggregate calculations from a roaringbitmap set
select rb_or_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{1,2,3,4}
rb_and_agg roaringbitmap roaringbitmap AND Aggregate calculations from a roaringbitmap set
select rb_and_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{2,3}
rb_xor_agg roaringbitmap roaringbitmap XOR Aggregate calculations from a roaringbitmap set
select rb_xor_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{1,4}
rb_or_cardinality_agg roaringbitmap bigint OR Aggregate calculations from a roaringbitmap set, return cardinality.
select rb_or_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
4
rb_and_cardinality_agg roaringbitmap bigint AND Aggregate calculations from a roaringbitmap set, return cardinality
select rb_and_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
2
rb_xor_cardinality_agg roaringbitmap bigint XOR Aggregate calculations from a roaringbitmap set, return cardinality
select rb_xor_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
2
# Cloud Vendor Support `pg_roaringbitmap` is supported by the following cloud vendors - Alibaba Cloud RDS PostgreSQL: https://www.alibabacloud.com/help/doc-detail/142340.htm - Huawei Cloud RDS PostgreSQL: https://support.huaweicloud.com/usermanual-rds/rds_09_0045.html - Tencent Cloud RDS PostgreSQL: https://cloud.tencent.com/document/product/409/67299 To request support for `pg_roaringbitmap` from other cloud vendors, please see the following: - Amazon RDS: send an email to rds-postgres-extensions-request@amazon.com with the extension name and use case ([docs](https://aws.amazon.com/rds/postgresql/faqs/)) - Google Cloud SQL: comment on [this issue](https://issuetracker.google.com/u/1/issues/207403722) - DigitalOcean Managed Databases: comment on [this idea](https://ideas.digitalocean.com/app-framework-services/p/postgres-extension-request-pgroaringbitmap) - Azure Database for PostgreSQL: comment on [this post](https://feedback.azure.com/d365community/idea/e6f5ff90-da4b-ec11-a819-0022484bf651)