PKG-895: Allow citus and timescaledb to compile#600
Conversation
The original part of the version is kept as is, and Percona specific information is added after. For example, psql displays the following after this commit: SELECT version(); PostgreSQL 17.6 (PostgreSQL) - Percona Server for PostgreSQL 17.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit Where the addition is the extra version at the end: " - Percona Server for PostgreSQL 17.4.1" with the ".1" being the Percona version number. Similarly, commands show the same information: bin/pg_ctl --version pg_ctl (PostgreSQL) 17.6 - Percona Server for PostgreSQL 17.6.1
There are various reasons why one would want to create their own implementation of a storage manager, among which are block-level compression, encryption and offloading to cold storage. This patch is a first patch that allows extensions to register their own SMgr. Note, however, that this SMgr is not yet used - only the first SMgr to register is used, and this is currently the md.c smgr. Future commits will include facilities to select an SMgr for each tablespace.
With this change, mdcreate receives the old relfilelocator along with the new for operations that create a new file for an existing relation. This is required for tde_heap in pg_tde.
and allow extensions to override it For now, it extends on `pread` and `pwrite` from/into segment files. This is the minimum we need for full XLog encryption with pg_de.
This commit modifies the build scripts so that `llvmjit.so` links LLVM statically instead of dynamically. No other binary affected, the change shouldn't be user visible at all. The memory footprint of postgres is also unaffected: only the `llvmjit.so` dynamic library is affected. If postgres doesn't use JIT, the LLVM library isn't loaded, as before. If it uses JIT, it is loaded as part of `llvmjit.so`, sa before. The only change is the binary size of `llvmjit.so`.
Instead of checking for the oid of the AM we check for if the access method of the relation uses the routine struct as the heap does. This allows the tools to be support all access methods which are simple copies of heap. I am uncertain if this is something which the project would be interested in but it decreases the size of our diff against upstream.
Need to modify the existing license to include the portions of code unique to Percona
The pg_tde extension's custom storage manager needs to add its own AIO callback to decrypt pages before the buffer manager's callback is called but since the set of AIO callbacks is fixed we need to expose this API so that extensions can register their own types of callbacks. This new function for registering a callback is supposed to be called when intializing shared memory. We try to keep the number of callbacks low, setting it to 15, so that we have the 4 built-in callbacks and 11 slots that extensions can use.
This commit reintroduces the old method signatures of some methods we changed to support pg_tde. The changed method signatures still exists with a "2" suffix, and all internal code calls the "2" methods. The reintroduced original methods also check a global flag which can disable them - if this happens, they report a FATAL error instead. This variable is activated by pg_tde, which means that if something tries to use these while pg_tde is loaded, and with that, possibly ignores its file tracking mechanism, it reports an error. In practice this means that timescaledb, or the columnar storage in citus won't work if pg_tde is loaded in the shared_preload_libraries, but they can be used with our distribution without pg_tde enabled. Also verified that both citus and timescaledb do compile with these changes, and their basic test suite can be run againts our fork.
jeltz
left a comment
There was a problem hiding this comment.
Apaart from some naming and style feedback I have one big question:
It feels a bit scary that we error out (now with FATAL but I think it could just be ERROR, not that that would help that much) not when Timescale or Citus are loaded but instead later when something uses them. But I can't really think of a better solution. We could of course have pg_tde warn if we see them loaded but we do not want to prevent them for using their own patched Citus so we can jsut warn which people probably will miss.
Maybe this is the least bad, hmm.
| *minmulti = GetOldestMultiXactId(); | ||
|
|
||
| srel = RelationCreateStorage(oldlocator, *newrlocator, persistence, true); | ||
| srel = RelationCreateStorage2(oldlocator, *newrlocator, persistence, true); |
There was a problem hiding this comment.
I wonder if we should not name it RelationCreateStoragePSP or RelationCreateStoragePercona instead to make it totally clear it is our function.
| bool is_internal, | ||
| Oid *constraintId) | ||
| { | ||
| if(!allow_upstream_smgr_api) { |
There was a problem hiding this comment.
This does not follow the PostgreSQL code style.
| Oid *constraintId) | ||
| { | ||
| if(!allow_upstream_smgr_api) { | ||
| elog(FATAL, "An extension is trying to use the traditional index_create method, while another loaded extension (pg_tde) requires the new API."); |
There was a problem hiding this comment.
Should this be FATAL or ERROR? I am leaning towards ERROR. If the error happens in a critical section won't it be fatal anyway?
| char relpersistence, | ||
| bool register_delete) | ||
| { | ||
| if(!allow_upstream_smgr_api) { |
There was a problem hiding this comment.
This also does not follow PG's C style.
|
|
||
| SMgrId storage_manager_id; | ||
|
|
||
| bool allow_upstream_smgr_api = true; |
|
|
||
| void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo) | ||
| { | ||
| if(!allow_upstream_smgr_api) { |
There was a problem hiding this comment.
This too does not follow code style. :)
Let's talk about this patch again!
Same as before, only for the PSP18 branch for now, 17 needs the same changes of course.
The pg_tde repo will need a matching
change if we go with this patch as-is.
This commit reintroduces the old method signatures of some methods we changed to support pg_tde. The changed method signatures still exists with a "2" suffix, and all internal code calls the "2" methods.
The reintroduced original methods also check a global flag which can disable them - if this happens, they report a FATAL error instead. This variable is activated by pg_tde, which means that if something tries to use these while pg_tde is loaded, and with that, possibly ignores its file tracking mechanism, it reports an error.
In practice this means that timescaledb, or the columnar storage in citus won't work if pg_tde is loaded in the shared_preload_libraries, but they can be used with our distribution without pg_tde enabled.
Also verified that both citus and timescaledb do compile with these changes, and their basic test suite can be run againts our fork.