Conversation
There was a problem hiding this comment.
Hi, @zhangyue1818 welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!
10a0611 to
ae89fd0
Compare
ae89fd0 to
29bcd9b
Compare
29bcd9b to
22c2bf4
Compare
22c2bf4 to
8415716
Compare
8415716 to
a97400c
Compare
This patch aims to fix the memory leak issue #10314 . Previously, the IC history table got pruned only after begining a new distributed transaction (dtx). As a result, when a transaction contains many commands, or when a session never begins any distributed transaction, the history will never get pruned. This can cause memory leak. This patch fixes this issue by reverting this part of code to the last version before dependency on distributed transaction was introduced. After that, we will only a constant-length IC history. This ensures that - The history table takes only constant amount of memory, and - The history is long enough to handle mismatched packets.
This modernizes and refactors the UDP IC socket setup code somewhat, drawing inspiration from pgstat_init(), where similar UDP socket setup is performed. Significant changes: * Prefer pg_getaddrinfo_all() to getaddrinfo(). * Consistently use guarded logging for address retries. * Simplify interface for sizing UDP IC socket send/receive buffers.
…ion (#16458) Previous PR #13355 fixed issue #10314. But it introduced a new bug: IC-UDP may hang forever in some scenarios (e.g. lots of IC instances in single one UDF), please see the discussion in #13411. Enhancing it in this PR: * for the non-read-only transaction, keep the previous logic (before PR-13355) to prevent the new bug. * for the read-only transaction, introduce gp_interconnect_cursor_ic_table_size to config the size of Cursor History Table as a workaround.
`SendDummyPacket` eventually calls `getaddrinfo` (which is a reentrant), however, `getaddrinfo` is not an async-signal-safe function. `getaddrinfo` internally calls `malloc`, which is strongly advised to not do within a signal handler as it may cause deadlocks. Cache the accepted socket information for the listener, so that it can be reused in `SendDummyPacket()`. The purpose of `SendDummyPacket` is to exit more quickly; it circumvents the polling that happens, which eventually times out after 250ms. Without `SendDummyPacket()`, there will be multiple test failures since some tests expects the backend connection to terminate almost immediately. To view all the async-signal-safe functions, please view the signal-safety(7) — Linux manual page. Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com> Reviewed-by: Andrew Repp <reppa@vmware.com>
Previously on commit 70306db18e2, we removed pg_getaddrinfo_all for signal handlers. However, in doing so, the capability of supporting both AF_INET6 and AF_INET was lost; this responsibility must now be handled by us. The commit mentioned above fixed the issue for AF_INET (IPv4), but not for AF_INET6 (IPv6). This commit addresses the situation for both AF_INET and AF_INET6. Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
`sendControlMessage()` method has no retry attempts, this refactor abstracted a `sendto` system call wrapper with retry enabled. Co-authored-by: zwenlin <zwenlin@vmware.com>
…nd packet" (#17164) * Add hint message for MTU settings when IC reports ERROR "Failed to send packet".
contrib/interconnect/udp/ic_udpifc.c
Outdated
| MemoryContext old; | ||
| CursorICHistoryEntry *p; | ||
| uint32 index = icId % CURSOR_IC_TABLE_SIZE; | ||
| uint32 index = icId % t->size; |
Add Assert(t->size > 0) before modulo operations on t->size in CursorICHistoryTable functions to guard against potential divide-by-zero: addCursorIcEntry, updateCursorIcEntry, getCursorIcEntry, and pruneCursorIcEntry.
The hash table index variables in updateCursorIcEntry, getCursorIcEntry, pruneCursorIcEntry and purgeCursorIcEntry were declared as uint8, which overflows when t->size > 255. Since Gp_interconnect_cursor_ic_table_size can be set up to 102400, the modulo result may exceed 255, causing lookups, updates and prunes to operate on wrong hash buckets. Change index type from uint8 to uint32 to match t->size, consistent with addCursorIcEntry which already uses uint32.
|
|
||
| static void ConvertToIPv4MappedAddr(struct sockaddr_storage *sockaddr, socklen_t *o_len); | ||
| #if defined(__darwin__) | ||
| #define s6_addr32 __u6_addr.__u6_addr32 |
There was a problem hiding this comment.
Are you tried to build on macos ?
There was a problem hiding this comment.
The __darwin__ macro is already widely used throughout the codebase (defined in src/include/port/darwin.h) for macOS-specific compatibility — e.g. in pqcomm.c, ps_status.c, elog.c, sysv_shmem.c, etc. This is not about building on macOS specifically, but about ensuring cross-platform compatibility.
Why s6_addr32 needs a macro:
macOS's <netinet6/in6.h> does not directly expose s6_addr32 as a member of struct in6_addr. On Linux it is available directly, but on macOS it must be accessed via __u6_addr.__u6_addr32. This macro bridges that difference so the code in ConvertToIPv4MappedAddr (e.g. in6_new->sin6_addr.s6_addr32[3] = in->sin_addr.s_addr) compiles on both platforms.
Both the macro and the function are guarded by #if defined(__darwin__) so they have no effect on Linux builds.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions