b2b_logic: fix duplicate out_sdp serialization in entity_storage#3772
b2b_logic: fix duplicate out_sdp serialization in entity_storage#3772dirkbro wants to merge 1 commit intoOpenSIPS:masterfrom
Conversation
When packing B2B entities for cluster replication in
b2bl_entity_pack(), the out_sdp field is currently serialized twice:
// Around line 133 (packing)
if (event_type == B2B_EVENT_CREATE) {
...
bin_push_str(storage, &entity->hdrs);
bin_push_str(storage, &entity->out_sdp); // First time
bin_push_str(storage, &entity->dlginfo->callid);
bin_push_str(storage, &entity->dlginfo->fromtag);
bin_push_str(storage, &entity->dlginfo->totag);
}
bin_push_str(storage, &entity->out_sdp); // Second time - remove this
However, in receive_entity_create(), out_sdp is deserialized only once, with the expected sequence being:
// Around line 344 (unpacking)
bin_pop_str(storage, &hdrs);
bin_pop_str(storage, &sdp);
...
bin_pop_str(storage, &dlginfo.callid);
bin_pop_str(storage, &dlginfo.fromtag);
bin_pop_str(storage, &dlginfo.totag);
Because the pack side writes out_sdp twice but the unpack side reads it only once, the binary stream becomes misaligned and all subsequent fields are read from the wrong offset. In clustered deployments this
corrupts the reconstructed entity, including entity->no, which leads to
errors such as:
ERROR:b2b_logic:receive_entity_create: Bad entity bridge no [21349]
for tuple [549.0]
|
I am afraid this is not entirely OK, as you no longer push an SDP after headers, but in create, it still popped. Thus, a complete fix, would also need to pop the SDP after the Thanks! |
|
I've just pushed b88203c to address this. Please give it a try and let us know if you're still having the initial issue. If not, I will backport everything to the supported versions. Best regards, |
|
Hi Răzvan, I’ve just tested with
So from my side the original issue is resolved. Please go ahead and backport to the supported branches. Thanks, and best regards, |
Summary
This PR fixes a bug in B2B entity replication (
clusterer+b2b_logic) where theout_sdpfield is pushed twice into the binary packet during packing, but consumed only once during unpacking. The mismatch corrupts the subsequent fields in the packet, includingentity->no, and breaks B2B tuple synchronization across cluster nodes.After this change:
pack_entity()andb2bl_entity_unpack()have matching field sequencesentity->nois correctly restored as0or1Bad entity bridge no [...]errors no longer occur in normal operationopensips-cli -x mi b2b_listreports a consistent view of tuples across all nodes.Details
Environment:
proto=bin) with B2B state replication enabledb2b_logicb2b_entitiesclustererProblem description:
When packing B2B entities for cluster replication in
pack_entity(), theout_sdpfield is currently serialized twice:// Around line 133 - entity_storage.c : pack_entity()
However, in receive_entity_create(), out_sdp is deserialized only once, with the expected sequence being:
Because the pack side writes out_sdp twice but the unpack side reads it only once, the binary stream becomes misaligned and all subsequent fields are read from the wrong offset. In clustered deployments this corrupts the reconstructed entity, including entity->no, which leads to errors such as:
On the “good” node (the one where the tuple was originally created), the B2B state looks correct, for example:
On the receiving nodes (after replication), the same tuple fails to deserialize correctly and receive_entity_create() logs an invalid entity->no (e.g. 21349), with the tuple subsequently missing in b2b_list / b2be_list due to the failed creation.
This behaviour is reproducible with:
Solution
The fix is to remove the duplicate serialization of
out_sdpinpack_entity(), so that the packed fields exactly match the unpacked fields inreceive_entity_create().Testing
b2b_logicwith the fix and deployed to a 3‑node SBC clusterBad entity bridge no [...]errors in logsopensips-cli -x mi b2b_listshows the same tuples on all nodesopensips-cli -x mi b2be_listshows consistent dialog / entity state across the clusterCompatibility
pack_entity()andreceive_entity_create()is restored to a consistent state.Closing issues
Fixes issue: #3707