Skip to content

Add support for websockets#4579

Open
lucasdrufva wants to merge 1 commit intosecdev:masterfrom
lucasdrufva:master
Open

Add support for websockets#4579
lucasdrufva wants to merge 1 commit intosecdev:masterfrom
lucasdrufva:master

Conversation

@lucasdrufva
Copy link

@lucasdrufva lucasdrufva commented Nov 3, 2024

Adds support for websockets.

HTTP dissector has been updated to support omitting the reason phrase as this is usually done when upgrading to websocket connection.

fixes #4578

@gpotter2
Copy link
Member

gpotter2 commented Nov 3, 2024

Hi ! This looks pretty cool !

@codecov
Copy link

codecov bot commented Nov 6, 2024

Codecov Report

❌ Patch coverage is 98.07692% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.74%. Comparing base (8e08cbf) to head (60a13b7).
⚠️ Report is 187 commits behind head on master.

Files with missing lines Patch % Lines
scapy/contrib/websocket.py 97.97% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4579      +/-   ##
==========================================
- Coverage   81.62%   78.74%   -2.89%     
==========================================
  Files         358      334      -24     
  Lines       85652    80602    -5050     
==========================================
- Hits        69915    63470    -6445     
- Misses      15737    17132    +1395     
Files with missing lines Coverage Δ
scapy/layers/http.py 80.54% <100.00%> (-1.90%) ⬇️
scapy/contrib/websocket.py 97.97% <97.97%> (ø)

... and 284 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lucasdrufva lucasdrufva marked this pull request as ready for review November 8, 2024 17:37
@Nemoumbra
Copy link

Is there anything that's currently stopping the PR from being merged?

@polybassa
Copy link
Contributor

This branch has a conflict with http.py

Would you be able to resolve it?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new WebSocket dissector as a Scapy contrib layer and updates HTTP parsing to better handle WebSocket upgrade traffic (notably responses that omit the reason phrase), with accompanying pcaps and unit tests to exercise the new functionality.

Changes:

  • Add scapy.contrib.websocket WebSocket layer with TCP stream reassembly support and optional permessage-deflate handling.
  • Update HTTP request/response first-line dissection to tolerate missing components (e.g., omitted reason phrase).
  • Add new test pcaps and a contrib .uts test suite for WebSocket parsing/building.

Reviewed changes

Copilot reviewed 3 out of 5 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
scapy/contrib/websocket.py New WebSocket layer implementation (framing, masking, TCP reassembly, compression hooks).
scapy/layers/http.py Adjust HTTP start-line parsing to allow missing reason phrase (WebSocket upgrade responses).
test/contrib/websocket.uts New unit tests validating WebSocket frame build/dissect and TCPSession behavior using pcaps.
test/pcaps/websocket_compressed_session.pcap Fixture pcap for TCPSession + permessage-deflate coverage.
test/pcaps/websocket_invalid_handshakes.pcap Fixture pcap for invalid/partial upgrade sequences coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

else:
return http_data

if not http_data.Upgrade.lower() == b"websocket":
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the SERVER_OPEN handshake branch, http_data.Upgrade.lower() is called without checking that Upgrade is present. For malformed/partial responses this will raise AttributeError and break TCP stream parsing. Mirror the request-side guard (if not http_data.Upgrade or ...).

Suggested change
if not http_data.Upgrade.lower() == b"websocket":
if not http_data.Upgrade or http_data.Upgrade.lower() != b"websocket":

Copilot uses AI. Check for mistakes.
self.setfieldval('Http_Version', HTTPVersion)
self.setfieldval('Status_Code', Status)
self.setfieldval('Reason_Phrase', Reason)
version_status_reason = re.split(br"\s+", first_line, maxsplit=2) + [None]
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For responses without a reason phrase, Reason_Phrase is set to None. Downstream code (e.g. summaries/pretty-printing) expects bytes-like values for HTTP header fields; using None can lead to odd representations ("None") and makes round-tripping harder. Consider setting an explicit empty bytes value (or leaving the field unset) when the reason phrase is omitted.

Suggested change
version_status_reason = re.split(br"\s+", first_line, maxsplit=2) + [None]
version_status_reason = re.split(br"\s+", first_line, maxsplit=2) + [b""]

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +101
if("permessage-deflate" in pkt.extensions):
try:
payloadData = pkt.decoder[0](payloadData + b"\x00\x00\xff\xff")
except Exception:
logging.debug("Failed to decompress payload", payloadData)
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decompression is attempted whenever "permessage-deflate" in pkt.extensions, regardless of whether the frame is actually compressed (RFC6455 permessage-deflate uses RSV1 to indicate compression). This can silently corrupt payloads if the decoder happens to accept uncompressed data. Consider conditioning decompression on the RSV1 bit (e.g. pkt.rsv), and also handle decoder being unset (currently pkt.decoder[0] will fail).

Suggested change
if("permessage-deflate" in pkt.extensions):
try:
payloadData = pkt.decoder[0](payloadData + b"\x00\x00\xff\xff")
except Exception:
logging.debug("Failed to decompress payload", payloadData)
extensions = getattr(pkt, "extensions", None)
decoder = getattr(pkt, "decoder", None)
# RFC6455 permessage-deflate uses RSV1 to indicate compression.
if (extensions
and "permessage-deflate" in extensions
and getattr(pkt, "rsv", 0) & 0x4
and decoder
and callable(decoder[0])):
try:
payloadData = decoder[0](payloadData + b"\x00\x00\xff\xff")
except zlib.error:
logging.debug("Failed to decompress payload: %r", payloadData)

Copilot uses AI. Check for mistakes.
# TODO: handle or Logg wrong accept key
pass

if b"Sec-WebSocket-Extensions" in http_data.Unknown_Headers:
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if b"Sec-WebSocket-Extensions" in http_data.Unknown_Headers: assumes Unknown_Headers is a dict. When the response is malformed and Unknown_Headers is None (which is possible per HTTP dissector), this will raise TypeError. Guard with if http_data.Unknown_Headers and ... before doing membership checks.

Suggested change
if b"Sec-WebSocket-Extensions" in http_data.Unknown_Headers:
if http_data.Unknown_Headers and b"Sec-WebSocket-Extensions" in http_data.Unknown_Headers:

Copilot uses AI. Check for mistakes.
Comment on lines +255 to +260
if "original" not in metadata:
return

if "permessage-deflate" in session["extensions"]:
is_server = True if metadata["original"][TCP].sport == session["server-port"] else False
ws = WebSocket(bytes(data), extensions=session["extensions"], decoder = session["server-decoder"] if is_server else session["client-decoder"])
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the handshake, the code returns early when "original" not in metadata. In TCPSession, subsequent parsing of leftover bytes (padding) calls tcp_reassemble with a cleared metadata dict, so this guard can prevent dissecting multiple WebSocket frames that arrive in the same TCP segment. Consider storing the needed direction info in session (or metadata) during the first call and allowing parsing even when metadata["original"] is missing.

Suggested change
if "original" not in metadata:
return
if "permessage-deflate" in session["extensions"]:
is_server = True if metadata["original"][TCP].sport == session["server-port"] else False
ws = WebSocket(bytes(data), extensions=session["extensions"], decoder = session["server-decoder"] if is_server else session["client-decoder"])
if "permessage-deflate" in session["extensions"]:
# Determine direction (server vs client). When metadata["original"] is
# not available (e.g., leftover bytes in the same TCP segment),
# fall back to the last known direction stored in the session.
if "original" in metadata:
is_server = metadata["original"][TCP].sport == session["server-port"]
session["last-direction-is-server"] = is_server
else:
is_server = session.get("last-direction-is-server", False)
ws = WebSocket(
bytes(data),
extensions=session["extensions"],
decoder=session["server-decoder"] if is_server else session["client-decoder"],
)

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +66
self.size = 7
return super().addfield(pkt, s, val)
elif val <= 0xFFFF:
self.size = 7+16
s, _, masked = s
return s + struct.pack("!BH", 126 | masked, val)
elif val <= 0xFFFFFFFFFFFFFFFF:
self.size = 7+64
s, _, masked = s
return s + struct.pack("!BQ", 127 | masked, val)
else:
raise Scapy_Exception("%s: Payload length too large" %
self.__class__.__name__)
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PayloadLenField.addfield() manually builds the second byte for extended payload lengths by unpacking the in-progress bitfield tuple (s, _, masked = s) and OR-ing masked into 126/127. This drops the partial bitfield state and also sets the mask bit in the wrong position (it should be 0x80, not 0x01), so masked frames with payload length > 125 will be serialized incorrectly. Consider relying on Scapy's bitfield packing for the mask/payloadLen byte and adding separate conditional fields for the 16/64-bit extended length.

Suggested change
self.size = 7
return super().addfield(pkt, s, val)
elif val <= 0xFFFF:
self.size = 7+16
s, _, masked = s
return s + struct.pack("!BH", 126 | masked, val)
elif val <= 0xFFFFFFFFFFFFFFFF:
self.size = 7+64
s, _, masked = s
return s + struct.pack("!BQ", 127 | masked, val)
else:
raise Scapy_Exception("%s: Payload length too large" %
self.__class__.__name__)
# Simple 7-bit length; let BitFieldLenField handle packing.
self.size = 7
return super().addfield(pkt, s, val)
elif val <= 0xFFFF:
# Extended 16-bit length. First pack the MASK + 7-bit length
# indicator (126) using the standard bitfield machinery, then
# append the 16-bit length to the raw bytes.
self.size = 7
s = super().addfield(pkt, s, 126)
raw, bitstate = s
raw += struct.pack("!H", val)
return raw, bitstate
elif val <= 0xFFFFFFFFFFFFFFFF:
# Extended 64-bit length. Same pattern as above, with indicator 127.
self.size = 7
s = super().addfield(pkt, s, 127)
raw, bitstate = s
raw += struct.pack("!Q", val)
return raw, bitstate
else:
raise Scapy_Exception("%s: Payload length too large" %
self.__class__.__name__)

Copilot uses AI. Check for mistakes.
Comment on lines +90 to +95
if pkt.mask:
key = struct.pack("I", pkt.maskingKey)[::-1]
data_int = int.from_bytes(payloadData, 'big')
mask_repeated = key * (len(payloadData) // 4) + key[: len(payloadData) % 4]
mask_int = int.from_bytes(mask_repeated, 'big')
payloadData = (data_int ^ mask_int).to_bytes(len(payloadData), 'big')
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PayloadField uses struct.pack("I", pkt.maskingKey)[::-1] to derive the masking key bytes. Using native-endian packing plus manual reversing is platform-dependent and hard to reason about. Prefer deterministic network-order packing (e.g. !I) or directly constructing the 4 key bytes to match the on-wire order used by XNBytesField.

Copilot uses AI. Check for mistakes.
try:
payloadData = pkt.decoder[0](payloadData + b"\x00\x00\xff\xff")
except Exception:
logging.debug("Failed to decompress payload", payloadData)
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handler calls logging.debug("Failed to decompress payload", payloadData). Passing payloadData as a second argument without a % placeholder will cause a TypeError in Python's logging (it tries to apply % formatting). Use a format placeholder (e.g. %r) or log with exc_info=True so decompression failures don't crash dissection.

Suggested change
logging.debug("Failed to decompress payload", payloadData)
logging.debug("Failed to decompress payload: %r", payloadData)

Copilot uses AI. Check for mistakes.
PayloadField("wsPayload", lengthFrom="payloadLen")
]

def __init__(self, pkt=None, extensions=[], decoder=None, *args, **fields):
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebSocket.__init__ uses a mutable default argument (extensions=[]). This can leak state between instances if the list is mutated. Use None as the default and create a new list/dict per instance.

Suggested change
def __init__(self, pkt=None, extensions=[], decoder=None, *args, **fields):
def __init__(self, pkt=None, extensions=None, decoder=None, *args, **fields):
if extensions is None:
extensions = []

Copilot uses AI. Check for mistakes.
super().__init__(_pkt=pkt, *args, **fields)

def extract_padding(self, s):
return '', s
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_padding() returns '' (a str) rather than b'' (bytes). Packet.extract_padding is expected to return bytes, and returning a string can break dissection and TCP reassembly logic. Return b"" for the payload portion.

Suggested change
return '', s
return b'', s

Copilot uses AI. Check for mistakes.
@gpotter2 gpotter2 self-assigned this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Websocket support

5 participants