-
Notifications
You must be signed in to change notification settings - Fork 31
Improvements to Google Parser #375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AndriusV4
wants to merge
5
commits into
networktocode:develop
Choose a base branch
from
AndriusV4:feature/improve-google-parser
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+830
−17
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
5121470
Adding subject parser for Google, improving status and no-end-date pa…
3bb17fa
Added changelog fragment
38d9268
Fixing changelog fragment PR num
5ff26ce
Removing incorrectly named changelog fragment
bebb52e
Fixing changelog fragment name
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| Added Google subject parser with support for multiple status types and notifications without end times |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -4,7 +4,7 @@ | |||||
| import re | ||||||
| from datetime import datetime | ||||||
|
|
||||||
| from circuit_maintenance_parser.parser import CircuitImpact, Html, Impact, Status | ||||||
| from circuit_maintenance_parser.parser import CircuitImpact, EmailSubjectParser, Html, Impact, Status | ||||||
|
|
||||||
| # pylint: disable=too-many-nested-blocks, too-many-branches | ||||||
|
|
||||||
|
|
@@ -18,7 +18,7 @@ def parse_html(self, soup): | |||||
| """Execute parsing.""" | ||||||
| data = {} | ||||||
| data["circuits"] = [] | ||||||
| data["status"] = Status.CONFIRMED | ||||||
| end_time_explicit = False | ||||||
|
|
||||||
| for span in soup.find_all("span"): | ||||||
| if span.string is None: | ||||||
|
|
@@ -29,6 +29,7 @@ def parse_html(self, soup): | |||||
| elif span.string.strip() == "End Time:": | ||||||
| dt_str = span.next_sibling.string.strip() | ||||||
| data["end"] = self.dt2ts(datetime.strptime(dt_str, "%Y-%m-%d %H:%M:%S %z UTC")) | ||||||
| end_time_explicit = True | ||||||
| elif span.string.strip() == "Peer ASN:": | ||||||
| data["account"] = span.parent.next_sibling.string.strip() | ||||||
| elif span.string.strip() == "Google Neighbor Address(es):": | ||||||
|
|
@@ -37,9 +38,42 @@ def parse_html(self, soup): | |||||
| cid = googleaddr + "-" + span.parent.next_sibling.string.strip() | ||||||
| data["circuits"].append(CircuitImpact(circuit_id=cid, impact=Impact.OUTAGE)) | ||||||
|
|
||||||
| summary = list(soup.find("div").find("div").strings)[-1].strip() | ||||||
| match = re.search(r" - Reference (.*)$", summary) | ||||||
| data["summary"] = summary | ||||||
| data["maintenance_id"] = match[1] | ||||||
| # Google sometimes send notifications without End Time specificed | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| if not end_time_explicit and data["start"]: | ||||||
| # Since start and end times cannot be equal, manufacturing end date by adding 1hr to start date | ||||||
| end_time_delta = 3600 | ||||||
| data["end"] = data["start"] + end_time_delta | ||||||
|
|
||||||
| return [data] | ||||||
|
|
||||||
|
|
||||||
| class SubjectParserGoogle1(EmailSubjectParser): | ||||||
| """Subject Parser for Google notifications.""" | ||||||
|
|
||||||
| def parse_subject(self, subject): | ||||||
| """Parse the subject line.""" | ||||||
| data = {} | ||||||
|
|
||||||
| # Example subject format - "[Scheduled] Google Planned Network Maintenance Notification - Reference PCR/123456" | ||||||
| # Group 1: Status (e.g., Scheduled, Completed, Canceled) | ||||||
| # Group 2: Maintenance ID (e.g., PCR/123456) | ||||||
| match = re.search(r"(\[\S+\]).*Reference\s+(\S+)", subject, re.IGNORECASE | re.DOTALL) | ||||||
| match_2 = re.search(r"\[\S+\]\s+(.*)", subject, re.IGNORECASE | re.DOTALL) | ||||||
|
|
||||||
| if match: | ||||||
| status_str = match.group(1).upper() | ||||||
| data["maintenance_id"] = match.group(2).strip() | ||||||
| if "COMPLETED" in status_str: | ||||||
| data["status"] = Status.COMPLETED | ||||||
| # To handle both Cancelled and Canceled spelling options just in case | ||||||
| elif "CANCEL" in status_str: | ||||||
| data["status"] = Status.CANCELLED | ||||||
| elif "SCHEDULED" in status_str: | ||||||
| data["status"] = Status.CONFIRMED | ||||||
| # If unable to match, we fallback to default confirmed | ||||||
| else: | ||||||
| data["status"] = Status.CONFIRMED | ||||||
| if match_2: | ||||||
| data["summary"] = match_2.group(1) | ||||||
|
|
||||||
| return [data] | ||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| [Scheduled] Google Planned Network Maintenance Notification - Reference PCR/123456 |
7 changes: 7 additions & 0 deletions
7
tests/unit/data/google/google2_scheduled_subject_parser_result.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| [ | ||
| { | ||
| "maintenance_id": "PCR/123456", | ||
| "status": "CONFIRMED", | ||
| "summary": "Google Planned Network Maintenance Notification - Reference PCR/123456" | ||
| } | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| [Completed] Google Planned Network Maintenance Notification - Reference PCR/123456 |
7 changes: 7 additions & 0 deletions
7
tests/unit/data/google/google3_completed_subject_parser_result.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| [ | ||
| { | ||
| "maintenance_id": "PCR/123456", | ||
| "status": "COMPLETED", | ||
| "summary": "Google Planned Network Maintenance Notification - Reference PCR/123456" | ||
| } | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| [Canceled] Google Planned Network Maintenance Notification - Reference PCR/123456 |
7 changes: 7 additions & 0 deletions
7
tests/unit/data/google/google4_canceled_subject_parser_result.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| [ | ||
| { | ||
| "maintenance_id": "PCR/123456", | ||
| "status": "CANCELLED", | ||
| "summary": "Google Planned Network Maintenance Notification - Reference PCR/123456" | ||
| } | ||
| ] |
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,152 @@ | ||
| Date: Tue, 15 Jul 2025 13:09:19 +0000 | ||
| Subject: [Completed] Google Planned Network Maintenance Notification - | ||
| Reference PCR/123456 | ||
| From: Google NOC <noc-noreply@google.com> | ||
| To: sample@example.net | ||
| Content-Type: multipart/alternative; boundary="000000000000ecbeea0639f77b7f" | ||
|
|
||
| --000000000000ecbeea0639f77b7f | ||
| Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes | ||
|
|
||
| Google Network Maintenance Notification - Reference PCR/123456 - Completed | ||
|
|
||
|
|
||
| This message was sent to: [sample@example.net] | ||
| These addresses were selected from the Google ISP Portal. You can update | ||
| your preferences by navigating to "Configuration > Contacts" and adjusting | ||
| Email Subscriptions for each user. You can find additional details about | ||
| Contacts here. | ||
|
|
||
|
|
||
| Details: | ||
| Start Time: 2025-07-15 03:55:00 +0000 UTC | ||
| End Time: 2025-07-15 14:05:00 +0000 UTC | ||
| Duration: 10h10m0s | ||
|
|
||
|
|
||
| Google ASN: 15169 Peer ASN: 65001 | ||
|
|
||
| Google Neighbor Address(es): 2001:db8::1 Peer Neighbor | ||
| Address(es): 2001:db8:: | ||
|
|
||
| Google Neighbor Address(es): 192.0.2.1 Peer Neighbor Address(es): | ||
| 192.0.2.0 | ||
|
|
||
|
|
||
| Please note that the maintenance is completed, Google systems will | ||
| automatically re-route traffic back. | ||
|
|
||
| Questions about this event can be sent to noc@google.com. | ||
|
|
||
| Thank you for peering with Google! | ||
|
|
||
|
|
||
|
|
||
| PRIVILEGE AND CONFIDENTIALITY NOTICE: The information in this e-mail | ||
| communication and any attached documents may be privileged, confidential or | ||
| otherwise protected from disclosure and is intended only for the use of the | ||
| designated recipient(s). If the reader is neither the intended recipient | ||
| nor an employee or agent thereof who is responsible for delivering it to | ||
| the intended recipient, you are hereby notified that any review, | ||
| dissemination, distribution, copying or other use of this communication is | ||
| strictly prohibited. If you have received this communication in error, | ||
| please immediately notify us by return e-mail and promptly delete the | ||
| original electronic e-mail communication and any attached documentation. | ||
| Receipt by anyone other than the intended recipient is not a waiver of any | ||
| attorney-client or work-product privilege. | ||
|
|
||
|
|
||
| --000000000000ecbeea0639f77b7f | ||
| Content-Type: text/html; charset="UTF-8" | ||
| Content-Transfer-Encoding: quoted-printable | ||
|
|
||
|
|
||
| <div style=3D"font-family: 'Open Sans', sans-serif;"> | ||
| <div style=3D"font-weight: bold; font-size: larger;"><span style=3D"color= | ||
| : rgb(51, 105, 232);">G</span><span style=3D"color: rgb(213, 15, 37);">o</s= | ||
| pan><span style=3D"color: rgb(238, 178, 17);">o</span><span style=3D"color:= | ||
| rgb(51, 105, 232);">g</span><span style=3D"color: rgb(0, 153, 37);">l</spa= | ||
| n><span style=3D"color: rgb(213, 15, 37);">e</span> Network Maintenance No= | ||
| tification - Reference PCR/123456 - <span style=3D"color: rgb(0, 153, 37);"= | ||
| >Completed</span> </div> <hr /> | ||
| =20 | ||
| <div style=3D"font-size: smaller;"> | ||
| This message was sent to: [sample@example.net]<b= | ||
| r /> | ||
| These addresses were selected from the <a href=3D"http://peering.google.c= | ||
| om/">Google ISP Portal</a>. You can update your preferences by navigating t= | ||
| o "Configuration > Contacts" and adjusting Email Subscriptions for each use= | ||
| r. You can find additional details about Contacts <a href=3D"https://suppor= | ||
| t.google.com/interconnect/answer/7658597?hl=3Den&ref_topic=3D7650153">here<= | ||
| /a>. | ||
| </div> | ||
|
|
||
| <br /> | ||
| <br /> | ||
| =20 | ||
| <span style=3D"font-weight: bold;">Details:</span><br /> | ||
| <span style=3D"font-weight: bold;">Start Time:</span> 2025-07-15 03:55:00= | ||
| +0000 UTC<br /> | ||
| <span style=3D"font-weight: bold;">End Time:</span> 2025-07-15 14:05:00 &= | ||
| #43;0000 UTC<br /> | ||
| <span style=3D"font-weight: bold;">Duration:</span> 10h10m0s<br /> | ||
| <table> | ||
| <tr> | ||
| <td><span style=3D"font-weight: bold;">Google ASN:</span></td><td>151= | ||
| 69</td> | ||
| <td><span style=3D"font-weight: bold;">Peer ASN:</span></td><td>65001= | ||
| </td> | ||
| </tr> | ||
| =20 | ||
| <tr style=3D"vertical-align: top;"> | ||
| <td><span style=3D"font-weight: bold;">Google Neighbor Address(es):</= | ||
| span></td><td>2001:db8::1</td> | ||
| <td><span style=3D"font-weight: bold;">Peer Neighbor Address(es):</sp= | ||
| an></td><td>2001:db8::</td> | ||
| </tr> | ||
| =20 | ||
| <tr style=3D"vertical-align: top;"> | ||
| <td><span style=3D"font-weight: bold;">Google Neighbor Address(es):</= | ||
| span></td><td>192.0.2.1</td> | ||
| <td><span style=3D"font-weight: bold;">Peer Neighbor Address(es):</sp= | ||
| an></td><td>192.0.2.0</td> | ||
| </tr> | ||
| =20 | ||
| </table> | ||
|
|
||
| <p> | ||
| Please note that the maintenance is completed, Google systems will automa= | ||
| tically re-route traffic back. | ||
| </p> | ||
| <p> | ||
| Questions about this event can be sent to noc@google.com. | ||
| </p> | ||
| <p> | ||
| Thank you for peering with <span style=3D"color: rgb(51, 105, 232);">G</s= | ||
| pan><span style=3D"color: rgb(213, 15, 37);">o</span><span style=3D"color: = | ||
| rgb(238, 178, 17);">o</span><span style=3D"color: rgb(51, 105, 232);">g</sp= | ||
| an><span style=3D"color: rgb(0, 153, 37);">l</span><span style=3D"color: rg= | ||
| b(213, 15, 37);">e</span>!<br /> | ||
| </p> | ||
| <hr /> | ||
| <br /> | ||
| <div style=3D"font-size: smaller; color: silver;"> | ||
| PRIVILEGE AND CONFIDENTIALITY NOTICE: The information in this | ||
| e-mail communication and any attached documents may be privileged, | ||
| confidential or otherwise protected from disclosure and is intended onl= | ||
| y | ||
| for the use of the designated recipient(s). If the reader is neither th= | ||
| e | ||
| intended recipient nor an employee or agent thereof who is responsible | ||
| for delivering it to the intended recipient, you are hereby notified | ||
| that any review, dissemination, distribution, copying or other use of | ||
| this communication is strictly prohibited. If you have received this | ||
| communication in error, please immediately notify us by return e-mail | ||
| and promptly delete the original electronic e-mail communication and | ||
| any attached documentation. Receipt by anyone other than the intended | ||
| recipient is not a waiver of any attorney-client or work-product | ||
| privilege. | ||
| </div> | ||
| </div> | ||
|
|
||
| --000000000000ecbeea0639f77b7f-- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| [ | ||
| { | ||
| "account": "65001", | ||
| "circuits": [ | ||
| { | ||
| "circuit_id": "2001:db8::1-2001:db8::", | ||
| "impact": "OUTAGE" | ||
| }, | ||
| { | ||
| "circuit_id": "192.0.2.1-192.0.2.0", | ||
| "impact": "OUTAGE" | ||
| } | ||
| ], | ||
| "end": 1752588300, | ||
| "maintenance_id": "PCR/123456", | ||
| "organizer": "noc-noreply@google.com", | ||
| "provider": "google", | ||
| "sequence": 1, | ||
| "stamp": 1752584959, | ||
| "start": 1752551700, | ||
| "status": "COMPLETED", | ||
| "summary": "Google Planned Network Maintenance Notification - Reference PCR/123456", | ||
| "uid": "0" | ||
| } | ||
| ] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.