Skip to content

Commit 8d9cb03

Browse files
authored
Add Tumblr error handling, response validation, and typed SocialPostingVertical (#83)
Closes #22 Closes #45 Closes #62 The Tumblr integration returned raw JSON, and we were not validating what Tumblr sent back, and any API failure would surface as a cryptic Python error instead of something useful. This brings it up to the same standard as Strava. - Added `TumblrAPIError` so Tumblr failures are identifiable and have useful messages. Before, a 401 from Tumblr would just be a generic `HTTPError`, and a malformed response would crash with something like `AttributeError: 'NoneType' object has no attribute 'get'`. Now we get `TumblrAPIError: Tumblr API error: ...` with the status code and a description of what went wrong. - Both `user/info` and `user/dashboard` responses are now checked for the expected structure before anything tries to read from them. If Tumblr returns 200 but the payload is weird (missing keys, wrong types), it raises `TumblrAPIError` right away instead of silently producing garbage data. - Added `parse_social_posting_vertical` that turns a raw Tumblr post into a `SocialPostingVertical` model — mapping things like post ID, URL, timestamp, tags, note count, text blocks, and media URLs into the standard schema. Same approach Strava already uses. - `fetch_social_posting_vertical` now returns `(parsed_verticals, raw_posts)` instead of just a list of dicts. Matches Strava's pattern so consumers get both structured data and raw JSON. - The updated testsuite covers error wrapping, validation, and parsing. Pagination #8 is out of scope here. --- These new features don't change anything in the demoing approach. ```bash from pardner.services.tumblr import TumblrTransferService from core.models import DonatedPost, ServiceAccount import json # Get the most recent donation sa = ServiceAccount.objects.order_by('-completed_donation_at').first() post = DonatedPost.objects.filter(service_account=sa).first() # Parse through the new parser svc = TumblrTransferService('x', 'x', 'http://localhost') vertical = svc.parse_social_posting_vertical(post.raw_data) # See the structured output print(json.dumps(vertical.model_dump(), indent=2, default=str)) ``` <details> <summary>Structured output example</summary> ```json { "pardner_object_id": "e6971577c78b49c6963cabdf6cebf529", "service_object_id": "810233790391877632", "creator_user_id": "t:iE-yd_-VKiQ4tRh4kkzLeg", "data_owner_id": "", "service": "Tumblr", "vertical_name": "social_posting", "created_at": "2026-03-05 08:25:56", "url": "https://angelswouldnthelpyou.tumblr.com/post/810233790391877632/david-lynch-and-sheryl-lee-twin-peaks-fire-walk", "abstract": "David Lynch and Sheryl Lee \nTwin Peaks Fire Walk With Me 1992", "associated_media": [ { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s1280x1920/ae567fc16c7d32e5359142ead8c14c8c43225c70.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s640x960/82e3c61df944a7acdbee8da566ad5001411912b8.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s540x810/f1a2d05bade2e5d0a9e6cc17050fac2475214b49.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s500x750/d6fbe2d7c7d4746aab542717af43e8aeaabd84dc.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s400x600/9528b99630774c979fd9afe77f41a09f27eef657.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s250x400/c8b9576de6f9f79b52c5fcf5952a42115b8d2031.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s100x200/62762543123ff8cad1d4467159d8d73ad827ed05.jpg" }, { "media_type": "image", "url": "https://64.media.tumblr.com/d020bdb6ba1834edc05117b2f7ef3f84/37d16fd99f65962b-60/s75x75_c1/1e79db2eb2321ea7f1bdcdd309634206d38819b4.jpg" } ], "interaction_count": 16, "keywords": [ "twin peaks fire walk with me", "david lynch", "sheryl lee", "laura palmer", "twin peaks", "cinema" ], "shared_content": [], "status": "public", "text": "David Lynch and Sheryl Lee \n\nTwin Peaks Fire Walk With Me 1992", "title": null } ``` </details> > The parsed `SocialPostingVertical` models are not being saved to the database right now. This is intentional. > > 1. The `DonatedPost` model only has a `raw_data` JSONField. There's no column or table for structured vertical data > 2. I wanted to preserve the site's access to raw Tumblr JSON and follow the Strava pattern of returning both > 3. The site currently does `_, raw_posts = ...` — the `_` discards the parsed verticals and only `raw_posts` gets stored
1 parent 2ec9a6b commit 8d9cb03

5 files changed

Lines changed: 396 additions & 39 deletions

File tree

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ cython_debug/
186186
# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
187187
# and can be added to the global gitignore or merged into this file. However, if you prefer,
188188
# you could uncomment the following to ignore the entire vscode folder
189-
# .vscode/
189+
.vscode/
190190

191191
# Ruff stuff:
192192
.ruff_cache/

src/pardner/exceptions.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
from typing import Any
2+
13
from pardner.verticals import Vertical
24

35

@@ -26,3 +28,23 @@ def __init__(self, *unsupported_verticals: Vertical, service_name: str) -> None:
2628
class UnsupportedRequestException(Exception):
2729
def __init__(self, service_name: str, message: str):
2830
super().__init__(f'Cannot fetch data from {service_name}: {message}')
31+
32+
33+
class TumblrAPIError(Exception):
34+
"""
35+
Raised when the Tumblr API returns a non-OK HTTP status or a success response
36+
whose payload is missing expected structure
37+
"""
38+
39+
def __init__(
40+
self,
41+
message: str,
42+
status_code: int | None = None,
43+
raw_response: Any = None,
44+
) -> None:
45+
detail = f'Tumblr API error: {message}'
46+
if status_code is not None:
47+
detail += f' (HTTP {status_code})'
48+
super().__init__(detail)
49+
self.status_code = status_code
50+
self.raw_response = raw_response

src/pardner/services/tumblr.py

Lines changed: 215 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
import json
2-
from typing import Any, Iterable, Optional, override
2+
from datetime import datetime, timezone
3+
from typing import Any, Iterable, Literal, Optional, override
34

4-
from pardner.exceptions import UnsupportedRequestException
5+
from requests import HTTPError
6+
7+
from pardner.exceptions import TumblrAPIError, UnsupportedRequestException
58
from pardner.services import BaseTransferService
69
from pardner.verticals import SocialPostingVertical, Vertical
10+
from pardner.verticals.sub_verticals import AssociatedMediaSubVertical
711

812

913
class TumblrTransferService(BaseTransferService):
@@ -63,6 +67,166 @@ def fetch_token(
6367
include_client_id: bool = True,
6468
) -> dict[str, Any]:
6569
return super().fetch_token(code, authorization_response, include_client_id)
70+
71+
def _validate_user_info_response(self, data: Any) -> dict[str, Any]:
72+
"""
73+
Validates the shape of a ``user/info`` JSON payload.
74+
75+
:param data: the parsed JSON dict returned by ``user/info``.
76+
:returns: the ``response`` sub-dict if validation passes.
77+
:raises: :class:`TumblrAPIError` if required keys are absent or have the wrong type.
78+
"""
79+
if not isinstance(data, dict):
80+
raise TumblrAPIError(
81+
'user/info response is not a JSON object', raw_response=data
82+
)
83+
response = data.get('response')
84+
if not isinstance(response, dict):
85+
raise TumblrAPIError(
86+
"user/info response is missing a 'response' object", raw_response=data
87+
)
88+
user = response.get('user')
89+
if not isinstance(user, dict):
90+
raise TumblrAPIError(
91+
'user/info response.user is missing or not an object', raw_response=data
92+
)
93+
blogs = user.get('blogs')
94+
if not isinstance(blogs, list):
95+
raise TumblrAPIError(
96+
'user/info response.user.blogs is missing or not a list',
97+
raw_response=data,
98+
)
99+
return response
100+
101+
def _validate_dashboard_response(self, data: Any) -> list[Any]:
102+
"""
103+
Validates the shape of a ``user/dashboard`` JSON payload.
104+
105+
:param data: the parsed JSON dict returned by ``user/dashboard``.
106+
:returns: the ``posts`` list if validation passes.
107+
:raises: :class:`TumblrAPIError` if required keys are absent or have the wrong type.
108+
"""
109+
if not isinstance(data, dict):
110+
raise TumblrAPIError(
111+
'user/dashboard response is not a JSON object', raw_response=data
112+
)
113+
response = data.get('response')
114+
if not isinstance(response, dict):
115+
raise TumblrAPIError(
116+
"user/dashboard response is missing a 'response' object",
117+
raw_response=data,
118+
)
119+
posts = response.get('posts')
120+
if not isinstance(posts, list):
121+
raise TumblrAPIError(
122+
'user/dashboard response.posts is missing or not a list',
123+
raw_response=data,
124+
)
125+
return posts
126+
127+
def _map_tumblr_state(
128+
self, state: str | None
129+
) -> Literal['public', 'private', 'draft', 'restricted'] | None:
130+
"""Maps a Tumblr post ``state`` string to the vertical status literal."""
131+
mapping: dict[str, Literal['public', 'private', 'draft', 'restricted']] = {
132+
'published': 'public',
133+
'private': 'private',
134+
'draft': 'draft',
135+
'queued': 'restricted',
136+
'queue': 'restricted',
137+
}
138+
return mapping.get(state or '', None)
139+
140+
def parse_social_posting_vertical(
141+
self, raw_data: Any
142+
) -> SocialPostingVertical | None:
143+
"""
144+
Given a single raw Tumblr post dict, creates a
145+
:class:`SocialPostingVertical` model object, if possible.
146+
147+
Maps stable NPF fields: ``id``, ``post_url``, ``timestamp``,
148+
``summary``, ``note_count``, ``tags``, ``state``, NPF content
149+
text blocks, and media blocks.
150+
151+
:param raw_data: a single post dict from the Tumblr dashboard response.
152+
:returns: :class:`SocialPostingVertical` or ``None`` if ``raw_data``
153+
is not a dict.
154+
"""
155+
if not isinstance(raw_data, dict):
156+
return None
157+
158+
# identity / location
159+
service_object_id: str | None = str(raw_data['id']) if 'id' in raw_data else None
160+
post_url: str | None = raw_data.get('post_url') or raw_data.get('short_url')
161+
162+
created_at: datetime | None = None
163+
timestamp = raw_data.get('timestamp')
164+
if isinstance(timestamp, (int, float)):
165+
created_at = datetime.fromtimestamp(timestamp, tz=timezone.utc).replace(
166+
tzinfo=None
167+
)
168+
169+
blog = raw_data.get('blog') or {}
170+
creator_user_id: str | None = (
171+
blog.get('uuid') or blog.get('name') or raw_data.get('blog_name')
172+
)
173+
data_owner_id: str = self.primary_blog_id or ''
174+
175+
interaction_count: int | None = raw_data.get('note_count')
176+
keywords: list[str] = raw_data.get('tags') or []
177+
178+
status = self._map_tumblr_state(raw_data.get('state'))
179+
180+
# NPF content blocks
181+
content_blocks: list[dict[str, Any]] = raw_data.get('content') or []
182+
text_parts: list[str] = []
183+
associated_media: list[AssociatedMediaSubVertical] = []
184+
185+
for block in content_blocks:
186+
if not isinstance(block, dict):
187+
continue
188+
block_type = block.get('type', '')
189+
if block_type == 'text':
190+
text_value = block.get('text')
191+
if isinstance(text_value, str) and text_value:
192+
text_parts.append(text_value)
193+
elif block_type in ('image', 'video', 'audio'):
194+
media_type_map: dict[str, Literal['audio', 'image', 'video']] = {
195+
'image': 'image',
196+
'video': 'video',
197+
'audio': 'audio',
198+
}
199+
media_type = media_type_map.get(block_type)
200+
media_entries: list[dict[str, Any]] = block.get('media') or []
201+
if isinstance(media_entries, list):
202+
for entry in media_entries:
203+
if isinstance(entry, dict) and entry.get('url'):
204+
associated_media.append(
205+
AssociatedMediaSubVertical(
206+
media_type=media_type, url=entry['url']
207+
)
208+
)
209+
elif isinstance(media_entries, dict) and media_entries.get('url'):
210+
associated_media.append(
211+
AssociatedMediaSubVertical(
212+
media_type=media_type, url=media_entries['url']
213+
)
214+
)
215+
216+
return SocialPostingVertical(
217+
creator_user_id=creator_user_id,
218+
data_owner_id=data_owner_id,
219+
service_object_id=service_object_id,
220+
service=self._service_name,
221+
created_at=created_at,
222+
url=post_url,
223+
abstract=raw_data.get('summary'),
224+
interaction_count=interaction_count,
225+
keywords=keywords,
226+
status=status,
227+
text='\n\n'.join(text_parts) if text_parts else None,
228+
associated_media=associated_media,
229+
)
66230

67231
def fetch_primary_blog_id(self) -> str:
68232
"""
@@ -75,13 +239,27 @@ def fetch_primary_blog_id(self) -> str:
75239
76240
:returns: the primary blog id.
77241
78-
:raises: :class:`ValueError`: if the primary blog ID could not be extracted from
79-
the response.
242+
:raises: :class:`TumblrAPIError`: if the Tumblr API returns a non-OK response
243+
or a malformed success payload.
244+
:raises: :class:`ValueError`: if the response is structurally valid but no
245+
primary blog with a UUID was found.
80246
"""
81247
if self.primary_blog_id:
82248
return self.primary_blog_id
83-
user_info = self._get_resource_from_path('user/info').json().get('response', {})
84-
for blog_info in user_info.get('user', {}).get('blogs', []):
249+
250+
try:
251+
raw_response = self._get_resource_from_path('user/info')
252+
except HTTPError as exc:
253+
raise TumblrAPIError(
254+
'Failed to fetch user/info',
255+
status_code=exc.response.status_code if exc.response is not None else None,
256+
raw_response=exc.response,
257+
) from exc
258+
259+
user_info_data = raw_response.json()
260+
response = self._validate_user_info_response(user_info_data)
261+
262+
for blog_info in response['user']['blogs']:
85263
if (
86264
isinstance(blog_info, dict)
87265
and blog_info.get('primary')
@@ -95,17 +273,17 @@ def fetch_primary_blog_id(self) -> str:
95273
'Failed to fetch primary blog id. Either manually set the _primary_blog_id '
96274
'attribute or verify all the client credentials '
97275
'and permissions are correct. Response from Tumblr: '
98-
f'{json.dumps(user_info, indent=2)}'
276+
f'{json.dumps(user_info_data, indent=2)}'
99277
)
100278

101279
def fetch_social_posting_vertical(
102280
self,
103281
request_params: dict[str, Any] = {},
104282
count: int = 20,
105283
text_only: bool = True,
106-
) -> list[Any]:
284+
) -> tuple[list[SocialPostingVertical | None], list[Any]]:
107285
"""
108-
Fetches posts from Tumblr feed for user account whose token was
286+
Fetches posts from Tumblr feed for the user account whose token was
109287
obtained using the Tumblr API.
110288
111289
:param count: number of posts to request.
@@ -115,21 +293,34 @@ def fetch_social_posting_vertical(
115293
to the endpoint. Depending on the parameters passed, this could override
116294
``count`` and ``text_only``.
117295
118-
:returns: a list of dictionary objects with information for the posts in a feed.
296+
:returns: a two-element tuple: the first element is a list of
297+
:class:`SocialPostingVertical` objects (``None`` for posts that could
298+
not be parsed); the second element is the raw list of post dicts as
299+
returned by the API.
119300
120-
:raises: :class:`UnsupportedRequestException` if the request is unable to be
121-
made.
301+
:raises: :class:`UnsupportedRequestException` if ``count`` exceeds 20.
302+
:raises: :class:`TumblrAPIError` if the API returns a non-OK response
303+
or a malformed success payload.
122304
"""
123-
if count <= 20:
124-
params: dict[str, Any] = {'limit': count, 'npf': True, **request_params}
125-
if text_only:
126-
params['type'] = 'text'
127-
dashboard_response = self._get_resource_from_path(
128-
'user/dashboard',
129-
params,
305+
if count > 20:
306+
raise UnsupportedRequestException(
307+
self._service_name,
308+
'can only make a request for at most 20 posts at a time.',
130309
)
131-
return list(dashboard_response.json().get('response').get('posts'))
132-
raise UnsupportedRequestException(
133-
self._service_name,
134-
'can only make a request for at most 20 posts at a time.',
135-
)
310+
311+
params: dict[str, Any] = {'limit': count, 'npf': True, **request_params}
312+
if text_only:
313+
params['type'] = 'text'
314+
315+
try:
316+
dashboard_response = self._get_resource_from_path('user/dashboard', params)
317+
except HTTPError as exc:
318+
raise TumblrAPIError(
319+
'Failed to fetch user/dashboard',
320+
status_code=exc.response.status_code if exc.response is not None else None,
321+
raw_response=exc.response,
322+
) from exc
323+
324+
raw_posts = self._validate_dashboard_response(dashboard_response.json())
325+
parsed = [self.parse_social_posting_vertical(post) for post in raw_posts]
326+
return parsed, raw_posts

0 commit comments

Comments
 (0)