Right now, our code for handling HTML vs text content is a bit of a mess. For example, extracting and converting links, mentions, hashtags, and other entities in HTML content into tags is in Source.postprocess_object (mentions), bluesky.from_as1 (links and hashtags?), and some other places too.
We should centralize all this into a new AS1 field for text content. Ideally something existing in AS1, but if not, maybe a new textContent field. content would always be HTML, everything would be in markup there. tags with startIndex/length would always reference textContent, we could reliably convert plain text whitespace (eg newlines) to <br>s, and we could get rid of content_is_html.
Background: #675, snarfed/bridgy-fed#990, etc
Right now, our code for handling HTML vs text content is a bit of a mess. For example, extracting and converting links, mentions, hashtags, and other entities in HTML content into
tags is inSource.postprocess_object(mentions),bluesky.from_as1(links and hashtags?), and some other places too.We should centralize all this into a new AS1 field for text content. Ideally something existing in AS1, but if not, maybe a new
textContentfield.contentwould always be HTML, everything would be in markup there.tags withstartIndex/lengthwould always referencetextContent, we could reliably convert plain text whitespace (eg newlines) to<br>s, and we could get rid ofcontent_is_html.Background: #675, snarfed/bridgy-fed#990, etc