You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@annotator.apache.org by GitBox <gi...@apache.org> on 2020/05/28 15:48:05 UTC

[GitHub] [incubator-annotator] danielweck edited a comment on issue #80: highlight range: "not a perfect undo: split text nodes are not merged again"

danielweck edited a comment on issue #80:
URL: https://github.com/apache/incubator-annotator/issues/80#issuecomment-635428423

> However it would not be a perfect undo function either, as it would also merge nodes that were already split beforehand (for some reason).

My thoughts exactly. Even if non-normalized empty / adjacent text nodes is a "weird" edge-case to start with, the assertion that the DOM will be restored to its exact original form is not 100% bullet proof, which can be problematic in usage scenarios where the DOM must be a source of truth for further selections / annotations. A pragmatic workaround for this is to reload the document from scratch instead of walking the tree in order to revert each individual DOM mutation, but this may not be a practical solution in usage scenarios where some context must persist.

As always, there are pros/cons, and interspersing DOM elements in order to "mark" text ranges no doubt has its merits. However, there are evident computational costs in "wrapping" + "unwrapping" DOM mutations, possibly also triggering prohibitive web browser's render/layout. In my experience though, performance degradation only becomes perceivable in stress-tested cases, for example when highlighting many thousands of search results inside a large document. I realize that the vast majority of use-cases probably do not require such level of scrutiny.

Just a bit of background to explain why I am paying attention to Apache Annotator, Hypothesis, etc.:

I work on a Readium project where document "annotations" are rendered using pixel-perfect overlays (i.e. shape "drawings" made with SVG or HTML elements + CSS), which are displayed above text characters using CSS mix blend mode 'multiply' in order to preserve text contrast, and with a translucent background color to "paint" the actual highlighting shape.

DOM mutations are batched and inserted at the end of the original document (using DocumentFragments), which is a more efficient technique than tree-walking and interspersing individual mutations. The primary computational cost is therefore the calculation of "client rectangles" (i.e. 2D shape coordinates at the character level), but this is relatively cheap if no re-renders are triggered in the web browser engine. The arithmetics needed to optimize the bounding shapes are cheap too (i.e. elimination of duplicates, overlapping regions, thin areas, etc.)

Bundling DOM "annotations" together at the end of the original DOM makes it easy to blacklist "foreign" document artefacts when calculating / matching references to the unadulterated HTML (simply speaking, we can just ignore whole subtrees when computing CSS selectors or when resolving DOM Ranges, for example).

We use pointer-events 'none' so that annotation overlays are effectively "transparent" to user interactions with the underlying document, which in turn makes it possible to create overlapping text selections / highlights. That being said, annotation overlays remain "tactile" in the sense that we simulate direct interaction using event delegation, for example to implement hit-testing with the mouse cursor, for hover and click actions at the level of each individual highlight.

As a final note, I will say that out-of-band overlays (versus inline / interspersed DOM mutations) introduced non-trivial implementation challenges related to the different display modes our application supports. This is in the EPUB context, so we have the traditional vertical "webby" scrolling view, a paginated mode (using CSS Columns), and fixed-layout (i.e. document root scaled with a zoom factor to fit entirely into the visible viewport). Once the idiosyncrasies of the CSS box model with `position` `absolute` vs. `fixed` in these different rendering contexts were figured-out ... we were good to go :)

Sorry to bombard you with slightly-unrelated information, I hope this is a useful exchange nonetheless. Keep up the good work in Apache Annotator and related projects (CC @tilgovi too).

PS: our project will at some point visit the problem space of creating interoperable annotations, likely relying on the W3C standard(s)...so I will continue to watch your projects.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org