You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@annotator.apache.org by GitBox <gi...@apache.org> on 2021/07/09 16:00:34 UTC

[GitHub] [incubator-annotator] Treora commented on issue #112: TextQuoteSelector selector yielding an infinite number of matches

Treora commented on issue #112:
URL: https://github.com/apache/incubator-annotator/issues/112#issuecomment-877289041


   For now, I just added warnings to the documentation. Longer term, I’d really like to fix this, but it is not trivial to deal with search in a changing DOM. However, while robustness to arbitrary DOM changes may be hard, it gets easier if we limit ourselves to limited changes. In any case, it seems fair to limit our effort to DOM changes that keep the scope’s `textContent` unchanged. Within these changes, we could consider:
   
   1. changes that split text nodes (as done by `highlightText`)
   2. changes that merge text nodes
   3. changes that wrap text nodes in elements, or unwrap them (also done in highlightText; but I suppose this is not a problem for us anyway, as we simply iterate through text nodes)
   4. changes that replace a node with another that has the same text content.
   
   Besides these kinds of changes, we could distinguish between changes on the nodes where the search is currently looking, changes before it, and changes ahead of it. Perhaps changes before and after it need not be a problem either, but we’d probably more often deal with changes at the nodes that are currently searched in (for highlighting and such).
   
   I would probably start with changing the approach of the abstract [text quote matcher implementation](https://github.com/apache/incubator-annotator/blob/main/packages/selector/src/text/match-text-quote.ts). Currently, it manually walks through chunks, but I think it could be simpler by using a TextSeeker to do this. Then, perhaps we can create robustness to split nodes/chunks in there, by e.g. always checking if `this.offsetInChunk < chunk.length` at the start of any function, and walking to the correct chunk if not. Perhaps there will be more problems though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@annotator.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org