You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@annotator.apache.org by GitBox <gi...@apache.org> on 2021/06/25 00:57:42 UTC

[GitHub] [incubator-annotator] ziodave opened a new issue #112: refinedBy selector yielding an infinite number of matches

ziodave opened a new issue #112:
URL: https://github.com/apache/incubator-annotator/issues/112


   Hello,
   
   Can you help me out, I tried to understand what's going on here: I created the following, which indeed works, i.e. the text is highlighted in DOM.
   
   However it enters an infinite loop in `for await (const match of matches)` and I couldn't understand the cause:
   
   ```
   async function highlightMatcher(matcher) {
     const matches = matcher(document.body);
     for await (const match of matches) {
       highlightRange(match); //    <----- this is called an infinite number of times
     }
   }
   
   const createMatcher = makeRefinable((selector) => {
     const innerCreateMatcher = {
       TextQuoteSelector: createTextQuoteSelectorMatcher,
       TextPositionSelector: createTextPositionSelectorMatcher,
       CssSelector: createCssSelectorMatcher,
       RangeSelector: makeCreateRangeSelectorMatcher(createMatcher)
     }[selector.type];
     if (!innerCreateMatcher) {
       throw new Error(`Unsupported selector type: ${selector.type}`);
     }
     return innerCreateMatcher(selector);
   });
   
   highlightMatcher(createMatcher({
     value: "#my-div",
     type: "CssSelector",
     refinedBy: {
       type: "TextQuoteSelector",
       exact: "My Text"
     }
   }));
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-annotator] Treora commented on issue #112: TextQuoteSelector selector yielding an infinite number of matches

Posted by GitBox <gi...@apache.org>.
Treora commented on issue #112:
URL: https://github.com/apache/incubator-annotator/issues/112#issuecomment-877289041


   For now, I just added warnings to the documentation. Longer term, I’d really like to fix this, but it is not trivial to deal with search in a changing DOM. However, while robustness to arbitrary DOM changes may be hard, it gets easier if we limit ourselves to limited changes. In any case, it seems fair to limit our effort to DOM changes that keep the scope’s `textContent` unchanged. Within these changes, we could consider:
   
   1. changes that split text nodes (as done by `highlightText`)
   2. changes that merge text nodes
   3. changes that wrap text nodes in elements, or unwrap them (also done in highlightText; but I suppose this is not a problem for us anyway, as we simply iterate through text nodes)
   4. changes that replace a node with another that has the same text content.
   
   Besides these kinds of changes, we could distinguish between changes on the nodes where the search is currently looking, changes before it, and changes ahead of it. Perhaps changes before and after it need not be a problem either, but we’d probably more often deal with changes at the nodes that are currently searched in (for highlighting and such).
   
   I would probably start with changing the approach of the abstract [text quote matcher implementation](https://github.com/apache/incubator-annotator/blob/main/packages/selector/src/text/match-text-quote.ts). Currently, it manually walks through chunks, but I think it could be simpler by using a TextSeeker to do this. Then, perhaps we can create robustness to split nodes/chunks in there, by e.g. always checking if `this.offsetInChunk < chunk.length` at the start of any function, and walking to the correct chunk if not. Perhaps there will be more problems though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@annotator.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-annotator] Treora commented on issue #112: refinedBy selector yielding an infinite number of matches

Posted by GitBox <gi...@apache.org>.
Treora commented on issue #112:
URL: https://github.com/apache/incubator-annotator/issues/112#issuecomment-868443072


   Can reproduce. The issue is that if the DOM is changed (by `highlightRange`/`highlightText`) while the TextQuoteSelector is being matched, it can end up finding the quote again and again. (note that the `refinedBy` part is not relevant here, the TextQuoteSelector could be passed to `createMatcher` directly)
   
   I would like our text matching to be resistant to DOM changes that do not modify the text content. I am not sure how easy or hard this may be to achieve. If we cannot fix this, it seems unhelpful to return an (async) iterable, and we’d better just return an array with the matches.
   
   For now, best is to avoid changing the DOM while running the matcher, and first collect the matches in an array. For example, we do this in the demo:
   
   ```
     const matchAll = createMatcher(selector);
     const ranges = [];
   
     for await (const range of matchAll(target)) {
       ranges.push(range);
     }
   
     for (const range of ranges) {
       const removeHighlight = highlightText(range);
       moduleState.cleanupFunctions.push(removeHighlight);
     }
   ```
   
   I see now that I gave the simpler but broken example in the [getting started](https://annotator.apache.org/docs/getting-started/) page. I will add update and a note!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org