You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@annotator.apache.org by ge...@apache.org on 2021/01/08 15:13:59 UTC

[incubator-annotator] branch master updated (c4b5598 -> 1126e14)

This is an automated email from the ASF dual-hosted git repository.

gerben pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git.


    from c4b5598  lint
     add 149c7b3  Generate less minimal prefixes&suffixes
     add 8dd8995  Update tests
     new 1126e14  Merge pull request #99: Generate less minimal prefixes&suffixes

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 packages/dom/src/text-quote/describe.ts           |   7 +-
 packages/dom/test/text-quote/describe-cases.ts    | 298 +++++++++++++++++++++-
 packages/dom/test/text-quote/describe.test.ts     |  61 ++++-
 packages/selector/src/text/describe-text-quote.ts | 209 ++++++++++++---
 packages/selector/src/text/seeker.ts              |  39 ++-
 web/demo/index.js                                 |   2 +-
 6 files changed, 552 insertions(+), 64 deletions(-)


[incubator-annotator] 01/01: Merge pull request #99: Generate less minimal prefixes&suffixes

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 1126e1418ced3e3aedfad2256867227a0228b9b2
Merge: c4b5598 8dd8995
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Jan 8 16:13:49 2021 +0100

    Merge pull request #99: Generate less minimal prefixes&suffixes
    
    A TextQuoteSelector can add as much prefix and suffix as desired. Until now, we only added prefix and suffix as much as was strictly necessary to disambiguate the target from other occurrences of the exact same text in the same document. When an annotation should still anchor on a modified version of the document, it can be helpful to add a little more context, in order to be robust against the ambiguity that would result if after such a modification the quoted text appears in more pl [...]
    
    Also, it seems neat to have the prefix and suffix contain whole words instead of stopping halfway inside a word. This makes it pleasant to read when user interfaces expose the prefix&suffix. Also it makes the implementation closer to being compatible with the WICG TextFragments spec (see #60).
    
    This PR thus adds two ways to generate less minimal prefixes&suffixes:
    
        - Round them up to the next whitespace.
        - Optionally add prefix&suffix around a short quote even if it is not
        ambiguous.
    
    I made rounding up to whitespace the default behaviour, while the previous behaviour can still be obtained using the option minimalContext. For the context around short quotes I would not know what would be a good default (might depend on use case and document length?); so I left it at 0 for now, i.e. the feature is turned off by default.
    
    This PR also refactors the implementation a bit, reusing the seekers instead of creating new ones on every match.
    
    To pass options, I added an options object as the last function parameter. I thought we might want to move the scope parameter into this option object too, but scope is specific to the DOM implementation, so I’m not sure if that is desirable.
    
    I added options for anything that would otherwise feel like we’re hardcoding a ‘magic number’, but of course quite some choices on how exactly the algorithm works are hardcoded opinions too. I doubted between a few variations, but thought this the most straightforward with I hope generally sensible results. To be seen in practice, I guess.
    
    I added basic tests for each of the new behaviours. Currently these tests are still in the dom package, but should be refactored and moved into the selector package as the actual algorithms being tested reside there.

 packages/dom/src/text-quote/describe.ts           |   7 +-
 packages/dom/test/text-quote/describe-cases.ts    | 298 +++++++++++++++++++++-
 packages/dom/test/text-quote/describe.test.ts     |  61 ++++-
 packages/selector/src/text/describe-text-quote.ts | 209 ++++++++++++---
 packages/selector/src/text/seeker.ts              |  39 ++-
 web/demo/index.js                                 |   2 +-
 6 files changed, 552 insertions(+), 64 deletions(-)