You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@annotator.apache.org by ge...@apache.org on 2020/11/20 21:20:58 UTC

[incubator-annotator] branch import-dom-seek updated (f58fdb8 -> 2ff0e4e)

This is an automated email from the ASF dual-hosted git repository.

gerben pushed a change to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git.


 discard f58fdb8  Rename chunker.ts→text-node-chunker.ts
 discard 5dd5105  Move abstract code into @annotator/selector
 discard d9ed301  Linting
 discard 436b3a0  Tweak seeker
 discard d27ba4e  tweak comments
 discard 9ccb88c  Compare *extra* pre/suffix lengths (ignore sunk costs)
 discard 8b7d302  Refactor pre/suffix disambiguation
 discard a174e4c  Refactor clip range to scope
 discard f37e397  This is what do–while was invented for :)
 discard 2da2f50  Require all Chunkers to be non-empty
 discard 36bb0b8  Add note on fragility. May need to rethink approach.
 discard 2fbc7bf  fix type of matchers
 discard 953f0d4  Make demo more challenging.
 discard 213c49e  Export describeTextPosition & use it in demo
     new 0d4d66f  Export describeTextPosition & use it in demo
     new 940984e  Make demo more challenging.
     new 9dc1e5e  fix type of matchers
     new 68f05f4  Add note on fragility. May need to rethink approach.
     new e8500bb  Require all Chunkers to be non-empty
     new 6bab278  This is what do–while was invented for :)
     new 2f97989  Refactor clip range to scope
     new 91d2459  Refactor pre/suffix disambiguation
     new 8459b0e  Compare *extra* pre/suffix lengths (ignore sunk costs)
     new f15fe34  tweak comments
     new d285714  Tweak seeker
     new 14e92f4  Linting
     new 6ee4ff8  Move abstract code into @annotator/selector
     new 2ff0e4e  Rename chunker.ts→text-node-chunker.ts

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (f58fdb8)
            \
             N -- N -- N   refs/heads/import-dom-seek (2ff0e4e)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 14 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 web/demo/index.js | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[incubator-annotator] 11/14: Tweak seeker

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit d28571401e5b181f2f745bc0be289b63eb40f134
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 13:07:16 2020 +0100

    Tweak seeker
---
 packages/dom/src/seek.ts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/packages/dom/src/seek.ts b/packages/dom/src/seek.ts
index 75627e9..a1f52c8 100644
--- a/packages/dom/src/seek.ts
+++ b/packages/dom/src/seek.ts
@@ -83,7 +83,7 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
 
   private _readOrSeekToChunk(read: true, target: TChunk, offset?: number): string
   private _readOrSeekToChunk(read: false, target: TChunk, offset?: number): void
-  private _readOrSeekToChunk(read: boolean, target: TChunk, offset: number = 0): string {
+  private _readOrSeekToChunk(read: boolean, target: TChunk, offset: number = 0): string | void {
     const oldPosition = this.position;
     let result = '';
 
@@ -124,8 +124,8 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
         this.seekTo(oldPosition);
         result = this.readTo(targetPosition);
       }
+      return result;
     }
-    return result;
   }
 
   private _readOrSeekTo(read: true, target: number, roundUp?: boolean): string


[incubator-annotator] 04/14: Add note on fragility. May need to rethink approach.

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 68f05f42ca9e9b44906c619a888a88a3381e4e72
Author: Gerben <ge...@treora.com>
AuthorDate: Thu Nov 19 18:16:10 2020 +0100

    Add note on fragility. May need to rethink approach.
---
 packages/dom/src/text-quote/describe.ts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 1a7941d..4f0c70f 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -79,6 +79,7 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
     const matches = abstractTextQuoteSelectorMatcher(tentativeSelector)(scope());
     let nextMatch = await matches.next();
 
+    // XXX This test is fragile: nextMatch and target are assumed to be normalised.
     if (!nextMatch.done && chunkRangeEquals(nextMatch.value, target)) {
       // This match is the intended one, ignore it.
       nextMatch = await matches.next();


[incubator-annotator] 03/14: fix type of matchers

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 9dc1e5e44b270efcbf40c5cd9e6f41fe21c54c56
Author: Gerben <ge...@treora.com>
AuthorDate: Thu Nov 19 17:48:40 2020 +0100

    fix type of matchers
---
 packages/dom/src/text-position/match.ts | 12 +++---------
 packages/dom/src/text-quote/match.ts    | 11 +++--------
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/packages/dom/src/text-position/match.ts b/packages/dom/src/text-position/match.ts
index aa5fe49..53bfae3 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/dom/src/text-position/match.ts
@@ -26,9 +26,7 @@ import { Chunk, ChunkRange, TextNodeChunker, PartialTextNode } from '../chunker'
 export function createTextPositionSelectorMatcher(
   selector: TextPositionSelector,
 ): Matcher<Range, Range> {
-
-  const abstractMatcher: AbstractMatcher<PartialTextNode> =
-    abstractTextPositionSelectorMatcher(selector);
+  const abstractMatcher = abstractTextPositionSelectorMatcher(selector);
 
   return async function* matchAll(scope) {
     const textChunks = new TextNodeChunker(scope);
@@ -43,13 +41,9 @@ export function createTextPositionSelectorMatcher(
   };
 }
 
-type AbstractMatcher<TChunk extends Chunk<any>> =
-  Matcher<NonEmptyChunker<TChunk>, ChunkRange<TChunk>>
-
-export function abstractTextPositionSelectorMatcher<TChunk extends Chunk<string>>(
+export function abstractTextPositionSelectorMatcher(
   selector: TextPositionSelector,
-): AbstractMatcher<TChunk> {
-
+): <TChunk extends Chunk<any>>(scope: NonEmptyChunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
   const { start, end } = selector;
 
   return async function* matchAll<TChunk extends Chunk<string>>(textChunks: NonEmptyChunker<TChunk>) {
diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index 37a75ba..38e09d5 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -24,9 +24,7 @@ import { TextNodeChunker, Chunk, Chunker, ChunkRange, PartialTextNode } from '..
 export function createTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,
 ): Matcher<Range, Range> {
-
-  const abstractMatcher: AbstractMatcher<PartialTextNode> =
-    abstractTextQuoteSelectorMatcher(selector);
+  const abstractMatcher = abstractTextQuoteSelectorMatcher(selector);
 
   return async function* matchAll(scope) {
     const textChunks = new TextNodeChunker(scope);
@@ -37,12 +35,9 @@ export function createTextQuoteSelectorMatcher(
   }
 }
 
-type AbstractMatcher<TChunk extends Chunk<any>> =
-  Matcher<Chunker<TChunk>, ChunkRange<TChunk>>
-
-export function abstractTextQuoteSelectorMatcher<TChunk extends Chunk<string>>(
+export function abstractTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,
-): AbstractMatcher<TChunk> {
+): <TChunk extends Chunk<any>>(scope: Chunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
   return async function* matchAll<TChunk extends Chunk<string>>(textChunks: Chunker<TChunk>) {
     const exact = selector.exact;
     const prefix = selector.prefix || '';


[incubator-annotator] 07/14: Refactor clip range to scope

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 2f97989bc21f3fcf188c47976fbc1de0247a655d
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 12:09:15 2020 +0100

    Refactor clip range to scope
---
 packages/dom/src/chunker.ts                | 9 +++++++++
 packages/dom/src/text-position/describe.ts | 7 -------
 packages/dom/src/text-quote/describe.ts    | 7 -------
 3 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/packages/dom/src/chunker.ts b/packages/dom/src/chunker.ts
index bb71857..8c28f78 100644
--- a/packages/dom/src/chunker.ts
+++ b/packages/dom/src/chunker.ts
@@ -114,6 +114,15 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
   }
 
   rangeToChunkRange(range: Range): ChunkRange<PartialTextNode> {
+    range = range.cloneRange();
+
+    // Take the part of the range that falls within the scope.
+    if (range.compareBoundaryPoints(Range.START_TO_START, this.scope) === -1)
+      range.setStart(this.scope.startContainer, this.scope.startOffset);
+    if (range.compareBoundaryPoints(Range.END_TO_END, this.scope) === 1)
+      range.setEnd(this.scope.endContainer, this.scope.endOffset);
+
+    // Ensure it starts and ends at text nodes.
     const textRange = normalizeRange(range, this.scope);
 
     const startChunk = this.nodeToChunk(textRange.startContainer);
diff --git a/packages/dom/src/text-position/describe.ts b/packages/dom/src/text-position/describe.ts
index a711410..d4099a9 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/dom/src/text-position/describe.ts
@@ -38,13 +38,6 @@ export async function describeTextPosition(
     scope.selectNodeContents(document);
   }
 
-  // Take the part of the range that falls within the scope.
-  range = range.cloneRange();
-  if (range.compareBoundaryPoints(Range.START_TO_START, scope) === -1)
-    range.setStart(scope.startContainer, scope.startOffset);
-  if (range.compareBoundaryPoints(Range.END_TO_END, scope) === 1)
-    range.setEnd(scope.endContainer, scope.endOffset);
-
   const textChunks = new TextNodeChunker(scope);
   if (textChunks.currentChunk === null)
     throw new RangeError('Range does not contain any Text nodes.');
diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 2e4693e..688089f 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -38,13 +38,6 @@ export async function describeTextQuote(
     scope.selectNodeContents(document);
   }
 
-  // Take the part of the range that falls within the scope.
-  range = range.cloneRange();
-  if (range.compareBoundaryPoints(Range.START_TO_START, scope) === -1)
-    range.setStart(scope.startContainer, scope.startOffset);
-  if (range.compareBoundaryPoints(Range.END_TO_END, scope) === 1)
-    range.setEnd(scope.endContainer, scope.endOffset);
-
   const chunker = new TextNodeChunker(scope);
 
   return await abstractDescribeTextQuote(


[incubator-annotator] 05/14: Require all Chunkers to be non-empty

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit e8500bb000e262e1bf24a4828b8d9c07834d2478
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 11:55:57 2020 +0100

    Require all Chunkers to be non-empty
---
 packages/dom/src/chunker.ts                | 27 ++++++++++++++++++++-------
 packages/dom/src/seek.ts                   |  6 +-----
 packages/dom/src/text-position/describe.ts |  8 ++++----
 packages/dom/src/text-position/match.ts    | 12 +++++-------
 packages/dom/src/text-quote/describe.ts    |  8 ++++----
 packages/dom/src/text-quote/match.ts       | 12 ++++++++++--
 6 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/packages/dom/src/chunker.ts b/packages/dom/src/chunker.ts
index f671b5e..bb71857 100644
--- a/packages/dom/src/chunker.ts
+++ b/packages/dom/src/chunker.ts
@@ -53,8 +53,8 @@ export function chunkRangeEquals(range1: ChunkRange<any>, range2: ChunkRange<any
 // It is inspired by, and similar to, the DOM’s NodeIterator. (but unlike
 // NodeIterator, it has no concept of being ‘before’ or ‘after’ a chunk)
 export interface Chunker<TChunk extends Chunk<any>> {
-  // currentChunk is null only if it contains no chunks at all.
-  readonly currentChunk: TChunk | null;
+  // The chunk currently being pointed at.
+  readonly currentChunk: TChunk;
 
   // Move currentChunk to the chunk following it, and return that chunk.
   // If there are no chunks following it, keep currentChunk unchanged and return null.
@@ -74,14 +74,22 @@ export interface PartialTextNode extends Chunk<string> {
   readonly endOffset: number;
 }
 
+export class EmptyScopeError extends TypeError {
+  constructor(message?: string) {
+    super(message || 'Scope contains no text nodes.');
+  }
+}
+
 export class TextNodeChunker implements Chunker<PartialTextNode> {
 
   private iter: NodeIterator;
 
   get currentChunk() {
     const node = this.iter.referenceNode;
-    if (!isText(node))
-      return null;
+
+    // This test should not actually be needed, but it keeps TypeScript happy.
+    if (!isText(node)) throw new EmptyScopeError();
+
     return this.nodeToChunk(node);
   }
 
@@ -131,6 +139,9 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
     return range;
   }
 
+  /**
+   * @param scope A Range that overlaps with at least one text node.
+   */
   constructor(private scope: Range) {
     this.iter = ownerDocument(scope).createNodeIterator(
       scope.commonAncestorContainer,
@@ -146,9 +157,11 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
 
     // Move the iterator to after the start (= root) node.
     this.iter.nextNode();
-    // If the start node is not a text node, move it to the first text node (if any).
-    if (!isText(this.iter.referenceNode))
-      this.iter.nextNode();
+    // If the start node is not a text node, move it to the first text node.
+    if (!isText(this.iter.referenceNode)) {
+      const nextNode = this.iter.nextNode();
+      if (nextNode === null) throw new EmptyScopeError();
+    }
   }
 
   nextChunk() {
diff --git a/packages/dom/src/seek.ts b/packages/dom/src/seek.ts
index 3832b07..75627e9 100644
--- a/packages/dom/src/seek.ts
+++ b/packages/dom/src/seek.ts
@@ -22,10 +22,6 @@ import { Chunk, Chunker, chunkEquals } from "./chunker";
 
 const E_END = 'Iterator exhausted before seek ended.';
 
-export interface NonEmptyChunker<TChunk extends Chunk<any>> extends Chunker<TChunk> {
-  readonly currentChunk: TChunk;
-}
-
 export interface Seeker<T extends Iterable<any> = string> {
   readonly position: number;
   read(length?: number, roundUp?: boolean): T;
@@ -56,7 +52,7 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
   // The current text position (measured in code units)
   get position() { return this.currentChunkPosition + this.offsetInChunk; }
 
-  constructor(protected chunker: NonEmptyChunker<TChunk>) {
+  constructor(protected chunker: Chunker<TChunk>) {
     // Walk to the start of the first non-empty chunk inside the scope.
     this.seekTo(0);
   }
diff --git a/packages/dom/src/text-position/describe.ts b/packages/dom/src/text-position/describe.ts
index 8baff8d..a711410 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/dom/src/text-position/describe.ts
@@ -20,9 +20,9 @@
 
 import type { TextPositionSelector } from '@annotator/selector';
 import { ownerDocument } from '../owner-document';
-import { Chunk, Chunker, ChunkRange, TextNodeChunker, PartialTextNode } from '../chunker';
+import { Chunk, Chunker, ChunkRange, TextNodeChunker } from '../chunker';
 import { CodePointSeeker } from '../code-point-seeker';
-import { TextSeeker, NonEmptyChunker } from '../seek';
+import { TextSeeker } from '../seek';
 
 export async function describeTextPosition(
   range: Range,
@@ -51,13 +51,13 @@ export async function describeTextPosition(
 
   return await abstractDescribeTextPosition(
     textChunks.rangeToChunkRange(range),
-    textChunks as NonEmptyChunker<PartialTextNode>,
+    textChunks,
   );
 }
 
 async function abstractDescribeTextPosition<TChunk extends Chunk<string>>(
   target: ChunkRange<TChunk>,
-  scope: NonEmptyChunker<TChunk>,
+  scope: Chunker<TChunk>,
 ): Promise<TextPositionSelector> {
   const codeUnitSeeker = new TextSeeker(scope);
   const codePointSeeker = new CodePointSeeker(codeUnitSeeker);
diff --git a/packages/dom/src/text-position/match.ts b/packages/dom/src/text-position/match.ts
index 53bfae3..cc8044e 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/dom/src/text-position/match.ts
@@ -19,9 +19,9 @@
  */
 
 import type { Matcher, TextPositionSelector } from '@annotator/selector';
-import { TextSeeker, NonEmptyChunker } from '../seek';
+import { TextSeeker } from '../seek';
 import { CodePointSeeker } from '../code-point-seeker';
-import { Chunk, ChunkRange, TextNodeChunker, PartialTextNode } from '../chunker';
+import { Chunk, ChunkRange, TextNodeChunker, Chunker } from '../chunker';
 
 export function createTextPositionSelectorMatcher(
   selector: TextPositionSelector,
@@ -31,9 +31,7 @@ export function createTextPositionSelectorMatcher(
   return async function* matchAll(scope) {
     const textChunks = new TextNodeChunker(scope);
 
-    if (textChunks.currentChunk === null)
-      throw new RangeError('Range does not contain any Text nodes.');
-    const matches = abstractMatcher(textChunks as NonEmptyChunker<PartialTextNode>);
+    const matches = abstractMatcher(textChunks);
 
     for await (const abstractMatch of matches) {
       yield textChunks.chunkRangeToRange(abstractMatch);
@@ -43,10 +41,10 @@ export function createTextPositionSelectorMatcher(
 
 export function abstractTextPositionSelectorMatcher(
   selector: TextPositionSelector,
-): <TChunk extends Chunk<any>>(scope: NonEmptyChunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
+): <TChunk extends Chunk<any>>(scope: Chunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
   const { start, end } = selector;
 
-  return async function* matchAll<TChunk extends Chunk<string>>(textChunks: NonEmptyChunker<TChunk>) {
+  return async function* matchAll<TChunk extends Chunk<string>>(textChunks: Chunker<TChunk>) {
     const codeUnitSeeker = new TextSeeker(textChunks);
     const codePointSeeker = new CodePointSeeker(codeUnitSeeker);
 
diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 4f0c70f..2e4693e 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -22,7 +22,7 @@ import type { TextQuoteSelector } from '@annotator/selector';
 import { ownerDocument } from '../owner-document';
 import { Chunk, Chunker, ChunkRange, TextNodeChunker, chunkRangeEquals } from '../chunker';
 import { abstractTextQuoteSelectorMatcher } from '.';
-import { TextSeeker, NonEmptyChunker } from '../seek';
+import { TextSeeker } from '../seek';
 
 export async function describeTextQuote(
   range: Range,
@@ -57,7 +57,7 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
   target: ChunkRange<TChunk>,
   scope: () => Chunker<TChunk>,
 ): Promise<TextQuoteSelector> {
-  const seeker = new TextSeeker(scope() as NonEmptyChunker<TChunk>);
+  const seeker = new TextSeeker(scope());
 
   // Read the target’s exact text.
   seeker.seekToChunk(target.startChunk, target.startIndex);
@@ -95,8 +95,8 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
 
     // We’ll have to add more prefix/suffix to disqualify this unintended match.
     const unintendedMatch = nextMatch.value;
-    const seeker1 = new TextSeeker(scope() as NonEmptyChunker<TChunk>);
-    const seeker2 = new TextSeeker(scope() as NonEmptyChunker<TChunk>);
+    const seeker1 = new TextSeeker(scope());
+    const seeker2 = new TextSeeker(scope());
 
     // Count how many characters we’d need as a prefix to disqualify this match.
     seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index 38e09d5..dea1f68 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -19,7 +19,7 @@
  */
 
 import type { Matcher, TextQuoteSelector } from '@annotator/selector';
-import { TextNodeChunker, Chunk, Chunker, ChunkRange, PartialTextNode } from '../chunker';
+import { Chunk, Chunker, ChunkRange, TextNodeChunker, EmptyScopeError } from '../chunker';
 
 export function createTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,
@@ -27,7 +27,15 @@ export function createTextQuoteSelectorMatcher(
   const abstractMatcher = abstractTextQuoteSelectorMatcher(selector);
 
   return async function* matchAll(scope) {
-    const textChunks = new TextNodeChunker(scope);
+    let textChunks;
+    try {
+      textChunks = new TextNodeChunker(scope);
+    } catch (err) {
+      if (err instanceof EmptyScopeError)
+        return; // An empty range contains no matches.
+      else
+        throw err;
+    }
 
     for await (const abstractMatch of abstractMatcher(textChunks)) {
       yield textChunks.chunkRangeToRange(abstractMatch);


[incubator-annotator] 01/14: Export describeTextPosition & use it in demo

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 0d4d66ff96ea62da3f68d18ec77a1a729b18e4a2
Author: Gerben <ge...@treora.com>
AuthorDate: Wed Nov 18 19:46:19 2020 +0100

    Export describeTextPosition & use it in demo
---
 packages/dom/src/index.ts |  1 +
 web/demo/index.html       |  9 +++++++++
 web/demo/index.js         | 12 ++++++++++--
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/packages/dom/src/index.ts b/packages/dom/src/index.ts
index 3d7ca58..cd7d2ea 100644
--- a/packages/dom/src/index.ts
+++ b/packages/dom/src/index.ts
@@ -21,4 +21,5 @@
 export * from './css';
 export * from './range';
 export * from './text-quote';
+export * from './text-position';
 export * from './highlight-range';
diff --git a/web/demo/index.html b/web/demo/index.html
index ad2d0a2..3ed0961 100644
--- a/web/demo/index.html
+++ b/web/demo/index.html
@@ -72,6 +72,15 @@ under the License.
           Upon a change of selection, a
           <a rel="external" href="https://www.w3.org/TR/2017/REC-annotation-model-20170223/#text-quote-selector" target="_blank">TextQuoteSelector</a>
           will be created, that describes what was selected.</p>
+          <form id="form">
+            The selector can work either
+            <br/>
+            <input type="radio" name="describeMode" value="TextQuote" id="describeModeTextQuote" checked>
+            <label for="describeModeTextQuote">by quoting the selected text</label>; or
+            </br>
+            <input type="radio" name="describeMode" value="TextPosition" id="describeModeTextPosition">
+            <label for="describeModeTextPosition">by counting the selected characters’ position in the text</label>.
+          </form>
       </div>
       <div class="column">
         <h2>Text is found here</h2>
diff --git a/web/demo/index.js b/web/demo/index.js
index cb96a40..d513252 100644
--- a/web/demo/index.js
+++ b/web/demo/index.js
@@ -18,12 +18,14 @@
  * under the License.
  */
 
-/* global info, module, source, target */
+/* global info, module, source, target, form */
 
 import {
   makeCreateRangeSelectorMatcher,
   createTextQuoteSelectorMatcher,
   describeTextQuote,
+  createTextPositionSelectorMatcher,
+  describeTextPosition,
   highlightRange,
 } from '@annotator/dom';
 import { makeRefinable } from '@annotator/selector';
@@ -95,6 +97,7 @@ function cleanup() {
 const createMatcher = makeRefinable((selector) => {
   const innerCreateMatcher = {
     TextQuoteSelector: createTextQuoteSelectorMatcher,
+    TextPositionSelector: createTextPositionSelectorMatcher,
     RangeSelector: makeCreateRangeSelectorMatcher(createMatcher),
   }[selector.type];
 
@@ -126,12 +129,16 @@ async function anchor(selector) {
 
 async function onSelectionChange() {
   cleanup();
+  const describeMode = form.describeMode.value;
   const scope = document.createRange();
   scope.selectNodeContents(source);
   const selection = document.getSelection();
   for (let i = 0; i < selection.rangeCount; i++) {
     const range = selection.getRangeAt(i);
-    const selector = await describeTextQuote(range, scope);
+    const selector =
+      describeMode === 'TextPosition'
+        ? await describeTextPosition(range, scope)
+        : await describeTextQuote(range, scope);
     await anchor(selector);
   }
 }
@@ -146,6 +153,7 @@ function onSelectorExampleClick(event) {
 }
 
 document.addEventListener('selectionchange', onSelectionChange);
+form.addEventListener('change', onSelectionChange);
 document.addEventListener('click', onSelectorExampleClick);
 
 if (module.hot) {


[incubator-annotator] 12/14: Linting

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 14e92f4bc0ca2fad099b77636afc7a0e8043e066
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 15:16:12 2020 +0100

    Linting
---
 .eslintrc.js                                     |   2 +
 packages/dom/src/chunker.ts                      |  79 +++++++++--------
 packages/dom/src/code-point-seeker.ts            |  68 ++++++++++-----
 packages/dom/src/normalize-range.ts              |  57 +++++++------
 packages/dom/src/range/cartesian.ts              |   2 +-
 packages/dom/src/seek.ts                         | 103 +++++++++++++++--------
 packages/dom/src/text-position/describe.ts       |   5 +-
 packages/dom/src/text-position/match.ts          |  15 ++--
 packages/dom/src/text-quote/describe.ts          |  41 ++++++---
 packages/dom/src/text-quote/match.ts             |  68 ++++++++++-----
 packages/dom/test/text-position/describe.test.ts |   5 +-
 packages/dom/test/text-position/match-cases.ts   |  17 ++--
 packages/dom/test/text-position/match.test.ts    |   1 -
 packages/dom/test/text-quote/match-cases.ts      |  77 +++++++++--------
 packages/dom/test/text-quote/match.test.ts       |   2 +-
 packages/selector/src/index.ts                   |   7 +-
 16 files changed, 337 insertions(+), 212 deletions(-)

diff --git a/.eslintrc.js b/.eslintrc.js
index 165813d..598d4ab 100644
--- a/.eslintrc.js
+++ b/.eslintrc.js
@@ -55,6 +55,7 @@ module.exports = {
       },
     ],
     'import/unambiguous': 'error',
+    'no-constant-condition': 'off',
     'prettier/prettier': [
       'error',
       {
@@ -111,6 +112,7 @@ module.exports = {
       plugins: ['@typescript-eslint'],
       rules: {
         '@typescript-eslint/consistent-type-imports': 'error',
+        '@typescript-eslint/no-explicit-any': 'off',
         '@typescript-eslint/no-unused-vars': [
           'error',
           { argsIgnorePattern: '^_' },
diff --git a/packages/dom/src/chunker.ts b/packages/dom/src/chunker.ts
index 8c28f78..8c56924 100644
--- a/packages/dom/src/chunker.ts
+++ b/packages/dom/src/chunker.ts
@@ -18,8 +18,8 @@
  * under the License.
  */
 
-import { normalizeRange } from "./normalize-range";
-import { ownerDocument } from "./owner-document";
+import { normalizeRange } from './normalize-range';
+import { ownerDocument } from './owner-document';
 
 // A Chunk represents a fragment (typically a string) of some document.
 // Subclasses can add further attributes to map the chunk to its position in the
@@ -40,12 +40,15 @@ export function chunkEquals(chunk1: Chunk<any>, chunk2: Chunk<any>): boolean {
   return chunk1.equals ? chunk1.equals(chunk2) : chunk1 === chunk2;
 }
 
-export function chunkRangeEquals(range1: ChunkRange<any>, range2: ChunkRange<any>) {
+export function chunkRangeEquals(
+  range1: ChunkRange<any>,
+  range2: ChunkRange<any>,
+): boolean {
   return (
-    chunkEquals(range1.startChunk, range2.startChunk)
-    && chunkEquals(range1.endChunk, range2.endChunk)
-    && range1.startIndex === range2.startIndex
-    && range1.endIndex === range2.endIndex
+    chunkEquals(range1.startChunk, range2.startChunk) &&
+    chunkEquals(range1.endChunk, range2.endChunk) &&
+    range1.startIndex === range2.startIndex &&
+    range1.endIndex === range2.endIndex
   );
 }
 
@@ -80,11 +83,19 @@ export class EmptyScopeError extends TypeError {
   }
 }
 
-export class TextNodeChunker implements Chunker<PartialTextNode> {
+export class OutOfScopeError extends TypeError {
+  constructor(message?: string) {
+    super(
+      message ||
+        'Cannot convert node to chunk, as it falls outside of chunker’s scope.',
+    );
+  }
+}
 
+export class TextNodeChunker implements Chunker<PartialTextNode> {
   private iter: NodeIterator;
 
-  get currentChunk() {
+  get currentChunk(): PartialTextNode {
     const node = this.iter.referenceNode;
 
     // This test should not actually be needed, but it keeps TypeScript happy.
@@ -94,10 +105,13 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
   }
 
   nodeToChunk(node: Text): PartialTextNode {
-    if (!this.scope.intersectsNode(node))
-      throw new Error('Cannot convert node to chunk, as it falls outside of chunker’s scope.');
-    const startOffset = (node === this.scope.startContainer) ? this.scope.startOffset : 0;
-    const endOffset = (node === this.scope.endContainer) ? this.scope.endOffset : node.length;
+    if (!this.scope.intersectsNode(node)) throw new OutOfScopeError();
+
+    const startOffset =
+      node === this.scope.startContainer ? this.scope.startOffset : 0;
+    const endOffset =
+      node === this.scope.endContainer ? this.scope.endOffset : node.length;
+
     return {
       node,
       startOffset,
@@ -105,12 +119,12 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
       data: node.data.substring(startOffset, endOffset),
       equals(other) {
         return (
-          other.node === this.node
-          && other.startOffset === this.startOffset
-          && other.endOffset === this.endOffset
+          other.node === this.node &&
+          other.startOffset === this.startOffset &&
+          other.endOffset === this.endOffset
         );
       },
-    }
+    };
   }
 
   rangeToChunkRange(range: Range): ChunkRange<PartialTextNode> {
@@ -173,28 +187,27 @@ export class TextNodeChunker implements Chunker<PartialTextNode> {
     }
   }
 
-  nextChunk() {
+  nextChunk(): PartialTextNode | null {
     // Move the iterator to after the current node, so nextNode() will cause a jump.
-    if (this.iter.pointerBeforeReferenceNode)
-      this.iter.nextNode();
-    if (this.iter.nextNode())
-      return this.currentChunk;
-    else
-      return null;
+    if (this.iter.pointerBeforeReferenceNode) this.iter.nextNode();
+
+    if (this.iter.nextNode()) return this.currentChunk;
+    else return null;
   }
 
-  previousChunk() {
-    if (!this.iter.pointerBeforeReferenceNode)
-      this.iter.previousNode();
-    if (this.iter.previousNode())
-      return this.currentChunk;
-    else
-      return null;
+  previousChunk(): PartialTextNode | null {
+    if (!this.iter.pointerBeforeReferenceNode) this.iter.previousNode();
+
+    if (this.iter.previousNode()) return this.currentChunk;
+    else return null;
   }
 
-  precedesCurrentChunk(chunk: PartialTextNode) {
+  precedesCurrentChunk(chunk: PartialTextNode): boolean {
     if (this.currentChunk === null) return false;
-    return !!(this.currentChunk.node.compareDocumentPosition(chunk.node) & Node.DOCUMENT_POSITION_PRECEDING);
+    return !!(
+      this.currentChunk.node.compareDocumentPosition(chunk.node) &
+      Node.DOCUMENT_POSITION_PRECEDING
+    );
   }
 }
 
diff --git a/packages/dom/src/code-point-seeker.ts b/packages/dom/src/code-point-seeker.ts
index b97089e..40f19b9 100644
--- a/packages/dom/src/code-point-seeker.ts
+++ b/packages/dom/src/code-point-seeker.ts
@@ -18,49 +18,58 @@
  * under the License.
  */
 
-import { ChunkSeeker } from "./seek";
-import { Chunk } from "./chunker";
+import type { Chunk } from './chunker';
+import type { ChunkSeeker } from './seek';
 
-export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TChunk, string[]> {
+export class CodePointSeeker<TChunk extends Chunk<string>>
+  implements ChunkSeeker<TChunk, string[]> {
   position = 0;
 
   constructor(public readonly raw: ChunkSeeker<TChunk>) {}
 
-  seekBy(length: number) {
+  seekBy(length: number): void {
     this.seekTo(this.position + length);
   }
 
-  seekTo(target: number) {
+  seekTo(target: number): void {
     this._readOrSeekTo(false, target);
   }
 
-  read(length: number, roundUp?: boolean) {
+  read(length: number, roundUp?: boolean): string[] {
     return this.readTo(this.position + length, roundUp);
   }
 
-  readTo(target: number, roundUp?: boolean) {
+  readTo(target: number, roundUp?: boolean): string[] {
     return this._readOrSeekTo(true, target, roundUp);
   }
 
-  get currentChunk() {
+  get currentChunk(): TChunk {
     return this.raw.currentChunk;
   }
 
-  get offsetInChunk() {
+  get offsetInChunk(): number {
     return this.raw.offsetInChunk;
   }
 
-  seekToChunk(target: TChunk, offset: number = 0) {
+  seekToChunk(target: TChunk, offset = 0): void {
     this._readOrSeekToChunk(false, target, offset);
   }
 
-  readToChunk(target: TChunk, offset: number = 0) {
+  readToChunk(target: TChunk, offset = 0): string[] {
     return this._readOrSeekToChunk(true, target, offset);
   }
 
-  private _readOrSeekToChunk(read: true, target: TChunk, offset?: number): string[]
-  private _readOrSeekToChunk(read: false, target: TChunk, offset?: number): void
-  private _readOrSeekToChunk(read: boolean, target: TChunk, offset: number = 0) {
+  private _readOrSeekToChunk(
+    read: true,
+    target: TChunk,
+    offset?: number,
+  ): string[];
+  private _readOrSeekToChunk(
+    read: false,
+    target: TChunk,
+    offset?: number,
+  ): void;
+  private _readOrSeekToChunk(read: boolean, target: TChunk, offset = 0) {
     const oldRawPosition = this.raw.position;
 
     let s = this.raw.readToChunk(target, offset);
@@ -75,7 +84,7 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
       s = s.slice(1);
     }
 
-    let result = [...s];
+    const result = [...s];
 
     this.position = movedForward
       ? this.position + result.length
@@ -84,9 +93,17 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
     if (read) return result;
   }
 
-  private _readOrSeekTo(read: true, target: number, roundUp?: boolean): string[];
+  private _readOrSeekTo(
+    read: true,
+    target: number,
+    roundUp?: boolean,
+  ): string[];
   private _readOrSeekTo(read: false, target: number, roundUp?: boolean): void;
-  private _readOrSeekTo(read: boolean, target: number, roundUp: boolean = false): string[] | void {
+  private _readOrSeekTo(
+    read: boolean,
+    target: number,
+    roundUp = false,
+  ): string[] | void {
     let result: string[] = [];
 
     if (this.position < target) {
@@ -96,7 +113,7 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
         let s = unpairedSurrogate + this.raw.read(1, true);
         if (endsWithinCharacter(s)) {
           unpairedSurrogate = s.slice(-1); // consider this half-character part of the next string.
-          s = s.slice(0,-1);
+          s = s.slice(0, -1);
         } else {
           unpairedSurrogate = '';
         }
@@ -107,11 +124,14 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
       if (unpairedSurrogate) this.raw.seekBy(-1); // align with the last complete character.
       if (!roundUp && this.position > target) {
         const overshootInCodePoints = this.position - target;
-        const overshootInCodeUnits = characters.slice(-overshootInCodePoints).join('').length;
+        const overshootInCodeUnits = characters
+          .slice(-overshootInCodePoints)
+          .join('').length;
         this.position -= overshootInCodePoints;
         this.raw.seekBy(-overshootInCodeUnits);
       }
-    } else { // Nearly equal to the if-block, but moving backward in the text.
+    } else {
+      // Nearly equal to the if-block, but moving backward in the text.
       let unpairedSurrogate = '';
       let characters: string[] = [];
       while (this.position > target) {
@@ -129,7 +149,9 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
       if (unpairedSurrogate) this.raw.seekBy(1);
       if (!roundUp && this.position < target) {
         const overshootInCodePoints = target - this.position;
-        const overshootInCodeUnits = characters.slice(0, overshootInCodePoints).join('').length;
+        const overshootInCodeUnits = characters
+          .slice(0, overshootInCodePoints)
+          .join('').length;
         this.position += overshootInCodePoints;
         this.raw.seekBy(overshootInCodeUnits);
       }
@@ -141,10 +163,10 @@ export class CodePointSeeker<TChunk extends Chunk<string>> implements ChunkSeeke
 
 function endsWithinCharacter(s: string) {
   const codeUnit = s.charCodeAt(s.length - 1);
-  return (0xD800 <= codeUnit && codeUnit <= 0xDBFF)
+  return 0xd800 <= codeUnit && codeUnit <= 0xdbff;
 }
 
 function startsWithinCharacter(s: string) {
   const codeUnit = s.charCodeAt(0);
-  return (0xDC00 <= codeUnit && codeUnit <= 0xDFFF)
+  return 0xdc00 <= codeUnit && codeUnit <= 0xdfff;
 }
diff --git a/packages/dom/src/normalize-range.ts b/packages/dom/src/normalize-range.ts
index a4a758e..30c1e37 100644
--- a/packages/dom/src/normalize-range.ts
+++ b/packages/dom/src/normalize-range.ts
@@ -18,7 +18,7 @@
  * under the License.
  */
 
-import { ownerDocument } from "./owner-document";
+import { ownerDocument } from './owner-document';
 
 // TextRange is a Range that guarantees to always have Text nodes as its start
 // and end nodes. To ensure the type remains correct, it also restricts usage
@@ -57,19 +57,18 @@ export interface TextRange extends Range {
 // after). If the document does not contain any text nodes, an error is thrown.
 export function normalizeRange(range: Range, scope?: Range): TextRange {
   const document = ownerDocument(range);
-  const walker = document.createTreeWalker(
-    document,
-    NodeFilter.SHOW_TEXT,
-    {
-      acceptNode(node: Text) {
-        return (!scope || scope.intersectsNode(node))
-          ? NodeFilter.FILTER_ACCEPT
-          : NodeFilter.FILTER_REJECT;
-      },
+  const walker = document.createTreeWalker(document, NodeFilter.SHOW_TEXT, {
+    acceptNode(node: Text) {
+      return !scope || scope.intersectsNode(node)
+        ? NodeFilter.FILTER_ACCEPT
+        : NodeFilter.FILTER_REJECT;
     },
-  );
+  });
 
-  let [ startContainer, startOffset ] = snapBoundaryPointToTextNode(range.startContainer, range.startOffset);
+  let [startContainer, startOffset] = snapBoundaryPointToTextNode(
+    range.startContainer,
+    range.startOffset,
+  );
 
   // If we point at the end of a text node, move to the start of the next one.
   // The step is repeated to skip over empty text nodes.
@@ -82,7 +81,10 @@ export function normalizeRange(range: Range, scope?: Range): TextRange {
   // Set the range’s start; note this might move its end too.
   range.setStart(startContainer, startOffset);
 
-  let [ endContainer, endOffset ] = snapBoundaryPointToTextNode(range.endContainer, range.endOffset);
+  let [endContainer, endOffset] = snapBoundaryPointToTextNode(
+    range.endContainer,
+    range.endOffset,
+  );
 
   // If we point at the start of a text node, move to the end of the previous one.
   // The step is repeated to skip over empty text nodes.
@@ -103,9 +105,11 @@ export function normalizeRange(range: Range, scope?: Range): TextRange {
 // - otherwise the first boundary point after it whose node is a text node, if any;
 // - otherwise, the last boundary point before it whose node is a text node.
 // If the document has no text nodes, it throws an error.
-function snapBoundaryPointToTextNode(node: Node, offset: number): [Text, number] {
-  if (isText(node))
-    return [node, offset];
+function snapBoundaryPointToTextNode(
+  node: Node,
+  offset: number,
+): [Text, number] {
+  if (isText(node)) return [node, offset];
 
   // Find the node at or right after the boundary point.
   let curNode: Node;
@@ -116,26 +120,27 @@ function snapBoundaryPointToTextNode(node: Node, offset: number): [Text, number]
   } else {
     curNode = node;
     while (curNode.nextSibling === null) {
-      if (curNode.parentNode === null) // Boundary point is at end of document
+      if (curNode.parentNode === null)
+        // Boundary point is at end of document
         throw new Error('not implemented'); // TODO
       curNode = curNode.parentNode;
     }
     curNode = curNode.nextSibling;
   }
 
-  if (isText(curNode))
-    return [curNode, 0];
+  if (isText(curNode)) return [curNode, 0];
 
   // Walk to the next text node, or the last if there is none.
-  const document = node.ownerDocument ?? node as Document;
+  const document = node.ownerDocument ?? (node as Document);
   const walker = document.createTreeWalker(document, NodeFilter.SHOW_TEXT);
   walker.currentNode = curNode;
-  if (walker.nextNode() !== null)
+  if (walker.nextNode() !== null) {
     return [walker.currentNode as Text, 0];
-  else if (walker.previousNode() !== null)
+  } else if (walker.previousNode() !== null) {
     return [walker.currentNode as Text, (walker.currentNode as Text).length];
-  else
+  } else {
     throw new Error('Document contains no text nodes.');
+  }
 }
 
 function isText(node: Node): node is Text {
@@ -144,8 +149,8 @@ function isText(node: Node): node is Text {
 
 function isCharacterData(node: Node): node is CharacterData {
   return (
-    node.nodeType === Node.PROCESSING_INSTRUCTION_NODE
-    || node.nodeType === Node.COMMENT_NODE
-    || node.nodeType === Node.TEXT_NODE
+    node.nodeType === Node.PROCESSING_INSTRUCTION_NODE ||
+    node.nodeType === Node.COMMENT_NODE ||
+    node.nodeType === Node.TEXT_NODE
   );
 }
diff --git a/packages/dom/src/range/cartesian.ts b/packages/dom/src/range/cartesian.ts
index 37e9876..060a27b 100644
--- a/packages/dom/src/range/cartesian.ts
+++ b/packages/dom/src/range/cartesian.ts
@@ -76,7 +76,7 @@ export async function* cartesian<T>(
 
     // Synchronously compute and yield tuples of the partial product.
     yield* scratch.reduce(
-      (a, b) => a.flatMap((v) => b.map((w) => [...v, w])),
+      (acc, next) => acc.flatMap((v) => next.map((w) => [...v, w])),
       [[]] as T[][],
     );
   }
diff --git a/packages/dom/src/seek.ts b/packages/dom/src/seek.ts
index a1f52c8..b27848f 100644
--- a/packages/dom/src/seek.ts
+++ b/packages/dom/src/seek.ts
@@ -18,7 +18,8 @@
  * under the License.
  */
 
-import { Chunk, Chunker, chunkEquals } from "./chunker";
+import type { Chunk, Chunker } from './chunker';
+import { chunkEquals } from './chunker';
 
 const E_END = 'Iterator exhausted before seek ended.';
 
@@ -30,16 +31,20 @@ export interface Seeker<T extends Iterable<any> = string> {
   seekTo(target: number): void;
 }
 
-export interface ChunkSeeker<TChunk extends Chunk<any>, T extends Iterable<any> = string> extends Seeker<T> {
+export interface ChunkSeeker<
+  TChunk extends Chunk<any>,
+  T extends Iterable<any> = string
+> extends Seeker<T> {
   readonly currentChunk: TChunk;
   readonly offsetInChunk: number;
   seekToChunk(chunk: TChunk, offset?: number): void;
   readToChunk(chunk: TChunk, offset?: number): T;
 }
 
-export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TChunk> {
+export class TextSeeker<TChunk extends Chunk<string>>
+  implements ChunkSeeker<TChunk> {
   // The chunk containing our current text position.
-  get currentChunk() {
+  get currentChunk(): TChunk {
     return this.chunker.currentChunk;
   }
 
@@ -50,57 +55,71 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
   offsetInChunk = 0;
 
   // The current text position (measured in code units)
-  get position() { return this.currentChunkPosition + this.offsetInChunk; }
+  get position(): number {
+    return this.currentChunkPosition + this.offsetInChunk;
+  }
 
   constructor(protected chunker: Chunker<TChunk>) {
     // Walk to the start of the first non-empty chunk inside the scope.
     this.seekTo(0);
   }
 
-  read(length: number, roundUp: boolean = false) {
+  read(length: number, roundUp = false): string {
     return this.readTo(this.position + length, roundUp);
   }
 
-  readTo(target: number, roundUp: boolean = false) {
+  readTo(target: number, roundUp = false): string {
     return this._readOrSeekTo(true, target, roundUp);
   }
 
-  seekBy(length: number) {
+  seekBy(length: number): void {
     this.seekTo(this.position + length);
   }
 
-  seekTo(target: number) {
+  seekTo(target: number): void {
     this._readOrSeekTo(false, target);
   }
 
-  seekToChunk(target: TChunk, offset: number = 0) {
+  seekToChunk(target: TChunk, offset = 0): void {
     this._readOrSeekToChunk(false, target, offset);
   }
 
-  readToChunk(target: TChunk, offset: number = 0): string {
+  readToChunk(target: TChunk, offset = 0): string {
     return this._readOrSeekToChunk(true, target, offset);
   }
 
-  private _readOrSeekToChunk(read: true, target: TChunk, offset?: number): string
-  private _readOrSeekToChunk(read: false, target: TChunk, offset?: number): void
-  private _readOrSeekToChunk(read: boolean, target: TChunk, offset: number = 0): string | void {
+  private _readOrSeekToChunk(
+    read: true,
+    target: TChunk,
+    offset?: number,
+  ): string;
+  private _readOrSeekToChunk(
+    read: false,
+    target: TChunk,
+    offset?: number,
+  ): void;
+  private _readOrSeekToChunk(
+    read: boolean,
+    target: TChunk,
+    offset = 0,
+  ): string | void {
     const oldPosition = this.position;
     let result = '';
 
     // Walk to the requested chunk.
-    if (!this.chunker.precedesCurrentChunk(target)) { // Search forwards.
+    if (!this.chunker.precedesCurrentChunk(target)) {
+      // Search forwards.
       while (!chunkEquals(this.currentChunk, target)) {
         const [data, nextChunk] = this._readToNextChunk();
         if (read) result += data;
-        if (nextChunk === null)
-          throw new RangeError(E_END);
+        if (nextChunk === null) throw new RangeError(E_END);
       }
-    } else { // Search backwards.
+    } else {
+      // Search backwards.
       while (!chunkEquals(this.currentChunk, target)) {
         const [data, previousChunk] = this._readToPreviousChunk();
         if (read) result = data + result;
-        if (previousChunk === null)
-          throw new RangeError(E_END);
+        if (previousChunk === null) throw new RangeError(E_END);
       }
     }
 
@@ -114,8 +133,7 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
       if (targetPosition >= this.position) {
         // Read further until the target.
         result += this.readTo(targetPosition);
-      }
-      else if (targetPosition >= oldPosition) {
+      } else if (targetPosition >= oldPosition) {
         // We passed by our target position: step back.
         this.seekTo(targetPosition);
         result = result.slice(0, targetPosition - oldPosition);
@@ -128,14 +146,20 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
     }
   }
 
-  private _readOrSeekTo(read: true, target: number, roundUp?: boolean): string
-  private _readOrSeekTo(read: false, target: number, roundUp?: boolean): void
-  private _readOrSeekTo(read: boolean, target: number, roundUp: boolean = false): string | void {
+  private _readOrSeekTo(read: true, target: number, roundUp?: boolean): string;
+  private _readOrSeekTo(read: false, target: number, roundUp?: boolean): void;
+  private _readOrSeekTo(
+    read: boolean,
+    target: number,
+    roundUp = false,
+  ): string | void {
     let result = '';
 
     if (this.position <= target) {
       while (true) {
-        if (this.currentChunkPosition + this.currentChunk.data.length <= target) {
+        const endOfChunk =
+          this.currentChunkPosition + this.currentChunk.data.length;
+        if (endOfChunk <= target) {
           // The target is beyond the current chunk.
           // (we use < not ≤: if the target is *at* the end of the chunk, possibly
           // because the current chunk is empty, we prefer to take the next chunk)
@@ -143,15 +167,19 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
           const [data, nextChunk] = this._readToNextChunk();
           if (read) result += data;
           if (nextChunk === null) {
-            if (this.position === target)
-              break;
-            else
-              throw new RangeError(E_END);
+            if (this.position === target) break;
+            else throw new RangeError(E_END);
           }
         } else {
           // The target is within the current chunk.
-          const newOffset = roundUp ? this.currentChunk.data.length : target - this.currentChunkPosition;
-          if (read) result += this.currentChunk.data.substring(this.offsetInChunk, newOffset);
+          const newOffset = roundUp
+            ? this.currentChunk.data.length
+            : target - this.currentChunkPosition;
+          if (read)
+            result += this.currentChunk.data.substring(
+              this.offsetInChunk,
+              newOffset,
+            );
           this.offsetInChunk = newOffset;
 
           // If we finish end at the end of the chunk, seek to the start of the next non-empty node.
@@ -161,19 +189,22 @@ export class TextSeeker<TChunk extends Chunk<string>> implements ChunkSeeker<TCh
           break;
         }
       }
-    } else { // Similar to the if-block, but moving backward in the text.
+    } else {
+      // Similar to the if-block, but moving backward in the text.
       while (this.position > target) {
         if (this.currentChunkPosition <= target) {
           // The target is within the current chunk.
           const newOffset = roundUp ? 0 : target - this.currentChunkPosition;
-          if (read) result = this.currentChunk.data.substring(newOffset, this.offsetInChunk) + result;
+          if (read)
+            result =
+              this.currentChunk.data.substring(newOffset, this.offsetInChunk) +
+              result;
           this.offsetInChunk = newOffset;
           break;
         } else {
           const [data, previousChunk] = this._readToPreviousChunk();
           if (read) result = data + result;
-          if (previousChunk === null)
-            throw new RangeError(E_END);
+          if (previousChunk === null) throw new RangeError(E_END);
         }
       }
     }
diff --git a/packages/dom/src/text-position/describe.ts b/packages/dom/src/text-position/describe.ts
index d4099a9..5f7f9a3 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/dom/src/text-position/describe.ts
@@ -19,9 +19,10 @@
  */
 
 import type { TextPositionSelector } from '@annotator/selector';
-import { ownerDocument } from '../owner-document';
-import { Chunk, Chunker, ChunkRange, TextNodeChunker } from '../chunker';
+import type { Chunk, Chunker, ChunkRange } from '../chunker';
+import { TextNodeChunker } from '../chunker';
 import { CodePointSeeker } from '../code-point-seeker';
+import { ownerDocument } from '../owner-document';
 import { TextSeeker } from '../seek';
 
 export async function describeTextPosition(
diff --git a/packages/dom/src/text-position/match.ts b/packages/dom/src/text-position/match.ts
index cc8044e..becd957 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/dom/src/text-position/match.ts
@@ -19,9 +19,10 @@
  */
 
 import type { Matcher, TextPositionSelector } from '@annotator/selector';
-import { TextSeeker } from '../seek';
+import type { Chunk, ChunkRange, Chunker } from '../chunker';
+import { TextNodeChunker } from '../chunker';
 import { CodePointSeeker } from '../code-point-seeker';
-import { Chunk, ChunkRange, TextNodeChunker, Chunker } from '../chunker';
+import { TextSeeker } from '../seek';
 
 export function createTextPositionSelectorMatcher(
   selector: TextPositionSelector,
@@ -41,10 +42,14 @@ export function createTextPositionSelectorMatcher(
 
 export function abstractTextPositionSelectorMatcher(
   selector: TextPositionSelector,
-): <TChunk extends Chunk<any>>(scope: Chunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
+): <TChunk extends Chunk<any>>(
+  scope: Chunker<TChunk>,
+) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
   const { start, end } = selector;
 
-  return async function* matchAll<TChunk extends Chunk<string>>(textChunks: Chunker<TChunk>) {
+  return async function* matchAll<TChunk extends Chunk<string>>(
+    textChunks: Chunker<TChunk>,
+  ) {
     const codeUnitSeeker = new TextSeeker(textChunks);
     const codePointSeeker = new CodePointSeeker(codeUnitSeeker);
 
@@ -56,5 +61,5 @@ export function abstractTextPositionSelectorMatcher(
     const endIndex = codeUnitSeeker.offsetInChunk;
 
     yield { startChunk, startIndex, endChunk, endIndex };
-  }
+  };
 }
diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 756df1e..3dfa45e 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -19,10 +19,12 @@
  */
 
 import type { TextQuoteSelector } from '@annotator/selector';
+import type { Chunk, Chunker, ChunkRange } from '../chunker';
+import { TextNodeChunker, chunkRangeEquals } from '../chunker';
 import { ownerDocument } from '../owner-document';
-import { Chunk, Chunker, ChunkRange, TextNodeChunker, chunkRangeEquals } from '../chunker';
+import type { Seeker } from '../seek';
+import { TextSeeker } from '../seek';
 import { abstractTextQuoteSelectorMatcher } from '.';
-import { TextSeeker, Seeker } from '../seek';
 
 export async function describeTextQuote(
   range: Range,
@@ -67,9 +69,11 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
       exact,
       prefix,
       suffix,
-    }
+    };
 
-    const matches = abstractTextQuoteSelectorMatcher(tentativeSelector)(scope());
+    const matches = abstractTextQuoteSelectorMatcher(tentativeSelector)(
+      scope(),
+    );
     let nextMatch = await matches.next();
 
     // If this match is the intended one, no need to act.
@@ -95,21 +99,32 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
 
     // Count how many characters we’d need as a prefix to disqualify this match.
     seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
-    seeker2.seekToChunk(unintendedMatch.startChunk, unintendedMatch.startIndex - prefix.length);
+    seeker2.seekToChunk(
+      unintendedMatch.startChunk,
+      unintendedMatch.startIndex - prefix.length,
+    );
     const extraPrefix = readUntilDifferent(seeker1, seeker2, true);
 
     // Count how many characters we’d need as a suffix to disqualify this match.
     seeker1.seekToChunk(target.endChunk, target.endIndex + suffix.length);
-    seeker2.seekToChunk(unintendedMatch.endChunk, unintendedMatch.endIndex + suffix.length);
+    seeker2.seekToChunk(
+      unintendedMatch.endChunk,
+      unintendedMatch.endIndex + suffix.length,
+    );
     const extraSuffix = readUntilDifferent(seeker1, seeker2, false);
 
     // Use either the prefix or suffix, whichever is shortest.
-    if (extraPrefix !== undefined && (extraSuffix === undefined || extraPrefix.length <= extraSuffix.length)) {
+    if (
+      extraPrefix !== undefined &&
+      (extraSuffix === undefined || extraPrefix.length <= extraSuffix.length)
+    ) {
       prefix = extraPrefix + prefix;
     } else if (extraSuffix !== undefined) {
       suffix = suffix + extraSuffix;
     } else {
-      throw new Error('Target cannot be disambiguated; how could that have happened‽');
+      throw new Error(
+        'Target cannot be disambiguated; how could that have happened‽',
+      );
     }
   }
 }
@@ -127,18 +142,16 @@ function readUntilDifferent(
     } catch (err) {
       return undefined; // Start/end of text reached: cannot expand result.
     }
-    result = reverse
-      ? nextCharacter + result
-      : result + nextCharacter;
+    result = reverse ? nextCharacter + result : result + nextCharacter;
 
     // Check if the newly added character makes the result differ from the second seeker.
     let comparisonCharacter: string | undefined;
     try {
       comparisonCharacter = seeker2.read(reverse ? -1 : 1);
-    } catch (err) { // A RangeError would merely mean seeker2 is exhausted.
+    } catch (err) {
+      // A RangeError would merely mean seeker2 is exhausted.
       if (!(err instanceof RangeError)) throw err;
     }
-    if (nextCharacter !== comparisonCharacter)
-      return result;
+    if (nextCharacter !== comparisonCharacter) return result;
   }
 }
diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index dd69227..9e09990 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -19,7 +19,8 @@
  */
 
 import type { Matcher, TextQuoteSelector } from '@annotator/selector';
-import { Chunk, Chunker, ChunkRange, TextNodeChunker, EmptyScopeError } from '../chunker';
+import type { Chunk, Chunker, ChunkRange } from '../chunker';
+import { TextNodeChunker, EmptyScopeError } from '../chunker';
 
 export function createTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,
@@ -31,22 +32,25 @@ export function createTextQuoteSelectorMatcher(
     try {
       textChunks = new TextNodeChunker(scope);
     } catch (err) {
-      if (err instanceof EmptyScopeError)
-        return; // An empty range contains no matches.
-      else
-        throw err;
+      if (err instanceof EmptyScopeError) return;
+      // An empty range contains no matches.
+      else throw err;
     }
 
     for await (const abstractMatch of abstractMatcher(textChunks)) {
       yield textChunks.chunkRangeToRange(abstractMatch);
     }
-  }
+  };
 }
 
 export function abstractTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,
-): <TChunk extends Chunk<any>>(scope: Chunker<TChunk>) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
-  return async function* matchAll<TChunk extends Chunk<string>>(textChunks: Chunker<TChunk>) {
+): <TChunk extends Chunk<any>>(
+  scope: Chunker<TChunk>,
+) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
+  return async function* matchAll<TChunk extends Chunk<string>>(
+    textChunks: Chunker<TChunk>,
+  ) {
     const exact = selector.exact;
     const prefix = selector.prefix || '';
     const suffix = selector.suffix || '';
@@ -78,7 +82,8 @@ export function abstractTextQuoteSelectorMatcher(
 
         // If the current chunk contains the start and/or end of the match, record these.
         if (partialMatch.endChunk === undefined) {
-          const charactersUntilMatchEnd = prefix.length + exact.length - charactersMatched;
+          const charactersUntilMatchEnd =
+            prefix.length + exact.length - charactersMatched;
           if (charactersUntilMatchEnd <= chunkValue.length) {
             partialMatch.endChunk = chunk;
             partialMatch.endIndex = charactersUntilMatchEnd;
@@ -87,20 +92,29 @@ export function abstractTextQuoteSelectorMatcher(
         if (partialMatch.startChunk === undefined) {
           const charactersUntilMatchStart = prefix.length - charactersMatched;
           if (
-            charactersUntilMatchStart < chunkValue.length
-            || partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
+            charactersUntilMatchStart < chunkValue.length ||
+            partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
           ) {
             partialMatch.startChunk = chunk;
             partialMatch.startIndex = charactersUntilMatchStart;
           }
         }
 
-        const charactersUntilSuffixEnd = searchPattern.length - charactersMatched;
+        const charactersUntilSuffixEnd =
+          searchPattern.length - charactersMatched;
         if (charactersUntilSuffixEnd <= chunkValue.length) {
-          if (chunkValue.startsWith(searchPattern.substring(charactersMatched))) {
+          if (
+            chunkValue.startsWith(searchPattern.substring(charactersMatched))
+          ) {
             yield partialMatch as ChunkRange<TChunk>; // all fields are certainly defined now.
           }
-        } else if (chunkValue === searchPattern.substring(charactersMatched, charactersMatched + chunkValue.length)) {
+        } else if (
+          chunkValue ===
+          searchPattern.substring(
+            charactersMatched,
+            charactersMatched + chunkValue.length,
+          )
+        ) {
           // The chunk is too short to complete the match; comparison has to be completed in subsequent chunks.
           partialMatch.charactersMatched += chunkValue.length;
           remainingPartialMatches.push(partialMatch);
@@ -112,12 +126,19 @@ export function abstractTextQuoteSelectorMatcher(
       if (searchPattern.length <= chunkValue.length) {
         let fromIndex = 0;
         while (fromIndex <= chunkValue.length) {
-          const patternStartIndex = chunkValue.indexOf(searchPattern, fromIndex);
+          const patternStartIndex = chunkValue.indexOf(
+            searchPattern,
+            fromIndex,
+          );
           if (patternStartIndex === -1) break;
           fromIndex = patternStartIndex + 1;
 
           // Handle edge case: an empty searchPattern would already have been yielded at the end of the last chunk.
-          if (patternStartIndex === 0 && searchPattern.length === 0 && !isFirstChunk)
+          if (
+            patternStartIndex === 0 &&
+            searchPattern.length === 0 &&
+            !isFirstChunk
+          )
             continue;
 
           yield {
@@ -131,11 +152,15 @@ export function abstractTextQuoteSelectorMatcher(
 
       // 3. Check if this chunk ends with a partial match (or even multiple partial matches).
       let newPartialMatches: number[] = [];
-      const searchStartPoint = Math.max(chunkValue.length - searchPattern.length + 1, 0);
+      const searchStartPoint = Math.max(
+        chunkValue.length - searchPattern.length + 1,
+        0,
+      );
       for (let i = searchStartPoint; i < chunkValue.length; i++) {
         const character = chunkValue[i];
         newPartialMatches = newPartialMatches.filter(
-          partialMatchStartIndex => (character === searchPattern[i - partialMatchStartIndex])
+          (partialMatchStartIndex) =>
+            character === searchPattern[i - partialMatchStartIndex],
         );
         if (character === searchPattern[0]) newPartialMatches.push(i);
       }
@@ -146,11 +171,12 @@ export function abstractTextQuoteSelectorMatcher(
         };
         if (charactersMatched >= prefix.length + exact.length) {
           partialMatch.endChunk = chunk;
-          partialMatch.endIndex = partialMatchStartIndex + prefix.length + exact.length;
+          partialMatch.endIndex =
+            partialMatchStartIndex + prefix.length + exact.length;
         }
         if (
-          charactersMatched > prefix.length
-          || partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
+          charactersMatched > prefix.length ||
+          partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
         ) {
           partialMatch.startChunk = chunk;
           partialMatch.startIndex = partialMatchStartIndex + prefix.length;
diff --git a/packages/dom/test/text-position/describe.test.ts b/packages/dom/test/text-position/describe.test.ts
index 2eefd38..9bc9957 100644
--- a/packages/dom/test/text-position/describe.test.ts
+++ b/packages/dom/test/text-position/describe.test.ts
@@ -35,7 +35,10 @@ describe('createTextPositionSelectorMatcher', () => {
         const doc = domParser.parseFromString(html, 'text/html');
         const scope = doc.createRange();
         scope.selectNodeContents(doc);
-        const result = await describeTextPosition(hydrateRange(range, doc), scope);
+        const result = await describeTextPosition(
+          hydrateRange(range, doc),
+          scope,
+        );
         assert.deepEqual(result, selector);
       });
     }
diff --git a/packages/dom/test/text-position/match-cases.ts b/packages/dom/test/text-position/match-cases.ts
index 0916446..6152a1c 100644
--- a/packages/dom/test/text-position/match-cases.ts
+++ b/packages/dom/test/text-position/match-cases.ts
@@ -109,8 +109,7 @@ export const testCases: {
     ],
   },
   'text inside <head>': {
-    html:
-      '<head><title>l😃rem ipsum dolor amet</title></head><b>yada yada</b>',
+    html: '<head><title>l😃rem ipsum dolor amet</title></head><b>yada yada</b>',
     selector: {
       type: 'TextPositionSelector',
       start: 18,
@@ -132,11 +131,13 @@ export const testCases: {
       start: 3,
       end: 3,
     },
-    expected: [{
-      startContainerXPath: '//b/text()',
-      startOffset: 4,
-      endContainerXPath: '//b/text()',
-      endOffset: 4,
-    }],
+    expected: [
+      {
+        startContainerXPath: '//b/text()',
+        startOffset: 4,
+        endContainerXPath: '//b/text()',
+        endOffset: 4,
+      },
+    ],
   },
 };
diff --git a/packages/dom/test/text-position/match.test.ts b/packages/dom/test/text-position/match.test.ts
index 1acaed0..ac9c31f 100644
--- a/packages/dom/test/text-position/match.test.ts
+++ b/packages/dom/test/text-position/match.test.ts
@@ -83,7 +83,6 @@ describe('createTextPositionSelectorMatcher', () => {
     // console.log([...textNode.parentNode.childNodes].map(node => node.textContent))
     // → [ '', 'l😃rem ipsum ', '', 'dolor', '', ' am', '', 'et yada yada', '' ]
 
-
     await testMatcher(doc, scope, selector, [
       {
         startContainerXPath: '//b/text()[4]', // "dolor"
diff --git a/packages/dom/test/text-quote/match-cases.ts b/packages/dom/test/text-quote/match-cases.ts
index 3145b51..5d2866d 100644
--- a/packages/dom/test/text-quote/match-cases.ts
+++ b/packages/dom/test/text-quote/match-cases.ts
@@ -307,45 +307,44 @@ export const testCases: {
       type: 'TextQuoteSelector',
       exact: '',
     },
-    expected:
-      [
-        {
-          startContainerXPath: '//b/text()[1]',
-          startOffset: 0,
-          endContainerXPath: '//b/text()[1]',
-          endOffset: 0,
-        },
-        {
-          startContainerXPath: '//b/text()[1]',
-          startOffset: 1,
-          endContainerXPath: '//b/text()[1]',
-          endOffset: 1,
-        },
-        {
-          startContainerXPath: '//i/text()',
-          startOffset: 1,
-          endContainerXPath: '//i/text()',
-          endOffset: 1,
-        },
-        {
-          startContainerXPath: '//i/text()',
-          startOffset: 2,
-          endContainerXPath: '//i/text()',
-          endOffset: 2,
-        },
-        {
-          startContainerXPath: '//b/text()[2]',
-          startOffset: 1,
-          endContainerXPath: '//b/text()[2]',
-          endOffset: 1,
-        },
-        {
-          startContainerXPath: '//b/text()[2]',
-          startOffset: 2,
-          endContainerXPath: '//b/text()[2]',
-          endOffset: 2,
-        },
-      ],
+    expected: [
+      {
+        startContainerXPath: '//b/text()[1]',
+        startOffset: 0,
+        endContainerXPath: '//b/text()[1]',
+        endOffset: 0,
+      },
+      {
+        startContainerXPath: '//b/text()[1]',
+        startOffset: 1,
+        endContainerXPath: '//b/text()[1]',
+        endOffset: 1,
+      },
+      {
+        startContainerXPath: '//i/text()',
+        startOffset: 1,
+        endContainerXPath: '//i/text()',
+        endOffset: 1,
+      },
+      {
+        startContainerXPath: '//i/text()',
+        startOffset: 2,
+        endContainerXPath: '//i/text()',
+        endOffset: 2,
+      },
+      {
+        startContainerXPath: '//b/text()[2]',
+        startOffset: 1,
+        endContainerXPath: '//b/text()[2]',
+        endOffset: 1,
+      },
+      {
+        startContainerXPath: '//b/text()[2]',
+        startOffset: 2,
+        endContainerXPath: '//b/text()[2]',
+        endOffset: 2,
+      },
+    ],
   },
   'empty quote, with prefix': {
     html: '<b>lorem ipsum dolor amet yada yada</b>',
diff --git a/packages/dom/test/text-quote/match.test.ts b/packages/dom/test/text-quote/match.test.ts
index 8a68cec..97f5c3c 100644
--- a/packages/dom/test/text-quote/match.test.ts
+++ b/packages/dom/test/text-quote/match.test.ts
@@ -194,7 +194,7 @@ async function testMatcher(
   const matcher = createTextQuoteSelectorMatcher(selector);
   const matches = [];
   for await (const value of matcher(scope)) matches.push(value);
-  assert.equal(matches.length, expected.length, "Wrong number of matches.");
+  assert.equal(matches.length, expected.length, 'Wrong number of matches.');
   matches.forEach((match, i) => {
     const expectedRange = expected[i];
     const expectedStartContainer = evaluateXPath(
diff --git a/packages/selector/src/index.ts b/packages/selector/src/index.ts
index ffab70b..73caa05 100644
--- a/packages/selector/src/index.ts
+++ b/packages/selector/src/index.ts
@@ -21,7 +21,12 @@
 import type { Matcher, Selector } from './types';
 
 export type { Matcher, Selector } from './types';
-export type { CssSelector, RangeSelector, TextPositionSelector, TextQuoteSelector } from './types';
+export type {
+  CssSelector,
+  RangeSelector,
+  TextPositionSelector,
+  TextQuoteSelector,
+} from './types';
 
 export function makeRefinable<
   // Any subtype of Selector can be made refinable; but note we limit the value


[incubator-annotator] 08/14: Refactor pre/suffix disambiguation

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 91d245958ef43d092ecbd2ad777af9794c40c516
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 12:59:29 2020 +0100

    Refactor pre/suffix disambiguation
---
 packages/dom/src/text-quote/describe.ts | 79 ++++++++++++++-------------------
 1 file changed, 34 insertions(+), 45 deletions(-)

diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 688089f..ae79ad0 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -22,7 +22,7 @@ import type { TextQuoteSelector } from '@annotator/selector';
 import { ownerDocument } from '../owner-document';
 import { Chunk, Chunker, ChunkRange, TextNodeChunker, chunkRangeEquals } from '../chunker';
 import { abstractTextQuoteSelectorMatcher } from '.';
-import { TextSeeker } from '../seek';
+import { TextSeeker, Seeker } from '../seek';
 
 export async function describeTextQuote(
   range: Range,
@@ -94,54 +94,14 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
     // Count how many characters we’d need as a prefix to disqualify this match.
     seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
     seeker2.seekToChunk(unintendedMatch.startChunk, unintendedMatch.startIndex - prefix.length);
-    let sufficientPrefix: string | undefined = prefix;
-    while (true) {
-      let previousCharacter: string;
-      try {
-        previousCharacter = seeker1.read(-1);
-      } catch (err) {
-        sufficientPrefix = undefined; // Start of text reached.
-        break;
-      }
-      sufficientPrefix = previousCharacter + sufficientPrefix;
-
-      // Break if the newly added character makes the prefix unambiguous.
-      try {
-        const unintendedMatchPreviousCharacter = seeker2.read(-1);
-        if (previousCharacter !== unintendedMatchPreviousCharacter) break;
-      } catch (err) {
-        if (err instanceof RangeError)
-          break;
-        else
-          throw err;
-      }
-    }
+    const extraPrefix = readUntilDifferent(seeker1, seeker2, true);
+    let sufficientPrefix = extraPrefix !== undefined ? extraPrefix + prefix : undefined;
 
     // Count how many characters we’d need as a suffix to disqualify this match.
     seeker1.seekToChunk(target.endChunk, target.endIndex + suffix.length);
     seeker2.seekToChunk(unintendedMatch.endChunk, unintendedMatch.endIndex + suffix.length);
-    let sufficientSuffix: string | undefined = suffix;
-    while (true) {
-      let nextCharacter: string;
-      try {
-        nextCharacter = seeker1.read(1);
-      } catch (err) {
-        sufficientSuffix = undefined; // End of text reached.
-        break;
-      }
-      sufficientSuffix += nextCharacter;
-
-      // Break if the newly added character makes the suffix unambiguous.
-      try {
-        const unintendedMatchNextCharacter = seeker2.read(1);
-        if (nextCharacter !== unintendedMatchNextCharacter) break;
-      } catch (err) {
-        if (err instanceof RangeError)
-          break;
-        else
-          throw err;
-      }
-    }
+    const extraSuffix = readUntilDifferent(seeker1, seeker2, false);
+    let sufficientSuffix = extraSuffix !== undefined ? suffix + extraSuffix : undefined;
 
     // Use either the prefix or suffix, whichever is shortest.
     if (sufficientPrefix !== undefined && (sufficientSuffix === undefined || sufficientPrefix.length <= sufficientSuffix.length)) {
@@ -154,3 +114,32 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
     }
   }
 }
+
+function readUntilDifferent(
+  seeker1: Seeker,
+  seeker2: Seeker,
+  reverse: boolean,
+): string | undefined {
+  let result = '';
+  while (true) {
+    let nextCharacter: string;
+    try {
+      nextCharacter = seeker1.read(reverse ? -1 : 1);
+    } catch (err) {
+      return undefined; // Start/end of text reached: cannot expand result.
+    }
+    result = reverse
+      ? nextCharacter + result
+      : result + nextCharacter;
+
+    // Check if the newly added character makes the result differ from the second seeker.
+    let comparisonCharacter: string | undefined;
+    try {
+      comparisonCharacter = seeker2.read(reverse ? -1 : 1);
+    } catch (err) { // A RangeError would merely mean seeker2 is exhausted.
+      if (!(err instanceof RangeError)) throw err;
+    }
+    if (nextCharacter !== comparisonCharacter)
+      return result;
+  }
+}


[incubator-annotator] 13/14: Move abstract code into @annotator/selector

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 6ee4ff8992c68366d8c0d3120945a94a45298d88
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 16:40:19 2020 +0100

    Move abstract code into @annotator/selector
---
 packages/dom/src/chunker.ts                        |  51 +------
 packages/dom/src/text-position/describe.ts         |  22 +--
 packages/dom/src/text-position/match.ts            |  28 +---
 packages/dom/src/text-quote/describe.ts            | 115 +-------------
 packages/dom/src/text-quote/match.ts               | 148 +-----------------
 packages/selector/src/index.ts                     |   1 +
 packages/selector/src/text/chunker.ts              |  69 +++++++++
 .../src => selector/src/text}/code-point-seeker.ts |   0
 .../src/text/describe-text-position.ts}            |  34 +----
 packages/selector/src/text/describe-text-quote.ts  | 134 ++++++++++++++++
 packages/selector/src/text/index.ts                |   5 +
 .../src/text/match-text-position.ts}               |  27 +---
 packages/selector/src/text/match-text-quote.ts     | 168 +++++++++++++++++++++
 packages/{dom/src => selector/src/text}/seek.ts    |   0
 14 files changed, 392 insertions(+), 410 deletions(-)

diff --git a/packages/dom/src/chunker.ts b/packages/dom/src/chunker.ts
index 8c56924..bc5c2c1 100644
--- a/packages/dom/src/chunker.ts
+++ b/packages/dom/src/chunker.ts
@@ -18,59 +18,10 @@
  * under the License.
  */
 
+import type { Chunk, Chunker, ChunkRange } from '@annotator/selector';
 import { normalizeRange } from './normalize-range';
 import { ownerDocument } from './owner-document';
 
-// A Chunk represents a fragment (typically a string) of some document.
-// Subclasses can add further attributes to map the chunk to its position in the
-// data structure it came from (e.g. a DOM node).
-export interface Chunk<TData> {
-  readonly data: TData;
-  equals?(otherChunk: this): boolean;
-}
-
-export interface ChunkRange<TChunk extends Chunk<any>> {
-  startChunk: TChunk;
-  startIndex: number;
-  endChunk: TChunk;
-  endIndex: number;
-}
-
-export function chunkEquals(chunk1: Chunk<any>, chunk2: Chunk<any>): boolean {
-  return chunk1.equals ? chunk1.equals(chunk2) : chunk1 === chunk2;
-}
-
-export function chunkRangeEquals(
-  range1: ChunkRange<any>,
-  range2: ChunkRange<any>,
-): boolean {
-  return (
-    chunkEquals(range1.startChunk, range2.startChunk) &&
-    chunkEquals(range1.endChunk, range2.endChunk) &&
-    range1.startIndex === range2.startIndex &&
-    range1.endIndex === range2.endIndex
-  );
-}
-
-// A Chunker lets one walk through the chunks of a document.
-// It is inspired by, and similar to, the DOM’s NodeIterator. (but unlike
-// NodeIterator, it has no concept of being ‘before’ or ‘after’ a chunk)
-export interface Chunker<TChunk extends Chunk<any>> {
-  // The chunk currently being pointed at.
-  readonly currentChunk: TChunk;
-
-  // Move currentChunk to the chunk following it, and return that chunk.
-  // If there are no chunks following it, keep currentChunk unchanged and return null.
-  nextChunk(): TChunk | null;
-
-  // Move currentChunk to the chunk preceding it, and return that chunk.
-  // If there are no preceding chunks, keep currentChunk unchanged and return null.
-  previousChunk(): TChunk | null;
-
-  // Test if a given chunk is before the current chunk.
-  precedesCurrentChunk(chunk: TChunk): boolean;
-}
-
 export interface PartialTextNode extends Chunk<string> {
   readonly node: Text;
   readonly startOffset: number;
diff --git a/packages/dom/src/text-position/describe.ts b/packages/dom/src/text-position/describe.ts
index 5f7f9a3..992a633 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/dom/src/text-position/describe.ts
@@ -19,11 +19,9 @@
  */
 
 import type { TextPositionSelector } from '@annotator/selector';
-import type { Chunk, Chunker, ChunkRange } from '../chunker';
+import { describeTextPosition as abstractDescribeTextPosition } from '@annotator/selector';
 import { TextNodeChunker } from '../chunker';
-import { CodePointSeeker } from '../code-point-seeker';
 import { ownerDocument } from '../owner-document';
-import { TextSeeker } from '../seek';
 
 export async function describeTextPosition(
   range: Range,
@@ -48,21 +46,3 @@ export async function describeTextPosition(
     textChunks,
   );
 }
-
-async function abstractDescribeTextPosition<TChunk extends Chunk<string>>(
-  target: ChunkRange<TChunk>,
-  scope: Chunker<TChunk>,
-): Promise<TextPositionSelector> {
-  const codeUnitSeeker = new TextSeeker(scope);
-  const codePointSeeker = new CodePointSeeker(codeUnitSeeker);
-
-  codePointSeeker.seekToChunk(target.startChunk, target.startIndex);
-  const start = codePointSeeker.position;
-  codePointSeeker.seekToChunk(target.endChunk, target.endIndex);
-  const end = codePointSeeker.position;
-  return {
-    type: 'TextPositionSelector',
-    start,
-    end,
-  };
-}
diff --git a/packages/dom/src/text-position/match.ts b/packages/dom/src/text-position/match.ts
index becd957..3dccede 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/dom/src/text-position/match.ts
@@ -19,10 +19,8 @@
  */
 
 import type { Matcher, TextPositionSelector } from '@annotator/selector';
-import type { Chunk, ChunkRange, Chunker } from '../chunker';
+import { textPositionSelectorMatcher as abstractTextPositionSelectorMatcher } from '@annotator/selector';
 import { TextNodeChunker } from '../chunker';
-import { CodePointSeeker } from '../code-point-seeker';
-import { TextSeeker } from '../seek';
 
 export function createTextPositionSelectorMatcher(
   selector: TextPositionSelector,
@@ -39,27 +37,3 @@ export function createTextPositionSelectorMatcher(
     }
   };
 }
-
-export function abstractTextPositionSelectorMatcher(
-  selector: TextPositionSelector,
-): <TChunk extends Chunk<any>>(
-  scope: Chunker<TChunk>,
-) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
-  const { start, end } = selector;
-
-  return async function* matchAll<TChunk extends Chunk<string>>(
-    textChunks: Chunker<TChunk>,
-  ) {
-    const codeUnitSeeker = new TextSeeker(textChunks);
-    const codePointSeeker = new CodePointSeeker(codeUnitSeeker);
-
-    codePointSeeker.seekTo(start);
-    const startChunk = codeUnitSeeker.currentChunk;
-    const startIndex = codeUnitSeeker.offsetInChunk;
-    codePointSeeker.seekTo(end);
-    const endChunk = codeUnitSeeker.currentChunk;
-    const endIndex = codeUnitSeeker.offsetInChunk;
-
-    yield { startChunk, startIndex, endChunk, endIndex };
-  };
-}
diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 3dfa45e..3ca3f0b 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -19,12 +19,9 @@
  */
 
 import type { TextQuoteSelector } from '@annotator/selector';
-import type { Chunk, Chunker, ChunkRange } from '../chunker';
-import { TextNodeChunker, chunkRangeEquals } from '../chunker';
+import { describeTextQuote as abstractDescribeTextQuote } from '@annotator/selector';
+import { TextNodeChunker } from '../chunker';
 import { ownerDocument } from '../owner-document';
-import type { Seeker } from '../seek';
-import { TextSeeker } from '../seek';
-import { abstractTextQuoteSelectorMatcher } from '.';
 
 export async function describeTextQuote(
   range: Range,
@@ -47,111 +44,3 @@ export async function describeTextQuote(
     () => new TextNodeChunker(scope),
   );
 }
-
-async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
-  target: ChunkRange<TChunk>,
-  scope: () => Chunker<TChunk>,
-): Promise<TextQuoteSelector> {
-  const seeker = new TextSeeker(scope());
-
-  // Read the target’s exact text.
-  seeker.seekToChunk(target.startChunk, target.startIndex);
-  const exact = seeker.readToChunk(target.endChunk, target.endIndex);
-
-  // Starting with an empty prefix and suffix, we search for matches. At each unintended match
-  // we encounter, we extend the prefix or suffix just enough to ensure it will no longer match.
-  let prefix = '';
-  let suffix = '';
-
-  while (true) {
-    const tentativeSelector: TextQuoteSelector = {
-      type: 'TextQuoteSelector',
-      exact,
-      prefix,
-      suffix,
-    };
-
-    const matches = abstractTextQuoteSelectorMatcher(tentativeSelector)(
-      scope(),
-    );
-    let nextMatch = await matches.next();
-
-    // If this match is the intended one, no need to act.
-    // XXX This test is fragile: nextMatch and target are assumed to be normalised.
-    if (!nextMatch.done && chunkRangeEquals(nextMatch.value, target)) {
-      nextMatch = await matches.next();
-    }
-
-    // If there are no more unintended matches, our selector is unambiguous!
-    if (nextMatch.done) return tentativeSelector;
-
-    // Possible optimisation: A subsequent search could safely skip the part we
-    // already processed, instead of starting from the beginning again. But we’d
-    // need the matcher to start at the seeker’s position, instead of searching
-    // in the whole current chunk. Then we could just seek back to just after
-    // the start of the prefix: seeker.seekBy(-prefix.length + 1); (don’t forget
-    // to also correct for any changes in the prefix we will make below)
-
-    // We’ll have to add more prefix/suffix to disqualify this unintended match.
-    const unintendedMatch = nextMatch.value;
-    const seeker1 = new TextSeeker(scope());
-    const seeker2 = new TextSeeker(scope());
-
-    // Count how many characters we’d need as a prefix to disqualify this match.
-    seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
-    seeker2.seekToChunk(
-      unintendedMatch.startChunk,
-      unintendedMatch.startIndex - prefix.length,
-    );
-    const extraPrefix = readUntilDifferent(seeker1, seeker2, true);
-
-    // Count how many characters we’d need as a suffix to disqualify this match.
-    seeker1.seekToChunk(target.endChunk, target.endIndex + suffix.length);
-    seeker2.seekToChunk(
-      unintendedMatch.endChunk,
-      unintendedMatch.endIndex + suffix.length,
-    );
-    const extraSuffix = readUntilDifferent(seeker1, seeker2, false);
-
-    // Use either the prefix or suffix, whichever is shortest.
-    if (
-      extraPrefix !== undefined &&
-      (extraSuffix === undefined || extraPrefix.length <= extraSuffix.length)
-    ) {
-      prefix = extraPrefix + prefix;
-    } else if (extraSuffix !== undefined) {
-      suffix = suffix + extraSuffix;
-    } else {
-      throw new Error(
-        'Target cannot be disambiguated; how could that have happened‽',
-      );
-    }
-  }
-}
-
-function readUntilDifferent(
-  seeker1: Seeker,
-  seeker2: Seeker,
-  reverse: boolean,
-): string | undefined {
-  let result = '';
-  while (true) {
-    let nextCharacter: string;
-    try {
-      nextCharacter = seeker1.read(reverse ? -1 : 1);
-    } catch (err) {
-      return undefined; // Start/end of text reached: cannot expand result.
-    }
-    result = reverse ? nextCharacter + result : result + nextCharacter;
-
-    // Check if the newly added character makes the result differ from the second seeker.
-    let comparisonCharacter: string | undefined;
-    try {
-      comparisonCharacter = seeker2.read(reverse ? -1 : 1);
-    } catch (err) {
-      // A RangeError would merely mean seeker2 is exhausted.
-      if (!(err instanceof RangeError)) throw err;
-    }
-    if (nextCharacter !== comparisonCharacter) return result;
-  }
-}
diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index 9e09990..0c16b8a 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -19,7 +19,7 @@
  */
 
 import type { Matcher, TextQuoteSelector } from '@annotator/selector';
-import type { Chunk, Chunker, ChunkRange } from '../chunker';
+import { textQuoteSelectorMatcher as abstractTextQuoteSelectorMatcher } from '@annotator/selector';
 import { TextNodeChunker, EmptyScopeError } from '../chunker';
 
 export function createTextQuoteSelectorMatcher(
@@ -42,149 +42,3 @@ export function createTextQuoteSelectorMatcher(
     }
   };
 }
-
-export function abstractTextQuoteSelectorMatcher(
-  selector: TextQuoteSelector,
-): <TChunk extends Chunk<any>>(
-  scope: Chunker<TChunk>,
-) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
-  return async function* matchAll<TChunk extends Chunk<string>>(
-    textChunks: Chunker<TChunk>,
-  ) {
-    const exact = selector.exact;
-    const prefix = selector.prefix || '';
-    const suffix = selector.suffix || '';
-    const searchPattern = prefix + exact + suffix;
-
-    // The code below runs a loop with three steps:
-    // 1. Continue checking any partial matches from the previous chunk(s).
-    // 2. Try find the whole pattern in the chunk (possibly multiple times).
-    // 3. Check if this chunk ends with a partial match (or even multiple partial matches).
-
-    interface PartialMatch {
-      startChunk?: TChunk;
-      startIndex?: number;
-      endChunk?: TChunk;
-      endIndex?: number;
-      charactersMatched: number;
-    }
-    let partialMatches: PartialMatch[] = [];
-
-    let isFirstChunk = true;
-    do {
-      const chunk = textChunks.currentChunk;
-      const chunkValue = chunk.data;
-
-      // 1. Continue checking any partial matches from the previous chunk(s).
-      const remainingPartialMatches: typeof partialMatches = [];
-      for (const partialMatch of partialMatches) {
-        const charactersMatched = partialMatch.charactersMatched;
-
-        // If the current chunk contains the start and/or end of the match, record these.
-        if (partialMatch.endChunk === undefined) {
-          const charactersUntilMatchEnd =
-            prefix.length + exact.length - charactersMatched;
-          if (charactersUntilMatchEnd <= chunkValue.length) {
-            partialMatch.endChunk = chunk;
-            partialMatch.endIndex = charactersUntilMatchEnd;
-          }
-        }
-        if (partialMatch.startChunk === undefined) {
-          const charactersUntilMatchStart = prefix.length - charactersMatched;
-          if (
-            charactersUntilMatchStart < chunkValue.length ||
-            partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
-          ) {
-            partialMatch.startChunk = chunk;
-            partialMatch.startIndex = charactersUntilMatchStart;
-          }
-        }
-
-        const charactersUntilSuffixEnd =
-          searchPattern.length - charactersMatched;
-        if (charactersUntilSuffixEnd <= chunkValue.length) {
-          if (
-            chunkValue.startsWith(searchPattern.substring(charactersMatched))
-          ) {
-            yield partialMatch as ChunkRange<TChunk>; // all fields are certainly defined now.
-          }
-        } else if (
-          chunkValue ===
-          searchPattern.substring(
-            charactersMatched,
-            charactersMatched + chunkValue.length,
-          )
-        ) {
-          // The chunk is too short to complete the match; comparison has to be completed in subsequent chunks.
-          partialMatch.charactersMatched += chunkValue.length;
-          remainingPartialMatches.push(partialMatch);
-        }
-      }
-      partialMatches = remainingPartialMatches;
-
-      // 2. Try find the whole pattern in the chunk (possibly multiple times).
-      if (searchPattern.length <= chunkValue.length) {
-        let fromIndex = 0;
-        while (fromIndex <= chunkValue.length) {
-          const patternStartIndex = chunkValue.indexOf(
-            searchPattern,
-            fromIndex,
-          );
-          if (patternStartIndex === -1) break;
-          fromIndex = patternStartIndex + 1;
-
-          // Handle edge case: an empty searchPattern would already have been yielded at the end of the last chunk.
-          if (
-            patternStartIndex === 0 &&
-            searchPattern.length === 0 &&
-            !isFirstChunk
-          )
-            continue;
-
-          yield {
-            startChunk: chunk,
-            startIndex: patternStartIndex + prefix.length,
-            endChunk: chunk,
-            endIndex: patternStartIndex + prefix.length + exact.length,
-          };
-        }
-      }
-
-      // 3. Check if this chunk ends with a partial match (or even multiple partial matches).
-      let newPartialMatches: number[] = [];
-      const searchStartPoint = Math.max(
-        chunkValue.length - searchPattern.length + 1,
-        0,
-      );
-      for (let i = searchStartPoint; i < chunkValue.length; i++) {
-        const character = chunkValue[i];
-        newPartialMatches = newPartialMatches.filter(
-          (partialMatchStartIndex) =>
-            character === searchPattern[i - partialMatchStartIndex],
-        );
-        if (character === searchPattern[0]) newPartialMatches.push(i);
-      }
-      for (const partialMatchStartIndex of newPartialMatches) {
-        const charactersMatched = chunkValue.length - partialMatchStartIndex;
-        const partialMatch: PartialMatch = {
-          charactersMatched,
-        };
-        if (charactersMatched >= prefix.length + exact.length) {
-          partialMatch.endChunk = chunk;
-          partialMatch.endIndex =
-            partialMatchStartIndex + prefix.length + exact.length;
-        }
-        if (
-          charactersMatched > prefix.length ||
-          partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
-        ) {
-          partialMatch.startChunk = chunk;
-          partialMatch.startIndex = partialMatchStartIndex + prefix.length;
-        }
-        partialMatches.push(partialMatch);
-      }
-
-      isFirstChunk = false;
-    } while (textChunks.nextChunk() !== null);
-  };
-}
diff --git a/packages/selector/src/index.ts b/packages/selector/src/index.ts
index 73caa05..adefd04 100644
--- a/packages/selector/src/index.ts
+++ b/packages/selector/src/index.ts
@@ -27,6 +27,7 @@ export type {
   TextPositionSelector,
   TextQuoteSelector,
 } from './types';
+export * from './text';
 
 export function makeRefinable<
   // Any subtype of Selector can be made refinable; but note we limit the value
diff --git a/packages/selector/src/text/chunker.ts b/packages/selector/src/text/chunker.ts
new file mode 100644
index 0000000..eb0d970
--- /dev/null
+++ b/packages/selector/src/text/chunker.ts
@@ -0,0 +1,69 @@
+/**
+ * @license
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+// A Chunk represents a fragment (typically a string) of some document.
+// Subclasses can add further attributes to map the chunk to its position in the
+// data structure it came from (e.g. a DOM node).
+export interface Chunk<TData> {
+  readonly data: TData;
+  equals?(otherChunk: this): boolean;
+}
+
+export function chunkEquals(chunk1: Chunk<any>, chunk2: Chunk<any>): boolean {
+  return chunk1.equals ? chunk1.equals(chunk2) : chunk1 === chunk2;
+}
+
+export interface ChunkRange<TChunk extends Chunk<any>> {
+  startChunk: TChunk;
+  startIndex: number;
+  endChunk: TChunk;
+  endIndex: number;
+}
+
+export function chunkRangeEquals(
+  range1: ChunkRange<any>,
+  range2: ChunkRange<any>,
+): boolean {
+  return (
+    chunkEquals(range1.startChunk, range2.startChunk) &&
+    chunkEquals(range1.endChunk, range2.endChunk) &&
+    range1.startIndex === range2.startIndex &&
+    range1.endIndex === range2.endIndex
+  );
+}
+
+// A Chunker lets one walk through the chunks of a document.
+// It is inspired by, and similar to, the DOM’s NodeIterator. (but unlike
+// NodeIterator, it has no concept of being ‘before’ or ‘after’ a chunk)
+export interface Chunker<TChunk extends Chunk<any>> {
+  // The chunk currently being pointed at.
+  readonly currentChunk: TChunk;
+
+  // Move currentChunk to the chunk following it, and return that chunk.
+  // If there are no chunks following it, keep currentChunk unchanged and return null.
+  nextChunk(): TChunk | null;
+
+  // Move currentChunk to the chunk preceding it, and return that chunk.
+  // If there are no preceding chunks, keep currentChunk unchanged and return null.
+  previousChunk(): TChunk | null;
+
+  // Test if a given chunk is before the current chunk.
+  precedesCurrentChunk(chunk: TChunk): boolean;
+}
diff --git a/packages/dom/src/code-point-seeker.ts b/packages/selector/src/text/code-point-seeker.ts
similarity index 100%
rename from packages/dom/src/code-point-seeker.ts
rename to packages/selector/src/text/code-point-seeker.ts
diff --git a/packages/dom/src/text-position/describe.ts b/packages/selector/src/text/describe-text-position.ts
similarity index 58%
copy from packages/dom/src/text-position/describe.ts
copy to packages/selector/src/text/describe-text-position.ts
index 5f7f9a3..99485b0 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/selector/src/text/describe-text-position.ts
@@ -19,37 +19,11 @@
  */
 
 import type { TextPositionSelector } from '@annotator/selector';
-import type { Chunk, Chunker, ChunkRange } from '../chunker';
-import { TextNodeChunker } from '../chunker';
-import { CodePointSeeker } from '../code-point-seeker';
-import { ownerDocument } from '../owner-document';
-import { TextSeeker } from '../seek';
+import type { Chunk, Chunker, ChunkRange } from './chunker';
+import { CodePointSeeker } from './code-point-seeker';
+import { TextSeeker } from './seek';
 
-export async function describeTextPosition(
-  range: Range,
-  maybeScope?: Range,
-): Promise<TextPositionSelector> {
-  // Default to search in the whole document.
-  let scope: Range;
-  if (maybeScope !== undefined) {
-    scope = maybeScope;
-  } else {
-    const document = ownerDocument(range);
-    scope = document.createRange();
-    scope.selectNodeContents(document);
-  }
-
-  const textChunks = new TextNodeChunker(scope);
-  if (textChunks.currentChunk === null)
-    throw new RangeError('Range does not contain any Text nodes.');
-
-  return await abstractDescribeTextPosition(
-    textChunks.rangeToChunkRange(range),
-    textChunks,
-  );
-}
-
-async function abstractDescribeTextPosition<TChunk extends Chunk<string>>(
+export async function describeTextPosition<TChunk extends Chunk<string>>(
   target: ChunkRange<TChunk>,
   scope: Chunker<TChunk>,
 ): Promise<TextPositionSelector> {
diff --git a/packages/selector/src/text/describe-text-quote.ts b/packages/selector/src/text/describe-text-quote.ts
new file mode 100644
index 0000000..c97a4f8
--- /dev/null
+++ b/packages/selector/src/text/describe-text-quote.ts
@@ -0,0 +1,134 @@
+/**
+ * @license
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import type { TextQuoteSelector } from '@annotator/selector';
+import type { Chunk, Chunker, ChunkRange } from './chunker';
+import { chunkRangeEquals } from './chunker';
+import type { Seeker } from './seek';
+import { TextSeeker } from './seek';
+import { textQuoteSelectorMatcher } from '.';
+
+export async function describeTextQuote<TChunk extends Chunk<string>>(
+    target: ChunkRange<TChunk>,
+    scope: () => Chunker<TChunk>,
+  ): Promise<TextQuoteSelector> {
+    const seeker = new TextSeeker(scope());
+
+    // Read the target’s exact text.
+    seeker.seekToChunk(target.startChunk, target.startIndex);
+    const exact = seeker.readToChunk(target.endChunk, target.endIndex);
+
+    // Starting with an empty prefix and suffix, we search for matches. At each unintended match
+    // we encounter, we extend the prefix or suffix just enough to ensure it will no longer match.
+    let prefix = '';
+    let suffix = '';
+
+    while (true) {
+      const tentativeSelector: TextQuoteSelector = {
+        type: 'TextQuoteSelector',
+        exact,
+        prefix,
+        suffix,
+      };
+
+      const matches = textQuoteSelectorMatcher(tentativeSelector)(
+        scope(),
+      );
+      let nextMatch = await matches.next();
+
+      // If this match is the intended one, no need to act.
+      // XXX This test is fragile: nextMatch and target are assumed to be normalised.
+      if (!nextMatch.done && chunkRangeEquals(nextMatch.value, target)) {
+        nextMatch = await matches.next();
+      }
+
+      // If there are no more unintended matches, our selector is unambiguous!
+      if (nextMatch.done) return tentativeSelector;
+
+      // Possible optimisation: A subsequent search could safely skip the part we
+      // already processed, instead of starting from the beginning again. But we’d
+      // need the matcher to start at the seeker’s position, instead of searching
+      // in the whole current chunk. Then we could just seek back to just after
+      // the start of the prefix: seeker.seekBy(-prefix.length + 1); (don’t forget
+      // to also correct for any changes in the prefix we will make below)
+
+      // We’ll have to add more prefix/suffix to disqualify this unintended match.
+      const unintendedMatch = nextMatch.value;
+      const seeker1 = new TextSeeker(scope());
+      const seeker2 = new TextSeeker(scope());
+
+      // Count how many characters we’d need as a prefix to disqualify this match.
+      seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
+      seeker2.seekToChunk(
+        unintendedMatch.startChunk,
+        unintendedMatch.startIndex - prefix.length,
+      );
+      const extraPrefix = readUntilDifferent(seeker1, seeker2, true);
+
+      // Count how many characters we’d need as a suffix to disqualify this match.
+      seeker1.seekToChunk(target.endChunk, target.endIndex + suffix.length);
+      seeker2.seekToChunk(
+        unintendedMatch.endChunk,
+        unintendedMatch.endIndex + suffix.length,
+      );
+      const extraSuffix = readUntilDifferent(seeker1, seeker2, false);
+
+      // Use either the prefix or suffix, whichever is shortest.
+      if (
+        extraPrefix !== undefined &&
+        (extraSuffix === undefined || extraPrefix.length <= extraSuffix.length)
+      ) {
+        prefix = extraPrefix + prefix;
+      } else if (extraSuffix !== undefined) {
+        suffix = suffix + extraSuffix;
+      } else {
+        throw new Error(
+          'Target cannot be disambiguated; how could that have happened‽',
+        );
+      }
+    }
+  }
+
+  function readUntilDifferent(
+    seeker1: Seeker,
+    seeker2: Seeker,
+    reverse: boolean,
+  ): string | undefined {
+    let result = '';
+    while (true) {
+      let nextCharacter: string;
+      try {
+        nextCharacter = seeker1.read(reverse ? -1 : 1);
+      } catch (err) {
+        return undefined; // Start/end of text reached: cannot expand result.
+      }
+      result = reverse ? nextCharacter + result : result + nextCharacter;
+
+      // Check if the newly added character makes the result differ from the second seeker.
+      let comparisonCharacter: string | undefined;
+      try {
+        comparisonCharacter = seeker2.read(reverse ? -1 : 1);
+      } catch (err) {
+        // A RangeError would merely mean seeker2 is exhausted.
+        if (!(err instanceof RangeError)) throw err;
+      }
+      if (nextCharacter !== comparisonCharacter) return result;
+    }
+  }
diff --git a/packages/selector/src/text/index.ts b/packages/selector/src/text/index.ts
new file mode 100644
index 0000000..ca88b66
--- /dev/null
+++ b/packages/selector/src/text/index.ts
@@ -0,0 +1,5 @@
+export * from './describe-text-quote';
+export * from './match-text-quote';
+export * from './describe-text-position';
+export * from './match-text-position';
+export * from './chunker';
diff --git a/packages/dom/src/text-position/match.ts b/packages/selector/src/text/match-text-position.ts
similarity index 66%
copy from packages/dom/src/text-position/match.ts
copy to packages/selector/src/text/match-text-position.ts
index becd957..c00f619 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/selector/src/text/match-text-position.ts
@@ -18,29 +18,12 @@
  * under the License.
  */
 
-import type { Matcher, TextPositionSelector } from '@annotator/selector';
-import type { Chunk, ChunkRange, Chunker } from '../chunker';
-import { TextNodeChunker } from '../chunker';
-import { CodePointSeeker } from '../code-point-seeker';
-import { TextSeeker } from '../seek';
+import type { TextPositionSelector } from '@annotator/selector';
+import type { Chunk, ChunkRange, Chunker } from './chunker';
+import { CodePointSeeker } from './code-point-seeker';
+import { TextSeeker } from './seek';
 
-export function createTextPositionSelectorMatcher(
-  selector: TextPositionSelector,
-): Matcher<Range, Range> {
-  const abstractMatcher = abstractTextPositionSelectorMatcher(selector);
-
-  return async function* matchAll(scope) {
-    const textChunks = new TextNodeChunker(scope);
-
-    const matches = abstractMatcher(textChunks);
-
-    for await (const abstractMatch of matches) {
-      yield textChunks.chunkRangeToRange(abstractMatch);
-    }
-  };
-}
-
-export function abstractTextPositionSelectorMatcher(
+export function textPositionSelectorMatcher(
   selector: TextPositionSelector,
 ): <TChunk extends Chunk<any>>(
   scope: Chunker<TChunk>,
diff --git a/packages/selector/src/text/match-text-quote.ts b/packages/selector/src/text/match-text-quote.ts
new file mode 100644
index 0000000..8e90a2f
--- /dev/null
+++ b/packages/selector/src/text/match-text-quote.ts
@@ -0,0 +1,168 @@
+/**
+ * @license
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import type { TextQuoteSelector } from '@annotator/selector';
+import type { Chunk, Chunker, ChunkRange } from './chunker';
+
+export function textQuoteSelectorMatcher(
+    selector: TextQuoteSelector,
+  ): <TChunk extends Chunk<any>>(
+    scope: Chunker<TChunk>,
+  ) => AsyncGenerator<ChunkRange<TChunk>, void, void> {
+    return async function* matchAll<TChunk extends Chunk<string>>(
+      textChunks: Chunker<TChunk>,
+    ) {
+      const exact = selector.exact;
+      const prefix = selector.prefix || '';
+      const suffix = selector.suffix || '';
+      const searchPattern = prefix + exact + suffix;
+
+      // The code below runs a loop with three steps:
+      // 1. Continue checking any partial matches from the previous chunk(s).
+      // 2. Try find the whole pattern in the chunk (possibly multiple times).
+      // 3. Check if this chunk ends with a partial match (or even multiple partial matches).
+
+      interface PartialMatch {
+        startChunk?: TChunk;
+        startIndex?: number;
+        endChunk?: TChunk;
+        endIndex?: number;
+        charactersMatched: number;
+      }
+      let partialMatches: PartialMatch[] = [];
+
+      let isFirstChunk = true;
+      do {
+        const chunk = textChunks.currentChunk;
+        const chunkValue = chunk.data;
+
+        // 1. Continue checking any partial matches from the previous chunk(s).
+        const remainingPartialMatches: typeof partialMatches = [];
+        for (const partialMatch of partialMatches) {
+          const charactersMatched = partialMatch.charactersMatched;
+
+          // If the current chunk contains the start and/or end of the match, record these.
+          if (partialMatch.endChunk === undefined) {
+            const charactersUntilMatchEnd =
+              prefix.length + exact.length - charactersMatched;
+            if (charactersUntilMatchEnd <= chunkValue.length) {
+              partialMatch.endChunk = chunk;
+              partialMatch.endIndex = charactersUntilMatchEnd;
+            }
+          }
+          if (partialMatch.startChunk === undefined) {
+            const charactersUntilMatchStart = prefix.length - charactersMatched;
+            if (
+              charactersUntilMatchStart < chunkValue.length ||
+              partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
+            ) {
+              partialMatch.startChunk = chunk;
+              partialMatch.startIndex = charactersUntilMatchStart;
+            }
+          }
+
+          const charactersUntilSuffixEnd =
+            searchPattern.length - charactersMatched;
+          if (charactersUntilSuffixEnd <= chunkValue.length) {
+            if (
+              chunkValue.startsWith(searchPattern.substring(charactersMatched))
+            ) {
+              yield partialMatch as ChunkRange<TChunk>; // all fields are certainly defined now.
+            }
+          } else if (
+            chunkValue ===
+            searchPattern.substring(
+              charactersMatched,
+              charactersMatched + chunkValue.length,
+            )
+          ) {
+            // The chunk is too short to complete the match; comparison has to be completed in subsequent chunks.
+            partialMatch.charactersMatched += chunkValue.length;
+            remainingPartialMatches.push(partialMatch);
+          }
+        }
+        partialMatches = remainingPartialMatches;
+
+        // 2. Try find the whole pattern in the chunk (possibly multiple times).
+        if (searchPattern.length <= chunkValue.length) {
+          let fromIndex = 0;
+          while (fromIndex <= chunkValue.length) {
+            const patternStartIndex = chunkValue.indexOf(
+              searchPattern,
+              fromIndex,
+            );
+            if (patternStartIndex === -1) break;
+            fromIndex = patternStartIndex + 1;
+
+            // Handle edge case: an empty searchPattern would already have been yielded at the end of the last chunk.
+            if (
+              patternStartIndex === 0 &&
+              searchPattern.length === 0 &&
+              !isFirstChunk
+            )
+              continue;
+
+            yield {
+              startChunk: chunk,
+              startIndex: patternStartIndex + prefix.length,
+              endChunk: chunk,
+              endIndex: patternStartIndex + prefix.length + exact.length,
+            };
+          }
+        }
+
+        // 3. Check if this chunk ends with a partial match (or even multiple partial matches).
+        let newPartialMatches: number[] = [];
+        const searchStartPoint = Math.max(
+          chunkValue.length - searchPattern.length + 1,
+          0,
+        );
+        for (let i = searchStartPoint; i < chunkValue.length; i++) {
+          const character = chunkValue[i];
+          newPartialMatches = newPartialMatches.filter(
+            (partialMatchStartIndex) =>
+              character === searchPattern[i - partialMatchStartIndex],
+          );
+          if (character === searchPattern[0]) newPartialMatches.push(i);
+        }
+        for (const partialMatchStartIndex of newPartialMatches) {
+          const charactersMatched = chunkValue.length - partialMatchStartIndex;
+          const partialMatch: PartialMatch = {
+            charactersMatched,
+          };
+          if (charactersMatched >= prefix.length + exact.length) {
+            partialMatch.endChunk = chunk;
+            partialMatch.endIndex =
+              partialMatchStartIndex + prefix.length + exact.length;
+          }
+          if (
+            charactersMatched > prefix.length ||
+            partialMatch.endChunk !== undefined // handles an edge case: an empty quote at the end of a chunk.
+          ) {
+            partialMatch.startChunk = chunk;
+            partialMatch.startIndex = partialMatchStartIndex + prefix.length;
+          }
+          partialMatches.push(partialMatch);
+        }
+
+        isFirstChunk = false;
+      } while (textChunks.nextChunk() !== null);
+    };
+  }
diff --git a/packages/dom/src/seek.ts b/packages/selector/src/text/seek.ts
similarity index 100%
rename from packages/dom/src/seek.ts
rename to packages/selector/src/text/seek.ts


[incubator-annotator] 02/14: Make demo more challenging.

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 940984e42d9117967f701c1249eb7eb990fddc29
Author: Gerben <ge...@treora.com>
AuthorDate: Thu Nov 19 16:46:24 2020 +0100

    Make demo more challenging.
---
 web/demo/index.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/web/demo/index.html b/web/demo/index.html
index 3ed0961..6c86b4b 100644
--- a/web/demo/index.html
+++ b/web/demo/index.html
@@ -67,7 +67,7 @@ under the License.
     <div class="columns full-width">
       <div class="column">
         <h2>Select text here</h2>
-        <p id="source">Hello, annotated world! To annotate, or not to annotate, that is the question.</p>
+        <p id="source" contenteditable>Hello, <em>annotated world!</em> 🙂 <b>To annotate, or <em>not</em> to annotate</b>, that is the question.</p>
         <p>Try selecting some text in this paragraph above.
           Upon a change of selection, a
           <a rel="external" href="https://www.w3.org/TR/2017/REC-annotation-model-20170223/#text-quote-selector" target="_blank">TextQuoteSelector</a>
@@ -84,7 +84,7 @@ under the License.
       </div>
       <div class="column">
         <h2>Text is found here</h2>
-        <p id="target" contenteditable>Hello, annotated world! To annotate, or not to annotate, that is the question.</p>
+        <p id="target" contenteditable><em>Hello, annotated</em> world! 🙂 To annotate, or not to annotate, <b><em>that</em> is the question.</b></p>
         <p>The selector is ‘anchored’ here: the segment it describes is found and highlighted.</p>
       </div>
       <div class="column" style="min-width: 20em;">


[incubator-annotator] 09/14: Compare *extra* pre/suffix lengths (ignore sunk costs)

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 8459b0e7a0e455f6bd6c124d7a8f45479a13bb59
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 13:02:16 2020 +0100

    Compare *extra* pre/suffix lengths (ignore sunk costs)
---
 packages/dom/src/text-quote/describe.ts | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index ae79ad0..81cc4fa 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -95,20 +95,17 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
     seeker1.seekToChunk(target.startChunk, target.startIndex - prefix.length);
     seeker2.seekToChunk(unintendedMatch.startChunk, unintendedMatch.startIndex - prefix.length);
     const extraPrefix = readUntilDifferent(seeker1, seeker2, true);
-    let sufficientPrefix = extraPrefix !== undefined ? extraPrefix + prefix : undefined;
 
     // Count how many characters we’d need as a suffix to disqualify this match.
     seeker1.seekToChunk(target.endChunk, target.endIndex + suffix.length);
     seeker2.seekToChunk(unintendedMatch.endChunk, unintendedMatch.endIndex + suffix.length);
     const extraSuffix = readUntilDifferent(seeker1, seeker2, false);
-    let sufficientSuffix = extraSuffix !== undefined ? suffix + extraSuffix : undefined;
 
     // Use either the prefix or suffix, whichever is shortest.
-    if (sufficientPrefix !== undefined && (sufficientSuffix === undefined || sufficientPrefix.length <= sufficientSuffix.length)) {
-      prefix = sufficientPrefix;
-      // seeker.seekBy(sufficientPrefix.length - prefix.length) // Would be required if we’d skip the processed part.
-    } else if (sufficientSuffix !== undefined) {
-      suffix = sufficientSuffix;
+    if (extraPrefix !== undefined && (extraSuffix === undefined || extraPrefix.length <= extraSuffix.length)) {
+      prefix = extraPrefix + prefix;
+    } else if (extraSuffix !== undefined) {
+      suffix = suffix + extraSuffix;
     } else {
       throw new Error('Target cannot be disambiguated; how could that have happened‽');
     }


[incubator-annotator] 06/14: This is what do–while was invented for :)

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 6bab278edede89032e958df3477aeaec9f1ef328
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 12:03:45 2020 +0100

    This is what do–while was invented for :)
---
 packages/dom/src/text-quote/match.ts | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index dea1f68..dd69227 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -66,9 +66,9 @@ export function abstractTextQuoteSelectorMatcher(
     }
     let partialMatches: PartialMatch[] = [];
 
-    let chunk: TChunk | null;
     let isFirstChunk = true;
-    while (chunk = textChunks.currentChunk) {
+    do {
+      const chunk = textChunks.currentChunk;
       const chunkValue = chunk.data;
 
       // 1. Continue checking any partial matches from the previous chunk(s).
@@ -158,10 +158,7 @@ export function abstractTextQuoteSelectorMatcher(
         partialMatches.push(partialMatch);
       }
 
-      if (textChunks.nextChunk() === null)
-        break;
-
       isFirstChunk = false;
-    }
+    } while (textChunks.nextChunk() !== null);
   };
 }


[incubator-annotator] 14/14: Rename chunker.ts→text-node-chunker.ts

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 2ff0e4edd7f49a652251f5c7592e06f7029a9382
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 16:44:18 2020 +0100

    Rename chunker.ts→text-node-chunker.ts
---
 packages/dom/src/{chunker.ts => text-node-chunker.ts} | 0
 packages/dom/src/text-position/describe.ts            | 2 +-
 packages/dom/src/text-position/match.ts               | 2 +-
 packages/dom/src/text-quote/describe.ts               | 2 +-
 packages/dom/src/text-quote/match.ts                  | 2 +-
 5 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/packages/dom/src/chunker.ts b/packages/dom/src/text-node-chunker.ts
similarity index 100%
rename from packages/dom/src/chunker.ts
rename to packages/dom/src/text-node-chunker.ts
diff --git a/packages/dom/src/text-position/describe.ts b/packages/dom/src/text-position/describe.ts
index 992a633..3183983 100644
--- a/packages/dom/src/text-position/describe.ts
+++ b/packages/dom/src/text-position/describe.ts
@@ -20,7 +20,7 @@
 
 import type { TextPositionSelector } from '@annotator/selector';
 import { describeTextPosition as abstractDescribeTextPosition } from '@annotator/selector';
-import { TextNodeChunker } from '../chunker';
+import { TextNodeChunker } from '../text-node-chunker';
 import { ownerDocument } from '../owner-document';
 
 export async function describeTextPosition(
diff --git a/packages/dom/src/text-position/match.ts b/packages/dom/src/text-position/match.ts
index 3dccede..aab4c79 100644
--- a/packages/dom/src/text-position/match.ts
+++ b/packages/dom/src/text-position/match.ts
@@ -20,7 +20,7 @@
 
 import type { Matcher, TextPositionSelector } from '@annotator/selector';
 import { textPositionSelectorMatcher as abstractTextPositionSelectorMatcher } from '@annotator/selector';
-import { TextNodeChunker } from '../chunker';
+import { TextNodeChunker } from '../text-node-chunker';
 
 export function createTextPositionSelectorMatcher(
   selector: TextPositionSelector,
diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 3ca3f0b..45b8adc 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -20,7 +20,7 @@
 
 import type { TextQuoteSelector } from '@annotator/selector';
 import { describeTextQuote as abstractDescribeTextQuote } from '@annotator/selector';
-import { TextNodeChunker } from '../chunker';
+import { TextNodeChunker } from '../text-node-chunker';
 import { ownerDocument } from '../owner-document';
 
 export async function describeTextQuote(
diff --git a/packages/dom/src/text-quote/match.ts b/packages/dom/src/text-quote/match.ts
index 0c16b8a..8e74c2e 100644
--- a/packages/dom/src/text-quote/match.ts
+++ b/packages/dom/src/text-quote/match.ts
@@ -20,7 +20,7 @@
 
 import type { Matcher, TextQuoteSelector } from '@annotator/selector';
 import { textQuoteSelectorMatcher as abstractTextQuoteSelectorMatcher } from '@annotator/selector';
-import { TextNodeChunker, EmptyScopeError } from '../chunker';
+import { TextNodeChunker, EmptyScopeError } from '../text-node-chunker';
 
 export function createTextQuoteSelectorMatcher(
   selector: TextQuoteSelector,


[incubator-annotator] 10/14: tweak comments

Posted by ge...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch import-dom-seek
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit f15fe3447da6bae21504ad778d5698f2d9839319
Author: Gerben <ge...@treora.com>
AuthorDate: Fri Nov 20 13:06:30 2020 +0100

    tweak comments
---
 packages/dom/src/text-quote/describe.ts | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/packages/dom/src/text-quote/describe.ts b/packages/dom/src/text-quote/describe.ts
index 81cc4fa..756df1e 100644
--- a/packages/dom/src/text-quote/describe.ts
+++ b/packages/dom/src/text-quote/describe.ts
@@ -72,19 +72,21 @@ async function abstractDescribeTextQuote<TChunk extends Chunk<string>>(
     const matches = abstractTextQuoteSelectorMatcher(tentativeSelector)(scope());
     let nextMatch = await matches.next();
 
+    // If this match is the intended one, no need to act.
     // XXX This test is fragile: nextMatch and target are assumed to be normalised.
     if (!nextMatch.done && chunkRangeEquals(nextMatch.value, target)) {
-      // This match is the intended one, ignore it.
       nextMatch = await matches.next();
     }
 
     // If there are no more unintended matches, our selector is unambiguous!
     if (nextMatch.done) return tentativeSelector;
 
-    // A subsequent search could safely skip the part we already processed,
-    // we’d need the matcher to start at the seeker’s position, instead of
-    // searching in the whole current chunk.
-    // seeker.seekBy(-prefix.length + 1);
+    // Possible optimisation: A subsequent search could safely skip the part we
+    // already processed, instead of starting from the beginning again. But we’d
+    // need the matcher to start at the seeker’s position, instead of searching
+    // in the whole current chunk. Then we could just seek back to just after
+    // the start of the prefix: seeker.seekBy(-prefix.length + 1); (don’t forget
+    // to also correct for any changes in the prefix we will make below)
 
     // We’ll have to add more prefix/suffix to disqualify this unintended match.
     const unintendedMatch = nextMatch.value;