Update splittext plan with CSS, a11y, SEO, BiDi, nesting, and line detection amendments by tombigel · Pull Request #132 · wix/interact

tombigel · 2026-02-23T12:12:19Z

Add base CSS strategy (injectStyles, space handling, direction detection)
Simplify ARIA to container-level only
Add SEO strategy with preserveText visually-hidden duplicate
Add BiDi/shaping injection options (bidiResolver, shaper)
Add nested option (flatten/preserve/depth) for DOM structure control
Make line detection opt-in and note binary-search algorithm improvement
Add Safari whitespace normalization test results and Playwright test case

…tection amendments - Add base CSS strategy (injectStyles, space handling, direction detection) - Simplify ARIA to container-level only - Add SEO strategy with preserveText visually-hidden duplicate - Add BiDi/shaping injection options (bidiResolver, shaper) - Add nested option (flatten/preserve/depth) for DOM structure control - Make line detection opt-in and note binary-search algorithm improvement - Add Safari whitespace normalization test results and Playwright test case Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Updates the SplitText package implementation plan to incorporate new strategies for base CSS injection, accessibility/SEO handling, BiDi/shaping extensibility, nested DOM handling, and opt-in line detection (including Safari/WebKit whitespace quirks and a Playwright test case).

Changes:

Makes line detection explicitly opt-in and documents a more efficient binary-search-based approach.
Adds plan-level API/options for CSS injection, preserveText for SEO/a11y, nested DOM handling, and BiDi/shaping injection points.
Expands testing/documentation sections (Safari whitespace normalization notes + Playwright test case, wrapper/CSS guidance).

Comments suppressed due to low confidence (1)

.cursor/plans/text_splitter_package_1eeee927.plan.md:850

SplitTextResult is typed as returning HTMLSpanElement[], but the implementation sketch (cache + getters + _performSplit) uses HTMLElement[]. This should be consistent (ideally HTMLSpanElement[] everywhere, since wrappers are spans), otherwise the public API typing and internals will diverge.

  private _cache: {
    chars?: HTMLElement[];
    words?: HTMLElement[];
    lines?: HTMLElement[];
    sentences?: HTMLElement[];
  } = {};

  constructor(element: HTMLElement, options?: SplitTextOptions) {
    this._element = element;
    this._originalHTML = element.innerHTML;

    // Eager split if type is provided; track whether 'lines' was requested (lines are opt-in and expensive)
    this._linesRequested = false;
    if (options?.type) {
      const types = Array.isArray(options.type) ? options.type : [options.type];
      this._linesRequested = types.includes('lines');
      for (const type of types) {
        this._performSplit(type);
      }
    }
  }

  // Lazy getter - split on first access, return cached thereafter
  get chars(): HTMLElement[] {
    if (!this._cache.chars) {
      this._cache.chars = this._performSplit('chars');
    }
    return this._cache.chars;
  }

  get words(): HTMLElement[] {
    if (!this._cache.words) {
      this._cache.words = this._performSplit('words');
    }
    return this._cache.words;
  }

  get lines(): HTMLElement[] {
    if (!this._linesRequested) return []; // or throw with message: "Lines not requested; pass type: 'lines' or type: [..., 'lines']"
    if (!this._cache.lines) {
      this._cache.lines = this._performSplit('lines');
    }
    return this._cache.lines;
  }

  get sentences(): HTMLElement[] {
    if (!this._cache.sentences) {
      this._cache.sentences = this._performSplit('sentences');
    }
    return this._cache.sentences;
  }

  private _performSplit(type: 'chars' | 'words' | 'lines' | 'sentences'): HTMLElement[] {
    // Actual splitting logic - creates wrapper elements in DOM
    // Returns array of created HTMLElements
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.cursor/plans/text_splitter_package_1eeee927.plan.md

Co-authored-by: Cursor <cursoragent@cursor.com>

…iv, HTMLSpanElement types, doc numbering Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 20 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.cursor/plans/text_splitter_package_1eeee927.plan.md

…requirements - Fix revert() prose, test locator, _handleResize options, remove splitBy/preserveWhitespace - Add Intl.Segmenter browser requirement + polyfill note - BiDi: external plugin API; remove shaper; shaped-languages note - Line detection: binary search pseudocode, naive example, heightTracker fix - Nested: clarify flatten deeper than N with example - Minor: aria inner div, line detection string[] note, numbered lists, Prettier Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

.cursor/plans/text_splitter_package_1eeee927.plan.md

- Remove lines opt-in special-casing; .lines is now lazy like all other getters - Add display: contents to aria-hidden wrapper div to prevent layout breakage - Remove redundant .split-space class (base CSS handles space preservation) - Assert withoutNorm in Safari whitespace test - Add Segmenter Polyfill API section with contract and compatible polyfills - Trim repetitive integration examples and redundant code subsections - Tag all code snippets with target file paths Made-with: Cursor

ydaniv

Reviewed until Line Detection Algorithm

ydaniv · 2026-03-04T10:41:48Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

+  // Base CSS (inline-block, white-space, etc.)
+  injectStyles?: boolean;  // default: true - auto-inject minimal base stylesheet (deduplicated via data-splittext)


Why do we need this exposed? What happens if it's false?

than we don't add the base css. not a must, but a nice option if the styles are bundled somewhere else

Let's clarify this feature a bit:

We don't want to specify this per split. This is a global setting.

We can generate the global CSS and inject it via adoptedStyleSheets. Because anyway we operate in the client.

If we're adding these styles globally it's enough for now, and we don't need the implementation to inject inline styles.

ydaniv · 2026-03-04T10:45:24Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

+  injectStyles?: boolean;  // default: true - auto-inject minimal base stylesheet (deduplicated via data-splittext)
+
+  // DOM structure
+  nested?: 'flatten' | 'preserve' | number;  // default: 'flatten'


Why do we need this exposed? What are the use-cases for flattening or providing a number?

I i remember correctly your original plan is set to preserve html structure. It is not always what we want, sometimes we just want the textContent of a styled text section, hence "flatten". the "number" option is a suggestion from claude that i adopted to have the ability to set nesting levels. so if i know my structure and all i want to preserve are top level bold and italic i can set this to 1.
again - not a must but a nice advanced feature

But when do we need this "flatten"? Do we have a real use-case?
Do we have a real use-case for "number"?

Use case for flatten? of course.
First of all this is de facto what everybody else is doing, we shouls at least have the option.
But generally for text editors generated html structures splitting can get messy, especially word and line - you can get to a point where you need two passes or look ahead... i don't remember, did it too many years ago (Think of a structure like  some text then some other text) and It's before we get to <ul><li> markers and counters. We might decide we also don't want to deal with it.

For the number option, is a cool feature i can see value in, not a must.

ydaniv · 2026-03-04T10:47:22Z

.cursor/plans/text_splitter_package_1eeee927.plan.md


 ```css
-/* Example: Typewriter effect */
+/* Typewriter effect using staggered animation-delay via --index custom property */


And how/where is that --index property applied?

mm, nice catch, I missed this example.
It's a good feature though, what do you say? we tell the splitter to add hardcoded --letter-index --word-index etc. for each element on split?

Yes, if we could apply --index per part and use that in the effect it would be nice. But these things are hard to manage, so need to see what we end up with and see if it makes sense.
Maybe something like --c-idx, --w-idx, etc.

done, used the longform (--letter-index) and also added another prop partIndex so this can be disabled if needed for some reason

.cursor/plans/text_splitter_package_1eeee927.plan.md

ydaniv · 2026-03-04T10:53:16Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

+### Base CSS Strategy

-For transforms to work correctly on spans, `display: inline-block` is often required. Users can set this via `wrapperStyle`:
+When `injectStyles` is true (default), the package injects a minimal base stylesheet once per document via a `<style data-splittext>` tag (deduplicated). This ensures transforms and spacing work without requiring users to add CSS manually.


Is this a static file or generated?
Also, better to not assume this implementation. Could be different depending on use-case.

the base should be static i guess. I agree that the implementation details are too detailed. will change

.cursor/plans/text_splitter_package_1eeee927.plan.md

ydaniv · 2026-03-11T08:14:29Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

+### Unicode/Emoji Handling & Text Segmentation

-Use `Intl.Segmenter` for proper character segmentation (with fallback for older browsers):
+Use `Intl.Segmenter` (native or via the `segmenter` option — see **Segmenter Polyfill API**) for all text segmentation — characters, words, and sentences. This provides locale-aware splitting that correctly handles emoji, multi-codepoint grapheme clusters, CJK text without spaces, and language-specific word/sentence boundaries. Because `Intl.Segmenter` handles these concerns natively, custom `splitBy` or whitespace-handling options are unnecessary.


This is a lot of craft, the output is very verbose and not necessary.
Please revert, unless you can point anything that's really needed.

You didn't change anything.
Anyway the list below is also not ideal, we can probably revise that as well.

done for realz

ydaniv · 2026-03-11T08:16:17Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

-const segmenter = new Intl.Segmenter('en', { granularity: 'grapheme' });
-const chars = [...segmenter.segment(text)].map((s) => s.segment);
+// Characters (grapheme clusters — handles emoji, combining marks, etc.)
+const charSegmenter = new Intl.Segmenter('en', { granularity: 'grapheme' });
+const chars = [...charSegmenter.segment(text)].map((s) => s.segment);
+
+// Words (locale-aware — works for CJK, languages without spaces, etc.)
+const wordSegmenter = new Intl.Segmenter('en', { granularity: 'word' });
+const words = [...wordSegmenter.segment(text)].filter((s) => s.isWordLike).map((s) => s.segment);
+
+// Sentences
+const sentenceSegmenter = new Intl.Segmenter('en', { granularity: 'sentence' });
+const sentences = [...sentenceSegmenter.segment(text)].map((s) => s.segment);


Not necessary. Basically the LLM knows how to use the API.

You didn't change anything here

really done this time

.cursor/plans/text_splitter_package_1eeee927.plan.md

…, fix gaps - Remove untested binary search line detection; restore Ben Nadel's proven getClientRects() approach as primary algorithm - Remove redundant cross-reference line, stale binary-search perf note - Simplify Base CSS section (static predefined stylesheet, less prescriptive) - Add .sr-only to base CSS required styles - Fix broken markdown formatting on SplitTextResultImpl heading - Remove orphaned masking todo (no plan section existed) - Remove redundant range.selectNodeContents line in height-tracking code - Fix markdown lint warnings (double blank lines) Made-with: Cursor

ydaniv · 2026-03-12T10:32:45Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

-  return lines;
-}
-```
+Instead of tracking rect count, track `getBoundingClientRect().height` on a range anchored at the text node start. When height increases, a new line has been reached. Same O(n) iteration but uses a single bounding rect per step instead of a rect array, which may be cheaper for long text.


I think this doesn't make sense, but also the whole algo above seems very inefficient. Increasing the range node by node and calling getBoundingClientRect() each time.

mm, it added it without me asking, i missed it

ydaniv · 2026-03-12T10:34:13Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

+1. **No pre-wrapping required** — detect lines from original text nodes
+2. **Accurate to browser rendering** — uses actual layout, not approximated positions
+3. **Measure first, wrap second** — original text stays intact during detection


You can revert this

ydaniv · 2026-03-12T10:35:56Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

-**Primary Approach: Range API with `getClientRects()`**
-
-Use the DOM Range API to detect line breaks from text nodes _before_ creating wrapper elements. This avoids unnecessary DOM manipulation and provides accurate line detection based on the browser's actual rendering:
-
-```typescript
-function detectLines(textNode: Text): string[] {
-  const range = document.createRange();
-  const text = textNode.textContent || '';
-  const lines: string[][] = [];
-  let lineChars: string[] = [];
-
-  // Normalize whitespace (Safari compatibility)
-  textNode.textContent = text.trim().replace(/\s+/g, ' ');
+Line detection is lazy like all other split types — it runs on first access of `.lines` (or eagerly if `type` includes `'lines'`). It uses the DOM `Range` API to detect line breaks from text nodes *before* DOM manipulation, avoiding unnecessary wrapper creation during measurement.

-  for (let i = 0; i < text.length; i++) {
-    range.setStart(textNode, 0);
-    range.setEnd(textNode, i + 1);
+**Primary approach: Range API with `getClientRects()`** → `src/lineDetection.ts`

-    // getClientRects() returns one rect per rendered line
-    const lineIndex = range.getClientRects().length - 1;
+Incrementally expand a `Range` one character at a time through the text node. At each position, `getClientRects()` returns one rect per visual line the range spans — so `getClientRects().length - 1` gives the line index of the last character. Group characters by their line index to extract rendered lines. (Technique from Ben Nadel, blog #4310; tested across Chrome, Firefox, Edge, Safari.)

-    if (!lines[lineIndex]) {
-      lines.push((lineChars = []));
-    }
-    lineChars.push(text.charAt(i));
-  }
-
-  return lines.map((chars) => chars.join('').trim());
-}
-```
+Whitespace must be normalized before detection (`textNode.textContent = text.trim().replace(/\s+/g, ' ')`) — Safari returns rects based on markup structure rather than rendered layout when raw whitespace is present (see Browser Compatibility section).


Please revert these, I think the original text with the algo itself is more explicit and clear

ydaniv · 2026-03-12T10:50:38Z

.cursor/plans/text_splitter_package_1eeee927.plan.md

-// Example 1: Lazy evaluation - no splitting happens yet
-const result = splitText('.headline');
-
-// Splitting happens on first access, result is cached
-const chars = result.chars; // Splits into chars NOW, caches result
-const chars2 = result.chars; // Returns cached result (no re-split)
-
-// Lines are split separately when accessed
-const lines = result.lines; // Splits into lines NOW, caches result
-
-// Example 2: Eager split with type option
-const eagerResult = splitText('.headline', { type: 'words' });
-// Words are split immediately on invocation
-
-// Other types still use lazy evaluation
-const lines2 = eagerResult.lines; // Splits into lines on access
-
-// Example 3: Multiple types eager
-const multiResult = splitText('.headline', { type: ['chars', 'words'] });
-// Both chars and words split immediately
-
-// Example 4: With animation library


This is the parts that explains the lazy/eager splitting. You removed it without reading and the LLM missed it.

restored the relevant examples (i think)

…code, trim verbosity - Remove injectStyles from per-split options; CSS now injected globally via adoptedStyleSheets - Add partIndexing option with --char-index/--word-index/--line-index/--sentence-index CSS custom properties - Restore original detectLines() function code (reverts compact description per reviewer request) - Remove height-tracking alternative from line detection section - Trim verbose Intl.Segmenter section and remove redundant code examples - Restore lazy/eager integration examples that explain core API behavior - Update doc references from data-index to CSS custom property indexing Made-with: Cursor

tombigel requested review from Copilot and ydaniv February 23, 2026 12:13

Copilot started reviewing on behalf of tombigel February 23, 2026 12:13 View session

Copilot AI reviewed Feb 23, 2026

View reviewed changes

tombigel and others added 2 commits February 23, 2026 14:18

Merge master: resolve text splitter plan conflicts (accept ours)

6671091

Co-authored-by: Cursor <cursoragent@cursor.com>

Address PR Copilot feedback: .lines returns [], aria-hidden wrapper d…

65d878f

…iv, HTMLSpanElement types, doc numbering Co-authored-by: Cursor <cursoragent@cursor.com>

tombigel requested a review from Copilot February 23, 2026 13:31

Copilot started reviewing on behalf of tombigel February 23, 2026 13:31 View session

Copilot AI reviewed Feb 23, 2026

View reviewed changes

tombigel and others added 2 commits February 24, 2026 11:39

Remove bold around code for Prettier compatibility

0cc3473

Co-authored-by: Cursor <cursoragent@cursor.com>

tombigel requested a review from ameerabuf February 27, 2026 18:28

Merge branch 'master' into tombigel/splittext-plan-revisions

7eeade4

ameerabuf reviewed Mar 2, 2026

View reviewed changes

ydaniv requested changes Mar 3, 2026

View reviewed changes

tombigel requested review from ameerabuf and ydaniv March 3, 2026 14:35

ydaniv requested changes Mar 4, 2026

View reviewed changes

ydaniv requested changes Mar 11, 2026

View reviewed changes

tombigel requested a review from ydaniv March 11, 2026 14:29

tombigel added 3 commits March 11, 2026 16:37

compact the explenation about the safari whitespaces quirk

bc1a2cc

readded fallback line

9a32bd3

Merge branch 'master' into tombigel/splittext-plan-revisions

3d4536b

ydaniv requested changes Mar 12, 2026

View reviewed changes

tombigel requested a review from ydaniv March 12, 2026 12:35

		// Base CSS (inline-block, white-space, etc.)
		injectStyles?: boolean; // default: true - auto-inject minimal base stylesheet (deduplicated via data-splittext)

Conversation

tombigel commented Feb 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ydaniv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydaniv Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ydaniv Mar 12, 2026 •

edited

Loading