|
| 1 | +# Emoji Corruption Issue - Root Cause Analysis |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +The emoji corruption issue (`🧩` → `🧩`) occurs during the transmission of data from the plugin's JavaScript context to the WebView's JavaScript context via `HTMLView.runJavaScript`. Once corrupted data is stored in the WebView, it propagates through all subsequent data operations, creating a self-perpetuating cycle of corruption. |
| 6 | + |
| 7 | +## Evidence from Logs |
| 8 | + |
| 9 | +### 1. Data Generation is Correct ✅ |
| 10 | + |
| 11 | +All logs from the plugin's data generation pipeline show the emoji is **correct**: |
| 12 | +- `makeDashboardParas`: `"Dashboard Plugin 🧩"` (charCodes=55358,56809) ✅ |
| 13 | +- `createSectionOpenItemsFromParas`: `"Dashboard Plugin 🧩"` (charCodes=55358,56809) ✅ |
| 14 | +- `getTodaySectionData`: `"Dashboard Plugin 🧩"` (charCodes=55358,56809) ✅ |
| 15 | +- `getSomeSectionsData`: `"Dashboard Plugin 🧩"` (charCodes=55358,56809) ✅ |
| 16 | + |
| 17 | +**Conclusion**: The plugin's JavaScript context correctly handles Unicode emojis as UTF-16 surrogate pairs. |
| 18 | + |
| 19 | +### 2. Corruption Occurs During Transmission ❌ |
| 20 | + |
| 21 | +The first appearance of corrupted data is in the WebView's JavaScript context: |
| 22 | +- `Root/onMessageReceived`: `"Dashboard Plugin 🧩"` (charCodes=240,376,167,169) ❌ |
| 23 | + |
| 24 | +**Timeline**: |
| 25 | +1. Plugin generates correct data: `🧩` (UTF-16: 55358,56809) |
| 26 | +2. Plugin calls `sendToHTMLWindow` → `HTMLView.runJavaScript` |
| 27 | +3. JavaScript string is transmitted from plugin's JavaScriptCore to WebView's JavaScriptCore |
| 28 | +4. **Corruption occurs during this transmission** |
| 29 | +5. WebView receives corrupted data: `🧩` (Latin-1: 240,376,167,169) |
| 30 | + |
| 31 | +**Conclusion**: The corruption happens in the `HTMLView.runJavaScript` bridge mechanism itself. |
| 32 | + |
| 33 | +### 3. Corruption Propagates Through System 🔄 |
| 34 | + |
| 35 | +Once corrupted data is stored in the WebView's `globalSharedData`: |
| 36 | +- Every subsequent `getGlobalSharedData` call returns corrupted data |
| 37 | +- `setPluginData` reads corrupted data from WebView and merges it with new correct data |
| 38 | +- The merge operation (`mergeSections`) combines corrupted existing data with correct new data |
| 39 | +- The corrupted data "wins" because it's already in the WebView's storage |
| 40 | +- This creates a self-perpetuating cycle: corrupted data → stored in WebView → read back → merged with new data → sent again → corrupted again |
| 41 | + |
| 42 | +**Conclusion**: The corruption is not just a one-time issue, but a systemic problem that propagates through all data operations. |
| 43 | + |
| 44 | +## Root Cause |
| 45 | + |
| 46 | +The corruption occurs in the **`HTMLView.runJavaScript` bridge** when transmitting JavaScript strings from the plugin's JavaScriptCore environment to the WebView's JavaScriptCore environment. |
| 47 | + |
| 48 | +### Technical Details |
| 49 | + |
| 50 | +1. **Correct State (Plugin JavaScript Context)**: |
| 51 | + - Emoji: `🧩` |
| 52 | + - UTF-16 surrogate pair: `55358, 56809` (0xD83E, 0xDDE9) |
| 53 | + - JavaScript string representation: Correct UTF-16 |
| 54 | + |
| 55 | +2. **Transmission (HTMLView.runJavaScript)**: |
| 56 | + - The JavaScript string containing the JSON payload is transmitted |
| 57 | + - During transmission, the UTF-16 surrogate pair is incorrectly converted |
| 58 | + - The emoji bytes are misinterpreted as Latin-1 characters |
| 59 | + |
| 60 | +3. **Corrupted State (WebView JavaScript Context)**: |
| 61 | + - Emoji: `🧩` |
| 62 | + - Latin-1 bytes: `240, 376, 167, 169` (0xF0, 0xF8, 0xA7, 0xA9) |
| 63 | + - These are the UTF-8 bytes `240, 159, 167, 169` (0xF0, 0x9F, 0xA7, 0xA9) interpreted as Latin-1 |
| 64 | + |
| 65 | +### Why This Happens |
| 66 | + |
| 67 | +The `HTMLView.runJavaScript` mechanism appears to: |
| 68 | +1. Convert the JavaScript string to a byte sequence (likely UTF-8) |
| 69 | +2. Transmit those bytes to the WebView |
| 70 | +3. Reconstruct the string in the WebView's JavaScript context |
| 71 | +4. **But somewhere in this process, the byte encoding is misinterpreted** |
| 72 | + |
| 73 | +The UTF-8 bytes for the emoji (`240, 159, 167, 169`) are being interpreted as Latin-1 characters (`240, 376, 167, 169`), where: |
| 74 | +- `159` (0x9F) becomes `376` (0xF8) - this is the key corruption |
| 75 | +- The other bytes remain the same but are now interpreted as Latin-1 |
| 76 | + |
| 77 | +## Impact |
| 78 | + |
| 79 | +1. **Initial Corruption**: First `sendToHTMLWindow` call corrupts the emoji during transmission |
| 80 | +2. **Storage**: Corrupted data is stored in WebView's `globalSharedData` |
| 81 | +3. **Propagation**: All subsequent operations read corrupted data and merge it with new correct data |
| 82 | +4. **Persistence**: The corruption persists across all data refresh cycles |
| 83 | + |
| 84 | +## Solution Required |
| 85 | + |
| 86 | +The fix must be implemented in the **`HTMLView.runJavaScript` bridge mechanism** (likely in Swift/Objective-C code that we don't have access to). The bridge needs to: |
| 87 | + |
| 88 | +1. Properly handle Unicode characters when transmitting JavaScript strings |
| 89 | +2. Ensure UTF-8 encoding is correctly interpreted on both sides |
| 90 | +3. Preserve UTF-16 surrogate pairs through the transmission |
| 91 | + |
| 92 | +## Workaround Attempts (Unsuccessful) |
| 93 | + |
| 94 | +We attempted several workarounds in JavaScript: |
| 95 | +- Manual Unicode escaping |
| 96 | +- Double JSON.stringify |
| 97 | +- Manual emoji encoding/decoding |
| 98 | + |
| 99 | +None of these worked because the corruption occurs **during the transmission itself**, before the JavaScript code in the WebView even executes. |
| 100 | + |
| 101 | +## Next Steps |
| 102 | + |
| 103 | +1. **Report to Eduard**: This is a bug in the `HTMLView.runJavaScript` mechanism that requires a fix in the native Swift/Objective-C code |
| 104 | +2. **Temporary Mitigation**: Consider avoiding emojis in note titles or implementing a workaround at the data source level (not recommended, as it limits functionality) |
| 105 | +3. **Investigation**: Eduard needs to investigate how `HTMLView.runJavaScript` handles Unicode strings when bridging between JavaScriptCore environments |
| 106 | + |
| 107 | +## Files Involved |
| 108 | + |
| 109 | +- `helpers/HTMLView.js` - Contains `sendToHTMLWindow` which calls `HTMLView.runJavaScript` |
| 110 | +- `np.Shared/src/react/Root.jsx` - Receives the corrupted data via `postMessage` |
| 111 | +- `jgclark.Dashboard/src/dashboardHelpers.js` - `setPluginData` and `mergeSections` propagate the corruption |
| 112 | +- `jgclark.Dashboard/src/refreshClickHandlers.js` - `refreshSomeSections` merges corrupted and correct data |
| 113 | + |
| 114 | +## Logging Evidence |
| 115 | + |
| 116 | +All logging shows: |
| 117 | +- ✅ Plugin context: Correct emoji (charCodes=55358,56809) |
| 118 | +- ❌ WebView context: Corrupted emoji (charCodes=240,376,167,169) |
| 119 | +- 🔄 Propagation: Corrupted data from WebView merges with correct new data |
| 120 | + |
| 121 | +The corruption is **definitively** occurring in the `HTMLView.runJavaScript` transmission bridge. |
| 122 | + |
| 123 | +--- |
| 124 | + |
| 125 | +## Succinct Version for Discord |
| 126 | + |
| 127 | +**Emoji Corruption Bug in `HTMLView.runJavaScript`** |
| 128 | + |
| 129 | +Emojis in note titles (e.g., `🧩`) are being corrupted (`🧩`) when transmitted from plugin JS to WebView JS via `HTMLView.runJavaScript`. |
| 130 | + |
| 131 | +**Evidence:** |
| 132 | +- Plugin side: Emoji is correct `🧩` (UTF-16: 55358,56809) |
| 133 | +- After `HTMLView.runJavaScript`: Corrupted `🧩` (Latin-1: 240,376,167,169) |
| 134 | +- The UTF-8 bytes `240,159,167,169` are being misinterpreted as Latin-1 `240,376,167,169` (byte `159` → `376`) |
| 135 | + |
| 136 | +**Root Cause:** |
| 137 | +The corruption occurs in the native bridge code during string transmission between JavaScriptCore environments. The UTF-8 encoding is being incorrectly interpreted as Latin-1. |
| 138 | + |
| 139 | +**Impact:** |
| 140 | +Once corrupted, the data propagates through all operations because it's stored in WebView's `globalSharedData` and merged with new correct data. |
| 141 | + |
| 142 | +**Fix Required:** |
| 143 | +The `HTMLView.runJavaScript` bridge needs to properly handle Unicode/UTF-8 encoding when transmitting JavaScript strings. This requires a fix in the native Swift/Objective-C code. |
| 144 | + |
| 145 | +**Logs show the corruption happens between:** |
| 146 | +- Plugin: `sendToHTMLWindow` → `HTMLView.runJavaScript` (correct) |
| 147 | +- WebView: `Root/onMessageReceived` (corrupted) |
| 148 | + |
| 149 | +JavaScript workarounds don't work because the corruption occurs during the native bridge transmission itself. |
0 commit comments