Decode category/tag strings before grouping in report layout#85603
Conversation
Transactions with the same category but different HTML encoding (e.g., Uber & car washes vs Uber & car washes) were being grouped separately in the expense report view, despite displaying the same group name. This happened because the raw category string was used as the Map key for grouping, while the displayed name was HTML-decoded. Now we decode category/tag strings before using them as grouping keys, matching how OldDot already handles this in lib_report.js. Related Expensify/Expensify#612034 Co-authored-by: Lydia Barclay <lydiabarclay@users.noreply.github.com>
|
The failing Root cause: The Grails Maven repository ( Evidence: This PR only modifies TypeScript files ( Recommendation: Re-run the failed |
|
@abzokhattab Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button] |
| for (const transaction of transactions) { | ||
| const tag = getTag(transaction); | ||
| const tagKey = isTagMissing(tag) ? '' : tag; | ||
| const tagKey = isTagMissing(tag) ? '' : Str.htmlDecode(tag); |
There was a problem hiding this comment.
❌ CONSISTENCY-3 (docs)
The category grouping uses getDecodedCategoryName(category) (a dedicated wrapper from CategoryUtils), but the tag grouping uses Str.htmlDecode(tag) directly. These perform the same operation (HTML-decoding a string before using it as a grouping key), but use inconsistent approaches. This inconsistency also introduces a new import {Str} from 'expensify-common' that would be unnecessary if a wrapper were used.
For consistency, create a getDecodedTagName helper in TagUtils.ts (mirroring getDecodedCategoryName in CategoryUtils.ts) and use it here instead of calling Str.htmlDecode directly:
// In src/libs/TagUtils.ts
function getDecodedTagName(tagName: string) {
return Str.htmlDecode(tagName);
}// In src/libs/ReportLayoutUtils.ts
const tagKey = isTagMissing(tag) ? '' : getDecodedTagName(tag);This also allows the import {Str} from 'expensify-common' to be removed from ReportLayoutUtils.ts.
Please rate this suggestion with 👍 or 👎 to help us improve! Reactions are used to monitor reviewer efficiency.
|
Hey @lydiabarclay i have been tryign to execute the steps locally but the steps are not clear enough .. is it easily reproducble without unit tests? |
trjExpensify
left a comment
There was a problem hiding this comment.
I agree @abzokhattab, the reproduction steps aren't clear on how to get to this state.
|
The failing Analysis: The Android Gradle build failed because the external repository Evidence: This PR only modifies The other failing check ( |
|
Hey @MelvinBot can you provide some more clear reproduction steps on how to reproduce this:
|
|
The encoding discrepancy comes from how the backend stores the How the encoding mismatch happensPolicy categories are stored with HTML-encoded Two code paths write
Both display identically in the UI (the display layer always HTML-decodes), but the raw strings differ — so the grouping function treated them as different categories. Concrete QA reproduction stepsSetup:
Create the two differently-encoded expenses:
Observe the bug:
Simplest way to verifyIf controlling MCC categorization is difficult, the linked issue references a real report (ID |
|
@trjExpensify do those Melv-generated QA steps check out? |
|
@MelvinBot can't seem to reproduce following your steps.
2026-04-02_01-46-19.mp4For good measure, I also created a distance expense on Classic and switched to NewDot to view it and still no slit of the groupings: 2026-04-02_01-50-17.mp4Account: tom+ampcode1@trj.chat
|
|
Thanks for the thorough testing, Why reproduction is difficultAfter tracing the code paths, I found that on the client side, both MCC group category and distance default category are set from the same source: the HTML-encoded The encoding mismatch in the original customer's data (
Since the backend PHP code (Web-Expensify) isn't available in this workspace, I can't pinpoint the exact server-side code path that produces the mismatch. But the existence of The fix is still correctThe fix is defensive normalization — it decodes category strings before using them as grouping keys, which is exactly what OldDot already does with
Suggested verification approachInstead of trying to create the mismatch from scratch, you could verify the fix against the original customer's data:
Alternatively, you could verify by directly inspecting the Onyx data in your test account's browser console to see if both transaction categories have identical encoding — if they do, that confirms the mismatch doesn't occur through the current client-side paths, but the fix still protects against backend-originated mismatches. |
|
@MelvinBot I can see in the customer account that two different expenses have a variance of the
So I think we need to try and isolate where and why the 2026-04-02_02-05-06.mp4 |
|
@abzokhattab I tried, but that's all I've got for now... it's not ideal to make changes we can't reproduce the root cause of how the situation is created in the first place. If you have any other bright ideas? 🤷 |
|
Great find, Root Cause: The client writes HTML-encoded category names to transactions
The chain:
Card transactions don't have this problem because the backend sets their categories server-side (via MCC auto-categorization), presumably using the decoded value. This is NOT distance-specificManual category selection ( The PR fix is correct but partialThe That said, the grouping fix in this PR is still valuable as a defensive measure and directly addresses the customer's symptom. Whether to also fix the write-side encoding is a separate decision. |
|
didnt have time to look into .. i am working on this today |
There was a problem hiding this comment.
Reviewer Checklist
- I have verified the author checklist is complete (all boxes are checked off).
- I verified the correct issue is linked in the
### Fixed Issuessection above - I verified testing steps are clear and they cover the changes made in this PR
- I verified the steps for local testing are in the
Testssection - I verified the steps for Staging and/or Production testing are in the
QA stepssection - I verified the steps cover any possible failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
- I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
- I verified the steps for local testing are in the
- I checked that screenshots or videos are included for tests on all platforms
- I included screenshots or videos for tests on all platforms
- I verified that the composer does not automatically focus or open the keyboard on mobile unless explicitly intended. This includes checking that returning the app from the background does not unexpectedly open the keyboard.
- I verified tests pass on all platforms & I tested again on:
- Android: HybridApp
- Android: mWeb Chrome
- iOS: HybridApp
- iOS: mWeb Safari
- MacOS: Chrome / Safari
- If there are any errors in the console that are unrelated to this PR, I either fixed them (preferred) or linked to where I reported them in Slack
- I verified there are no new alerts related to the
canBeMissingparam foruseOnyx - I verified proper code patterns were followed (see Reviewing the code)
- I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e.
toggleReportand notonIconClick). - I verified that comments were added to code that is not self explanatory
- I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
- I verified any copy / text shown in the product is localized by adding it to
src/languages/*files and using the translation method - I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
- I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
- I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
- I verified the JSDocs style guidelines (in
STYLE.md) were followed
- I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e.
- If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
- I verified that this PR follows the guidelines as stated in the Review Guidelines
- I verified other components that can be impacted by these changes have been tested, and I retested again (i.e. if the PR modifies a shared library or component like
Avatar, I verified the components usingAvatarhave been tested & I retested again) - I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
- I verified any variables that can be defined as constants (ie. in CONST.ts or at the top of the file that uses the constant) are defined as such
- If a new component is created I verified that:
- A similar component doesn't exist in the codebase
- All props are defined accurately and each prop has a
/** comment above it */ - The file is named correctly
- The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
- The only data being stored in the state is data necessary for rendering and nothing else
- For Class Components, any internal methods passed to components event handlers are bound to
thisproperly so there are no scoping issues (i.e. foronClick={this.submit}the methodthis.submitshould be bound tothisin the constructor) - Any internal methods bound to
thisare necessary to be bound (i.e. avoidthis.submit = this.submit.bind(this);ifthis.submitis never passed to a component event handler likeonClick) - All JSX used for rendering exists in the render method
- The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
- If any new file was added I verified that:
- The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
- If a new CSS style is added I verified that:
- A similar style doesn't already exist
- The style can't be created with an existing StyleUtils function (i.e.
StyleUtils.getBackgroundAndBorderStyle(theme.componentBG)
- If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
- If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like
Avataris modified, I verified thatAvataris working as expected in all cases) - If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
- If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
- If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
- I verified that all the inputs inside a form are aligned with each other.
- I added
Designlabel and/or tagged@Expensify/designso the design team can review the changes.
- If a new page is added, I verified it's using the
ScrollViewcomponent to make it scrollable when more elements are added to the page. - For any bug fix or new feature in this PR, I verified that sufficient unit tests are included to prevent regressions in this flow.
- If the
mainbranch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTeststeps. - I have checked off every checkbox in the PR reviewer checklist, including those that don't apply to this PR.
Screenshots/Videos
Android: HybridApp
Android: mWeb Chrome
iOS: HybridApp
iOS: mWeb Safari
MacOS: Chrome / Safari
sqlite> select transactionid, category, tag from transactions where transactionID in (7273776609767507467, 642949550980543372);
transactionID category tag
------------------- ------------- -------------
642949550980543372 One & Two uno & dos
7273776609767507467 One & Two uno & dos
After:
Screen.Recording.2026-05-18.at.01.06.44.mov
|
The failing Root cause: The external Maven repository Evidence: The error The other failing check ( Recommendation: Re-run the failed |
|
can you please check this comment @cead22 @lydiabarclay #85603 (comment) |
|
@MelvinBot can you adapt the changes requested here #85603 (comment) |
Addresses review feedback on PR #85603: - cead22: tag column showed raw HTML-encoded value (e.g. `uno & dos`) while the group-by-tag dropdown decoded it. TagCell now decodes the display value, mirroring how CategoryCell already handles categories. - github-actions CONSISTENCY-3: extract `getDecodedTagName` into TagUtils (mirroring `getDecodedCategoryName` in CategoryUtils) and use it from ReportLayoutUtils instead of importing `Str` directly. Keeps the tag / category decoding pattern symmetric across the codebase. Adds unit tests for `getDecodedTagName`.
|
Cherry-picked
All unit tests and typecheck pass. |
|
The failing Failure: Evidence: This PR only modifies |
|
@MelvinBot the grouping + tag column fix is solid, but while testing this branch I found the same encoding leaks into other tag display surfaces that this PR doesn't cover — most visibly the tag picker itself, which still renders Repro on this branch: in a report with a tag stored as
All of these go through Proposed changeFold the decode into // src/libs/PolicyUtils.ts
/**
* Cleans up escaping of colons (used to create multi-level tags, e.g. "Parent: Child")
* and HTML-decodes the result so tags stored with encoded entities (e.g. `R&D`) render as `R&D`.
* Mirrors how `getDecodedCategoryName` normalizes category display in CategoryUtils.
*/
function getCleanedTagName(tag: string) {
return Str.htmlDecode(tag?.replaceAll('\\:', CONST.COLON) ?? '');
}
Why it's safeEvery callsite I audited follows the
|
Add Str.htmlDecode to getCleanedTagName so all tag display surfaces (picker, workspace list, edit screens, search filters, group headers, modified-expense messages, sidebar previews) render decoded entities like R&D instead of raw R&D. Mirrors how getDecodedCategoryName handles categories in CategoryUtils. Co-authored-by: Abdelrahman Khattab <abzokhattab@users.noreply.github.com>
|
@abzokhattab conflicts |
|
@MelvinBot resolve the conflicts |
Resolve conflict in TagCell.tsx by keeping main's EditableCell/TagPickerModal restructuring while preserving the PR's getDecodedTagName decode wrapper. Co-authored-by: Abdelrahman Khattab <abzokhattab@users.noreply.github.com>
|
Conflicts resolved — merged `main` into this branch. The only conflict was in |
cead22
left a comment
There was a problem hiding this comment.
Let's clean up these AI generated comments
| }; | ||
|
|
||
| const tagForDisplay = getTagForDisplay(transactionItem); | ||
| // Decode HTML entities so tags stored with encoding (e.g. `uno & dos`) display as `uno & dos`, |
There was a problem hiding this comment.
| // Decode HTML entities so tags stored with encoding (e.g. `uno & dos`) display as `uno & dos`, | |
| // Decode HTML entities so tags stored with encoding are displayed properly (e.g. `uno & dos` display as `uno & dos`) |
There was a problem hiding this comment.
Done — simplified the comment in bc02597.
| * Cleans up escaping of colons (used to create multi-level tags, e.g. "Parent: Child") in the tag name we receive from the backend, | ||
| * and HTML-decodes the result so tags stored with encoded entities (e.g. `R&D`) render as `R&D`. |
There was a problem hiding this comment.
| * Cleans up escaping of colons (used to create multi-level tags, e.g. "Parent: Child") in the tag name we receive from the backend, | |
| * and HTML-decodes the result so tags stored with encoded entities (e.g. `R&D`) render as `R&D`. | |
| * Cleans up escaping of colons used to create multi-level tags (e.g. "Parent: Child"), | |
| * and HTML-decodes the result so tags stored with encoded entities display correctly (e.g. `R&D`, renders as `R&D`) |
| * HTML-decodes a tag name so values stored with different encodings (e.g. `R&D` vs `R&D`) | ||
| * resolve to the same string. Mirrors getDecodedCategoryName in CategoryUtils. |
There was a problem hiding this comment.
| * HTML-decodes a tag name so values stored with different encodings (e.g. `R&D` vs `R&D`) | |
| * resolve to the same string. Mirrors getDecodedCategoryName in CategoryUtils. | |
| * HTML-decodes a tag name so values stored with different encodings are displayed correctly (e.g. `R&D` vs `R&D`) | |
| * Mirrors getDecodedCategoryName in CategoryUtils. |
|
@MelvinBot make the requested changes |
Simplify comment wording in TagCell.tsx, PolicyUtils.ts, and TagUtils.ts to be more concise and less verbose. Co-authored-by: Abdelrahman Khattab <abzokhattab@users.noreply.github.com>
|
done @cead22 |
|
@cead22 looks like this was merged without a test passing. Please add a note explaining why this was done and remove the |
|
🚧 @cead22 has triggered a test Expensify/App build. You can view the workflow run here. |
|
🧪🧪 Use the links below to test this adhoc build on Android, iOS, and Web. Happy testing! 🧪🧪
|
|
✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release. |
|
🚀 Deployed to staging by https://github.com/cead22 in version: 9.3.76-0 🚀
|
|
No help site changes are required for this PR. This is a bug fix that corrects how transactions with HTML-encoded category/tag names (e.g., |
|
🚀 Deployed to production by https://github.com/roryabraham in version: 9.3.77-3 🚀
|
Explanation of Change
When expenses share the same category but are stored with different HTML encodings (e.g.,
Uber & car washesvsUber & car washes), the report view in New Expensify displays them as two separate category groups — even though they look identical. This happens becausegroupTransactionsByCategoryinReportLayoutUtils.tsuses the raw category string as theMapkey for grouping, but displays the HTML-decoded name. Different code paths (MCC auto-categorization vs. distance default category) can produce the same category name with different HTML encoding, particularly for the&character.The fix decodes the category string before using it as the grouping key, so that
&and&resolve to the same group. The same fix is applied togroupTransactionsByTagfor consistency. This matches how OldDot already handles this inlib_report.jswith_.unescape().Fixed Issues
$ https://github.com/Expensify/Expensify/issues/612034
Tests
&in the category string and the other has a literal&Offline tests
N/A — this change is purely about how transactions are grouped in memory for display. Offline behavior is unchanged.
QA Steps
Support login into the account and go to the report shown in the linked issue, and confirm the issue is gone
PR Author Checklist
### Fixed Issuessection aboveTestssectionOffline stepssectionQA stepssectiontoggleReportand notonIconClick)src/languages/*files and using the translation methodSTYLE.md) were followedAvatar, I verified the components usingAvatarare working as expected)StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))npm run compress-svg)Avataris modified, I verified thatAvataris working as expected in all cases)Designlabel and/or tagged@Expensify/designso the design team can review the changes.ScrollViewcomponent to make it scrollable when more elements are added to the page.mainbranch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTeststeps.Screenshots/Videos
Android: Native
N/A — no UI changes, logic-only fix in grouping utility
Android: mWeb Chrome
N/A — no UI changes, logic-only fix in grouping utility
iOS: Native
N/A — no UI changes, logic-only fix in grouping utility
iOS: mWeb Safari
N/A — no UI changes, logic-only fix in grouping utility
MacOS: Chrome / Safari
N/A — no UI changes, logic-only fix in grouping utility