Skip to content

New entity grouping endpoint changes#98

Open
joeferraratonic wants to merge 8 commits intomainfrom
entity_linking_changes
Open

New entity grouping endpoint changes#98
joeferraratonic wants to merge 8 commits intomainfrom
entity_linking_changes

Conversation

@joeferraratonic
Copy link
Contributor

No description provided.

for entity_data in group.get("entities", []):
# Convert camelCase fields to snake_case for Replacement
group_entities.append(Replacement(
group_entities.append(SingleDetectionResult(

This comment was marked as outdated.

groups.append(group_entities)

return GroupResponse(groups=groups)
return groups
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The function now returns a raw list instead of a GroupResponse object, which will cause an AttributeError in client code expecting a .groups attribute.
Severity: HIGH

Suggested Fix

Wrap the returned groups list in a GroupResponse object before returning, to match the function's original contract and the expectations of its callers. Change return groups to return GroupResponse(groups=groups).

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: tonic_textual/redact_api.py#L554

Potential issue: The `group_entities` function's return type was changed from a
`GroupResponse` object to a raw `list[list[SingleDetectionResult]]`. However, client
code and associated tests expect the original `GroupResponse` object, which has a
`.groups` attribute. The new implementation returns a `list`, which lacks this
attribute. Any attempt to access `response.groups` will result in an `AttributeError:
'list' object has no attribute 'groups'`, breaking the function's contract with its
callers. This bug will be masked by a more critical `TypeError` that occurs earlier in
the function.

Did we get this right? 👍 / 👎 to inform future reviews.

"pythonEnd": replacement["end"], # Python index
"label": replacement["label"],
"text": replacement["text"],
"score": replacement["score"],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The function replacement_to_grouping_entity will crash with a KeyError if a dict input is provided without a score key, despite the type hint allowing it.
Severity: MEDIUM

Suggested Fix

To prevent a KeyError, use safe dictionary access for the score key. Instead of replacement["score"], use replacement.get("score"). This change would make the function more robust and align with defensive programming practices seen elsewhere in the codebase.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: tonic_textual/generator_utils.py#L389

Potential issue: The function `replacement_to_grouping_entity` accepts a `Union` type
that includes a plain `dict`. However, the code unconditionally accesses the `score` key
using `replacement["score"]`. If a caller, such as `group_entities()` or
`group_synthesis()`, passes a dictionary that lacks a `score` key, the application will
raise a `KeyError`. While the type signature allows for this input, the implementation
does not handle it safely, leading to a runtime crash. Existing tests do not cover this
failure case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant