Skip to content

feat(search): bring query implementation back in-repo#3550

Merged
Mpdreamz merged 3 commits into
mainfrom
feature/bring-query-implementation-back
Jun 22, 2026
Merged

feat(search): bring query implementation back in-repo#3550
Mpdreamz merged 3 commits into
mainfrom
feature/bring-query-implementation-back

Conversation

@Mpdreamz

Copy link
Copy Markdown
Member

Summary

  • Reverses b38a552 which extracted the search query implementation into the Elastic.Internal.Search.Elasticsearch NuGet package from website-search-data
  • Brings all query code back in-repo as Elastic.Documentation.Search.* — the right long-term home, since it's tightly coupled to docs-builder's index structure
  • Introduces SearchQueryConfiguration (lean 4-field record) replacing the lean DTO from the package
  • Removes the Elastic.Internal.Search.Elasticsearch package reference entirely

What moved in

New file Purpose
DefaultSearchService.cs Hybrid lex+semantic search and autocomplete, implements ISearchService<T>
Configuration/SearchQueryConfiguration.cs Synonyms, diminish terms, ruleset name, semantic flag
Query/SearchQueryBuilder.cs Lexical / semantic / diminish / rule-query construction
Query/QueryFieldNames.cs Centralised ES field-name constants
Highlighting/SearchResultProcessor.cs Hit → SearchResultItem<T>, merges ES highlight fragments
Highlighting/StringHighlightExtensions.cs Post-process <mark> over un-highlighted occurrences
Highlighting/HighlightOptions.cs Options record for highlight post-processing
Diagnostics/SearchExplainExtensions.cs _explain API helpers for relevance tests
SourceGenerationContext.cs AOT-safe JSON source-gen for RuleQueryMatchCriteria

Features stripped pending a Contract update

Three features from the website-search-data source require types not yet published in Elastic.Internal.Search.Contract 0.9.2. They are stripped with // NOTE: comments to restore:

  • ElasticsearchTookMs / IsValidResponse on SearchResponse<T> / AutocompleteResponse<T>
  • Probe-mode SearchAsync branch (requires SearchQueryComponents)

A companion PR in website-search-data will remove Elastic.Internal.Search.Elasticsearch and rework essc to validate instead of publishing synonym sets (since docs-builder owns those resources).

Test plan

  • dotnet build — 0 errors, 0 warnings
  • ./build.sh unit-test — 1761 + 314 tests pass (14 pre-existing scoped-FS failures unrelated to this change)
  • grep -rn "Elastic.Internal.Search.Elasticsearch" src tests tests-integration → zero hits
  • StringHighlightExtensionsTests (91 API infra tests) — all pass
  • Integration tests require a live ES cluster — run in CI

🤖 Generated with Claude Code

Mpdreamz and others added 2 commits June 22, 2026 11:10
…mentation.Search

Reverses commit b38a552 which extracted the search query implementation into
the Elastic.Internal.Search.Elasticsearch NuGet package. The query code now
lives in-repo as part of Elastic.Documentation.Search, which is the right
long-term home — it's tightly coupled to docs-builder's index structure and
rarely consumed outside this repo.

Changes:
- Add SearchQueryConfiguration record (replaces the lean DTO from the package)
- Add DefaultSearchService<TDocument> (hybrid lex+semantic, query builder, autocomplete)
- Add SearchQueryBuilder (lexical/semantic/diminish/rule-query construction)
- Add QueryFieldNames (centralised ES field-name constants)
- Add Highlighting: SearchResultProcessor, StringHighlightExtensions, HighlightOptions
- Add Diagnostics: SearchExplainExtensions (_explain API for relevance tests)
- Add SourceGenerationContext (AOT-safe JSON source-gen for RuleQueryMatchCriteria)
- Remove Elastic.Internal.Search.Elasticsearch package reference and pin

NOTE: three features present in the website-search-data source are stripped because
they depend on types not yet published in Contract 0.9.2:
  - ElasticsearchTookMs / IsValidResponse on SearchResponse/AutocompleteResponse
  - probe-mode SearchAsync (requires SearchQueryComponents)
Restore when Contract publishes a version with these types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Mpdreamz Mpdreamz requested a review from a team as a code owner June 22, 2026 09:13
@Mpdreamz Mpdreamz requested a review from cotti June 22, 2026 09:13
@Mpdreamz Mpdreamz temporarily deployed to integration-tests June 22, 2026 09:13 — with GitHub Actions Inactive
@reakaleek

Copy link
Copy Markdown
Member

Is this a true revert? Or are we keeping like attribute names?

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@Mpdreamz, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 38 minutes and 40 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 134dd622-7280-4e31-8330-9189cf386e8c

📥 Commits

Reviewing files that changed from the base of the PR and between a2ae4e8 and b091278.

📒 Files selected for processing (3)
  • src/services/search/Elastic.Documentation.Search/DefaultSearchService.cs
  • src/services/search/Elastic.Documentation.Search/Diagnostics/SearchExplainExtensions.cs
  • src/services/search/Elastic.Documentation.Search/Highlighting/SearchResultProcessor.cs
📝 Walkthrough

Walkthrough

The PR removes the Elastic.Internal.Search.Elasticsearch NuGet package dependency from both Directory.Packages.props and the project file, and inlines equivalent functionality directly into Elastic.Documentation.Search. New additions include: SearchQueryConfiguration record, QueryFieldNames field-name constants, SearchQueryBuilder for lexical/semantic query composition, HighlightOptions/StringHighlightExtensions/SearchResultProcessor for token highlighting, DefaultSearchService<TDocument> implementing ISearchService<TDocument> with autocomplete and full-text search, SearchExplainExtensions diagnostics, and a SourceGenerationContext for JSON source generation. DI wiring in ServicesExtension, integration tests, and the unit test namespace import are updated to reference the new local types.

Possibly Related PRs

  • elastic/docs-builder#3364: Directly related — both PRs modify the Elastic.Internal.Search.Elasticsearch entry in Directory.Packages.props (version pinning vs. removal).
  • elastic/docs-builder#3462: Directly related — both PRs modify ElasticsearchClientJsonResolver.cs to change the JsonTypeInfoResolver composition for RuleQueryMatchCriteria-related serialization contexts.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.42% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: reintegrating the search query implementation from an external package back into the repository.
Description check ✅ Passed The description clearly explains the reversion of a prior commit, lists all moved files with their purposes, documents temporarily stripped features with restoration guidance, and provides a comprehensive test plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feature/bring-query-implementation-back

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/services/search/Elastic.Documentation.Search/Highlighting/SearchResultProcessor.cs (1)

38-42: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Hardcoded field names should use QueryFieldNames constants.

The highlight field names "stripped_body" and "title" are hardcoded here, but QueryFieldNames (from the query contracts layer) centralizes these constants. Using the constants would ensure consistency if field names change.

Suggested fix
-		if (highlights.TryGetValue("stripped_body", out var bodyHighlights) && bodyHighlights.Count > 0)
+		if (highlights.TryGetValue(QueryFieldNames.StrippedBody, out var bodyHighlights) && bodyHighlights.Count > 0)
			highlightedBody = string.Join(". ", bodyHighlights.Select(h => h.Trim(['|', ' ', '.', '-'])));

-		if (highlights.TryGetValue("title", out var titleHighlights) && titleHighlights.Count > 0)
+		if (highlights.TryGetValue(QueryFieldNames.Title, out var titleHighlights) && titleHighlights.Count > 0)
			highlightedTitle = string.Join(". ", titleHighlights.Select(h => h.Trim(['|', ' ', '.', '-'])));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/services/search/Elastic.Documentation.Search/Highlighting/SearchResultProcessor.cs`
around lines 38 - 42, In the SearchResultProcessor class, replace the hardcoded
field names "stripped_body" and "title" in the highlights.TryGetValue calls with
the corresponding constants from QueryFieldNames. This ensures field name
consistency across the codebase and makes future maintenance easier if field
names need to be updated in one centralized location rather than scattered
throughout the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/services/search/Elastic.Documentation.Search/DefaultSearchService.cs`:
- Around line 184-188: The nested aggregation named "applies_to_type" with inner
terms "types" is being extracted at line 227-228 but is never defined in the
.Aggregations() builder at lines 184-187. Add the missing aggregation definition
to the .Aggregations() builder using .Add("applies_to_type", ...) with the
appropriate nested Terms aggregation for the types, or alternatively remove the
extraction code at line 227-228 if the deployment type facet is not currently
needed.

In
`@src/services/search/Elastic.Documentation.Search/Diagnostics/SearchExplainExtensions.cs`:
- Around line 38-41: All await expressions in this library file must include
.ConfigureAwait(false) to prevent ambient context capture and deadlock risks.
Locate the six await sites mentioned (lines 38, 56, 89, 101, 104-105) and add
.ConfigureAwait(false) to each await call, including the
service.Client.SearchAsync await shown in the diff and any other async
operations throughout the file. This applies to all async method calls to ensure
proper library code behavior.

---

Nitpick comments:
In
`@src/services/search/Elastic.Documentation.Search/Highlighting/SearchResultProcessor.cs`:
- Around line 38-42: In the SearchResultProcessor class, replace the hardcoded
field names "stripped_body" and "title" in the highlights.TryGetValue calls with
the corresponding constants from QueryFieldNames. This ensures field name
consistency across the codebase and makes future maintenance easier if field
names need to be updated in one centralized location rather than scattered
throughout the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 408c7acf-537e-41b7-a2ab-0c5125e8cdcf

📥 Commits

Reviewing files that changed from the base of the PR and between 654fcce and a2ae4e8.

📒 Files selected for processing (17)
  • Directory.Packages.props
  • src/services/search/Elastic.Documentation.Search/Common/ElasticsearchClientJsonResolver.cs
  • src/services/search/Elastic.Documentation.Search/Configuration/SearchQueryConfiguration.cs
  • src/services/search/Elastic.Documentation.Search/DefaultSearchService.cs
  • src/services/search/Elastic.Documentation.Search/Diagnostics/SearchExplainExtensions.cs
  • src/services/search/Elastic.Documentation.Search/Elastic.Documentation.Search.csproj
  • src/services/search/Elastic.Documentation.Search/Highlighting/HighlightOptions.cs
  • src/services/search/Elastic.Documentation.Search/Highlighting/SearchResultProcessor.cs
  • src/services/search/Elastic.Documentation.Search/Highlighting/StringHighlightExtensions.cs
  • src/services/search/Elastic.Documentation.Search/NavigationSearchService.cs
  • src/services/search/Elastic.Documentation.Search/Query/QueryFieldNames.cs
  • src/services/search/Elastic.Documentation.Search/Query/SearchQueryBuilder.cs
  • src/services/search/Elastic.Documentation.Search/ServicesExtension.cs
  • src/services/search/Elastic.Documentation.Search/SourceGenerationContext.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests-integration/Search.IntegrationTests/SearchRelevanceTests.cs
  • tests/Elastic.Documentation.Api.Infrastructure.Tests/Adapters/Search/StringHighlightExtensionsTests.cs
💤 Files with no reviewable changes (2)
  • src/services/search/Elastic.Documentation.Search/Elastic.Documentation.Search.csproj
  • Directory.Packages.props

@Mpdreamz Mpdreamz added the chore label Jun 22, 2026
- Remove dead DeploymentType extraction in DefaultSearchService — the
  'applies_to_type' nested aggregation was never defined in the query
  builder, so ExtractNestedTermsAggregation always returned []; the
  default on SearchAggregations covers this
- Add .ConfigureAwait(false) to all six await sites in
  SearchExplainExtensions (library code requirement)
- Replace hardcoded "stripped_body" / "title" strings in
  SearchResultProcessor with QueryFieldNames.StrippedBody / .Title

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Mpdreamz Mpdreamz temporarily deployed to integration-tests June 22, 2026 09:34 — with GitHub Actions Inactive
@Mpdreamz

Copy link
Copy Markdown
Member Author

@reakaleek this does not touch the attributes.

@Mpdreamz Mpdreamz merged commit c372f06 into main Jun 22, 2026
25 checks passed
@Mpdreamz Mpdreamz deleted the feature/bring-query-implementation-back branch June 22, 2026 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants