Skip to content

Conversation

@brandur
Copy link
Contributor

@brandur brandur commented Mar 7, 2025

This one aims to give us a workable resolution to one of our most common
problems with sqlc. Namely, that although it allows substitution for
parameters that work with a prepared query, it can't replace arbitrary
parts of a SQL query, leading to operations that aren't possible so that
we either don't do them or end up degrading to raw SQL that's only
checked at runtime.

Here, we add a sqlctemplate package that's designed to be run from
inside custom implementations of sqlc's DBTX interface so that it it
runs after sqlc's generated code but before the query goes to Postgres.

In sqlc code, templates look like this:

-- name: JobCountByState :one
SELECT count(*)
FROM /* TEMPLATE: schema */river_job
WHERE state = @state;

The template replacement is modeled as a comment so that it doesn't
interfere with with sqlc's parsing of SQL syntax. The above is valid SQL
with or without the template, but with it, sqlctemplate can add an
arbitrary schema name to the queried table.

It also supports a form of syntax where a value is required for SQL to
be valid. For example, WHERE and ORDER BY clauses both require a
value for them to be valid. Here, a stand in value is provide between
template tags. It's processed by sqlc's parser, but then replace by the
template engine before the SQL is executed:

-- name: JobList :many
SELECT *
FROM river_job
WHERE /* TEMPLATE_BEGIN: where_clause */ 1 /* TEMPLATE_END */
ORDER BY /* TEMPLATE_BEGIN: order_by_clause */ id /* TEMPLATE_END */
LIMIT @max::int;

Template values are injected via context (don't love this, but there's
no other way in getting information down to a layer below DBTX):

ctx = sqlctemplate.WithReplacements(ctx, map[string]sqlctemplate.Replacement{
    "order_by_clause": {Value: params.OrderByClause},
    "where_clause":    {Value: params.WhereClause},
}, params.NamedArgs)

jobs, err := dbsqlc.New().JobList(ctx, e.dbtx, params.Max)

The template engine is written to be root out as many error as possible
by noticing if a template replacement is passed that doesn't have an
equivalent template in SQL, or if a template in SQL is present for which
there's no replacement.

Named args are support in templates similar to how sqlc supports them.
This allows pgx's prepared statement cache to continue to operate as it
did before, thereby keeping everything fast.

Lastly, I should note that templates are meant as a utility of last
resort. All effort should be made to resolve problems via mainstream
sqlc, and only bring in templates when there's no other option.

@brandur brandur marked this pull request as draft March 7, 2025 04:05
@brandur brandur force-pushed the brandur-sqlc-templates-basic branch 3 times, most recently from 6325450 to ae13b6c Compare March 7, 2025 04:34
@brandur
Copy link
Contributor Author

brandur commented Mar 7, 2025

@bgentry Alright, so this one's the same as #792 except that I'e reverted the changes to job listing + use of injected schema, and added a test suite for the templating package.

If you're okay with it I'd propose:

  1. Reviewing and getting this merged.
  2. Cut release of River for other packages to use an updated rivershared.
  3. In parallel:
    1. Start updating the concurrency query.
    2. I can go back and update JobList based on Add sqlc templates for arbitrary text substitution #792. An explicit schema will be a little more work (all queries would need to be updated), but we could look into that from there.

@brandur brandur requested a review from bgentry March 7, 2025 04:39
@brandur brandur marked this pull request as ready for review March 7, 2025 04:40
@brandur brandur force-pushed the brandur-sqlc-templates-basic branch from ae13b6c to f335397 Compare March 7, 2025 04:44
This one aims to give us a workable resolution to one of our most common
problems with sqlc. Namely, that although it allows substitution for
parameters that work with a prepared query, it can't replace arbitrary
parts of a SQL query, leading to operations that aren't possible so that
we either don't do them or end up degrading to raw SQL that's only
checked at runtime.

Here, we add a `sqlctemplate` package that's designed to be run from
inside custom implementations of sqlc's `DBTX` interface so that it it
runs after sqlc's generated code but before the query goes to Postgres.

In sqlc code, templates look like this:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The template replacement is modeled as a comment so that it doesn't
interfere with with sqlc's parsing of SQL syntax. The above is valid SQL
with or without the template, but with it, `sqlctemplate` can add an
arbitrary schema name to the queried table.

It also supports a form of syntax where a value is required for SQL to
be valid. For example, `WHERE` and `ORDER BY` clauses both require a
value for them to be valid. Here, a stand in value is provide between
template tags. It's processed by sqlc's parser, but then replace by the
template engine before the SQL is executed:

    -- name: JobList :many
    SELECT *
    FROM river_job
    WHERE /* TEMPLATE_BEGIN: where_clause */ 1 /* TEMPLATE_END */
    ORDER BY /* TEMPLATE_BEGIN: order_by_clause */ id /* TEMPLATE_END */
    LIMIT @max::int;

Template values are injected via context (don't love this, but there's
no other way in getting information down to a layer below `DBTX`):

    ctx = sqlctemplate.WithReplacements(ctx, map[string]sqlctemplate.Replacement{
        "order_by_clause": {Value: params.OrderByClause},
        "where_clause":    {Value: params.WhereClause},
    }, params.NamedArgs)

    jobs, err := dbsqlc.New().JobList(ctx, e.dbtx, params.Max)

The template engine is written to be root out as many error as possible
by noticing if a template replacement is passed that doesn't have an
equivalent template in SQL, or if a template in SQL is present for which
there's no replacement.

Named args are support in templates similar to how sqlc supports them.
This allows pgx's prepared statement cache to continue to operate as it
did before, thereby keeping everything fast.

Lastly, I should note that templates are meant as a utility of last
resort. All effort should be made to resolve problems via mainstream
sqlc, and only bring in templates when there's no other option.
@brandur brandur force-pushed the brandur-sqlc-templates-basic branch from f335397 to 433cd72 Compare March 7, 2025 04:46
Copy link
Contributor

@bgentry bgentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! :shipit:

@brandur
Copy link
Contributor Author

brandur commented Mar 8, 2025

Excellent, thanks. Let me get this in and start work on formally moving JobList over.

@brandur brandur merged commit ab892c3 into master Mar 8, 2025
10 checks passed
@brandur brandur deleted the brandur-sqlc-templates-basic branch March 8, 2025 04:34
brandur added a commit that referenced this pull request Mar 8, 2025
Here, reimplement the job list API's internals using sqlc using the new
sqlc template system introduced in #794. Admittedly this one isn't quite
as big of a win as we expect the templating to be in some cases, but it
gets rid of a case of SQL code in a raw string, and lets us drop the
`JobListFields` driver interface function.

We don't add tests because the APIs are relatively well tested already,
but make some small modifications to the driver `JobList` test cases so
they're a little more thorough.
brandur added a commit that referenced this pull request Mar 8, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.
brandur added a commit that referenced this pull request Mar 8, 2025
Follows up the addition of `sqlctemplate` in #794. I noticed while
adding functionality in with templates that it was quite easy to (1) add
a `/* TEMPLATE` tag to `river_job.sql`, (2) put in context parameters to
a driver, but then (3) forget to run `make generate`. The context
parameters are injected, but `sqlctemplate` no ops with a fast short
circuit because there's no `/* TEMPLATE` tag present in the generated
Go code that the driver is executing. This leads to confusion.

Here, add a few more error conditions:

* If a context container is present without any `/* TEMPLATE` tags,
  error.

* If any `/* TEMPLATE` tags are present without a context container,
  error.

This makes dumb bugs easier to catch because we get an explicit error
instead of them failing silently. Tests are updated to check for them.
Comment on lines +32 to +39
// Be careful not to place a template on a line by itself because sqlc will
// strip any lines that start with a comment. For example, this does NOT work:
//
// -- name: JobList :many
// SELECT *
// FROM river_job
// /* TEMPLATE_BEGIN: where_clause */
// LIMIT @max::int;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually true @brandur? I have some comments in another branch that I would actually love to be stripped out of the final query, but they're not, regardless of whether they're on a -- or /* */ comment line all by themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it seems to only happen when the line doesn't have any leading whitespace, i.e. solely the comment and nothing else (similar to comment lines that denote a query). But if you've indented even a single space, the comment is maintained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, sounds about right. TBH, I find the behavior kind of confusing, and especially given use of prepared statements with pgx, I doubt the comment stripping is really doing that much perf wise and it probably would've been better if they kept everything in.

Here's the sqlc issue on the subject FWIW. No activity in a while:

sqlc-dev/sqlc#2207

brandur added a commit that referenced this pull request Mar 9, 2025
Follows up the addition of `sqlctemplate` in #794. I noticed while
adding functionality in with templates that it was quite easy to (1) add
a `/* TEMPLATE` tag to `river_job.sql`, (2) put in context parameters to
a driver, but then (3) forget to run `make generate`. The context
parameters are injected, but `sqlctemplate` no ops with a fast short
circuit because there's no `/* TEMPLATE` tag present in the generated
Go code that the driver is executing. This leads to confusion.

Here, add a few more error conditions:

* If a context container is present without any `/* TEMPLATE` tags,
  error.

* If any `/* TEMPLATE` tags are present without a context container,
  error.

This makes dumb bugs easier to catch because we get an explicit error
instead of them failing silently. Tests are updated to check for them.
brandur added a commit that referenced this pull request Mar 9, 2025
…799)

Follows up the addition of `sqlctemplate` in #794. I noticed while
adding functionality in with templates that it was quite easy to (1) add
a `/* TEMPLATE` tag to `river_job.sql`, (2) put in context parameters to
a driver, but then (3) forget to run `make generate`. The context
parameters are injected, but `sqlctemplate` no ops with a fast short
circuit because there's no `/* TEMPLATE` tag present in the generated
Go code that the driver is executing. This leads to confusion.

Here, add a few more error conditions:

* If a context container is present without any `/* TEMPLATE` tags,
  error.

* If any `/* TEMPLATE` tags are present without a context container,
  error.

This makes dumb bugs easier to catch because we get an explicit error
instead of them failing silently. Tests are updated to check for them.
brandur added a commit that referenced this pull request Mar 9, 2025
Here, reimplement the job list API's internals using sqlc using the new
sqlc template system introduced in #794. Admittedly this one isn't quite
as big of a win as we expect the templating to be in some cases, but it
gets rid of a case of SQL code in a raw string, and lets us drop the
`JobListFields` driver interface function.

We don't add tests because the APIs are relatively well tested already,
but make some small modifications to the driver `JobList` test cases so
they're a little more thorough.
brandur added a commit that referenced this pull request Mar 9, 2025
Here, reimplement the job list API's internals using sqlc using the new
sqlc template system introduced in #794. Admittedly this one isn't quite
as big of a win as we expect the templating to be in some cases, but it
gets rid of a case of SQL code in a raw string, and lets us drop the
`JobListFields` driver interface function.

We don't add tests because the APIs are relatively well tested already,
but make some small modifications to the driver `JobList` test cases so
they're a little more thorough.
brandur added a commit that referenced this pull request Mar 9, 2025
A few more small follow ups for #794. I started renaming things late in
the process while refactoring it, and a few of the renames didn't end up
getting bubbled all the way through.

Try to maintain roughly this vernacular:

* **Template:** A template itself like `/* TEMPLATE: ...` along with its
  assigned template name.

* **Replacement:** A replacement value for a template, possibly
  including some metadata like `Stable`.

Get rid of the use of `tmpl` in favor of `replacement` for locals.
brandur added a commit that referenced this pull request Mar 9, 2025
A few more small follow ups for #794. I started renaming things late in
the process while refactoring it, and a few of the renames didn't end up
getting bubbled all the way through.

Try to maintain roughly this vernacular:

* **Template:** A template itself like `/* TEMPLATE: ...` along with its
  assigned template name.

* **Replacement:** A replacement value for a template, possibly
  including some metadata like `Stable`.

Get rid of the use of `tmpl` in favor of `replacement` for locals.
brandur added a commit that referenced this pull request Mar 9, 2025
A few more small follow ups for #794. I started renaming things late in
the process while refactoring it, and a few of the renames didn't end up
getting bubbled all the way through.

Try to maintain roughly this vernacular:

* **Template:** A template itself like `/* TEMPLATE: ...` along with its
  assigned template name.

* **Replacement:** A replacement value for a template, possibly
  including some metadata like `Stable`.

Get rid of the use of `tmpl` in favor of `replacement` for locals.
brandur added a commit that referenced this pull request Mar 9, 2025
This one follows up #794 again. That change included a template cache,
with the idea being that we could reuse a rendered sqlc query when all
input values were expected to be stable. For example, when replacing
something like a schema name, we'd presumably always be replacing the
template value with the same schema over and over again, so it'd be
better to save on the work.

But that caching system hadn't adequately been thought through, and
wouldn't really work. I realized when I was putting #798 (explicit
schemas) together that if you injected two schema values from two
different clients then you'd end up using the same cache value, which is
wrong.

Here, we tool out the cache layer a little more so that it considers
input values and named args, which all must be consistent for a cached
value to be returned. We also add tests to make sure of it.

Building a cache key unfortunately has to rely on concatenating some
strings together and presorting map keys for stability, but even with
the extra work involved, a cache hit still comes out to be quite a bit
faster than a miss (~2.5x faster), so it seems worth doing:

    $ go test ./rivershared/sqlctemplate -bench=.
    goos: darwin
    goarch: arm64
    pkg: github.com/riverqueue/river/rivershared/sqlctemplate
    cpu: Apple M4
    BenchmarkReplacer/WithCache-10           1626988               735.7 ns/op
    BenchmarkReplacer/WithoutCache-10         695517              1710 ns/op
    PASS
    ok      github.com/riverqueue/river/rivershared/sqlctemplate    3.419s
brandur added a commit that referenced this pull request Mar 9, 2025
This one follows up #794 again. That change included a template cache,
with the idea being that we could reuse a rendered sqlc query when all
input values were expected to be stable. For example, when replacing
something like a schema name, we'd presumably always be replacing the
template value with the same schema over and over again, so it'd be
better to save on the work.

But that caching system hadn't adequately been thought through, and
wouldn't really work. I realized when I was putting #798 (explicit
schemas) together that if you injected two schema values from two
different clients then you'd end up using the same cache value, which is
wrong.

Here, we tool out the cache layer a little more so that it considers
input values and named args, which all must be consistent for a cached
value to be returned. We also add tests to make sure of it.

Building a cache key unfortunately has to rely on concatenating some
strings together and presorting map keys for stability, but even with
the extra work involved, a cache hit still comes out to be quite a bit
faster than a miss (~2.5x faster), so it seems worth doing:

    $ go test ./rivershared/sqlctemplate -bench=.
    goos: darwin
    goarch: arm64
    pkg: github.com/riverqueue/river/rivershared/sqlctemplate
    cpu: Apple M4
    BenchmarkReplacer/WithCache-10           1626988               735.7 ns/op
    BenchmarkReplacer/WithoutCache-10         695517              1710 ns/op
    PASS
    ok      github.com/riverqueue/river/rivershared/sqlctemplate    3.419s
brandur added a commit that referenced this pull request Mar 9, 2025
This one follows up #794 again. That change included a template cache,
with the idea being that we could reuse a rendered sqlc query when all
input values were expected to be stable. For example, when replacing
something like a schema name, we'd presumably always be replacing the
template value with the same schema over and over again, so it'd be
better to save on the work.

But that caching system hadn't adequately been thought through, and
wouldn't really work. I realized when I was putting #798 (explicit
schemas) together that if you injected two schema values from two
different clients then you'd end up using the same cache value, which is
wrong.

Here, we tool out the cache layer a little more so that it considers
input values and named args, which all must be consistent for a cached
value to be returned. We also add tests to make sure of it.

Building a cache key unfortunately has to rely on concatenating some
strings together and presorting map keys for stability, but even with
the extra work involved, a cache hit still comes out to be quite a bit
faster than a miss (~2.5x faster), so it seems worth doing:

    $ go test ./rivershared/sqlctemplate -bench=.
    goos: darwin
    goarch: arm64
    pkg: github.com/riverqueue/river/rivershared/sqlctemplate
    cpu: Apple M4
    BenchmarkReplacer/WithCache-10           1626988               735.7 ns/op
    BenchmarkReplacer/WithoutCache-10         695517              1710 ns/op
    PASS
    ok      github.com/riverqueue/river/rivershared/sqlctemplate    3.419s
brandur added a commit that referenced this pull request Mar 9, 2025
This one follows up #794 again. That change included a template cache,
with the idea being that we could reuse a rendered sqlc query when all
input values were expected to be stable. For example, when replacing
something like a schema name, we'd presumably always be replacing the
template value with the same schema over and over again, so it'd be
better to save on the work.

But that caching system hadn't adequately been thought through, and
wouldn't really work. I realized when I was putting #798 (explicit
schemas) together that if you injected two schema values from two
different clients then you'd end up using the same cache value, which is
wrong.

Here, we tool out the cache layer a little more so that it considers
input values and named args, which all must be consistent for a cached
value to be returned. We also add tests to make sure of it.

Building a cache key unfortunately has to rely on concatenating some
strings together and presorting map keys for stability, but even with
the extra work involved, a cache hit still comes out to be quite a bit
faster than a miss (~2.5x faster), so it seems worth doing:

    $ go test ./rivershared/sqlctemplate -bench=.
    goos: darwin
    goarch: arm64
    pkg: github.com/riverqueue/river/rivershared/sqlctemplate
    cpu: Apple M4
    BenchmarkReplacer/WithCache-10           1626988               735.7 ns/op
    BenchmarkReplacer/WithoutCache-10         695517              1710 ns/op
    PASS
    ok      github.com/riverqueue/river/rivershared/sqlctemplate    3.419s
brandur added a commit that referenced this pull request Mar 9, 2025
This one follows up #794 again. That change included a template cache,
with the idea being that we could reuse a rendered sqlc query when all
input values were expected to be stable. For example, when replacing
something like a schema name, we'd presumably always be replacing the
template value with the same schema over and over again, so it'd be
better to save on the work.

But that caching system hadn't adequately been thought through, and
wouldn't really work. I realized when I was putting #798 (explicit
schemas) together that if you injected two schema values from two
different clients then you'd end up using the same cache value, which is
wrong.

Here, we tool out the cache layer a little more so that it considers
input values and named args, which all must be consistent for a cached
value to be returned. We also add tests to make sure of it.

Building a cache key unfortunately has to rely on concatenating some
strings together and presorting map keys for stability, but even with
the extra work involved, a cache hit still comes out to be quite a bit
faster than a miss (~2.5x faster), so it seems worth doing:

    $ go test ./rivershared/sqlctemplate -bench=.
    goos: darwin
    goarch: arm64
    pkg: github.com/riverqueue/river/rivershared/sqlctemplate
    cpu: Apple M4
    BenchmarkReplacer/WithCache-10           1626988               735.7 ns/op
    BenchmarkReplacer/WithoutCache-10         695517              1710 ns/op
    PASS
    ok      github.com/riverqueue/river/rivershared/sqlctemplate    3.419s
brandur added a commit that referenced this pull request Apr 13, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.
brandur added a commit that referenced this pull request Apr 13, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 14, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 15, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 15, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
brandur added a commit that referenced this pull request Apr 15, 2025
Here, follow up #794 to proof out the injection of an explicit schema
into SQL queries so that putting River in a custom schema doesn't
require the use of `search_path`. We use the newly introduced sqlc
template package to prefix table names like so:

    -- name: JobCountByState :one
    SELECT count(*)
    FROM /* TEMPLATE: schema */river_job
    WHERE state = @State;

The main advantage of this approach compared to `search_path` is that
connections no longer require that their `search_path` be set with an
explicit value to work. `search_path` can be set or misset, but because
table names are prefixed with the right schema name already, queries
still go through. This is especially useful in the context of PgBouncer,
where a valid `search_path` setting can't be guaranteed.

Notably, this change doesn't bring in enough new testing to prove that
River really works with explicit schema injection, so for the time being
this setting remains completely internal. What I'd like to try and do is
get some basic infrastructure like this in place first, then prove it
out by starting to move the test suite over to schema-specific tests. By
virtue of doing that we'd be putting the entire load of the River test
suite on the new paths, which should give us high confidence that it's
all working as intended.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants