Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 70 additions & 3 deletions .github/workflows/functional.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ on:
jobs:
functional:
runs-on: ubuntu-latest
timeout-minutes: 20
timeout-minutes: 30

steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -77,17 +77,84 @@ jobs:
- name: Run failover tests
run: bash tests/functional/test-failover.sh

# ---- PostgreSQL functional tests ----
- name: Start PostgreSQL containers
working-directory: tests/functional
run: |
docker compose up -d pgprimary
echo "Waiting for pgprimary to be healthy..."
timeout 120 bash -c '
while true; do
STATUS=$(docker compose ps pgprimary --format json 2>/dev/null | python3 -c "
import json, sys
for line in sys.stdin:
svc = json.loads(line)
if \"healthy\" in svc.get(\"Status\",\"\").lower():
print(\"healthy\")
sys.exit(0)
print(\"waiting\")
" 2>/dev/null || echo "waiting")
if [ "$STATUS" = "healthy" ]; then
echo "pgprimary healthy"
exit 0
fi
sleep 2
done
' || { echo "Timeout waiting for pgprimary"; docker compose logs pgprimary --tail=30; exit 1; }
docker compose up -d pgstandby1
echo "Waiting for pgstandby1 to be healthy..."
timeout 120 bash -c '
while true; do
STATUS=$(docker compose ps pgstandby1 --format json 2>/dev/null | python3 -c "
import json, sys
for line in sys.stdin:
svc = json.loads(line)
if \"healthy\" in svc.get(\"Status\",\"\").lower():
print(\"healthy\")
sys.exit(0)
print(\"waiting\")
" 2>/dev/null || echo "waiting")
if [ "$STATUS" = "healthy" ]; then
echo "pgstandby1 healthy"
exit 0
fi
sleep 2
done
' || { echo "Timeout waiting for pgstandby1"; docker compose logs pgstandby1 --tail=30; exit 1; }

- name: Start PostgreSQL orchestrator
working-directory: tests/functional
run: |
docker compose up -d orchestrator-pg
echo "Waiting for PostgreSQL orchestrator to be ready..."
timeout 120 bash -c '
while true; do
if curl -sf http://localhost:3098/api/clusters > /dev/null 2>&1; then
echo "PostgreSQL orchestrator ready"
exit 0
fi
sleep 2
done
' || { echo "PostgreSQL orchestrator not ready"; docker compose logs orchestrator-pg --tail=50; exit 1; }

- name: Run PostgreSQL tests
run: bash tests/functional/test-postgresql.sh

- name: Collect orchestrator logs
if: always()
working-directory: tests/functional
run: docker compose logs orchestrator > /tmp/orchestrator-test.log 2>&1 || true
run: |
docker compose logs orchestrator > /tmp/orchestrator-test.log 2>&1 || true
docker compose logs orchestrator-pg > /tmp/orchestrator-pg-test.log 2>&1 || true

- name: Upload orchestrator logs
if: always()
uses: actions/upload-artifact@v4
with:
name: orchestrator-test-logs
path: /tmp/orchestrator-test.log
path: |
/tmp/orchestrator-test.log
/tmp/orchestrator-pg-test.log

- name: Collect all docker logs on failure
if: failure()
Expand Down
75 changes: 75 additions & 0 deletions tests/functional/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,51 @@ services:
aliases:
- proxysql

pgprimary:
image: postgres:17
hostname: pgprimary
environment:
POSTGRES_PASSWORD: testpass
POSTGRES_USER: postgres
volumes:
- ./postgres/init-primary.sh:/docker-entrypoint-initdb.d/init.sh
ports:
- "15432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 3s
retries: 30
networks:
orchnet:
aliases:
- pgprimary

pgstandby1:
image: postgres:17
hostname: pgstandby1
environment:
POSTGRES_PASSWORD: testpass
PGUSER: postgres
PGPASSWORD: repl_pass
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the pgstandby1 service the libpq environment variables are inconsistent (PGUSER: postgres but PGPASSWORD: repl_pass). This can break readiness checks / any libpq client defaults, and it makes the intended authentication unclear. Set PGUSER/PGPASSWORD to the same account you expect to use (e.g., PGUSER: repl with PGPASSWORD: repl_pass, or keep postgres with testpass).

Suggested change
PGPASSWORD: repl_pass
PGPASSWORD: testpass

Copilot uses AI. Check for mistakes.
volumes:
- ./postgres/init-standby.sh:/init-standby.sh
entrypoint: ["/bin/bash", "/init-standby.sh"]
depends_on:
pgprimary:
condition: service_healthy
ports:
- "15433:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 3s
retries: 30
networks:
orchnet:
aliases:
- pgstandby1

Comment on lines +115 to +139
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description / linked issue mentions a primary plus two streaming replicas, but this compose file defines only one standby (pgstandby1). If the intent is to close #68 as written, add the second standby service (and extend the tests accordingly); otherwise, update the PR/issue linkage so expectations match what’s implemented.

Copilot uses AI. Check for mistakes.
orchestrator:
image: ubuntu:24.04
hostname: orchestrator
Expand Down Expand Up @@ -122,6 +167,36 @@ services:
aliases:
- orchestrator

orchestrator-pg:
image: ubuntu:24.04
hostname: orchestrator-pg
volumes:
- ../../bin/orchestrator:/usr/local/bin/orchestrator:ro
- ../../resources:/orchestrator/resources:ro
- ./orchestrator-pg-test.conf.json:/orchestrator/orchestrator.conf.json:ro
command: >
bash -c "
apt-get update -qq && apt-get install -y -qq curl sqlite3 > /dev/null 2>&1 &&
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Running apt-get update and apt-get install every time the container starts is inefficient and makes the tests dependent on external network connectivity and package repository availability. It is recommended to use a custom Dockerfile or a pre-built image that already includes curl and sqlite3 to speed up the test execution and improve reliability.

rm -f /tmp/orchestrator-pg-test.sqlite3 &&
cd /orchestrator &&
orchestrator -config orchestrator.conf.json http
"
ports:
- "3098:3098"
depends_on:
pgprimary:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:3098/api/clusters"]
interval: 5s
timeout: 3s
retries: 60
start_period: 15s
networks:
orchnet:
aliases:
- orchestrator-pg

networks:
orchnet:
driver: bridge
19 changes: 19 additions & 0 deletions tests/functional/orchestrator-pg-test.conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"Debug": true,
"ListenAddress": ":3098",
"ProviderType": "postgresql",
"PostgreSQLTopologyUser": "orchestrator",
"PostgreSQLTopologyPassword": "orch_pass",
"PostgreSQLSSLMode": "disable",
"MySQLOrchestratorHost": "",
"MySQLOrchestratorPort": 0,
"BackendDB": "sqlite",
"SQLite3DataFile": "/tmp/orchestrator-pg-test.sqlite3",
"DiscoverByShowSlaveHosts": false,
"InstancePollSeconds": 5,
"RecoveryPeriodBlockSeconds": 10,
"RecoverMasterClusterFilters": [".*"],
"RecoverIntermediateMasterClusterFilters": [".*"],
"AutoPseudoGTID": false,
"PrometheusEnabled": true
}
36 changes: 36 additions & 0 deletions tests/functional/postgres/init-primary.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash
# Initialize PostgreSQL primary for functional tests.
# This script runs inside the postgres Docker entrypoint (initdb phase).

set -e

# ---- WAL / replication settings ----
cat >> "$PGDATA/postgresql.conf" <<EOF
wal_level = replica
max_wal_senders = 10
wal_keep_size = 256MB
hot_standby = on
listen_addresses = '*'
EOF

# ---- pg_hba: allow replication and orchestrator connections ----
cat >> "$PGDATA/pg_hba.conf" <<EOF
# Allow replication connections from any host in the Docker network
host replication repl all md5
# Allow orchestrator monitoring user
host all orchestrator all md5
EOF

# ---- Create users ----
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname postgres <<-EOSQL
-- Replication user
CREATE ROLE repl WITH REPLICATION LOGIN PASSWORD 'repl_pass';

-- Orchestrator monitoring user
CREATE ROLE orchestrator WITH LOGIN PASSWORD 'orch_pass';
GRANT pg_monitor TO orchestrator;
-- Allow orchestrator to promote standbys and reload config
ALTER ROLE orchestrator SUPERUSER;
EOSQL
Comment on lines +29 to +34
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init-primary.sh grants orchestrator SUPERUSER and pg_hba.conf allows password auth from all addresses. Combined with the host port publishing in docker-compose.yml, this creates a broad superuser entry point during CI runs. If possible, reduce privileges to what orchestrator needs for tests (e.g., pg_monitor + pg_signal_backend for pg_promote) and/or restrict pg_hba.conf to the Docker network CIDR / remove host port publishing for Postgres services.

Copilot uses AI. Check for mistakes.

echo "Primary initialization complete."
37 changes: 37 additions & 0 deletions tests/functional/postgres/init-standby.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash
# Initialize a PostgreSQL standby via pg_basebackup from the primary.
# This script replaces the default entrypoint for standby containers.

set -e

PGDATA="/var/lib/postgresql/data"
PRIMARY_HOST="pgprimary"
PRIMARY_PORT=5432

echo "Waiting for primary to accept connections..."
until pg_isready -h "$PRIMARY_HOST" -p "$PRIMARY_PORT" -U postgres; do
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init-standby.sh waits for the primary using pg_isready -U postgres, but this container’s environment sets PGPASSWORD to the replication password. If the primary requires password auth, this readiness loop may never succeed. Use the replication user (or ensure PGPASSWORD matches the user used in pg_isready).

Suggested change
until pg_isready -h "$PRIMARY_HOST" -p "$PRIMARY_PORT" -U postgres; do
until pg_isready -h "$PRIMARY_HOST" -p "$PRIMARY_PORT" -U repl; do

Copilot uses AI. Check for mistakes.
sleep 1
done

echo "Running pg_basebackup from $PRIMARY_HOST..."
rm -rf "$PGDATA"/*
pg_basebackup \
-h "$PRIMARY_HOST" \
-p "$PRIMARY_PORT" \
-U repl \
-D "$PGDATA" \
-Fp -Xs -P -R

# pg_basebackup with -R creates standby.signal and sets primary_conninfo
# in postgresql.auto.conf. Override to ensure correct connection string.
cat > "$PGDATA/postgresql.auto.conf" <<EOF
primary_conninfo = 'host=$PRIMARY_HOST port=$PRIMARY_PORT user=repl password=repl_pass'
EOF

touch "$PGDATA/standby.signal"

# Fix permissions (pg_basebackup may set them correctly, but be safe)
chmod 0700 "$PGDATA"

echo "Starting PostgreSQL in standby mode..."
exec postgres -D "$PGDATA"
Loading
Loading