Skip to content

Pawel plesniak/static port fix#741

Open
PawelPlesniak wants to merge 17 commits intodevelopfrom
PawelPlesniak/StaticPortFix
Open

Pawel plesniak/static port fix#741
PawelPlesniak wants to merge 17 commits intodevelopfrom
PawelPlesniak/StaticPortFix

Conversation

@PawelPlesniak
Copy link
Collaborator

@PawelPlesniak PawelPlesniak commented Dec 10, 2025

Description

Fixes #702

Includes basic checks for the root controller and local connectivity service applications, and points the user to the commands that should be run to address this.

Changelog

Prior to booting, the process manager driver will validate that the requested port numbers for the local connectivity service and root controller are free. If they are free, nothing changes. If the ports are occupied, and if the configuration file is modifiable, the port numbers will be changed in the configuration file. If the ports are occupied, and if the configuration file is not modifiable (e.g. the file lives on CVMFS), the DAL is updated, but the configuration file is not. This way, the session can still run, but the consolidated json will contain different port numbers. This is a way to compromise on updating the port numbers, and keeping the dal as updated as possible. I do not think the port number used a vital part of information that must be kept for the offline database analysis.

Suggested testing methods

On the same physical host run

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config FirstInstanceOfSession
boot

then in a separate tty run

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config SecondInstanceOfSession
boot

This will give you logs such as

Found free Kubernetes NodePort: 30510
Updated RC Controller Service 'root-rccontroller_control' to use port 30510
Successfully configured RC controller port for session 'local-1x1-config'.

and

Found free Kubernetes NodePort: 32030
Updated Connectivity Service 'local-connectivity-service' to use port 32030
Updated runtime environment variable 'local-env-connectivity-port' to '32030'
Successfully configured connectivity service port for session 'local-1x1-config'.

Note - if you attempt to use two identital session IDs with the ehn1 conn srv, this will fail as the root controller port gets mapped over, but the "new" session is pointed to the "old" session in the connectivity service storage. The discussion of having an endpoint to check against existing session IDs has been proposed here.

All the integration tests have passed.

Type of change

  • Documentation (non-breaking change that adds or improves the documentation)
  • New feature (non-breaking change which adds functionality)
  • Optimization (non-breaking, back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (whatever its nature)

Key checklist

  • All tests pass (eg. python -m pytest)
  • Pre-commit hooks run successfully (eg. pre-commit run --all-files)

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added or an issue has been opened to tackle that in the future.
    (Indicate issue here: # (issue))

@PawelPlesniak
Copy link
Collaborator Author

Note - this has not been marked as "Ready for review" as the archiecture behind the integration tests that start the LCS separately has not yet been understood, hence all integration tests that start their own LCS fail. This will be corrected, but until then, this PR is on hold

Base automatically changed from prep-release/fddaq-v5.5.0 to develop December 15, 2025 17:19
@PawelPlesniak
Copy link
Collaborator Author

In one of the Run Control technical meetings, it was decided that the run control will be able to change the port number provided in the configuration, confirmed with @mroda88, this is a high priority item.

@PawelPlesniak
Copy link
Collaborator Author

Note - this PR is currently in progress as testing with EHN1 configurations fails the root controller address checks

@PawelPlesniak
Copy link
Collaborator Author

This is now addressed, requires testing only

@PawelPlesniak
Copy link
Collaborator Author

Integration tests pass

@PawelPlesniak
Copy link
Collaborator Author

Note - in commit 190fec5, I have removed all log handlers from the root logger, as daqconf scripts still use the old style logging.basicConfig style of configuring the logging, as opposed to the more modern daqpytools implementation.

@PawelPlesniak
Copy link
Collaborator Author

When testing, it was found that when without having a set KUBE_CONFIG, one would get the following error

[2026/02/17 14:20:23 UTC] ERROR      unified_shell.py:15                      drunc.unified_shell         
🔥🔥 Exception thrown 🔥🔥
[2026/02/17 14:20:23 UTC] ERROR      unified_shell.py:16                      drunc.unified_shell         
HTTPSConnectionPool(host='10.73.136.40', port=6443): Max retries exceeded with url: /api/v1/services 
(Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify 
failed: unable to get local issuer certificate (_ssl.c:1010)')))

@PawelPlesniak
Copy link
Collaborator Author

Needs a warning and an exit if a config being tested is remote, and the full integ test

@PawelPlesniak
Copy link
Collaborator Author

Note the pytests failing have been addressed in a separate PR

@PawelPlesniak
Copy link
Collaborator Author

PawelPlesniak commented Feb 27, 2026

Integration tests passed

if config_is_read_only:
new_port = find_free_port(30000, 32767)
root_controller_service.port = new_port
self.log.debug(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase log level to info

@emmuhamm emmuhamm force-pushed the PawelPlesniak/StaticPortFix branch from ae477f6 to e6d1837 Compare March 5, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: root-controller always points to port 30006

2 participants