-
Notifications
You must be signed in to change notification settings - Fork 3
Fix Hibernate ORM vs SQL schema discrepancies and document analysis #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ment Co-authored-by: rvosa <[email protected]>
…sults Co-authored-by: rvosa <[email protected]>
…tches Co-authored-by: rvosa <[email protected]>
|
@copilot, I need a comprehensive analysis of how character-state matrix data is entered into the database. What I noticed is that there are impedance mismatches when a nexus file is read by headless Mesquite and its contents are inserted into the database. I don't know if this is because there is a discrepancy between Hibernate ORM and the schema, as it is instantiated using the SQL patches, or something else. Based on your analysis in this PR, it is likely something else. The problem manifests specifically when character state matrix data is inserted. Thinking back to the history of TreeBASE (this is 15 years ago) I recall that the initial database design was highly normalized, such that each individual cell in a character state matrix was a tuple. This was not scalable at all (e.g. a single matrix of 100 taxa with 1000 characters is 100k records, and that's still a small table by today's standards), so we worked on optimising this through denormalisation and possibly also by bypassing some of the ORM. The denormalisation consisted of inserting the entire matrix row as a single "symbol string". Whether we inserted this via Hibernate or a lower-level SQL operation, I don't remember. I need you to perform an in-depth analysis of the workflow for inserting Nexus data into the database so we can identify the root cause of the stack trace below, which appears in the GUI when a Nexus file upload is performed. We have been stumped by this issue for a few days now, so this needs a thorough scan of all the code and code paths to figure this out. Clearly, it's not a direct discrepancy between Hibernate and PostgreSQL, so it's something else. |
Co-authored-by: rvosa <[email protected]>
I've identified and fixed the root cause of the MatrixRow loading failure. The issue was a Hibernate @lob type mismatch: Problem: The The Code Path:
Solution (commit f9f4c13):
This is a classic Hibernate/PostgreSQL impedance mismatch when mixing ORM and direct JDBC for large text fields. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Database Schema Analysis: Hibernate ORM vs SQL Scripts
Summary
This PR provides a comprehensive analysis of discrepancies between the Hibernate ORM data model and SQL-based PostgreSQL schema instantiation, along with fixes and recommendations.
Critical Fix: MatrixRow @lob Type Mismatch
The Error:
Root Cause: The
@Lobannotation onMatrixRow.symbolStringcaused Hibernate to use OID-based CLOB handling, but the data is inserted via direct JDBC usingsetString(). When Hibernate tried to read back usinggetClob(), it interpreted text data as a CLOB OID.Solution: Removed
@LobfromMatrixRow.symbolStringandPhyloTree.newickString, replaced with@Column(columnDefinition = "text").Changes Made
Phase 1: Fix schema patches
init_db_uptodate.pgPhase 2: Fix LOB type mismatch
@LobfromMatrixRow.symbolString@LobfromPhyloTree.newickString@Column(columnDefinition = "text")for PostgreSQL TEXT typePhase 3: Documentation
/doc/technical-notes/HIBERNATE_VS_SQL_SCHEMA_ANALYSIS.mdFiles Changed
treebase-core/db/schema/init_db_uptodate.pg- Added patch 0011treebase-core/db/schema/patches/0010_fix-taxonlabel-linked-column-type.sql- Made idempotenttreebase-core/db/schema/patches/0011_increase-citation-column-lengths.sql- Made idempotenttreebase-core/src/main/java/org/cipres/treebase/domain/matrix/MatrixRow.java- Fixed @lob issuetreebase-core/src/main/java/org/cipres/treebase/domain/tree/PhyloTree.java- Fixed @lob issuedoc/technical-notes/HIBERNATE_VS_SQL_SCHEMA_ANALYSIS.md- Updated with findingsTest Results
createVerified
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.