Skip to content

Device UKS/GKS Implementation#91

Merged
wavefunction91 merged 81 commits intowavefunction91:masterfrom
mikovtun:UKS_device
Jul 30, 2024
Merged

Device UKS/GKS Implementation#91
wavefunction91 merged 81 commits intowavefunction91:masterfrom
mikovtun:UKS_device

Conversation

@mikovtun
Copy link
Copy Markdown
Contributor

@mikovtun mikovtun commented Nov 28, 2023

This PR adds device (CUDA) support for XC calculations with Unrestricted and Generalized Kohn-Sham references. LDA and GGA functionals are supported.

Copy link
Copy Markdown
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick once over - this looks like a good start. Definitely want to get UKS GGA + unification of the RKS/UKS drivers (e.g. the local_work bits). Will do a more thorough review once its more complete!

Comment thread src/xc_integrator/local_work_driver/device/common/uvvars.hpp Outdated
Comment thread src/xc_integrator/local_work_driver/device/common/zmat_vxc.hpp Outdated
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu Outdated
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu Outdated
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu Outdated
Comment thread src/xc_integrator/local_work_driver/device/local_device_work_driver.hpp Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_aos_data.cxx Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_data.hpp Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_data.hpp Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_data.hpp Outdated
@wavefunction91 wavefunction91 changed the title Device LDA UKS Device UKS Implementation Nov 29, 2023
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu Outdated
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/zmat_vxc.cu Outdated
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/zmat_vxc.cu Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_data.hpp Outdated
Comment thread src/xc_integrator/xc_data/device/xc_device_data.hpp Outdated
Comment thread tests/xc_integrator.cxx Outdated
@wavefunction91 wavefunction91 changed the title Device UKS Implementation Device UKS/GKS Implementation May 8, 2024
Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu
@mikovtun
Copy link
Copy Markdown
Contributor Author

_5e6a6f4c-c911-4efb-91df-bee0a6b80c09

Copy link
Copy Markdown
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a screenshot / terminal output showing no performance degradation for Ubi DZ/PBE between current master and this PR + the misc minor changes noted, and this is g2g.

Sorry for the delay.

Comment thread src/xc_integrator/local_work_driver/device/cuda/kernels/uvvars.cu

if (mtemp > dtolsq) {
const double inv_mnorm = rsqrt(mtemp);
mnorm = 1./inv_mnorm;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future note, this is likely the culprit for #134.

Comment thread src/xc_integrator/shell_batched/shell_batched_replicated_xc_integrator_exx.hpp Outdated
@mikovtun
Copy link
Copy Markdown
Contributor Author

To demonstrate this PR doesn't degrade RKS performance, I present the timings (from standalone_driver) for Ubiquitin/DZ/PBE on a single A100 on Perlmutter. ( master here is 1a47c11 and UKS_device is 448bec8 )
Screen Shot 2024-07-30 at 1 12 24 PM

Copy link
Copy Markdown
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wavefunction91 wavefunction91 merged commit 2e489d4 into wavefunction91:master Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants