Skip to content

Adjust error bound usage#51

Merged
treigerm merged 3 commits intomainfrom
adjust-bounds
Aug 6, 2025
Merged

Adjust error bound usage#51
treigerm merged 3 commits intomainfrom
adjust-bounds

Conversation

@treigerm
Copy link
Member

This PR is a follow up to #43 .

I would like to make some adjustments for which variables we use absolute and relative error bounds. There's a couple of reasons for this:

  • For the CAMS no2 data the absolute error bound computation leads to lots of weird artifacts for most compressors as I showed in the meeting last week. After some more experimentation I found that this is mostly due to the error bound not being tight enough and not due to numerical issues. Using a relative error bound now makes the output look reasonable. The specific error bounds I used here are based on the number of keepbits computed with the real information from (https://www.nature.com/articles/s43588-021-00156-2).
  • After the switch, only 3 out of the 9 variables we have in the benchmark have absolute errors associated with it. Additionally, 2 out of these 3 variables (ta, tos) have NaNs in the fields meaning a lot of the compressor fail on them. So only msl (mean sea level pressure) is left as a variable that uses an absolute error and that all compressors can easily handle. So overall, I think this would be a bit too much of a skewed emphasis on relative error bounds.
  • Proposed switch: change rlut (outgoing longwave radiation) and the wind variables back to using absolute errors. From the plots in Use ERA5 ensemble bounds #43 absolute errors still seem sensible. Additionally, the automated error bounds for the wind variables are on the same order of magnitude as those chosen by the experts so the error bounds should be reasonable.
  • Overall, this leaves agb (biomass), pr (precipitation), and no2 as the variables with relative error bounds. I think for each of them there is a clear and obvious reason for using a relative error bounds. agb: contains lots of 0s and has a large range [0, 400], pr: many values at different magnitudes close to 0, no2: many values at different magnitudes close to 0.

Let me know what you think @juntyr !

@juntyr
Copy link
Collaborator

juntyr commented Jul 30, 2025

I’m ok with the change in error bounds and would be interested if the combined table, which also includes some bit information computations, also contains what you need.

Perhaps @milankl can have a look at the math details here?

@treigerm
Copy link
Member Author

At least for nitrogen dioxide there will be no match in the ERA5 variables which is why I had to resort to the computations from the BitInformation paper.

@treigerm
Copy link
Member Author

treigerm commented Aug 5, 2025

@juntyr I will merge this tomorrow, now that we discussed the math with @milankl , if there's no further objections!

@treigerm treigerm merged commit 10d3704 into main Aug 6, 2025
3 checks passed
@treigerm treigerm deleted the adjust-bounds branch August 6, 2025 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants