Skip to content

Add new template functions for internal error/warning/message to ease translation#5056

Merged
mattdowle merged 10 commits intomasterfrom
template-translation
Jul 8, 2021
Merged

Add new template functions for internal error/warning/message to ease translation#5056
mattdowle merged 10 commits intomasterfrom
template-translation

Conversation

@MichaelChirico
Copy link
Copy Markdown
Member

@MichaelChirico MichaelChirico commented Jun 29, 2021

potools invocation helping us out here (I'll write this in CRAN_release once potools itself lands on CRAN):

potools::translate_package(custom_translation_functions = list(
  R = c("stopf:fmt|1", "warningf:fmt|1", "messagef:fmt|1", "packageStartupMessagef:fmt|1"),
  src = c("STOP:1", "DTWARN:1", "Error:1", "DTPRINT:1")
))

A few thoughts

  1. We could just replace all stop/warning/message usages with stopf/warningf/messagef to keep a consistent interface & not worry about switching between stop/stopf based on if there are templates.
  2. We could consider another function (nstopf()?) to wrap the somewhat ugly stop(domain=NA, sprintf(ngettext(n, msg1, msg2), ...)) case (and of course nwarningf/nmessagef)
  3. (for another day) there's definitely a lot of inconsistency internally with punctuation... whether we wrap columns/types/data.table names with single/double/no quotes, etc. could consider a pass over messages to come up with a consistent taxonomy there. potools::get_message_data() will facilitate this by getting a data set of messages.

@MichaelChirico MichaelChirico added the translation issues/PRs related to message translation projects label Jun 29, 2021
bad.counts = name.tab[1 < name.tab]
if (length(bad.counts)) {
stop(err.what, " should be uniquely named, problems: ", paste(names(bad.counts), collapse=","))
stopf("%s should be uniquely named, problems: %s", err.what, brackify(names(bad.counts)))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdhock FYI I think in a follow-up to this PR we'll want to revisit error messages here, it's not very translation-friendly at the moment (not a high priority)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure that is fine with me. what exactly does it mean to be translation-friendly?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, we would have to have translators translate sentence fragments in isolation. the way to combine those fragments as we do here in English is likely to be different in other languages.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes sense, so in this case err.what has three possible values, "measured columns", "elements of fun.list", "... arguments to measure" which we should convert to three separate calls to stop, for example the first one becomes "measured columns should be uniquely named, problems: "

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds good... full sentences are usually good

R/fread.R Outdated
ngettext(
sum(idx_type),
'colClasses dictated by user input and those read from YAML header are in conflict (specifically, for column [%s]); the proceeding assumes the user input was an intentional override and will ignore the types implied by the YAML header; please exclude this column from colClasses if this was unintentional.',
'colClasses dictated by user input and those read from YAML header are in conflict (specifically, for columns [%s]); the proceeding assumes the user input was an intentional override and will ignore the types implied by the YAML header; please exclude these columns from colClasses if this was unintentional.'
Copy link
Copy Markdown
Member

@mattdowle mattdowle Jul 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be reduced to one string using "column(s)" and "these columns" always? I'm thinking "these columns" still reads ok when there's just one, because "column(s)" occurs earlier in the message. Then this could be one call to messagef(), simpler and without repeating the long message string.
Or perhaps "column(s)" is hard to translate?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're safe for now because Chinese doesn't really have pluralization, and I only see maybe extending to a few romance languages (which will have similar pluralization to English), so I agree it's nice to reduce the overhead here given how clunky ngettext() is. we can revisit in the future on request.

i also just wrote a message elsewhere that is like start(s): %d; end(s); %d, which, IINM, would require nested ngettext()` calls to "properly" pluralize. yuck.

@mattdowle mattdowle added this to the 1.14.1 milestone Jul 8, 2021
@mattdowle
Copy link
Copy Markdown
Member

mattdowle commented Jul 8, 2021

Awesome PR. This solves this area very nicely.

Re 1-3 thoughts in top comment :

  1. yes agree
  2. using the colClasses dictated by user input ... message I commented about above as an example of this, I'm wondering if we can move to generic messages that cover both cases of single and plural. I know I was the one who started doing if (plural) "s" else "" as I thought at the time it was going the extra mile, but in cases like this the extra mile feels like it's now costing too much in complexity; e.g. repeating the whole long message string. When it comes to future maintenance of that message we'll have to make sure changes are made to both and that both strings are translated twice. And we'll need to test and cover the plural switch too: that we didn't in this case resulted in the missed part of the message when plural. Faced with these potential burdens, I'm now leaning towards using generalizations like "column(s): ...". Then we wouldn't need nstopf(). WDYT?
  3. agree

@mattdowle
Copy link
Copy Markdown
Member

mattdowle commented Jul 8, 2021

I realized what I was suggesting was a bit tricky so I had a go in 0e45b19.
I found the 'unless ...' at the end of this helpful : https://www.chicagomanualofstyle.org/qanda/data/faq/topics/Plurals/faq0003.html
Doing this revealed that types also needed attention, so I changed that to types(s) too.
8 lines reduced to 2, uses messagef(), and the long string is no longer repeated.

@mattdowle
Copy link
Copy Markdown
Member

Since pluralization will need a future PR anyway either for this one or the other similar ones, merging now to make progress, so we can start using stopf etc.

@mattdowle mattdowle merged commit 1759f3c into master Jul 8, 2021
@mattdowle mattdowle deleted the template-translation branch July 8, 2021 03:40
@MichaelChirico
Copy link
Copy Markdown
Member Author

Sounds good, working on a follow-up now.

mattdowle added a commit that referenced this pull request Jul 8, 2021
@jangorecki jangorecki removed this from the 1.14.9 milestone Oct 29, 2023
@jangorecki jangorecki added this to the 1.15.0 milestone Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

translation issues/PRs related to message translation projects

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants