Standardize the output format of the test functions

We have ambitious goals to make experiments with different test functions extremely flexible. For example:

* Enable `sim_gs_n()` to apply a different test at each cut, and even potentially multiple tests per cut
* Enable `maxcombo()` to accept an arbitrary number of test functions and return the best result
* Enable `multitest()` to run multiple tests on a given data set and return the results (currently works by returning a list, but a data frame would be easier for downstream analysis)

Currently this flexibility isn't possible. While the test functions all accept a data frame of trial data as their first positional argument, the outputs are not uniform. This makes it impossible to combine the results into a data frame to facilitate downstream analysis. I also don't understand how the `maxcombo()` function can be extended to accept any number of arbitrary tests without a common test statistic to compare them (though I admit I may just not fully understand the future plans for `maxcombo()`).

The example below demonstrates the diversity of the outputs:

```R
remotes::install_github("Merck/simtrial@2c015f6", upgrade = FALSE)
library("simtrial")
packageVersion("simtrial")
## [1] ‘0.3.2.13’

trial_data <- sim_pw_surv()
trial_data_cut <- cut_data_by_event(trial_data, 150)

wlr(trial_data_cut, weight = fh(rho = 0, gamma = 0))
##   rho gamma         z
## 1   0     0 -1.392586
wlr(trial_data_cut, weight = fh(rho = 0, gamma = 0.5))
##   rho gamma         z
## 1   0   0.5 -1.650691
wlr(trial_data_cut, weight = mb(delay = 3))
##           z
## 1 -1.425919
rmst(trial_data_cut, tau = 20)
##   rmst_arm1 rmst_arm0 rmst_diff         z
## 1  11.16504  10.20869 0.9563463 0.6412476
milestone(trial_data_cut, ms_time = 10)
##      method          z ms_time surv_ctrl surv_exp surv_diff std_err_ctrl std_err_exp
## 1 milestone 0.04013879      10      0.46     0.48      0.02   0.07048404  0.07065409
maxcombo(trial_data_cut, rho = c(0, 0), gamma = c(0, 0.5))
##      p_value
## 1 0.06348763
```

I propose that each test function returns a named vector with 3 elements:

* `description`: brief description of the test performed
* `test statistic`: whatever makes sense to return from the test
* `p-value`: to facilitate comparing across tests with different test statistics

If this is too simplistic, we could include more information (eg the input parameters) by either:

* Converting from a named vector to a named list. Then we could add more complex elements like `parameters`, which could itself be a named vector. Then we could convert these to a data frame using a list column for `parameters`
* We could switch to OOP and define an S3 class for each test result. Then it could return tons of data, but then the method `as.data.frame()` would extract a minimal set to combine the results of many tests into a single data frame
* We could probably pack some more information into the `attributes` of the named vector. But this isn't really friendly for downstream analysis. It would also likely be lost in functions like `sim_gs_n()` that return a data frame

xref: #215 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize the output format of the test functions #222

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Standardize the output format of the test functions #222

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions