Conversation
…LI to manage input data
test/workflows/automatic_job_grouping/inputs_files-strings.yaml
Outdated
Show resolved
Hide resolved
| class TransformationSubmissionModel(BaseModel): | ||
| """Transformation definition sent to the router.""" | ||
|
|
||
| # Allow arbitrary types to be passed to the model | ||
| model_config = ConfigDict(arbitrary_types_allowed=True) | ||
|
|
||
| task: CommandLineTool | Workflow | ExpressionTool | ||
| input_data: Optional[list[str | File] | None] = None |
There was a problem hiding this comment.
As we are going to integrate input sandbox within transformations (#92), it would be interesting to see if we could reuse the JobInputModel (renamed as InputModel?)
There was a problem hiding this comment.
Regarding @arrabito comments:
- I agree that we don't need to have input sandbox for now, so it can't be local files.
- I don't remember how we will add support for sandboxes in the transformation system. For simplicity, I would keep just LFN paths for now.
- As said before, in my opinion there is no need to support/create sandboxes for now.
Do I still make this change in this PR? Or wouldn't it be better to do it in a (futur) sandbox PR? Maybe I missunderstood what you meant here.
There was a problem hiding this comment.
Let's make this change in a future sandbox PR I would say
There was a problem hiding this comment.
Thinking a little bit further, we may also want to allow local file paths, but only to be used for Local execution (without adding them to SB).
So if the submission is local we allow only local paths, while if the submission is to DIRAC we allow only LFN paths.
In this way, we could also execute transformations locally.
Eventually later on, we will also allow local file paths for DIRAC submission (adding them to ISB).
@aldbr what do you think?
|
@aldbr Regarding this part of the code: dirac-cwl/src/dirac_cwl_proto/transformation/__init__.py Lines 130 to 163 in 72956d5 Are we planning on keeping it? Just so I un-comment it and make the changes related to the |
|
Waiting on #66 (comment) and #95 (comment) approval about what we're doing, and then, PR should be ready to be fully reviewed (and potentially merged 🙏). |
Yes we want to keep it. A transformation should either get inputs from the CLI, or from a |
|
I’m also not sure whether the Also, the If you have any ideas. |
Since |
As far as I see, I'm not sure that any input_name is needed anymore. In the current QueryBasedPlugin, input_name is just used to build the LFN path, see: Probably we could just change get_input_query to not take any argument and just build LFN path as:
instead of:
Then, I guess that the group_size in yaml file should be specified as: instead of: @aldbr do you agree? (Maybe some other changes are needed that I haven't thought). |
test/test_integration.py
Outdated
| task_file = job_wrapper.job_path / "task.cwl" | ||
| task_file.unlink(missing_ok=True) |
There was a problem hiding this comment.
I see that you are using these 2 lines in different test.
Can you create or reuse an existing fixture within conftest?
You can generally yield and then add these 2 lines and they will be executed after the execution of the test. Example: https://docs.pytest.org/en/6.2.x/fixture.html#yield-fixtures-recommended
There was a problem hiding this comment.
I tried to add a fixture like that:
@pytest.fixture
def cleanup_wrapper():
job_wrapper = JobWrapper()
yield job_wrapper
task_file = job_wrapper.job_path / "task.cwl"
task_file.unlink(missing_ok=True)So I could use the JobWrapper yield in the tests and cleanup after (because I need access to job_path value)
But I kept having errors about the create_sandbox method, this also happens when the fixture is not called in any test, just having it in conftest makes the tests fail:
self = <[AttributeError("'pathlib._local.PosixPath' object has no attribute '_raw_paths'") raised in repr()] PosixPath object at 0x1070f0c80>
args = (<coroutine object create_sandbox at 0x1070ebb40>,), paths = [], arg = <coroutine object create_sandbox at 0x1070ebb40>
path = <coroutine object create_sandbox at 0x1070ebb40>
E TypeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'coroutine'I spent a lot of time on this yesterday and I still don't understand why it occurs so that's why I just added the lines that were working.
If you have any ideas on why it doesn't work
There was a problem hiding this comment.
I haven't looked at your code, but I had a similar error when the DIRAC_PROTO_LOCAL variable was not correctly set.
So maybe you can check lines in your code with:
os.environ["DIRAC_PROTO_LOCAL"]
Yes I agree. In any case, this is going to be revised at some point with the hints proposed in #69 |
|
Current PR status:
|
… 1st Transformation input_files
DRAFT PR
cc @aldbr @arrabito @natthan-pigoux
Closes: #66
Related to: #61
Changes:
input_data: list[str | File]toTransformationSubmissionModelinputs-fileparameter to Transformation CLI:dirac-cwl transformation submit file.cwl --inputs-file file.yamlparameter-pathtoinput_files: list[str]in Job CLI:dirac-cwl job submit file.cwl --input-files file1.yaml file2.yaml ...group_sizeexecutionHooksHintto Transformation Workflows, such as:group_sizedetermines the number of jobs to be created and how many inputs files they will contain insubmit_transformation_router, by default, it equals 1, which mean a job will be created for each input in the inputs file. Once the list of jobs is created, it is sent to thejob_routerand processed.JobWrapperrelated tests:task.cwlwas created duringpost_processbut never cleared after running tests. Couldn't manage to create a fixture to do that (I had strange errors?), it probably can be done prettier.Comments:
TODO after this PR: