-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converted inputs to sets + fixed inputs #87
Conversation
I don't think it is wise to convert the input to lists. The order of the input (e.g. in the SOF) file is important. As in that it affects things in the ESO stack. Also, I would like to (at some point) derive the workflow directly from the recipes, or at least experiment with that. And then the order can be very important. So I (strongly) oppose to this change. |
Plus, can we fix the CI before adding new things? |
So the "set of files" is not actually a set, but a list? 🤔 |
Not sure about the ordering issue, but losing this info might get us tangled down the line, compared to what we gain by going to sets? So, we want 'ordered sets'? Something like |
The source of this information is the order in which they are defined in the class hierarchy (by us). So for instance in But with sets it is easier for me to prevent developer errors: if I derive from |
Technically the order should not matter for the current way things are organized. But making the input a set will make it harder to move your beautiful abstraction layer into the direction I would like. For example, I'd like all the information that is in the EDPS tasks to be also present in the recipe itself. To the extend that the EDPS task can be derived from the recipe. (FWIW, this also means I personally like to have 3 or 4 recipes for the dark instead of 1.) However, I don't think the order of the So I would like a way forward in which this ordering is kept. Hmm, thinking about it more, I do think that it should be possible to infer the ordering in case it would matter. So maybe go ahead and do this. To summarize a case where the ordering would matter in the EDPS workflow is when you'd want your standard star reduction to use the same flat as is used for the science data reduction. Then you need to first determine the flat for the science data before you can decide how to do the standard star reduction (which is indirectly also input to the science task). But if such a selection mechanism is indeed needed, then it must be encoded. E.g. through a query like Also, you are right that this change does not directly affect the SOF files, but indirectly it might. For example if we create documentation from the recipe directly and that people create their SOF based on the documentation. (And you are also right that the order in the SOF files should not matter, but it can.) So I've reduced my strong opposition to a mild one, and perhaps I now even think that the ordering should not matter. Furthermore, there are quite a lot of hypotheticals in the story above, that might never come true, so you should not worry about. (Another hypothetical: say you actually want the same input twice? Not as a list-of-two, but as two distinct inputs. Now we can't do that anymore! But try as I might, all scenario's I can come up with would be better served with either a list, or by having different classes for the input data so they are not duplicates anymore.) Concluding, I think the benefits outweigh the negatives, so good idea! |
Maybe a data structure with a deterministic ordering like OrderedSet would indeed be beneficial. Because just this morning I fixed a bug in the ScopeSim tests because |
@hugobuddel Thanks for the stream of consciousness! I had a throbbing headache at night and decided not to touch anything, now with a clear head I think I understand. I see no downsides in using an ordered set, but then we need to make sure it matches DRLD / workflows everywhere. On the other hand, I cannot find any ordered set implementation in the standard library, only If an input file is required twice for whatever reason, it is perfectly possible to use two separate |
From Python 3.7 onwards a keys-only dict should mostly work like an ordered set? Depends on the operations we expect on it after it is created, though. |
Then I am already using that. But I get
which is exactly what I want, but apparently not ordered. |
I like to take a step back and think about what we are trying to solve. You claim that this is necessary for the "used frames", that is #85 (if I understand correctly), because there you say
However, I'm not sure that #85 solves the problem as set out in #79, which I think it is aimed to do. So transitively, I'm not sure this P.R. is necessary. We might still merge this because it seems useful on its own, but we should do so for the right reasons. The main reason to distinguish between "all frames" and "used frames" is when a recipe has multiple output products, and the different output products use different input. The two main cases are:
In both cases I don't see how having the input as a set would help. But I could be misunderstanding what we are trying to achieve. Also, we can still merge this either way. |
I see. No, for now I only wanted to compare frames present in the SOF and those that are also loaded. If it should mean tracking which inputs were actually used for which product, it does not attempt to solve that. Currently I am only trying to ensure that all inputs are present in the SOF and read (unless not required) and vice versa, that there are no extra / unused files in the SOF (I silently assume that the files are correct and whenever something is missing, it is the fault of whoever wrote the recipe). So it still is a tiny step in the right direction, but for validating the But I guess we will have to add something similar to the Again, it seems to me that this will be achieved more easily with sets rather than lists, but I will gladly defer this discussion to the future PR. |
I have found the error, it is the discrepancy between what However that makes other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sesquideus , let's merge this and get everything working again!
Some thoughts:
- I think we should write our recipes such that they will do the best they can even if input data is missing (or weird). E.g.
metis_det_detlin
might still be able to do something sensible if there is noLM_WCU_OFF_RAW
. Or maybe depending on some flag that can be set. Such a 'just do it' flag might be especially useful for AIT because people might deliberately want to skip some calibration steps. (Persistence should always be optional though.) - The recipes should emit a warning on missing input so it is immediately clear what is wrong.
- If a change in the recipes require a change in the workflow, then we can just make that change in the workflow too.
jb/ifu_rsrf_dev
inputs
to sets to avoid duplicates and facilitate comparisons