Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no commands listed #3363

Open
sjanssen2 opened this issue Feb 16, 2024 · 11 comments
Open

no commands listed #3363

sjanssen2 opened this issue Feb 16, 2024 · 11 comments

Comments

@sjanssen2
Copy link
Contributor

I came across the situation that I configured and started plugins on top of the qiita_test system. Adding a new study, with new 16S prep and uploaded per_sample_fastq files led to an empty combo box when processing the per_sample_fastq artifact:

image

If tracked this down to the fact that the set of active commands cids and the commands from default workflows for 16S cmds had zero overlap. This might be an edge case, but would it be worth warning the user about this situation, i.e. print to the log that subsetting results in an empty command list?

cids = cmds & cids

@sjanssen2
Copy link
Contributor Author

am I right that qiita-env clean-test does not clear the database from test entries, but rather restores the test data should they have been messed up via some test? If so, how could I best remove / update the defined "recommended workflows" such that their command_ids match those provided from the active plugins?

@sjanssen2
Copy link
Contributor Author

I found qp_target_gene/support_files/patches/171029_QIIME_v191_to_QIIMEq2_v191.sql which I guess is the source of the renaming of the plugin name (QIIME -> QIIMEq2), but this patch does not cover workflows. Probably because they did not exists at this time?

@antgonza
Copy link
Member

Just to answer your questions:

  1. Yes, the qiita-env clean-test cleans/restores the qiita-test database so it can be in the same state for new tests to run. This is helpful when you need to tests something and the cleaning of the database is not possible. Note that the decorator @qiita_test_checker runs that code too at the end of each of the qiita-test objects; and that the system needs to be in tests mode so this is applied.
  2. QIIME->QIIMEq2 was done many (many ...) moons before the workflow idea was applied.

Now, in general plugin testing should happen via the testing framework (you can see any plugin for examples); which means that they are not normally shown in the GUI - mainly because at the end of the testing the database is reset.

Hope this helps.

@antgonza
Copy link
Member

Closing for now, please reopen if you have more questions.

@sjanssen2
Copy link
Contributor Author

I am coming back to this filtering as I do not fully understand the rationale here. I set up a fresh Qiita instance in test mode and installed the following plugins:

Image

Next, I created a new "study" with one "preparation" with sample- and prep- metadata and 3 per sample fastq files:

Image

As I am in the test environment, the available "recommended workflows" are those (Note there is none containing e.g. a "qp-deblur" command):

Image

When processing the initial "per_sample_fastq" artifact, I can compose a workflow like (screenshot automatically closed the combo-box, but from the available items I can select deblur):

Image

The workflow is nicely executed and results are available:

Image

Should I then decide to delete the two deblur artifacts and re-create from the existing "demultiplex" artifact, the combo-box does not list deblur (again, screenshot is closing the box before taking the shot :-/):

Image

I trace this back to the above mentioned filter:

qiita/qiita_db/artifact.py

Lines 1659 to 1676 in 58e15a4

if self.analysis is None:
sql += " AND is_analysis = False"
# get the workflows that match this artifact so we can filter
# the available commands based on the commands in the worflows
# for that artifact - except is the artifact_type == 'BIOM'
if self.artifact_type != 'BIOM':
dws = [w for w in qdb.software.DefaultWorkflow.iter()
if self.data_type in w.data_type]
else:
sql += " AND is_analysis = True"
qdb.sql_connection.TRN.add(sql, [self.id])
cids = set(qdb.sql_connection.TRN.execute_fetchflatten())
if dws:
cmds = {n.default_parameter.command.id
for w in dws for n in w.graph.nodes}
cids = cmds & cids

In effect, that does mean a) there is a different behavior if either the user creates a complete processing workflow or continues a partially processed one, which IMHO is quite confusing and b) the "recommended" workflows partially limit the options on how to combine processing steps and thus is more an "enforcing" of options than a "recommendation".

Can you elaborate, why it is useful to limit user options to those options defined in the recommended workflows?!

@antgonza
Copy link
Member

antgonza commented Mar 9, 2025

The original intent was to reduce or minimize mistakes by naive users; for example, not use a 16S command (like split libraries) on a WGS preparation. However, at this point I'm thinking that it might be better if the plugins defined what kind of data they can work on vs. the workflow defining them. What do you think?

@antgonza antgonza reopened this Mar 9, 2025
@sjanssen2
Copy link
Contributor Author

Hm, this is kind of a semantic type checking mechanism, right? I just wonder, if the maintainer of a plugin is really the right person to have the overview how his/her plugin shall be reasonable used within the Qiita universe, e.g. split-library was developed long before WGS processing was available in Qiita. Therefore, I tend to vote against offloading this burden to individual plugins.

On the other hand, if you restrict options too harsh, we might not need to offer individual processing steps at all but only offer a very limited set of recommended workflows.

What would currently happen, if the user applies "split-libraries" to a WGS artifact? Will it fail and thus raise user awareness OR produce results which the user might wrongly interpret as biological truth. We should definitely protect users from the later, but I'd say the first type of wrong use is OK if you don't mind wasting some compute.

@antgonza
Copy link
Member

Thank you for the feedback. I think the main issue of allowing a command to be run against the incorrect data type (MTX, WGS, 16S, 18S, etc) is that there is no assurance that it will fail - meaning that it can produce a result that will not make sense. Now, I agree that the plugin developer might not be the best person to do this. Thus, what about adding to the plugin page the possibility of an admin "direct" allowing commands to run on a specific data type; thus skipping the need to be part of a workflow?

@sjanssen2
Copy link
Contributor Author

Thus, what about adding to the plugin page the possibility of an admin "direct" allowing commands to run on a specific data type

do you mean at the software page like here? Adding another interactive column for admins to add lists of input artifacts per command? Maybe it is less effort to maintain a black-list instead of a white-list?

Image

thus skipping the need to be part of a workflow?
This I like :-)

@antgonza
Copy link
Member

Yes, there; and I think white listed is better because, in my experience, most active commands are actually used as part of the default workflow.

@sjanssen2
Copy link
Contributor Author

Aha, you want to keep workflow dependent filtering AND let admins whitelist additional commands. I previously thought you want to remove the workflow filtering and only operate with whitelists.
In the former case, I agree, the amount of extra steps should be much smaller with whitelists instead of blacklists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants