-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inflation in peptide IDs after group specific FDR #522
Comments
Please first use one of our prebuilt workflows. It looks like you made some changes, like I see you have --prot 1 (i.e. no protein FDR) and you changed to 2D filter. These two settings are incompatible. We have some tutorials on our FragPipe page, please follow it without making any changes: |
Dear Alexey, Thank you for your response. I am using the default nonspecific HLA workflow and followed the tutorial to setup the group FDR. Does this mean that group FDR is not compatible with the nonspecific HLA workflow? |
It is compatible, but as I said something is not right. I see you made at least some changes to the workflows. Please provide more details on how you annotated sequences in the database with PE numbers, what workflow you loaded and what other changes you made. |
I loaded the default nonspecific HLA workflow and removed the nQ, NE and Cysteinylation modifications for both the searches. For the large database search, split database option was increased to 100 and group FDR option was enabled as mentioned in the tutorial. The database was annotated with PE=1 for canonical sequences and PE=2 for non-canonical sequences using a Python script. We used FragPipe headless mode on Linux for large database search. I don't recollect any other changes made to the workflow. I am attaching the workflows for your review. Thank you. |
Dear Alexey and the Fragpipe team, Just following up to see if you have any suggestions regarding the workflow and the reason behind increase in the number of peptide identifications after applying group-specific FDR? Thank you. |
Sorry this is hard to debug via GitHub for me. Perhaps you can send more details by email |
Thank you, Alexey. Sure, I will send you an email with more details. |
Hello Fragpipe team,
I am doing non-specific HLA search against a large database to identify peptides derived from non-canonical genomic regions with group specific FDR option enabled. Using the same sample, comparing to the database search against standard human canonical proteome database, there is an inflation in the number of peptides identified from large database (33,709 vs 21,924) and all the peptides identified are only from canonical proteins.
I would expect that if the group FDR is functioning correctly, both searches should yield a similar number of peptides from canonical protein sequences. Could you confirm if this assumption is correct? If so, what might be causing this discrepancy?
log_2025-01-19_15-06-25_large_database.txt
log_2025-02-09_18-23-01_Standard_database.txt
The text was updated successfully, but these errors were encountered: