Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular expression typo false positive #642

Open
pdostal opened this issue Dec 28, 2022 · 5 comments
Open

Regular expression typo false positive #642

pdostal opened this issue Dec 28, 2022 · 5 comments
Labels
A-exclude Area: automatic and user-controlled exclusions C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing

Comments

@pdostal
Copy link

pdostal commented Dec 28, 2022

Hello,

I've this case:

error: `ot` should be `to`, `of`, `or`
  --> ./lib/qam.pm:71:44
   |
71 |     if ($patch_status =~ /Status\s*:\s+[nN]ot\s[nN]eeded/) {
   |                                            ^^

Would that be possible to filter those out?

@epage epage added the C-enhancement Category: Raise on the bar on expectations label Dec 29, 2022
@epage
Copy link
Collaborator

epage commented Dec 29, 2022

Similar to #643, our main two routes are

  • Heuristics.
    • We need to distinguish from paths when doing so
    • Not all regex engines have leading, trailing /
  • Mark sections for ignoring like in Exclude specific line #316

@epage
Copy link
Collaborator

epage commented Mar 22, 2023

FYI #695 provides a new workaround for false positives

@kdeldycke
Copy link

To illustrate this issue with more test cases, here is another false positive of a regular expression encountered in a markdown document (as produced by typos-cli 1.20.8):

error: `ba` should be `by`, `be`
  --> ./content/2011/postgresql-commands.md:47:95
   |
47 |   $ psql --tuples-only --no-align -d database_id -c "SELECT id FROM res_users;" | sed ':a;N;$!ba;s/\n/ /g'
   |                                                                                               ^^
   |

In my case, the fix consisted in adding the following configuration to my pyproject.toml:

[tool.typos]
default.extend-ignore-identifiers-re = [
    "ba",
]

@epage
Copy link
Collaborator

epage commented Apr 16, 2024

Personally, I would recommend using default.extend-ignore-re to look for the pattern of your style of regexes and avoid checking them completely, rather than playing whack-a-mole with specific identifiers within a regex.

When handling identifiers, I would recommend to instead use

[default.extend-identifiers]
ba = "ba"

@kdeldycke
Copy link

Personally, I would recommend using default.extend-ignore-re

Ah yes, thanks @epage for the precision. It's better indeed. It just took me while to understand the different stages of tokenization process of typos and the influence of its parameters on that.

In the end I fine-tuned typos with the following config:

[tool.typos]
default.extend-ignore-re = [
    "!ba;",
]

@epage epage added S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing A-exclude Area: automatic and user-controlled exclusions labels Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-exclude Area: automatic and user-controlled exclusions C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing
Projects
None yet
Development

No branches or pull requests

3 participants