Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Fixes for n param deobfuscation function #1253

Merged
merged 11 commits into from
Jan 26, 2025

Conversation

gechoto
Copy link
Contributor

@gechoto gechoto commented Dec 29, 2024

Fixes #1252

Changes:

  • new regex for extracting the name of the n param deobfuscation function
  • added fixup for the n param deobfuscation functions code to prevent an early return

Copy link
Member

@Stypox Stypox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Comment on lines 52 to 57
// Pattern.compile(SINGLE_CHAR_VARIABLE_REGEX + "=\"nn\"\\[\\+" + MULTIPLE_CHARS_REGEX
// + "\\." + MULTIPLE_CHARS_REGEX + "]," + MULTIPLE_CHARS_REGEX + "\\("
// + MULTIPLE_CHARS_REGEX + "\\)," + MULTIPLE_CHARS_REGEX + "="
// + MULTIPLE_CHARS_REGEX + "\\." + MULTIPLE_CHARS_REGEX + "\\["
// + MULTIPLE_CHARS_REGEX + "]\\|\\|null\\).+\\|\\|(" + MULTIPLE_CHARS_REGEX
// + ")\\(\"\"\\)"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the previous regex? It's better if you keep it, so in case YouTube reverts something it's already there and it works.

Copy link
Contributor Author

@gechoto gechoto Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.

This pattern is prone to find wrong functions.
Imagine this:

  • NewPipeExtractor searches for the n deobfuscation function
  • the first regex doesn't work so it uses the second one (the one I commented out)
  • the (commented out) regex finds the wrong function
  • NewPipeExtractor runs the wrong function and gets the wrong n parameter
  • User gets 403
  • No exception is thrown in NewPipeExtractor so it is harder to find the correct place which needs fixing

2.

The regex which comes after the one commented out already finds a very similar pattern, only different groups and more specific.
In this example:

a.D&&(b="nn"[+a.D],WL(a),c=a.j[b]||null)&&(c=SDa[0](c),a.set(b,c),SDa.length||Wma("")

Looking at different versions of the player code it seems more often correct to use the function which is in SDa[0] but using Wma in this case would find the wrong function in newer versions.

Going for SDa[0] (and searching for what is in the array afterwards) seems more robust across multiple versions. The next regex already catches that.

From what I have seen I think if the regex which I commented out is added back in it should be swapped with the next one to hopefully prevent false positives more often.

(Also it might be a good idea to add some additional checks to validate it found the correct function but I consider this out of scope for this PR.)

But I have to say I'm very new to this YT/NewPipeExtractor stuff and only looked at some but not all player version so I might lack some experience with YTs past which means I might be wrong. I can only talk about what I have seen so far so if you know better feel free to correct me.

My suggestion for now:
I will add it back in but swap it with the next one.
What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also it might be a good idea to add some additional checks to validate it found the correct function but I consider this out of scope for this PR.)

That's a pretty good idea, however if you are using other clients not requiring deobfuscation like it is the case currently (we do not use HTML5 clients for now except for age-restricted videos, but this is a broken workaround that will be removed soon), you are preventing streams extraction. This already happened in past. A general error/logging warning is something that would fit the best.

My suggestion for now:
I will add it back in but swap it with the next one.
What do you think?

Keep the current third version as it is and move the current first one, potentially finding the wrong function, at the third place and the current regexes from this position to the last one after the new third regex. Let me know if I am not clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I have done now compared to the current code on the dev branch:

  • swap the first two regexes
  • insert the new regex in third place

Is this correct?

@gechoto
Copy link
Contributor Author

gechoto commented Jan 5, 2025

@Stypox @AudricV I added the second regex back, swapped it with the third one and added a comment with the reasoning.
Anything else left? Would like to get this done and hopefully see it in a new release soon.

@AudricV
Copy link
Member

AudricV commented Jan 6, 2025

Is it really needed to add a new regex matching partially what the current second one does? I think it could increase the chances to extract the wrong function, as there is nothing specific about the n param itself like the nn string.

@gechoto
Copy link
Contributor Author

gechoto commented Jan 6, 2025

Is it really needed to add a new regex matching partially what the current second one does? I think it could increase the chances to extract the wrong function, as there is nothing specific about the n param itself like the nn string.

The old regexes don't find any matches and the current player js does not contain the nn string anymore which is why I had to add a new regex.

The chance of extracting the wrong function in the future will probably never be zero.
If you have an idea how to improve the regex let me know.

@gechoto gechoto requested a review from AudricV January 6, 2025 20:29
@gechoto
Copy link
Contributor Author

gechoto commented Jan 19, 2025

Is this good to go now?

@ildar
Copy link

ildar commented Jan 25, 2025

I see the merging is blocked. Is it possible to advance it? AFAIK some downstreams depend on it.

Also, @gechoto , could you please rebase it on top of the latest release?

thanks guys, your work is hard to overestimate for many many ppl!

@gechoto
Copy link
Contributor Author

gechoto commented Jan 26, 2025

I see the merging is blocked.

I merged the latest changes from dev but guess I messed it up a bit. Now we have 20 commits in this PR.
But should be fine if NewPipe team squashes the commits on merge.

AFAIK some downstreams depend on it.

Yes this is an important fix for some apps and I don't want to maintain a fork for this indefinitely. Hope this gets merged asap but we have to wait until someone from the team has time.

@litetex was active lately. Maybe you can review this one?

@ildar
Copy link

ildar commented Jan 26, 2025 via email

@Stypox
Copy link
Member

Stypox commented Jan 26, 2025

Mmmh, I don't see the point of not putting the most recent regex in first place. YouTube is unlikely to rollback to the previous player version.
I pushed two commits, one to match more context around the function we are looking for in the new regex (see yt-dlp's regex), and another to remove "first regex", "second regex", ... in the comments (which are useless imho).
I also rebased on dev.

@gechoto
Copy link
Contributor Author

gechoto commented Jan 26, 2025

Mmmh, I don't see the point of not putting the most recent regex in first place.

The new regex is in third place because it is less specific as it doesn't contain the "nn" string anymore.
At least that is how I understood the comment from @AudricV above.

@Stypox
Copy link
Member

Stypox commented Jan 26, 2025

Yes, the point is that every regex is made specifically for one player version from YouTube, and the current version only matches the third regex (and none of the others), so it doesn't make sense not to put that one regex in first position. It's not like YouTube is going to rollback to the previous player (I mean, it happened once I think, and that's why we have the other fallback regexes in the list).

Copy link
Member

@Stypox Stypox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, we can always change it later

@Stypox Stypox merged commit 1ca8275 into TeamNewPipe:dev Jan 26, 2025
3 checks passed
@gechoto
Copy link
Contributor Author

gechoto commented Jan 26, 2025

@Stypox Okay yeah but this is conflicting to how I understood @AudricV above.
I'm new to YT stuff so I don't really have a strong opinion on it.
If you think it is fine feel free to move the new regex in first position.

@gechoto gechoto deleted the fix-n-func branch January 26, 2025 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is related to a bug youtube service, https://www.youtube.com/
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Could not find throttling deobfuscation function
5 participants