Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deviantart] [Bug] Attempting to download results from a search query with the order set to "newest" only gets the first few deviations #2096

Closed
Scripter17 opened this issue Dec 12, 2021 · 6 comments

Comments

@Scripter17
Copy link
Contributor

I checked if it was a problem with mature deviations but it seems not. the /browse/newest API endpoint is definitely returning more deviations than what's being downloaded, but the extractor doesn't seem to use it?

This could be related or identical to #1704 and/or #1911

@mikf
Copy link
Owner

mikf commented Dec 17, 2021

I'd like to see the output with --verbose, if possible.

order=newest does not get recognized with the latest changes (a7ddb5f), but it should just fall back to all-time in that case.

Where did you get an order=newest search URL from in the first place? I can only find most-recent: https://www.deviantart.com/search/deviations?q=tree&order=most-recent

@Scripter17
Copy link
Contributor Author

For some reason I'm now also getting order=most-reccent

And yeah, it seems to be falling back to something else, but my search results for order=all-time are different from the download results. Probably just an API thing

Also I only just remembered I'm using 1.19.1 because of some changes I made to my local install (exponential backoff and removing the max wait limit; Having too many concurrent downloads causes an infinite loop of 429's). I don't think that matters since the only change to the deviantart extractor was something with stashes in 1.19.2

The (very redacted) output of gallery-dl --verbose "https://www.deviantart.com/search/deviations?q=fish&order=most-recent":

[gallery-dl][debug] Version 1.19.1
[gallery-dl][debug] Python 3.10.0 - Windows-10-10.0.19043-SP0
[gallery-dl][debug] requests 2.21.0 - urllib3 1.24.3
[gallery-dl][debug] Starting DownloadJob for 'https://www.deviantart.com/search/deviations?q=fish&order=most-recent'
[deviantart][debug] Using DeviantartPopularExtractor for 'https://www.deviantart.com/search/deviations?q=fish&order=most-recent'
[deviantart][debug] Using custom API credentials (client-id [REDACTED])
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.deviantart.com:443
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/browse/popular?q=fish&limit=50&timerange=now&offset=0&mature_content=true HTTP/1.1" 200 None
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/metadata?deviationids[0]=[REDACTED]&deviationids[1]=[REDACTED]&deviationids[2]=[REDACTED]&deviationids[3]=[REDACTED]&deviationids[4]=[REDACTED]&deviationids[5]=[REDACTED]&deviationids[6]=[REDACTED]&deviationids[7]=[REDACTED]&deviationids[8]=[REDACTED]&deviationids[9]=[REDACTED]&deviationids[10]=[REDACTED]&deviationids[11]=[REDACTED]&deviationids[12]=[REDACTED]&deviationids[13]=[REDACTED]&deviationids[14]=[REDACTED]&deviationids[15]=[REDACTED]&deviationids[16]=[REDACTED]&deviationids[17]=[REDACTED]&deviationids[18]=[REDACTED]&deviationids[19]=[REDACTED]&deviationids[20]=[REDACTED]&deviationids[21]=[REDACTED]&deviationids[22]=[REDACTED]&deviationids[23]=[REDACTED]&deviationids[24]=[REDACTED]&deviationids[25]=[REDACTED]&deviationids[26]=[REDACTED]&deviationids[27]=[REDACTED]&deviationids[28]=[REDACTED]&deviationids[29]=[REDACTED]&deviationids[30]=[REDACTED]&deviationids[31]=[REDACTED]&deviationids[32]=[REDACTED]&deviationids[33]=[REDACTED]&deviationids[34]=[REDACTED]&deviationids[35]=[REDACTED]&deviationids[36]=[REDACTED]&deviationids[37]=[REDACTED]&deviationids[38]=[REDACTED]&deviationids[39]=[REDACTED]&deviationids[40]=[REDACTED]&deviationids[41]=[REDACTED]&deviationids[42]=[REDACTED]&deviationids[43]=[REDACTED]&deviationids[44]=[REDACTED]&deviationids[45]=[REDACTED]&deviationids[46]=[REDACTED]&deviationids[47]=[REDACTED]&deviationids[48]=[REDACTED]&deviationids[49]=[REDACTED]&mature_content=true HTTP/1.1" 200 None
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 1003
[deviantart][debug] Active postprocessor modules: [MtimePP]
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): images-wixmp-[REDACTED].wixmp.com:443
[urllib3.connectionpool][debug] https://images-wixmp-[REDACTED].wixmp.com:443 "GET /f/[REDACTED]/dew7df7-[REDACTED].jpg?token=[REDACTED] HTTP/1.1" 200 185273
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 6126
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/[REDACTED]?mature_content=true HTTP/1.1" 200 546
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api-da.wixmp.com:443
[urllib3.connectionpool][debug] https://api-da.wixmp.com:443 "GET /_api/download/file?downloadToken=[REDACTED] HTTP/1.1" 200 None
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 1451
[urllib3.connectionpool][debug] https://images-wixmp-[REDACTED].wixmp.com:443 "GET /f/[REDACTED]/devfftn-[REDACTED].jpg?token=[REDACTED] HTTP/1.1" 200 673390
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 1189
[urllib3.connectionpool][debug] https://images-wixmp-[REDACTED].wixmp.com:443 "GET /f/[REDACTED]/dewfuxm-[REDACTED].png?token=[REDACTED] HTTP/1.1" 200 3148892
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 418
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/[REDACTED]?mature_content=true HTTP/1.1" 200 547
[urllib3.connectionpool][debug] https://api-da.wixmp.com:443 "GET /_api/download/file?downloadToken=[REDACTED] HTTP/1.1" 200 None
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/[REDACTED]?mature_content=true HTTP/1.1" 200 552
[urllib3.connectionpool][debug] https://api-da.wixmp.com:443 "GET /_api/download/file?downloadToken=[REDACTED] HTTP/1.1" 200 None
* .\gallery-dl\deviantart\!Search\Popular\most-recent-fish\[REDACTED]
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/comments/deviation/[REDACTED]?maxdepth=5&offset=0&limit=50&mature_content=true HTTP/1.1" 200 3506
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/[REDACTED]?mature_content=true HTTP/1.1" 200 556
[urllib3.connectionpool][debug] https://api-da.wixmp.com:443 "GET /_api/download/file?downloadToken=[REDACTED] HTTP/1.1" 200 None

@Scripter17
Copy link
Contributor Author

Scripter17 commented Dec 18, 2021

Wait hang on I'm being a total idiot

The 1.19.3 doesn't mention the change to deviantart searching but the commit you linked should be in it. Gonna update and see if that works

Edit: Nope, 1.19.3 is still having the issue

@mikf
Copy link
Owner

mikf commented Dec 19, 2021

Well, I got 150+ files with that URL before stopping it.
It is using the correct timerange and everything, and having metadata, comments, or any other deviantart specific options enabled doesn't make a difference.
Do you have it set to stop when it encounters a previously downloaded file? ("skip": "abort" etc) Maybe that's the reason it stops prematurely?

Maybe using the /browse/newest endpoint for order=most-recent instead of /browse/popular with timerange=now makes a difference, but I wouldn't be suprised if they both return the exact same results.

And yes, the last relevant change to dA was before 1.19.1, so that shouldn't matter,

@Scripter17
Copy link
Contributor Author

Yeah the case it failed on earlier is working now, but it's still the wrong order which causes issues with "skip":"abort"

A quick test in the developer console shows that /browse/newest does in fact return stuff in the right order. When /browse/popular is set to "all time" it doesn't sort by newest, it just means the most popular of all time

Which, given the name and the existence of /browse/newest, is probably pretty obvious in hind sight

@mikf
Copy link
Owner

mikf commented Dec 21, 2021

With commit 8f0cf0b it now uses /browse/newest for order=newest and order=most-recent. That should hopefully have a more consistent order.

but it's still the wrong order which causes issues with "skip":"abort"

Immediately stopping when finding an already downloaded file is generally a bad idea and usually causes more harm than it's worth. I'd recommend at least "skip":"abort:3", but you do you.

@mikf mikf closed this as completed Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants