Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5.8 on S3 storage: thumbnails anger the S3 Library #8219

Closed
donsizemore opened this issue Nov 5, 2021 · 10 comments · Fixed by #8420
Closed

5.8 on S3 storage: thumbnails anger the S3 Library #8219

donsizemore opened this issue Nov 5, 2021 · 10 comments · Fixed by #8420
Milestone

Comments

@donsizemore
Copy link
Contributor

donsizemore commented Nov 5, 2021

What steps does it take to reproduce the issue?

Upload an image file (or possibly a PDF from which a thumbnail will be generated)

  • When does this issue occur?

When a user loads a dataset or datafile page containing a thumbnail read from S3 storage.

  • Which page(s) does it occurs on?

dataset.xhtml and file.xhtml

  • What happens?

server.log WARNs: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

It's just a warning, but a fairly verbose one which will get written on every page load.

  • To whom does it occur (all users, curators, superusers)?

All users.

  • What did you expect to happen?

logfile crickets

Which version of Dataverse are you using?

5.8

Any related open or closed issues to this bug report?

nothing in the 5.8 milestone jumps out at me

Screenshots:

Here's a screenshot of the thumbnail of the PNG I uploaded:

Screen Shot 2021-11-05 at 16 19 26

@qqmyers
Copy link
Member

qqmyers commented Nov 6, 2021

Hmm - may just be noise from me, but testing at QDR I see the error with payara-5.2021.7 but not with payara-5.2021.5, same image file uploaded. It's possible I have some other difference that I don't know about, and I can't think why payara version would make an aws class warning appear/disappear.

@donsizemore
Copy link
Contributor Author

This WARNing doesn't seem to print when one downloads files which were uploaded using a previous version of Dataverse. It does print when one downloads an image-format datafile but the resulting download has the same bytesize as the original. Strange.

@landreev
Copy link
Contributor

landreev commented Nov 8, 2021

This is interesting. "uploaded using a previous version of dataverse", hmm?
[Edit: removing everything I said here earlier; this is likely in an entirely different area than where I was guessing/speculating]

I'm interested in investigating this; but it will have to wit til next week.

@landreev
Copy link
Contributor

landreev commented Nov 9, 2021

OK, the message says "you are closing this S3 connection without reading everything that was in the pipe". I believe we may be doing something along these lines, routinely. For example, when the S3 redirect option is enabled, we do open that remote S3 object - to confirm that it exists - but then we close it, without reading any bytes, and issue the redirect.
Or, similarly, with thumbnails and such: we first open the main object (the full image; to confirm it is a legit, existing S3 file); but then close it, and open the saved auxiliary file (the thumbnail version of that image in the same bucket), and serve that.
Is that what it's complaining about? There may be a cleaner way of handling the above; w/ flushing things better, etc.
But, I'm curious about the fact this has seemingly started happening just now. (?)
Have any client libraries have been updated recently?

All of the above is still guessing and speculating. But will be happy to investigate deeper.

@donsizemore
Copy link
Contributor Author

On the S3 library question: I just compared 5.8 and 5.3 - each warfile provides three AWS jars, versioned 1.11.762.

@donsizemore
Copy link
Contributor Author

I get this warning running the 5.8 release warfile on payara-5.2021.9 as well.

@landreev
Copy link
Contributor

My guess is that this is a warning message that has always been there, but used to be suppressed in the logs in the older Payara distributions.
But I still haven't gotten around to experimenting with this.

@donsizemore
Copy link
Contributor Author

Adding com.amazonaws.services.s3.level=SEVERE to logging.properties seems to quiet things down, as long as that's a safe thing to do.

@qqmyers
Copy link
Member

qqmyers commented Feb 11, 2022

I think I understand/have fixed this. There were a couple places where we call storageIO.getInputStream().close() to make sure we get a handle to the (assumed already opened stream) to close it, e.g. after getting an aux file rather than the main file. However, in the meantime, the S3store got smart enough to not open the stream for the main file in the storageIO.open() method (FileAccessIO still does)- by checking when you call getInputStream and opening it right then for you if it isn't already open. :-)

Fortunately there is already a storageIO.closeInputStream() with the quiet close logic.

PR to follow.

@qqmyers
Copy link
Member

qqmyers commented May 9, 2022

FWIW: It looks like there are still cases where the warning occurs - guessing it is things like mimetype determination where the library doesn't read the whole file. So - we may want to reopen this or add a new issue for the specific case(s) causing the remaining instances of the Not all bytes were read message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants