Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ASCII characters in Content-Disposition header #389

Closed
matthew-white opened this issue Jul 26, 2021 · 3 comments
Closed

Non-ASCII characters in Content-Disposition header #389

matthew-white opened this issue Jul 26, 2021 · 3 comments

Comments

@matthew-white
Copy link
Member

A report on the forum notes that downloading a submission attachment whose filename contains a non-ASCII character results in an error: https://forum.getodk.org/t/error-with-non-ascii-chars-in-attachment-filenames/34462

This has come up before. However, sanitize() was added to our binary() function to address this: c1e41ba

The endpoint to download a submission attachment uses the binary() function. However, while binary() does call sanitize(), it looks like sanitize() doesn't remove non-ASCII characters.

If a change is needed in binary(), I think it'd also be needed elsewhere, as there are places outside binary() where we use sanitize() together with Content-Disposition.

Somewhat relatedly, I'm noticing that we don't call binary() or sanitize() for the endpoint to download a form attachment:

service.get(`${base}/attachments/:name`, endpoint(({ Blobs, FormAttachments, Forms }, { params, auth }, _, response) =>
getInstance(Forms, params)
.then((form) => auth.canOrReject('form.read', form))
.then((form) => FormAttachments.getByFormDefIdAndName(form.def.id, params.name)
.then(getOrNotFound)
.then((attachment) => ((attachment.blobId == null)
? reject(Problem.user.notFound())
: Blobs.getById(attachment.blobId)
.then(getOrNotFound)
.then((blob) => {
response.set('Content-Type', blob.contentType);
response.set('Content-Disposition', `attachment; filename="${attachment.name}"`);
return blob.content;
}))))));

Would it work to specify a filename* parameter to Content-Disposition in addition to filename? Does Express or an existing npm package provide an easy way to specify both parameters?

@matthew-white
Copy link
Member Author

Just saw this Sentry issue, which seems related. In this case, it looks like the .csv.zip can't be downloaded, because the form ID contains a Unicode character.

@matthew-white
Copy link
Member Author

This Sentry issue also seems related: https://sentry.io/organizations/getodk/issues/2572098274/?project=1298632

@issa-tseng
Copy link
Contributor

@matthew-white can you look over my work here? 086a59b
the rfcs are very very very very confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants