-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
glob("*") does not support matching non-utf8 filenames #11916
Comments
An alternative approach is to wait until
For this reason, the approach outlined in the issue description is recommended. |
Test case from @flaper87: #[test]
#[cfg(not(windows))]
fn test_non_utf8_glob() {
let dir = tempfile::TempDir::new("").unwrap();
let p = dir.path().join(&[0xFFu8]);
fs::mkdir(&p, S_IRWXU as u32);
let pat = p.with_filename("*");
assert_eq!(glob(pat.as_str().expect("tmpdir is not utf-8")).collect::<~[Path]>(), ~[p])
} This also needs to be disabled on OS X, although perhaps we should do the opposite and simply enable it for linux. |
@kballard Thanks for putting this together. As discussed on IRC, I'm setting the mentor tag on you 😄 |
@kballard I'll work on this |
cc @nick29581, could this move to rust-lang/globs? |
This issue has been moved to the RFCs repo: rust-lang/glob#23 |
Thanks! |
glob::glob()
does not have any support right now for matching non-utf8 filenames. Not only are its patterns restricted to strings, but it also explicitly skips any non-utf8 filenames it encounters (which should at least be able to match a*
pattern).Tasks that need to be done:
glob()
needs to accept both strings and byte-vectors. It can do this usingstd::path::BytesContainer
glob()
needs to process its pattern as a byte vector instead of a string, which will allow it to process filenames as byte vectors. This includes matching non-utf8 filenames against*
and?
tokens (for the latter, matching a single byte is appropriate; ideally, it would match however many bytes are supposed to be consumed to create aU+FFFD REPLACEMENT CHARACTER
as per the unicode standard)This is a sub-task of #9639.
The text was updated successfully, but these errors were encountered: