-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fread signed integer overflow on gcc-UBSAN and clang-UBSAN checks #6729
Comments
Can be reproduced in current foo <- data.table::fread("https://github.com/cran/PAMscapes/raw/refs/heads/master/inst/extdata/PSDSmall.csv")
# <...>
# fread.c:1974:30: runtime error: signed integer overflow: 237281 * 237281 cannot be represented in type 'int' Looks like this doesn't actually break anything, and the destination of the assignment is already of type Line 1974 in 1dd2976
|
Awesome, thanks for the quick response and fix. Just for my own understanding - this was happening because of the large number of columns in my example CSV file, so if I greatly reduce the number of columns then this wouldn't happen? |
Yes, it's possible whenever the number of characters on any line in the file exceeds a bit over 40,000 (specifically, √(2³¹-1)). However, it does not matter if UBSan is not enabled -- the answer will wind up correct. |
My package on CRAN (PAMscapes) recently got flagged because there is an error triggered in both the gcc-UBSAN and clang-UBSAN additional CRAN checks. Here is a link to the check page, here is some version info from the check log:
clang version 19.1.6
flang-new version 19.1.6
And here is copy of the error message from the Ex-out:
fread.c:1919:30: runtime error: signed integer overflow: 237281 * 237281 cannot be represented in type 'int'
#0 0x7fee3c6f9969 in freadMain /tmp/Rtmp8p0OKk/R.INSTALLecbd63f70502d/data.table/src/fread.c:1919:30
#1 0x7fee3c70e547 in freadR /tmp/Rtmp8p0OKk/R.INSTALLecbd63f70502d/data.table/src/freadR.c:218:3
#2 0x55b4d6c2342b in R_doDotCall (/data/gannet/ripley/R/R-clang/bin/exec/R+0xf342b)
#3 0x55b4d6c23d1d in do_dotcall (/data/gannet/ripley/R/R-clang/bin/exec/R+0xf3d1d)
#4 0x55b4d6c5ffe2 in bcEval_loop eval.c
#5 0x55b4d6c5971b in bcEval eval.c
#6 0x55b4d6c58ea4 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x128ea4)
#7 0x55b4d6c719c1 in R_execClosure eval.c
#8 0x55b4d6c70eab in applyClosure_core eval.c
#9 0x55b4d6c592f5 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x1292f5)
#10 0x55b4d6c77006 in do_set (/data/gannet/ripley/R/R-clang/bin/exec/R+0x147006)
#11 0x55b4d6c590cf in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x1290cf)
#12 0x55b4d6ca6e20 in Rf_ReplIteration (/data/gannet/ripley/R/R-clang/bin/exec/R+0x176e20)
#13 0x55b4d6ca87be in run_Rmainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x1787be)
#14 0x55b4d6ca882a in Rf_mainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x17882a)
#15 0x55b4d6b97a27 in main (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67a27)
#16 0x7fee4fa2950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
#17 0x7fee4fa295c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x295c8) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
#18 0x55b4d6b97944 in _start (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67944)
The only thing I am doing in my package is reading this CSV file using fread with the option
header=TRUE
.#
Minimal reproducible example
; please be sure to setverbose=TRUE
where possible!I've tried but been unable to reproduce this - I don't have access to a Linux machine, and I don't really understand the UBSAN section in the writing R extensions manual. I've tried running package checks using the rhub package's
clang-asan
container that is meant to mimic these CRAN checks, but the same example ran without error. I'm not exactly sure where to go from here, but figured y'all might have better ideas.Thanks!
The text was updated successfully, but these errors were encountered: