Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uniqueN() fails for zero-length vectors in by #4594

Closed
mcol opened this issue Jul 9, 2020 · 2 comments · Fixed by #4595
Closed

uniqueN() fails for zero-length vectors in by #4594

mcol opened this issue Jul 9, 2020 · 2 comments · Fixed by #4595
Labels
Milestone

Comments

@mcol
Copy link
Contributor

mcol commented Jul 9, 2020

If in uniqueN a zero-length vector is passed to by, the function throws an error. I would expect it to return 0.

library(data.table)
DT <- data.table(idx=1:4, value="val")
uniqueN(DT, character(0))
# Error in forderv(x, by = by, retGrp = TRUE, na.last = if (!na.rm) FALSE else NA) :
# Internal error: DT has 2 columns but 'by' is either not integer or is length 0

This was unexpected, as instead the following works:

uniqueN(DT[, .SD, .SDcols=character(0)])
# [1] 0

Output of sessionInfo()

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.12.8

loaded via a namespace (and not attached):
[1] compiler_3.6.3
@MichaelChirico
Copy link
Member

Note that your example is not really the relevant case for consistency -- .SDcols=character() is not the same as by=character()

I'm not sure uniqueN is well-defined in this case. Should it be nrow(DT)? That would match uniqueN(DT, by = NULL) at least.

Certainly you shouldn't be getting an Internal Error so there is something to fix.

@mcol
Copy link
Contributor Author

mcol commented Jul 10, 2020

I think you are right in your assessment, using by=character() should match by=NULL. Thanks for providing such a quick fix!

@mattdowle mattdowle added this to the 1.14.1 milestone Jun 17, 2021
@mattdowle mattdowle added the bug label Jun 17, 2021
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants