You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The predictions matrix all_pred initialised by np.zeros(..., dtype=np.int) in line 73 of bias_variance_decomp() is truncating predictions (casting to integer):
This causes wildly inaccurate results if the target variable is small, as predictions are truncated as integers. Regardless, casting predictions to integers doesn't strike me as a desired feature of the bias_variance_decomp() function.
See this gist for a full reproducible example of this, but below are the differences in results in a regression case with a small target variable:
Wow, good catch. Yeah, the examples and unit tests for the MSE loss were all with relatively large numbers so I didn't notice that. That's going to be fixed via #749. Many thanks.
Bug description
The predictions matrix
all_pred
initialised bynp.zeros(..., dtype=np.int)
in line 73 ofbias_variance_decomp()
is truncating predictions (casting to integer):Example of
numpy
behaviour causing the issue:This causes wildly inaccurate results if the target variable is small, as predictions are truncated as integers. Regardless, casting predictions to integers doesn't strike me as a desired feature of the
bias_variance_decomp()
function.See this gist for a full reproducible example of this, but below are the differences in results in a regression case with a small target variable:
Unchanged function results:
Results after removing
dtype=np.int
fromnp.zeros()
inall_pred
initialisation:Steps/Code to Reproduce
See this gist.
Versions
MLxtend 0.17.3
macOS-10.15.6-x86_64-i386-64bit
Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34)
[Clang 6.0 (clang-600.0.57)]
Scikit-learn 0.23.2
NumPy 1.19.2
SciPy 1.5.2
The text was updated successfully, but these errors were encountered: