-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for Ch5 - Classification 1 #41
Comments
I'll just show both. Using grouby and count is a good repetition of general purpose stuff that will help students reinforce earlier concepts. |
I just use |
Agree. I just changed the values in the df. Will open an issue in the R book. |
We added this into the book because students ask this question essentially every semester. But it does need to be fixed a bit for the Python version. |
These are all suggestions, some might seem more harshly written, but that is just to be succinct and write quickly.
groupby.count
with lots of intermediate objects, we should use.value_counts
with and withoutnormalize=True
.B
asMalign
and there would be no easy way to find out. I think it is much better to change the values of the individual data points using dictionary like assignment that is robust to data reordering and therefore more reproducible.np
not defined. This is both for 2 and more predictor variables code chunks. HTML also not defined. Also usenlargest
instead of sorting first.unscaled_cancer
can be assigned in one step.make_column_transformer
not defined (imports seem to generally not be working as expected here)fit_transform
requires more explanation (as doestransform
itself)pd.DataFrame
(some things in this cell seems unnecessary)bind_rows
never explained, setting labels manually againpd.concat
never explained, neither isquery
norlambda
inapply
.value_counts
instead ofgroupby.count
(in text and code)loc[:, ...]
again in make_piplineThe text was updated successfully, but these errors were encountered: