You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've implemented a non-parametric bootstrapping strategy for genetic analysis using the MixedModels.jl. One thing I've noticed is that the MixedModelsPermutations package by @palday resamples residuals directly within the model (in-place) without having to regenerate from scratch the MixedModel object for every bootstrap iteration followed by refit!. This results in a huge speed up.
Is there any recommendation on how to do this resampling observations (essentially as if we were to input a new resampled dataset to the fit function but within the MixedModel object)? I've been attempting to resample the model.reterms attribute but I was wondering if there's a better way to take into account all the necessary attributes for this task (e.g. X, Xymat, reterms)? Is this something feasible considering the MixedModels class definition?
Many thanks,
Jose
The text was updated successfully, but these errors were encountered:
You don't need to worry about X vs. Xymat -- X is a view into the latter.
I haven't tested this out, so I don't promise the accuracy (and know certain parts probably aren't quite right), but here's something between pseudo- and actual code to do things:
using StatsBase
# pre-allocate a few things
selection =collect(1:nobs(model))
scratch =deepcopy(model)
rng =MersenneTwister(1)
for i in1:niter:sample!(rng, 1:nobs(model), selection; replace=true)
# careful use of `copy!` might be a bit faster here
scratch.X .=@view model.X[selection, :]
scratch.y .=@view model.y[selection, :]
for (scratch_re, re) inzip(scratch.reterms, model.reterms)
# the transposes are stored, so we need to do columns
scratch_re.z .=@view(re.z[:, selection])
scratch_re.adjA .=@view(re.adjA[:, selection])
endend
(The above example ignores the weighted case.)
This is assuming that you're sampling blindly without taking the blocking structure into account. That's a very naive way to sample for entire observations (as opposed to residuals) and may (probably?) not give the inference you want. Beyond that concern, it also destroys the sparsity pattern in the random effects because the observations are no longer sorted by block. This has several knock-on effects:
The sparse matrices will be stored far less efficiently, which will result in higher memory use and slower element access. The latter bit will slow down everything.
The intermediate computations used in fitting a model will also have their sparsity patterns destroyed, which means that you'll either hit slower, more general methods OR encounter cases that can't occur when the sparsity pattern is maintained and thus get a MethodError because we haven't implemented the necessary code for those patterns.
Dear MixedModels.jl team,
I've implemented a non-parametric bootstrapping strategy for genetic analysis using the MixedModels.jl. One thing I've noticed is that the
MixedModelsPermutations
package by @palday resamples residuals directly within the model (in-place) without having to regenerate from scratch theMixedModel
object for every bootstrap iteration followed byrefit!
. This results in a huge speed up.Is there any recommendation on how to do this resampling observations (essentially as if we were to input a new resampled dataset to the
fit
function but within theMixedModel
object)? I've been attempting to resample themodel.reterms
attribute but I was wondering if there's a better way to take into account all the necessary attributes for this task (e.g. X, Xymat, reterms)? Is this something feasible considering the MixedModels class definition?Many thanks,
Jose
The text was updated successfully, but these errors were encountered: