Resample MixedModel object and refit #806

joseah · 2025-02-28T09:02:34Z

Dear MixedModels.jl team,

I've implemented a non-parametric bootstrapping strategy for genetic analysis using the MixedModels.jl. One thing I've noticed is that the MixedModelsPermutations package by @palday resamples residuals directly within the model (in-place) without having to regenerate from scratch the MixedModel object for every bootstrap iteration followed by refit!. This results in a huge speed up.

Is there any recommendation on how to do this resampling observations (essentially as if we were to input a new resampled dataset to the fit function but within the MixedModel object)? I've been attempting to resample the model.reterms attribute but I was wondering if there's a better way to take into account all the necessary attributes for this task (e.g. X, Xymat, reterms)? Is this something feasible considering the MixedModels class definition?

Many thanks,
Jose

The text was updated successfully, but these errors were encountered:

palday · 2025-03-03T04:47:01Z

You don't need to worry about X vs. Xymat -- X is a view into the latter.

I haven't tested this out, so I don't promise the accuracy (and know certain parts probably aren't quite right), but here's something between pseudo- and actual code to do things:

using StatsBase
# pre-allocate a few things
selection = collect(1:nobs(model))
scratch = deepcopy(model)
rng = MersenneTwister(1)

for i in 1:niter:
   sample!(rng, 1:nobs(model), selection; replace=true)
   # careful use of `copy!` might be a bit faster here
   scratch.X .= @view model.X[selection, :]
   scratch.y .= @view model.y[selection, :]
   for (scratch_re, re) in zip(scratch.reterms, model.reterms)
      # the transposes are stored, so we need to do columns 
      scratch_re.z .= @view(re.z[:, selection])
      scratch_re.adjA .= @view(re.adjA[:, selection])
   end
end

(The above example ignores the weighted case.)

This is assuming that you're sampling blindly without taking the blocking structure into account. That's a very naive way to sample for entire observations (as opposed to residuals) and may (probably?) not give the inference you want. Beyond that concern, it also destroys the sparsity pattern in the random effects because the observations are no longer sorted by block. This has several knock-on effects:

The sparse matrices will be stored far less efficiently, which will result in higher memory use and slower element access. The latter bit will slow down everything.
The intermediate computations used in fitting a model will also have their sparsity patterns destroyed, which means that you'll either hit slower, more general methods OR encounter cases that can't occur when the sparsity pattern is maintained and thus get a MethodError because we haven't implemented the necessary code for those patterns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resample MixedModel object and refit #806

Resample MixedModel object and refit #806

joseah commented Feb 28, 2025

palday commented Mar 3, 2025 •

edited

Loading

Resample MixedModel object and refit #806

Resample MixedModel object and refit #806

Comments

joseah commented Feb 28, 2025

palday commented Mar 3, 2025 • edited Loading

palday commented Mar 3, 2025 •

edited

Loading