Simplify and add more unit tests #42

davidssmith · 2018-01-10T22:52:54Z

I'm not entirely happy with the number of dependencies and helper functions for what is essentially a glorified non-linear least squares fitting code. I would be happier if it were distilled to its essential functions and all of those functions were unit tested on small, known data sets, rather than the QIBA phantom.

To start the discussion, I propose removing:

command line arguments
parallel processing (the new algos are so fast that the benefit is minimal)
move demo and validate to test/
replace jacobian from Calculus.jl with a simpler, inline expression
remove any plotting and leave that to the user, or use something simpler, like UnicodePlots

These are just ideas. I'll think about it for a while and take any comments the community has.

The text was updated successfully, but these errors were encountered:

davidssmith · 2018-01-10T22:54:12Z

Since the actual in vivo data set has no known ground truth, perhaps it should be removed and the QIBA phantom should be the demo. Then the validation would occur on a handful of fake voxels with known parameter values that are checked in runtests.jl.

davidssmith · 2018-01-10T22:58:10Z

In fact, the nightly build seems to be failing due to a bug in ArgParse which could be removed theoretically.

notZaki · 2018-01-19T16:27:54Z

I've never used the command line version, so can't really comment on that.

The parallel processing (nlsfit()) has been useful to me in the past when working with simulation data. But yea, it's not really something I'd use for clinical data unless batch processing a big collection. Is there a significant overhead for keeping this feature?

Yes to moving demo and validate.

I have no opinion on jacobian. Just curious if ForwardDiff.jl is any better than Calculus.jl.

The plotting code doesn't seem to be causing any problems. It'll even be out of sight if demo/validate are moved to test.

I think it'll be useful for each model to have its own fitting function - e.g. fitTofts(), fitExtTofts(), etc. The 'numbered model paradigm' worked well for a sequence of nested models, and I guess it is convenient for users that want a pre-configured fitting pipeline, but it is awkward to add new models to it.

Another change - which might be unfeasible - is to reduce the size of the package by removing unnecessary files, such as the DICOM files of the QIBA phantoms. The problem there is that git tracks the files, so they're never truly gone. They could be purged from the git history, but then the peerj paper is no longer reproducible.

davidssmith · 2018-01-23T19:26:32Z

The overhead for keeping parallel functionality is a slower single CPU fit, since it still runs through the same framework, and of course code bloat with code that is especially difficult for the casual user to understand.

It looks like ForwardDiff.jl requires a differentiable function, but I bet it is faster. Maybe you can do some math to the Tofts integral to turn it into a differentiable function suitable for passing to it.

I don't understand why each model should have it's own fitting function. Why is the numbered model paradigm not working for you?

I'd like to remove the DICOMs and replace them with something like DataDeps.jl, but yes it won't reduce the repo size unfortunately unless we nuke the old versions, including the one in the paper.

notZaki · 2018-01-24T02:19:38Z

I guess parallel processing could be removed if it's slowing down the single CPU fit, as that will be how most people use the package. It could be replaced by adding Threads.@threads before for-loops (the ones iterating over each voxel), which'll take advantage of multi-threading - if the user has it enabled.

The problem I see is that fitdce() will become big and complicated as more models are added. For example, if we were to add the 2CXM - which has parameters Fp, PS, ve, vp - do we add Fpmax and PSmax as optional parameters to fitdce()? Also, if the user asks to fit both the ExtTofts and 2CXM, how would fitdce() handle the fact that Ktrans, Fp and PS are not in both models?
It's why I think the fitting part - i.e. the part after if x in models - be its own function. Then, fitdce() would be a convenient wrapper for the Tofts model family without having to include all models.
This would also allow users to call the fitting part directly to get the 'raw' fits without any post-processing, and allows each model's fit to be tested without having to go through fitdce().
There's also the smaller issue of the model numbers being arbitrary - e.g. the ExtTofts model is currently Models 3 and 5 - which isn't aesthetically pleasing.

I think the new package managed (Pkg3) is supposed to use shallow clones, so that might speed up installation after removing auxiliary files. But, anyone wanting to contribute to the package will still have to download the entire git history.

davidssmith added help wanted question labels Jan 10, 2018

davidssmith mentioned this issue Jan 11, 2018

Speed improvement by updating convolution code #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify and add more unit tests #42

Simplify and add more unit tests #42

davidssmith commented Jan 10, 2018

davidssmith commented Jan 10, 2018

davidssmith commented Jan 10, 2018

notZaki commented Jan 19, 2018

davidssmith commented Jan 23, 2018

notZaki commented Jan 24, 2018

Simplify and add more unit tests #42

Simplify and add more unit tests #42

Comments

davidssmith commented Jan 10, 2018

davidssmith commented Jan 10, 2018

davidssmith commented Jan 10, 2018

notZaki commented Jan 19, 2018

davidssmith commented Jan 23, 2018

notZaki commented Jan 24, 2018