Skip to content
This repository was archived by the owner on Nov 21, 2023. It is now read-only.

Inference Time Explaination #13

Closed
beetleskin opened this issue Jan 23, 2018 · 3 comments
Closed

Inference Time Explaination #13

beetleskin opened this issue Jan 23, 2018 · 3 comments

Comments

@beetleskin
Copy link

Inference times are often expressed as "X + Y", in which X is time taken in reasonably well-optimized GPU code and Y is time taken in unoptimized CPU code. (The CPU code time could be reduced substantially with additional engineering.)

Isn't it the other way around? X is always > Y in the tables.

@rbgirshick
Copy link
Contributor

rbgirshick commented Jan 24, 2018

The explanation is correct; the "Y" time is indeed unoptimized CPU code. The fact that it's often so small is why it's left unoptimized :). The main point is that when considering how fast a model is, we can take the timing to be essentially just X because Y can be made much smaller with some engineering effort (e.g., the Y for Mask R-CNN is mostly time spent upsampling 100 predicted masks, one at a time, not in parallel; this could be replaced with a parallelized GPU implementation and take almost no time at all).

@beetleskin
Copy link
Author

beetleskin commented Jan 24, 2018

So, if I got this right, the total inference time is always X + Y, i.e. some parts of the inference is run on GPU, some on CPU? From the explanation I thought X is inference time on the GPU and Y is inference time on the CPU, i.e. the same algorithm on different hardware.. But I guess the "+" expresses exactly that :)

Does the inference time also relate to the hardware of

8 NVIDIA Tesla P100 GPU

, run in parallel?

@rbgirshick
Copy link
Contributor

I see the confusion. Yes, the total time is additive as in X plus Y.

When the --multi-gpu-testing flag is used with {train,test}_net.py inference happens on the dataset in a map-reduce way; the dataset is partitioned into NUM_GPUS subsets and they are processed in parallel. Inference on each individual image is always run on a single GPU.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants