-
Notifications
You must be signed in to change notification settings - Fork 2k
cannot Classify image #108
Comments
Yeah, I encountered this issue recently too. We really need to upload a newer version of DIGITS, this is maybe the cause. |
I don't think this is an issue in DIGITS 3.0 per se but something to do with the conflicting packages - see NVIDIA/DIGITS#801 (comment) I'm surprised training works fine. @Cimenx did you update your image after you initially trained the model? |
No, we have the 7.5 package, because I pin the cuDNN version in my Dockerfile: You can also verify it inside the container:
|
@Cimenx on DockerHub you mentioned the previous version was working? It's probably related to the newer caffe/cudnn. If so, I will revert and push a different tag for a newer version of DIGITS. |
On DIGITS 3.0.0, the problem only exists with |
After an inverted bisect, the problem was fixed in NVIDIA/DIGITS@9dba452 |
@gheinrich @lukeyeager I was able to reproduce the problems on two machines, without Docker, just using the packages from the repo. So it might be a DIGITS/packaging problem. |
@gheinrich No, i'm training in a new fresh updated container. |
@Cimenx: sure, the master branch of DIGITS doesn't have this problem, that's why. |
@Cimenx did you update the NVIDIA driver recently? |
Hi @Cimenx , DIGITS dev here. I tried a simple training + inference with the
|
We are still investigating the bug for DIGITS 3.0. |
We are now tracking this bug in NVIDIA/DIGITS#845 |
I recently update Nvidia driver and then came across this problem ("Cannot create Cublas handle....") . I've tried re-pull NVIDIA/DIGITS images, but it doesn't work. I'm new to docker. Do you have any solution? |
In docker registry nvidia/digits https://hub.docker.com/r/nvidia/digits
when i run it, it works normally but it seem libdc1394 is missing
2016-06-10 03:00:28 [9] [INFO] Starting gunicorn 17.5
libdc1394 error: Failed to initialize libdc1394
2016-06-10 03:00:28 [9] [DEBUG] Arbiter booted
2016-06-10 03:00:28 [9] [INFO] Listening at: http://0.0.0.0:34448 (9)
2016-06-10 03:00:28 [9] [INFO] Using worker: socketio.sgunicorn.GeventSocketIOWorker
And after doing training (its work fine) i try to Classify One but it say "The connection was reset" in my browser.
2016-06-10 03:01:26 [52] [INFO] Booting worker with pid: 52
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0610 03:02:23.285483 52 common.cpp:110] Cannot create Cublas handle. Cublas won't be available.
E0610 03:02:23.288187 52 common.cpp:117] Cannot create Curand generator. Curand won't be available.
E0610 03:02:23.290375 52 common.cpp:121] Cannot create cuDNN handle. cuDNN won't be available.
F0610 03:02:23.331511 52 syncedmem.hpp:19] Check failed: error == cudaSuccess (3 vs. 0) initialization error
*** Check failure stack trace: ***
The text was updated successfully, but these errors were encountered: