Friday, June 25, 2010

[from Pawel] new driver/SDK on cudak5

we have a 195.36.31 driver (stable, new) on cudak5.
the previous one, 256.25 was a beta version that we needed to compile anything
using the gcc-3.4 compiler, but gradually showed some serious problems,
like not seeing chosen devices and putting them in high-gear mode (high T) permanently.

Josh installed the appropriate toolkit with SDK and it now works ~ok!!
I've tested every SDK demo on all cards and the result is that we no longer have
any problems with multi-GPU!

the problems you'll encouner running nvidia-settings utility & examples in
/home/student/NVIDIA_GPU_Computing_SDK/C/bin/linux/release

are restricted to following programs
====================================================

nvidia-setting does not show:
Thermal Settings
PowerMizer
DFP-1(20 - GeForce GTX 480)

for devices 1 and 2.
to see temperatures you must say
% nvidia-smi -q -a
which prints correct temperatures. I've been able to generate T=90-94C while running
3 examples of nbody test on 3 cards. it's normal.
=======================================================================================
printout from ./deviceQuery
(..)
Concurrent copy and execution: Yes
Run time limit on kernels: No
(...)

and yet...

nt@cudak5 release]$
[student@cudak5 release]$ ./concurrentKernels
[concurrentKernels] - Starting...

CUDA Device GeForce GTX 480 has 15 Multi-Processors
CUDA Device GeForce GTX 480 is NOT capable of concurrent kernel execution
concurrentKernels.cu(111) : cudaSafeCall() Runtime API error : unspecified launch failure.
[student@cudak5 release]$
======================================================================================

[student@cudak5 release]$ ./fluidsGL
[fluidsGL] - [OpenGL/CUDA simulation]
CUDA device [GeForce GTX 480] has 15 Multi-Processors
fluidsGL_kernels.cu(44) : cutilCheckMsg() CUTIL CUDA error : cudaMemcpy failed : unspecified launch failure.
fluidsGL.cu(379) : cutilCheckMsg() CUTIL CUDA error : cudaGLUnregisterResource failed : unspecified launch failure.
[student@cudak5 release]$

[student@cudak5 release]$
[student@cudak5 release]$ ./fluidsGL --device=2
[fluidsGL] - [OpenGL/CUDA simulation]
Using device 2: GeForce GTX 480
CUDA device [GeForce GTX 480] has 15 Multi-Processors
fluidsGL.cu(183) : cudaSafeCall() Runtime API error : unspecified launch failure.
fluidsGL.cu(379) : cutilCheckMsg() CUTIL CUDA error : cudaGLUnregisterResource failed : unspecified launch failure.
[student@cudak5 release]$
[student@cudak5 release]$
[student@cudak5 release]$
[student@cudak5 release]$ ./fluidsGL --device=1
[fluidsGL] - [OpenGL/CUDA simulation]
Using device 1: GeForce GTX 480
CUDA device [GeForce GTX 480] has 15 Multi-Processors
fluidsGL.cu(183) : cudaSafeCall() Runtime API error : unspecified launch failure.
fluidsGL.cu(379) : cutilCheckMsg() CUTIL CUDA error : cudaGLUnregisterResource failed : unspecified launch failure.
[student@cudak5 release]$
[student@cudak5 release]$


====================================================================

Those who are close to producing working CUDA programs please do so on cudak5 in student account. Don;t worry (by necessity) about infiniband.
While having no internode comm looks moderately bad, not having any CUDA will definitely look bad during our 29th Jun meeting.

We have very little time.. full time engagement is expected now to the end of month.

No comments:

Post a Comment