Friday 13 September 2013

numerical correctness in openCL (CPU) vs java?

numerical correctness in openCL (CPU) vs java?

I'm implementing a numerical algorithm in both Java and OpenCL (through
the www.jocl.org bindings), and I noticed a strange behaviour: there are
clear differences between the numerical outcome of the java program and
the OpenCL program, even on CPU. On GPU this could be expected, but on
CPU? Aren't both Java and OpenCL supposed to use the same numerical
instructions embedded in the CPU?
Naturally, I first thought that it would be a concurrency issue, but that
turned out not be the case since the results are consistent and the issue
remains identical even when I put global_work_size = 1.
Then, I thought I had maybe stumbled upon a compiler error. Unfortunately,
the results were similar for each of the following device/driver
combinations in my possession:
Intel CPU with Intel driver
Intel CPU with AMD driver
Nvidia GPU with Nvidia driver
AMD GPU with AMD driver
Each of them gave a similar result in OpenCL, but both CPUs yielded a
different result in Java. And the Java result is the "correct" one (yields
the same statistical probabilities as documented in the literature).
So, I would like to know if anyone has any idea what could possibly be
going wrong? I'm out of inspiration what could possibly be going wrong.
Aren't OpenCL on CPU and Java supposed to be identical in numerical
precision, as they should be using the same CPU instructions?

No comments:

Post a Comment