by over9000 » Mon Jun 30, 2014 2:09 pm
Definitely looks like the OP is omitting the parts that do the MPI initialisation and sending. Also that the code is almost definitely running on ARM, not Epiphany. As for 'counter', with MPI this will be per-thread. You can do an MPI reduce operation to sum up all the values later, which is probably another bit of code that's omitted here.
I know it's only a simple test, but if you want accuracy and speed from this, consider:
* there's no need for floating points. You can scale everything according to your max int value. you could scale the circle to sqrt(max_int), but you only get half the number of significant bits, or scale to max_int and do a "long" multiply on the high/low max_int/2 bit values to avoid r squared and other products overflowing. Even if you do long multiplication, you'll probably get around an order of magnitude faster results on the Epiphany.
* the standard random number generator probably isn't good for statistical uses, so you'd probably need a higher quality one. Use the same idea as above and don't go doing unnecessary floating point operations (save for the final one that calculates the reduced ratio). If RAND_MAX is less than max_int, consider scaling to RAND_MAX instead.
I haven't had time to play with my Parallella much yet, and this is one of the very simple test projects I had in mind. It should be interesting to pit the epiphany against other platforms like the Raspberry Pi's GPU, ARM NEON, x86 SSE and so on. I know doing this via MPI isn't immediately useful, but it becomes more so if you can do arbitrary-precision stuff. Also, without cluster cables, MPI is probably the best choice for having multiple Parallella boards communicate with each other...