Parallella Community

by **dobkeratops** » Sat Apr 09, 2016 12:27 pm

https://www.youtube.com/watch?v=Ey-inJ9Dz6Q

Seems to be a language that can automatically map arrays to a partitioned global address space, and (if I've understood correctly?) iterations over those arrays are automatically distributed across threads (or processors) tied to subsets of the address range. It's sort of data-parallel, but with controlled locality.

I suppose as it stands this still isn't really a perfect match to the epiphany chip, where the scratchpads are more analogous to L1 cache (and you'd really be using DMA to work with off chip global memory); but I guess it might apply more to the vision of future versions with large amounts of 3d memory per core, on chip. I speculate you might be able to make an openCL implementation work like this (but it would be horrendously complex to do.)

The talk is interesting, covering the use cases for PGAS where it offers potential advantages over MPI and shared-memory.

by **jar** » Sat Apr 09, 2016 3:48 pm

Don't think of the scratchpad like L1 cache. All PGAS models are valid for the Epiphany architecture. OpenCL is for SMP architectures and is not a good model as I've said many times. Running OpenCL on Epiphany is like running OpenCL across multiple nodes in a cluster. It may be possible but it does not address inter-core/inter-node communication which is critical to performance for anything but trivially parallel applications.

by **dobkeratops** » Sat Apr 09, 2016 4:56 pm

Parallella Community

"UPC", PGAS memory model

"UPC", PGAS memory model

Re: "UPC", PGAS memory model

Re: "UPC", PGAS memory model

Who is online