Parallella Community

by **jar** » Mon May 15, 2017 5:08 pm

I have not run the code, but it looks like your "tailEnds" (cores 0 and 15, presumably) do not wait for the DMA engine to complete before using the DMA again (in e_dma_copy).

Another concept that developers don't realize is that just because the DMA engine completed (after e_dma_wait) doesn't mean your data is where you think it should be. The DMA completing just means the last bit of command to move data across the network or e-link interface has been issued. You must perform a non-trivial check that the data completed moving. If multiple cores are reading or writing to that location, then you must also introduce synchronization. In the OpenSHMEM API, the coherence checking is handled implicitly by shmem_quiet after the call to a non-blocking shmem_put*_nbi or shmem_get*_nbi operation. The e-lib library provides no mechanism for this and is left to the developer.

Also, I have experienced some DMA weirdness that I never was able to pin down. I know that isn't helpful. In general, I avoid DMA with OpenSHMEM calls since synchronous copies typically beat it with Epiphany-III. Asynchronous/non-blocking code is also more complicated code. The painstakingly optimized shmemx_memcpy routine is the fastest way to write contiguous blocks of aligned memory with Epiphany-III (but it also handles misalignment).

by **nickoppen** » Tue May 16, 2017 10:52 am

Hi Jar,

The variable name "tailEnds" is perhaps not a good one. It is the amount of data for the core modulus the buffer size. I'll make sure that it waits before sending the results back.

Do the processor states in the debug session shed any light on what is happening? I looked through the architecture reference and there is a lot of discussion about processor states but no table that relates the mnemonic with the value.

I'm getting into dma in the belief that it can be used to shift one lot of data around while the core is processing another. If the algorithm is non-trivial or needs to be run many times (e.g. neural network training) then there is a net gain, even if the data transfer is not as quick as it could be. Am I on the right track here?

nick

by **jar** » Tue May 16, 2017 1:31 pm

I ran your code last night and I wasn't sure what you were seeing that was crashing/stalling. How can I reproduce your issue?

Those error codes never made sense to me and it would be nice if there was some way to decode it. I'll ask dar sometime (or you could).

by **nickoppen** » Wed May 17, 2017 12:35 am

That code on github was set to use memcpy and that test file was one that worked anyway.

I've updated the repository with a DMA version and an input file (bridge5.csv) that always fails for me. The file bridge0.csv works as does gray.csv.

I've left in some host_printf calls that shed some light on what is going on. There is also a global symbol (_bebug) that points to the location of the most recently transferred data.

Thanks again for helping me with this.

nick

by **nickoppen** » Wed May 24, 2017 12:46 am

I've been digging through the Architecture reference and I think I've found candidates for the columns displayed by the status command in debugger.

My guess is that the "run_state" refers to the STATUS register and "debug_state" refers to the DEBUGSTATUS register. I've not idea what "info" could refer to.

However, this does not help me much. The value 0x4000000b in the STATUS register says:

- The core is active
- All interrupts are enabled
- The WAND bit is set (which has something to do with barriers but is marked as LABS which should be regarded as "experimental")
- Bit 31 is also set for core 0 but that is reserved so I've no idea what this means.

The value of the debug_state seems to indicate that everything is fine.

So interesting but not useful.

Parallella Community

Cores stall (or crash) on e_dma_wait

Cores stall (or crash) on e_dma_wait

Re: Cores stall (or crash) on e_dma_wait

Re: Cores stall (or crash) on e_dma_wait

Re: Cores stall (or crash) on e_dma_wait

Re: Cores stall (or crash) on e_dma_wait

Re: Cores stall (or crash) on e_dma_wait

Who is online