Practical crypto on epiphany ?

Hello,
Just wondering if anyone already has numbers for some basic cryptography on the Epiphany, to compare with.
After a couple of days with my parallella, I have AES-256 in CTR mode and Chacha20 running (they're both counter-based and fully parallel), but speed is not tremendous. AES is at around 45 MB/s (including DMA'ing the data back into the ARM memory, which is a bottleneck even after trying to overlap the DMAs with the computations) vs. 16-18 MB/s for openssl on the A9. Chacha20 barely matches the A9 (around 35 MB/s, including DMA). As anyone done similar work and what would be the state of the art on the Epiphany?
Incidentally, it seems the fast SRAM works wonder for AES, just use the full 4 32-bits forward tables. OTOH, the compute-heavy Chacha20 doesn't like the single-issue pipeline I think. Also, not having a rotation instruction hurts.
Just wondering if anyone already has numbers for some basic cryptography on the Epiphany, to compare with.
After a couple of days with my parallella, I have AES-256 in CTR mode and Chacha20 running (they're both counter-based and fully parallel), but speed is not tremendous. AES is at around 45 MB/s (including DMA'ing the data back into the ARM memory, which is a bottleneck even after trying to overlap the DMAs with the computations) vs. 16-18 MB/s for openssl on the A9. Chacha20 barely matches the A9 (around 35 MB/s, including DMA). As anyone done similar work and what would be the state of the art on the Epiphany?
Incidentally, it seems the fast SRAM works wonder for AES, just use the full 4 32-bits forward tables. OTOH, the compute-heavy Chacha20 doesn't like the single-issue pipeline I think. Also, not having a rotation instruction hurts.