A Survey on Techniques for Managing GPU Register File

Accepted in IEEE TPDS 2016
Part of the abstract: To support their massively-multithreaded architecture, GPUs use very large register file (RF) which has a capacity higher than even L1 and L2 caches. In total contrast, traditional CPUs use tiny RF and much larger caches to optimize latency. Due to these differences, along with the crucial impact of RF in determining GPU performance, novel and intelligent techniques are required for managing GPU RF.
This paper surveys the techniques related to performance, energy and reliability aspects of GPU RF. It also discusses techniques which propose use of emerging memories (e.g., domain wall memory, STT-RAM, eDRAM) for designing GPU RF. Further, it summarizes the trend in RF size in recent GPUs and shows contribution of RF in total GPU power consumption.