Abstract: Decoupled vector runahead (DVR) exploits massive amounts of memory-level parallelism to improve the performance of applications that feature indirect memory accesses by dynamically inferring ...