Advances in the VLSI process technology helped in realizing multi-core processors to exploit application-level parallelism. Multiple applications are running on these cores and the average demand per core for the shared resources like caches, interconnect and main memory is varying in nature. Overall system performance depends on how well one can utilize the shared resources of a multi-core processor, such as last level cache (LLC), network-on-chip (NoC), and main memory (DRAM). Managing shared resources in a highly parallel system is one of the most fundamental challenges that we face. Performance of a multi-core system depends on computation time, inter core communication delay and efficiency of on-chip memory systems.
In order to supply the data as quickly as possible to the processor, efficient cache designs along with various replacement policies are proposed to utilize the last level shared cache efficiently. Unfortunately, most of the techniques assume constant main memory latency in their evaluations. But the actual latency in retrieving data from main memory depends on various factors such as the congestion in the NoC, main memory scheduling mechanism, etc. Scheduling decisions taken at main memory may have significant impact on both latency as well as power consumption.