QSI Dynamical Fetch Policy for SMT

A Simultaneous Multithreading (SMT) Processor is capable of executing instructions from multiple threads in the same cycle. SMT in fact was introduced as a powerful architecture to superscalar to increase the throughput of the processor. Simultaneous Multithreading is a technique that permits multiple instructions from multiple independent applications or threads to compete limited resources each cycle. While the fetch unit has been identified as one of the major bottlenecks of SMT architecture, several fetch schemes were proposed by prior works to enhance the fetching efficiency and overall performance. In this paper, we propose a novel fetch policy called queue situation identifier (QSI) which counts some kind of long latency instructions of each thread each cycle then properly selects which threads to fetch next cycle. Simulation results show that in best case our fetch policy can achieve 30% on speedup and also can reduce the data cache level 1 miss rate.


Keywords:


References:
[1] D. Tullsen, S. Eggers, and H. Levy, "Simultaneous multithreading:
Maximizing on-chip parallelism," In 22nd Annul International Symposium on
Computer Architecture, June 1995, Pages 392-403
[2] D. Madon, E. Sanchez, and S. Monnier, "A Study of a Simultaneous
Multithreaded Architecture," In Proceedings of EuroPar'99, Toulouse, Lectures
Notes in Computer Science, Volume 1685, Springer-Verlag, Sep. 1999, Pages
716-726
[3] D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stamm, "Exploiting
choice: Instruction fetch and issue on an implementable simultaneous multithreading processor," In 23rd Annul
International Symposium on Computer Architecture, May 1996
[4] S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stamm, and D. Tullsen,
"Simultaneous multithreading: A platform for next-generation processors,"
IEEE Micro, Sep. 1997, Pages 12-18
[5] D. Tullsen, and J. Brown, "Handling long-latency loads in a simultaneous
multithreading processor" In 34th Annual International Symposium on
Microarchitecture, December, 2001
[6] Y-H. Chen, and J.-J. Shieh, "ICC: A Simultaneous Multithreading Fetch
Engine"2005 National Computer Symposium, 15-16 Dec. 2005, Pages 59-59
[7] D. Madon, E. Sanchez, and S. Monnier, "A Study of a Simultaneous
Multithreaded Architecture," In Proceedings of EuroPar'99, Toulouse, Lectures
Notes in Computer Science, Volume 1685, Springer-Verlag, Sep. 1999, Pages
716-726
[8] T. Austin, E. Larson, D. Ernst, "SimpleScalar: an infrastructure for
computer system modeling," IEEE Computer Journal, Feb. 2002, Pages 59-67
[9] D.M. Tullsen, J.A. Brown."Handling Long-latency Loads in a Simultaneous
Multithreading Processr,"In 34th International Symposium on
Microarchitecture, , Dec. 2001, Pages 318-327.
[10]Y-H. Chen, and J.-J. Shieh, "ICC: A Simultaneous Multithreading Fetch
Engine"2005 National Computer Symposium, 15-16 Dec. 2005, Pages 59-59