Application-Specific Instruction Sets Processor with Implicit Registers to Improve Register Bandwidth

Application-Specific Instruction (ASI ) set Processors (ASIP) have become an important design choice for embedded systems due to runtime flexibility, which cannot be provided by custom ASIC solutions. One major bottleneck in maximizing ASIP performance is the limitation on the data bandwidth between the General Purpose Register File (GPRF) and ASIs. This paper presents the Implicit Registers (IRs) to provide the desirable data bandwidth. An ASI Input/Output model is proposed to formulate the overheads of the additional data transfer between the GPRF and IRs, therefore, an IRs allocation algorithm is used to achieve the better performance by minimizing the number of extra data transfer instructions. The experiment results show an up to 3.33x speedup compared to the results without using IRs.




References:
[1] CoWare LISATek Tools. http://www.coware.com/.
[2] Tensilica. http://www.tensilica.com/.
[3] Altera Corp. http://www.altera.com/.
[4] MIPS CorExtend. http://www.mips.com/.
[5] IBM PowerPC. http://www.ibm.com/
[6] M. Jain et al., "ASIP Design Methodologies: Survey and Issues,"
Proceedings of the 14 International Conference on VLSI Design, 2001, pp.
3-7, Jan. 2001.
[7] D. Fischer, J. Teich, M.Thies, and R.Weper, "Efficient
architecture/compiler co-exploration for asips," in Proc. Int. Conf.
Compilers, Arch., Synth. Embedded Syst., 2002, pp.27-34.
[8] N. Clark, H. Zhong, and S. Mahlke, "Processor acceleration through
automated instruction set customization," in Proc. 36th Annu. Int. Symp.
Microarchitecture, Dec. 2003, pp. 129-140.
[9] P. Yu and T. Mitra, "Scalable custom instructions identification for
instruction set extensible processors," in Proc. Int. Conf. Compilers
Architectures Synthesis Embedded Syst., Sep. 2004, pp. 69-78.
[10] K. Atasu, L. Pozzi, and P. Ienne, "Automatic application-specific
instruction-set extensions under microarchitectural constraints," in Proc.
40th Des. Autom. Conf., Jun. 2003, pp. 256-261.
[11] L. Pozzi, K. Atasu, and P. Ienne, "Exact and approximate algorithms for
the extension of embedded processor instruction sets," IEEE Trans.
Comput.-Aided Des. Integr. Circuits Syst., vol. 25, no. 4, pp. 1209-1229,
Jul. 2006.
[12] P. Yu and T. Mitra, "Disjoint pattern enumeration for custom instruction
identification," in Proc. 17th Int. Conf. Field-Programmable Logic Appl.,
Aug. 2007, pp. 273-278.
[13] P. Bonzini and L. Pozzi, "Polynomial-time subgraph enumeration for
automated instruction set extension," in Proc. Des. Autom. Test Eur. Conf.
Exhibition, Apr. 2007, pp. 1331-1336.
[14] X. Chen, D. L. Maskell, and Y. Sun, "Fast identification of custom
instructions for extensible processors," IEEE Trans. Comput.-Aided Des.
Integr. Circuits Syst., vol. 26, no. 2, pp. 359-368, Feb. 2007.
[15] N.T. Clark, H. Zhong, S.A. Mahlke, "Automated custom instruction
generation for domain-specific processor acceleration," IEEE Transactions
on Computers, Vol. 54, Issue. 10, p1258-1270, Oct. 2005.
[16] P. Ienne, L. Pozzi, and M. Vuletic, "On the limits of processor
specialization by mapping dataflow sections on ad-hoc functional units,"
Comput. Sci. Dept., Swiss Federal Inst. Technol. Lausanne, Lausanne,
Switzerland, Tech. Rep. 01/376, 2001.
[17] F. Sun, S. Ravi, A. Raghunathan, and N. K. Jha, "Synthesis of custom
processors based on extensible platforms," in Proc. Int. Conf. Comput.-
Aided Des., 2002, pp. 256-261.
[18] J. Cong, G. Han, Z. Zhang, "Architecture and Compiler Optimizations for
Data Bandwidth Improvement in Configurable Processors," IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, Vol.
14, no. 9, pp. 986 - 997, 2006.
[19] Pozzi L. Pozzi and P. Ienne. Exploiting pipelining to relax register file
port constraints of instruction-set extensions. In CASES 2005, San
Francisco, CA, Sept. 2005.
[20] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R.
B. Brown, "MiBench: A free, commercially representative embedded
benchmark suite," Proc. IEEE 4th Ann. Workshop Workload
Characterization (WWC 01), Dec. 2001, pp. 3-14.
[21] MPEG Audio Decoder. http://www.underbit.com/products/mad/.