An Integrated Temporal Partitioning and Mapping Framework for Improving Performance of a Reconfigurable Instruction Set Processor
محورهای موضوعی : Journal of Computer & RoboticsFarhad Mehdipour 1 , Hamid Noori 2 , Morteza Saheb Zamani 3 , Hiroaki Honda 4 , Koji Inoue 5 , Kazuaki Murakami 6
1 - Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, Fukuoka, Japan
2 - School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
3 - Department of Computer Engineering and IT, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran
4 - Institute of Systems, Information Technologies and Nanotechnologies, Fukuoka, Japan
5 - Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, Fukuoka, Japan
6 - Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, Fukuoka, Japan
کلید واژه: Reconfigurable instruction set processor, Custom instruction, Reconfigurable functional unit, Temporal partitioning,
چکیده مقاله :
Reconfigurable instruction set processors allow customization for an application domain by extending the core instruction set architecture. Extracting appropriate custom instructions is an important phase for implementing an application on a reconfigurable instruction set processor. A custom instruction (CI) is usually extracted from critical portions of applications and implemented on a reconfigurable functional unit. In this paper, our proposed RFU architecture for a reconfigurable instruction set processor is introduced. As the main contribution of this work, an integrated framework of temporal partitioning and mapping is introduced that partitions and maps CIs on the RFU. Temporal partitioning iterates and modifies partitions incrementally to generate CIs. The proposed framework improves the timing performance particularly for the applications comprising a considerable amount of CIs that could not be implemented on the RFU due to architectural limitations. Furthermore, exploiting similarity detection and merging as two complementary techniques for the integrated framework brings about reduction in the configuration memory size.
References
[1] M. Arnold and H. Corporaal, Designing domain-specific processors, In Proc. of the Design, Automation and Test in Europe Conf, 61-66, 2002.
[2] K. Atasu, L. Pozzi and P. Lenne, Automatic application-specific instruction-set extensions under microarchitectural constraints, In Proc. of the Design, Automation and Test in Europe (DATE), 256-261, 2003.
[3] F. Barat, R. Lauwereins and G. Deconinck, Reconfigurable instruction set processors from a hardware/software perspective, IEEE Trans. on Software Engineering, vol. 28, no. 9, 847-861, 2002.
[4] C.Bobda, Synthesis of Dataflow Graphs for Reconfigurable Systems Using Temporal Partitioning and Temporal Placement, Ph.D thesis, University of Paderborn, 2003.
[5] N. Clark, M. Kudlur, H. Park, S. Mahlke and K. Flautner, Application-specific processing on a general-purpose core via
11.21.41.61.822.22.4bitcountsblowfishblowfish (dec)cjpegdjpegfftfft (inv)gsm (dec)gsm (enc)lamerijndael (enc)rijndael (dec)shaSpeedup HTTPVTTP Without IntegFrame
Journal of Computer and Robotics 1 (2010) 1-11
11
transparent instruction set customization, In Proc. of IEEE/ACM Int. Symp. on Microarchitecture, 30-40, 2004.
[6] M. Karthikeya, P. Gajjala and B. Dinesh, Temporal partitioning and scheduling data flow graphs for reconfigurable computer, IEEE Trans. on Computers, vol. 48, no. 6, 579-590, 1999.
[7] R. Kastner, A. Kaplan, S. Ogrenci Memik and E. Bozorgzadeh, Instruction generation for hybrid reconfigurable systems, ACM TODAES, vol. 7, no. 4, 605-627, 2002.
[8] J. Krinke, Identifying Similar Code with Program Dependence Graphs, In Proc. 8th Working Conf. on Reverse Engineering, 301-309, 2001.
[9] F. Mehdipour, H. Noori, M. Saheb Zamani, K. Murakami, M. Sedighi and K. Inoue, An integrated temporal partitioning and mapping framework for handling custom instructions on a reconfigurable functional unit, The 11th Asia-Pacific Computer Systems Architecture Conf. (ACSAC'06), Lecture Notes in Computer Science, vol. 4186/2006, 219-230, 2006.
[10] F. Mehdipour, M. Saheb Zamani and M. Sedighi, An integrated temporal partitioning and physical design framework for static compilation of reconfigurable computing systems, Microprocessors and Microsystems, vol. 30, no. 1, 52-62, 2006.
[11] Mibench. http://www.eecs.umich.edu/mibench.
[12] G.D. Micheli, Synthesis and Optimization of Digital Circuits, McGraw-Hill, 1994.
[13] H. Noori, F. Mehdipour, K. Murakami, K. Inoue and M. Saheb Zamani, An architecture framework for an adaptive extensible
processor, The Journal of Supercomputing, Springer Netherlands, vol. 45, no. 3, 313-340, 2008.
[14] I. Ouaiss, S. Govindarajan, V. Srinivasan, M. Kaul and R. Vemuri, An integrated partitioning and synthesis system for dynamically reconfigurable multi-FPGA architectures, In Proc. of the Reconfigurable Architecture Workshop, 31-36, 1998.
[15] R. Razdan and M.D. Smith, A high-performance microarchitecture with hardware-programmable functional units, In Proc. of the 27th Annual Int. Symp. on Microarchitecture, 172-180, 1994.
[16] N. Sherwani, Algorithms for VLSI Physical Design Automation, Kluwer-Academic Publishers, 1991.
[17] Simplescalar. http://www.simplescalar.com.
[18] J. Spillane and H. Owen, Temporal partitioning for partially reconfigurable field programmable gate arrays, IPPS/SPDP Workshops, 37-42, 1998.
[19] C. Tanougast, Y. Berviller, P. Brunet, S. Weber and H. Rabah, Temporal partitioning methodology optimizing FPGA resources for dynamically reconfigurable embedded real-time system, Microprocessors and Microsystems, vol. 27, 115-130, 2003.
[20] W.Weisstein,Graphisomorphism,http://mathworld.wolfram.com/ GraphIsomorphism.html.
[21] Z.A.Ye et al, Chimaera: A high-performance architecture with tightly-coupled reconfigurable functional unit, In Proc. of the 27th Annual Int. Symp. on Computer Architecture, 225-235, 2000.