Performance Analysis of Digital Signal Processors Using SMV Benchmark

Unlike general-purpose processors, digital signal processors (DSP processors) are strongly application-dependent. To meet the needs for diverse applications, a wide variety of DSP processors based on different architectures ranging from the traditional to VLIW have been introduced to the market over the years. The functionality, performance, and cost of these processors vary over a wide range. In order to select a processor that meets the design criteria for an application, processor performance is usually the major concern for digital signal processing (DSP) application developers. Performance data are also essential for the designers of DSP processors to improve their design. Consequently, several DSP performance benchmarks have been proposed over the past decade or so. However, none of these benchmarks seem to have included recent new DSP applications. In this paper, we use a new benchmark that we recently developed to compare the performance of popular DSP processors from Texas Instruments and StarCore. The new benchmark is based on the Selectable Mode Vocoder (SMV), a speech-coding program from the recent third generation (3G) wireless voice applications. All benchmark kernels are compiled by the compilers of the respective DSP processors and run on their simulators. Weighted arithmetic mean of clock cycles and arithmetic mean of code size are used to compare the performance of five DSP processors. In addition, we studied how the performance of a processor is affected by code structure, features of processor architecture and optimization of compiler. The extensive experimental data gathered, analyzed, and presented in this paper should be helpful for DSP processor and compiler designers to meet their specific design goals.




References:
[1] W. Strauss, Forward Concepts- Press 55, www.fwdconcepts.com, April
2007
[2] D. Katz and R. Gentile, How to Choose an Embedded Media Processor,
DSP Design Line April, 10, 2007
[3] N. Dutt and K. Choi, Configurable Processors for Embedded Computing,
IEEE Computer, Jan. 2003
[4] E. Tan and W. Heinzelman, DSP architectures: past, present and futures,
ACM SIGARCH Computer Architecture News Vol. 31, Issue 3, 2003
[5] C. Kozyrakis and D. Patterson, Vector vs. Superscalar and VLIW
Architectures for Embedded Multimedia Benchmarks, Proc. of MICRO-
35, 2002
[6] The BDTImark2000™: A Summary Measure of DSP Speed,
www.bdti.com, Sept. 2004
[7] EEMBC Brings Embedded Benchmarking out of the Pits, 2000,
www.eembc.org
[8] C. Lee et al., MediaBench: A Tool for Evaluating and Synthesizing
Multimedia and Communications Systems, Proc. Of MICRO-30, 1997
[9] M. Guthaus, etc., MiBench: A free, commercially representative
embedded benchmark suite, IEEE 4th Annual Workshop on Workload
Characterization, Austin, TX, December 2001
[10] E. Hu et al, New DSP Benchmark based on Selectable Mode Vocoder
(SMV), Proc. of the 2006 International Conference on Computer
Design, June 2006
[11] CDMA Enhancements Build on a Strong Foundation, www.cdg.org,
2003
[12] M. Genutis, E. Kazanavièius, and O.Olsen, Benchmarking in DSP, ISSN
1392-2114 ULTRAGARSAS, Nr.2(39). 2001.
[13] Code Optimization for TI C62xx / C64xx, CHRONIX tutorial,
www.chronix.co.jp/chronix/syouhin/visioncomponents/pdf/Code_Optim
ization.pdf, 2005
[14] M. Chalamalasetti, Selectable Mode Vocoder (SMV), www.bsnl.in, Feb.
2003
[15] W. Strauss, Forward Concepts- DSP Market Bulletin,
www.fwdconcepts.com, Jan. 2008
[16] D. Talla et al, Evaluating Signal Processing and Multimedia
Applications on SIMD, VLIW, and Superscalar Architectures., Proc. Of
ICCD-00, 2000
[17] J. Fisher etc., Moving from Embedded Systems to Embedded
Computing, Keynote addressing, CASES03, 2003
[18] www.3gpp2.org,
[19] L. Codrescu and E. Plondke, A Characterization of Branch Behavior in
DSP Application, Proc. Of the International Signal Processing
Conference (ISPC03), 2003
[20] E. Fernandes and V. Barbosa, Monitoring the Structure and Behavior of
Programs, Proc. of MPCS-02, April, 2002
[21] M. Smith, Overcoming the Challenges to Feedback-Directed
Optimization, Proc. of the ACM SIGPLAN Workshop on Dynamic and
Adaptive Compilation and Optimization (Dynamo-00), 2000.
[22] S. Jinturkar etc., Profile Directed Compilation in DSP Applications, Proc.
of the International Conference on Signal Processing Applications and
Technology (ICSPAT'98, 1998)
[23] D. Wall, Limits of Instruction-Level Parallelism, Proc. of ASPLOS-IV,
1991.
[24] S. Graham etc., gprof: A Call Graph Executin Profiler. Proc. of
SIGPLAN notices, Vol. 17, No.6, 1982.
[25] B. Su et al., Analysis of Loop Behavior of Selectable Mode Vocoder
(SMV) and Its Impact of Instruction Level Parallelism, Proc. of GSPx
2005.
[26] T. McCabe, A Complexity Meqsure, IEEE Tran. On Software
Engineering, 2(4):308-320, 1976
[27] Software Engineering Institute, Cyclomatic Complexity, Software
Technology Roadmap, Carnegie Mellon University,
http://www.sei.cmu.edu/str/descriptions/cyclomatic_body.htm, 2005
[28] S. Ahmadi, Tutorial on the Variable-Rate Multimode Wideband Speech
Codec, CommsDesign, Sept. 2, 2003
[29] B. Su et al, Software De-Pipelining Technique, Proc. Fourth IEEE
International Workshop on Source Code Analysis and Manipulation
(SCAM2004), 2004
[30] B. Su et al, A new Source-Level Benchmarking for DSP Processors,
Proc. of the International Conference on Signal Processing Applications
and Technology (ICSPAT'03) 2003.
[31] J. Sankaran et al, Optimized implementation of the FFT algorithm on the
TMS320C62x and the TMS320C64x DS, Proc. of the 3rd Workshop on
Optimizations for DSP andEmbedded Systems (ODES-3), March 20,
2005