FPGA Hardware Implementation and Evaluation of a Micro-Network Architecture for Multi-Core Systems

This paper presents the design, implementation and evaluation of a micro-network, or Network-on-Chip (NoC), based on a generic pipeline router architecture. The router is designed to efficiently support traffic generated by multimedia applications on embedded multi-core systems. It employs a simplest routing mechanism and implements the round-robin scheduling strategy to resolve output port contentions and minimize latency. A virtual channel flow control is applied to avoid the head-of-line blocking problem and enhance performance in the NoC. The hardware design of the router architecture has been implemented at the register transfer level; its functionality is evaluated in the case of the two dimensional Mesh/Torus topology, and performance results are derived from ModelSim simulator and Xilinx ISE 9.2i synthesis tool. An example of a multi-core image processing system utilizing the NoC structure has been implemented and validated to demonstrate the capability of the proposed micro-network architecture. To reduce complexity of the image compression and decompression architecture, the system use image processing algorithm based on classical discrete cosine transform with an efficient zonal processing approach. The experimental results have confirmed that both the proposed image compression scheme and NoC architecture can achieve a reasonable image quality with lower processing time.

Flexible Wormhole-Switched Network-on-chip with Two-Level Priority Data Delivery Service

A synchronous network-on-chip using wormhole packet switching and supporting guaranteed-completion best-effort with low-priority (LP) and high-priority (HP) wormhole packet delivery service is presented in this paper. Both our proposed LP and HP message services deliver a good quality of service in term of lossless packet completion and in-order message data delivery. However, the LP message service does not guarantee minimal completion bound. The HP packets will absolutely use 100% bandwidth of their reserved links if the HP packets are injected from the source node with maximum injection. Hence, the service are suitable for small size messages (less than hundred bytes). Otherwise the other HP and LP messages, which require also the links, will experience relatively high latency depending on the size of the HP message. The LP packets are routed using a minimal adaptive routing, while the HP packets are routed using a non-minimal adaptive routing algorithm. Therefore, an additional 3-bit field, identifying the packet type, is introduced in their packet headers to classify and to determine the type of service committed to the packet. Our NoC prototypes have been also synthesized using a 180-nm CMOS standard-cell technology to evaluate the cost of implementing the combination of both services.

Pipelined Control-Path Effects on Area and Performance of a Wormhole-Switched Network-on-Chip

This paper presents design trade-off and performance impacts of the amount of pipeline phase of control path signals in a wormhole-switched network-on-chip (NoC). The numbers of the pipeline phase of the control path vary between two- and one-cycle pipeline phase. The control paths consist of the routing request paths for output selection and the arbitration paths for input selection. Data communications between on-chip routers are implemented synchronously and for quality of service, the inter-router data transports are controlled by using a link-level congestion control to avoid lose of data because of an overflow. The trade-off between the area (logic cell area) and the performance (bandwidth gain) of two proposed NoC router microarchitectures are presented in this paper. The performance evaluation is made by using a traffic scenario with different number of workloads under 2D mesh NoC topology using a static routing algorithm. By using a 130-nm CMOS standard-cell technology, our NoC routers can be clocked at 1 GHz, resulting in a high speed network link and high router bandwidth capacity of about 320 Gbit/s. Based on our experiments, the amount of control path pipeline stages gives more significant impact on the NoC performance than the impact on the logic area of the NoC router.