FFCNN: Fast FPGA based Acceleration for Convolution neural network inference
Abstract
We present a new efficient OpenCL-based Accelerator for large scale Convolutional Neural Networks called Fast Inference on FPGAs for Convolution Neural Network (FFCNN). FFCNN is based on a deeply pipelined OpenCL kernels architecture. As pointed out before, high-level synthesis tools such as the OpenCL framework can easily port codes originally designed for CPUs and GPUs to FPGAs, but it is still difficult to make OpenCL codes run efficiently on FPGAs. This work aims to propose an efficient FPGA implementation of OpenCL High-Performance Computing Applications. To do so, a Data reuse and task mapping techniques are also presented to improve design efficiency. In addition, the following motivations were taken into account when developing FFCNN: 1) FFCNN has been designed to be easily implemented on Intel OpenCL SDK based FPGA design flow. 2) In FFFCN, different techniques have been integrated to improve the memory band with and throughput. A performance analysis is conducted on two deep CNN for Large-Scale Images classification. The obtained results, and the comparison with other works designed to accelerate the same types of architectures, show the efficiency and the competitiveness of the proposed accelerator design by significantly improved performance and resource utilization.