JuliaCon 2020 (times are in UTC)

Parallel face recognition algorithms using Julia + CUDAnative.jl

Three important algorithms for face recognition are implemented and tested in this work. Principal Component Analysis (PCA) or eigenfaces, Histogram Oriented Gradients (HOG) and Convolutional Neural Networks (CNN) are computationally demanding algorithms.
To obtain performance improvements, parallel algorithms were designed using Julia 1.1.0 and graphic processing units through CUDAnative.jl which provide effortless development and scalability. This work is supported by PAPIIT-IA104720


The biometric pattern recognition area is very important in the industry, particularly for security and authentication reasons. Face recognition is commonly used in device unlocking and controlled accessibility, as well as being present in different Internet of Things (IoT) applications. Computer Scientists work on increasing the effectiveness of the three most representative algorithms for face recognition: PCA, HOG and CNN. Parallel implementations of these algorithms using a high-level programming language and graphic accelerator devices help to show the advantages and disadvantages of each of these algorithms related to accuracy. CUDAnative.jl wrapper facilitates the design of CUDA kernels, optimizes matrix calculations through CUDArray objects creation and wraps the low-level programming required when using pure CUDA.

The main disadvantage that these algorithms share is that their execution is slow, for instance, PCA carries out a brute force strategy when comparing the complete set of images, equivalently the CNN algorithm is slow in training the neural network; such that the parallelization of these algorithms is an important option worth considering.

Different image databases such as "Extended Yale Face Database", "The AR Face Database", "Face94 face Database" and "Labeled Faces in the wild dataset" were used to confirm that there is no overfitting in each of the algorithms mentioned above, some evaluations were made to find the algorithm with greater accuracy in the classification.

The advantages offered by Julia combined with the CUDAnative.jl wrapper are significant since it facilitates the development of parallel algorithms that are executed in graphic processing units, speeding up their execution without the need to understand CUDA's low-level programming paradigm in detail.
We would like to thank the financial support of PAPIIT-IA104720.