Title Utilizing deep learning models for image analysis at scale: comparison of deployment solutions /
Translation of Title Giliojo mokymosi modelių panaudojimas sparčiai vaizdų analizei: produkcinių aplinkų palyginimas.
Authors Klimiato, Renata
Full Text Download
Pages 78
Keywords [eng] image classification ; deep learning ; convolutional neural networks ; model serving ; quantization
Abstract [eng] Computer vision algorithms have been actively developed, with the highest peak during the last decade. With the increasing need for a phase of transferring pre-trained models to the production environment, it becomes important for architects of artificial intelligence systems to assess the inference environment for reaching the most effective model performance. In this paper, we review image classification task, architectures, model serving process, and deployment software. Furthermore, we present benchmark specification and experiments results performed using EfficientNet and MobileNet family models with the purpose of comparing three model serving software: TensorFlow Serving, TorchServe, and Triton Inference Server. Additionally, model quantization impact on experiments inference time was reviewed. As a result, Triton Inference Server showed 16 times faster performance compared to TorchServe. Additionally, cloud instances' hourly costs were reviewed when comparing TensorFlow Serving and Triton Inference Server model's performance. Lastly, recommendations for efficient image classification model inference in production were provided.
Dissertation Institution Kauno technologijos universitetas.
Type Master thesis
Language English
Publication date 2022