Utilizing deep learning models for image analysis at scale: comparison of deployment solutions /

Renata Klimiato

Title	Utilizing deep learning models for image analysis at scale: comparison of deployment solutions /
Translation of Title	Giliojo mokymosi modelių panaudojimas sparčiai vaizdų analizei: produkcinių aplinkų palyginimas.
Authors	Klimiato, Renata
Full Text
Pages	78
Keywords [eng]	image classification ; deep learning ; convolutional neural networks ; model serving ; quantization
Abstract [eng]	Computer vision algorithms have been actively developed, with the highest peak during the last decade. With the increasing need for a phase of transferring pre-trained models to the production environment, it becomes important for architects of artificial intelligence systems to assess the inference environment for reaching the most effective model performance. In this paper, we review image classification task, architectures, model serving process, and deployment software. Furthermore, we present benchmark specification and experiments results performed using EfficientNet and MobileNet family models with the purpose of comparing three model serving software: TensorFlow Serving, TorchServe, and Triton Inference Server. Additionally, model quantization impact on experiments inference time was reviewed. As a result, Triton Inference Server showed 16 times faster performance compared to TorchServe. Additionally, cloud instances' hourly costs were reviewed when comparing TensorFlow Serving and Triton Inference Server model's performance. Lastly, recommendations for efficient image classification model inference in production were provided.
Dissertation Institution	Kauno technologijos universitetas.
Type	Master thesis
Language	English
Publication date	2022

„Utilizing deep learning models for image analysis at scale: comparison of deployment solutions /“