YOLOv8-DeepSORT: A High-Performance Framework for Real-Time Multi-Object Tracking with Attention and Adaptive Optimization

Godfrey Perfectson Oise; Nkem Belinda Unuigbokhai; Chioma Julia Onwuzo; Onyemaechi Clement Nwabuokei; Prosper Otega Ejenarhome; Onoriode Michael Atake; Sofiat Kehinde Bakare

doi:10.70882/josrar.2025.v2i2.50

Authors

Godfrey Perfectson Oise
godfrey.oise@wellspringuniversity.edung

Wellspring University
Nkem Belinda Unuigbokhai
Wellspring University, Edo State
Chioma Julia Onwuzo
Michael Okpara University of Agriculture, Umudike
Onyemaechi Clement Nwabuokei
Delta State College of Education, Mosugar
Prosper Otega Ejenarhome
Delta State University, Abraka
Onoriode Michael Atake
Western Delta University, Oghara
Sofiat Kehinde Bakare
University of Benin

Keywords:

YOLOv8, DeepSORT, Object Tracking, MOTA, Real-time Performance, Computer Vision, Deep Learning, Multi-object Tracking

Abstract

The integration of YOLOv8 and DeepSORT has significantly advanced real-time multi-object tracking in computer vision, delivering a robust solution for dynamic video analysis. This study comprehensively evaluates the YOLOv8-DeepSORT pipeline, combining YOLOv8's high-accuracy detection capabilities with DeepSORT's efficient identity association to achieve precise and consistent tracking. Key contributions include domain-specific fine-tuning of YOLOv4, optimization through model pruning and quantization, and seamless integration with DeepSORT's deep appearance descriptors and Kalman filtering. The system was rigorously tested on the MOT20 benchmark, achieving a Multiple Object Tracking Accuracy (MOTA) of 78.2%, precision of 83.5%, recall of 81.0%, and a mean Intersection over Union (IoU) of 0.74, demonstrating strong detection and tracking performance. The framework exhibited reliable identity preservation across frames with only 19 ID-switches and a false positive rate (FPR) of 4.8%. Real-time deployment on a GTX 1660 Ti achieved 28.6 frames per second (FPS), confirming its suitability for latency-sensitive applications. The study highlights practical implementations in traffic monitoring, industrial automation, retail analytics, and surveillance, showcasing the pipeline's adaptability to diverse scenarios. Challenges such as computational efficiency for edge deployment, occlusion handling in crowded environments, and ethical considerations in surveillance applications are critically analyzed. Optimization techniques, including adaptive tracking and multimodal integration, are proposed to address current limitations. By synthesizing experimental results and real-world case studies, this work provides a detailed assessment of the YOLOv8-DeepSORT framework, emphasizing its balance of accuracy, speed, and scalability. The findings serve as a valuable reference for researchers and practitioners aiming to deploy efficient object tracking systems in resource-constrained environments.

Dimensions

REFERENCES

Bhat, G., Danelljan, M., Van Gool, L., & Timofte, R. (2020). Know Your Surroundings: Exploiting Scene Information for Object Tracking. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer Vision – ECCV 2020 (Vol. 12368, pp. 205–221). Springer International Publishing. https://doi.org/10.1007/978-3-030-58592-1_13

Blatter, P., Kanakis, M., Danelljan, M., & Gool, L. V. (2023). Efficient Visual Tracking with Exemplar Transformers. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 1571–1581. https://doi.org/10.1109/WACV56688.2023.00162

Chen, X., Peng, H., Wang, D., Lu, H., & Hu, H. (2023). SeqTrack: Sequence to Sequence Learning for Visual Object Tracking. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14572–14581. https://doi.org/10.1109/CVPR52729.2023.01400

Chen, Z., Zhong, B., Li, G., Zhang, S., & Ji, R. (2020). Siamese Box Adaptive Network for Visual Tracking. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6667–6676. https://doi.org/10.1109/CVPR42600.2020.00670

Danelljan, M., Van Gool, L., & Timofte, R. (2020). Probabilistic Regression for Visual Tracking. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7181–7190. https://doi.org/10.1109/CVPR42600.2020.00721

James, G. G., P, O. G., G, C. E., A, M. N., F, E. W., & E, O. P. (2024). Optimizing Business Intelligence System Using Big Data and Machine Learning. Journal of Information Systems and Informatics, 6(2), 1215–1236. https://doi.org/10.51519/journalisi.v6i2.631

Kim, H.-I., & Park, R.-H. (2018). Residual LSTM Attention Network for Object Tracking. IEEE Signal Processing Letters, 25(7), 1029–1033. https://doi.org/10.1109/LSP.2018.2835768

Lan, J.-P., Cheng, Z.-Q., He, J.-Y., Li, C., Luo, B., Bao, X., Xiang, W., Geng, Y., & Xie, X. (2023). Procontext: Exploring Progressive Context Transformer for Tracking. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5. https://doi.org/10.1109/ICASSP49357.2023.10094971

Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., & Yan, J. (2019). SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4277–4286. https://doi.org/10.1109/CVPR.2019.00441

Li, B., Yan, J., Wu, W., Zhu, Z., & Hu, X. (2018). High Performance Visual Tracking with Siamese Region Proposal Network. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8971–8980. https://doi.org/10.1109/CVPR.2018.00935

Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2020). Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318–327. https://doi.org/10.1109/TPAMI.2018.2858826

Meinhardt, T., Kirillov, A., Leal-Taixe, L., & Feichtenhofer, C. (2022). TrackFormer: Multi-Object Tracking with Transformers. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8834–8844. https://doi.org/10.1109/CVPR52688.2022.00864

Oise, G., & Konyeha, S. (2024). E-WASTE MANAGEMENT THROUGH DEEP LEARNING: A SEQUENTIAL NEURAL NETWORK APPROACH. FUDMA JOURNAL OF SCIENCES, 8(3), 17–24. https://doi.org/10.33003/fjs-2024-0804-2579

Oise, G. P., & Konyeha, S. (2024). Deep Learning System for E-Waste Management. The 3rd International Electronic Conference on Processes, 66. https://doi.org/10.3390/engproc2024067066

Oise, G. P., Nwabuokei, O. C., Akpowehbve, O. J., Eyitemi, B. A., & Unuigbokhai, N. B. (2025). TOWARDS SMARTER CYBER DEFENSE: LEVERAGING DEEP LEARNING FOR THREAT IDENTIFICATION AND PREVENTION. FUDMA JOURNAL OF SCIENCES, 9(3), 122–128. https://doi.org/10.33003/fjs-2025-0903-3264

Oise, G. P., Oyedotun, S. A., Nwabuokei, O. C., Babalola, A. E., & Unuigbokhai, N. B. (2025). ENHANCED PREDICTION OF CORONARY ARTERY DISEASE USING LOGISTIC REGRESSION. FUDMA JOURNAL OF SCIENCES, 9(3), 201–208. https://doi.org/10.33003/fjs-2025-0903-3263

Porzi, L., Hofinger, M., Ruiz, I., Serrat, J., Bulo, S. R., & Kontschieder, P. (2020). Learning Multi-Object Tracking and Segmentation From Automatic Annotations. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6845–6854. https://doi.org/10.1109/CVPR42600.2020.00688

Song, H., Zheng, Y., & Zhang, K. (2017). Robust visual tracking via self‐similarity learning. Electronics Letters, 53(1), 20–22. https://doi.org/10.1049/el.2016.3011

Teng, Z., Zhang, B., & Fan, J. (2020). Three-step action search networks with deep Q-learning for real-time object tracking. Pattern Recognition, 101, 107188. https://doi.org/10.1016/j.patcog.2019.107188

Vaquero, L., Brea, V. M., & Mucientes, M. (2022). Tracking more than 100 arbitrary objects at 25 FPS through deep learning. Pattern Recognition, 121, 108205. https://doi.org/10.1016/j.patcog.2021.108205

Wang, L., Ouyang, W., Wang, X., & Lu, H. (2015). Visual Tracking with Fully Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV), 3119–3127. https://doi.org/10.1109/ICCV.2015.357

Yu, B., Tang, M., Zheng, L., Zhu, G., Wang, J., Feng, H., Feng, X., & Lu, H. (2021). High-Performance Discriminative Tracking with Transformers. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 9836–9845. https://doi.org/10.1109/ICCV48922.2021.00971

Zhang, Y., Wang, T., & Zhang, X. (2023). MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22056–22065. https://doi.org/10.1109/CVPR52729.2023.02112

YOLOv8-DeepSORT: A High-Performance Framework for Real-Time Multi-Object Tracking with Attention and Adaptive Optimization

Authors

Keywords:

Abstract

REFERENCES

Published

How to Cite

Issue

Section

How to Cite

Latest publications

Information

Make a Submission