YOLOv8-DeepSORT: A High-Performance Framework for Real-Time Multi-Object Tracking with Attention and Adaptive Optimization
DOI:
https://doi.org/10.70882/josrar.2025.v2i2.50Keywords:
YOLOv8, DeepSORT, Object Tracking, MOTA, Real-time Performance, Computer Vision, Deep Learning, Multi-object TrackingAbstract
The integration of YOLOv8 and DeepSORT has significantly advanced real-time multi-object tracking in computer vision, delivering a robust solution for dynamic video analysis. This study comprehensively evaluates the YOLOv8-DeepSORT pipeline, combining YOLOv8's high-accuracy detection capabilities with DeepSORT's efficient identity association to achieve precise and consistent tracking. Key contributions include domain-specific fine-tuning of YOLOv4, optimization through model pruning and quantization, and seamless integration with DeepSORT's deep appearance descriptors and Kalman filtering. The system was rigorously tested on the MOT20 benchmark, achieving a Multiple Object Tracking Accuracy (MOTA) of 78.2%, precision of 83.5%, recall of 81.0%, and a mean Intersection over Union (IoU) of 0.74, demonstrating strong detection and tracking performance. The framework exhibited reliable identity preservation across frames with only 19 ID-switches and a false positive rate (FPR) of 4.8%. Real-time deployment on a GTX 1660 Ti achieved 28.6 frames per second (FPS), confirming its suitability for latency-sensitive applications. The study highlights practical implementations in traffic monitoring, industrial automation, retail analytics, and surveillance, showcasing the pipeline's adaptability to diverse scenarios. Challenges such as computational efficiency for edge deployment, occlusion handling in crowded environments, and ethical considerations in surveillance applications are critically analyzed. Optimization techniques, including adaptive tracking and multimodal integration, are proposed to address current limitations. By synthesizing experimental results and real-world case studies, this work provides a detailed assessment of the YOLOv8-DeepSORT framework, emphasizing its balance of accuracy, speed, and scalability. The findings serve as a valuable reference for researchers and practitioners aiming to deploy efficient object tracking systems in resource-constrained environments.
References
Bhat, G., Danelljan, M., Van Gool, L., & Timofte, R. (2020). Know Your Surroundings: Exploiting Scene Information for Object Tracking. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer Vision – ECCV 2020 (Vol. 12368, pp. 205–221). Springer International Publishing. https://doi.org/10.1007/978-3-030-58592-1_13
Blatter, P., Kanakis, M., Danelljan, M., & Gool, L. V. (2023). Efficient Visual Tracking with Exemplar Transformers. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 1571–1581. https://doi.org/10.1109/WACV56688.2023.00162
Chen, X., Peng, H., Wang, D., Lu, H., & Hu, H. (2023). SeqTrack: Sequence to Sequence Learning for Visual Object Tracking. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14572–14581. https://doi.org/10.1109/CVPR52729.2023.01400
Chen, Z., Zhong, B., Li, G., Zhang, S., & Ji, R. (2020). Siamese Box Adaptive Network for Visual Tracking. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6667–6676. https://doi.org/10.1109/CVPR42600.2020.00670
Danelljan, M., Van Gool, L., & Timofte, R. (2020). Probabilistic Regression for Visual Tracking. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7181–7190. https://doi.org/10.1109/CVPR42600.2020.00721
James, G. G., P, O. G., G, C. E., A, M. N., F, E. W., & E, O. P. (2024). Optimizing Business Intelligence System Using Big Data and Machine Learning. Journal of Information Systems and Informatics, 6(2), 1215–1236. https://doi.org/10.51519/journalisi.v6i2.631
Kim, H.-I., & Park, R.-H. (2018). Residual LSTM Attention Network for Object Tracking. IEEE Signal Processing Letters, 25(7), 1029–1033. https://doi.org/10.1109/LSP.2018.2835768
Lan, J.-P., Cheng, Z.-Q., He, J.-Y., Li, C., Luo, B., Bao, X., Xiang, W., Geng, Y., & Xie, X. (2023). Procontext: Exploring Progressive Context Transformer for Tracking. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5. https://doi.org/10.1109/ICASSP49357.2023.10094971
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., & Yan, J. (2019). SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4277–4286. https://doi.org/10.1109/CVPR.2019.00441
Li, B., Yan, J., Wu, W., Zhu, Z., & Hu, X. (2018). High Performance Visual Tracking with Siamese Region Proposal Network. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8971–8980. https://doi.org/10.1109/CVPR.2018.00935
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2020). Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318–327. https://doi.org/10.1109/TPAMI.2018.2858826
Meinhardt, T., Kirillov, A., Leal-Taixe, L., & Feichtenhofer, C. (2022). TrackFormer: Multi-Object Tracking with Transformers. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8834–8844. https://doi.org/10.1109/CVPR52688.2022.00864
Oise, G., & Konyeha, S. (2024). E-WASTE MANAGEMENT THROUGH DEEP LEARNING: A SEQUENTIAL NEURAL NETWORK APPROACH. FUDMA JOURNAL OF SCIENCES, 8(3), 17–24. https://doi.org/10.33003/fjs-2024-0804-2579
Oise, G. P., & Konyeha, S. (2024). Deep Learning System for E-Waste Management. The 3rd International Electronic Conference on Processes, 66. https://doi.org/10.3390/engproc2024067066
Oise, G. P., Nwabuokei, O. C., Akpowehbve, O. J., Eyitemi, B. A., & Unuigbokhai, N. B. (2025). TOWARDS SMARTER CYBER DEFENSE: LEVERAGING DEEP LEARNING FOR THREAT IDENTIFICATION AND PREVENTION. FUDMA JOURNAL OF SCIENCES, 9(3), 122–128. https://doi.org/10.33003/fjs-2025-0903-3264
Oise, G. P., Oyedotun, S. A., Nwabuokei, O. C., Babalola, A. E., & Unuigbokhai, N. B. (2025). ENHANCED PREDICTION OF CORONARY ARTERY DISEASE USING LOGISTIC REGRESSION. FUDMA JOURNAL OF SCIENCES, 9(3), 201–208. https://doi.org/10.33003/fjs-2025-0903-3263
Porzi, L., Hofinger, M., Ruiz, I., Serrat, J., Bulo, S. R., & Kontschieder, P. (2020). Learning Multi-Object Tracking and Segmentation From Automatic Annotations. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6845–6854. https://doi.org/10.1109/CVPR42600.2020.00688
Song, H., Zheng, Y., & Zhang, K. (2017). Robust visual tracking via self‐similarity learning. Electronics Letters, 53(1), 20–22. https://doi.org/10.1049/el.2016.3011
Teng, Z., Zhang, B., & Fan, J. (2020). Three-step action search networks with deep Q-learning for real-time object tracking. Pattern Recognition, 101, 107188. https://doi.org/10.1016/j.patcog.2019.107188
Vaquero, L., Brea, V. M., & Mucientes, M. (2022). Tracking more than 100 arbitrary objects at 25 FPS through deep learning. Pattern Recognition, 121, 108205. https://doi.org/10.1016/j.patcog.2021.108205
Wang, L., Ouyang, W., Wang, X., & Lu, H. (2015). Visual Tracking with Fully Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV), 3119–3127. https://doi.org/10.1109/ICCV.2015.357
Yu, B., Tang, M., Zheng, L., Zhu, G., Wang, J., Feng, H., Feng, X., & Lu, H. (2021). High-Performance Discriminative Tracking with Transformers. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 9836–9845. https://doi.org/10.1109/ICCV48922.2021.00971
Zhang, Y., Wang, T., & Zhang, X. (2023). MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22056–22065. https://doi.org/10.1109/CVPR52729.2023.02112
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Godfrey Perfectson Oise, Nkem Belinda Unuigbokhai, Chioma Julia Onwuzo, Onyemaechi Clement Nwabuokei, Prosper Otega Ejenarhome, Onoriode Michael Atake, Sofiat Kehinde Bakare (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial — You may not use the material for commercial purposes.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.