Skip to content

Object Detection and Semantic Segmentation

Object Detection Model

Currently, the model used for detecting pallets and ground is YOLOv11. This model was chosen due to its balance between performance and efficiency. Despite utilizing fewer parameters, it achieves a commendable mAP (mean Average Precision) score, making it a suitable choice for edge computing devices like Jetson Orin boards. Additionally, its architecture is easier to optimize and quantize, which is critical for resource-constrained environments.

Model Selection

The YOLOv11 model is available in multiple sizes. Considering the requirements for edge deployment, I selected the small and medium-sized variants, as they offer an optimal trade-off between accuracy and computational efficiency.

Below are the training and validation results for both model sizes:

1. YOLOv11-Small:

YOLOv11 Small
Figure 1: Training Result (100 epoch)
YOLOv11 Medium
Figure 2: Training Confusion Matrix Normalized
YOLOv11 Medium
Figure 3: Validation Confusion Matrix Normalized

2. YOLOv11-Medium:

YOLOv11 Medium
Figure 4: Training Result (100 epoch)
YOLOv11 Medium
Figure 5: Training Confusion Matrix Normalized
YOLOv11 Medium
Figure 6: Validation Confusion Matrix Normalized

Note: The results shown above pertain to the YOLOv11 small and medium-sized models trained for 100 epochs. For a comprehensive analysis, including results from both 50-epoch and 100-epoch training, refer to the full results.

Download the complete result: (YOLOv11s-50ep) (YOLOv11s-100ep) (YOLOv11m-50ep) (YOLOv11m-100ep)