diff --git a/docs/doc/assets/yolo26_out.png b/docs/doc/assets/yolo26_out.png new file mode 100644 index 00000000..b0ab2a63 Binary files /dev/null and b/docs/doc/assets/yolo26_out.png differ diff --git a/docs/doc/en/vision/customize_model_yolov8.md b/docs/doc/en/vision/customize_model_yolov8.md index 2cbd0ce2..abd91185 100644 --- a/docs/doc/en/vision/customize_model_yolov8.md +++ b/docs/doc/en/vision/customize_model_yolov8.md @@ -1,10 +1,10 @@ --- -title: Offline Training for YOLO11/YOLOv8 Models on MaixCAM MaixPy to Customize Object and Keypoint Detection +title: Offline Training YOLO11/YOLOv8/YOLO26 Models for MaixCAM with MaixPy:Custom Object Detection & Keypoint Detection update: - date: 2024-06-21 version: v1.0 author: neucrack - content: Document creation + content: Initial documentation - date: 2024-10-10 version: v2.0 author: neucrack @@ -12,48 +12,45 @@ update: - date: 2025-07-01 version: v3.0 author: neucrack - content: Add MaixCAM2 support + content: Added MaixCAM2 support + - date: 2026-02-04 + version: v4.0 + author: Tao + content: Added YOLO26 support --- ## Introduction +The official default model supports detection of 80 object classes. If this does not meet your requirements, you can train your own custom object detector by setting up a training environment on your local computer or server. -The default official model provides detection for 80 different objects. If this doesn't meet your needs, you can train your own model to detect custom objects, which can be done on your own computer or server by setting up a training environment. +YOLOv8 / YOLO11 not only support object detection, but also `yolov8-pose` / `YOLO11-pose` for keypoint detection. In addition to the official human keypoints, you can create your own custom keypoint datasets to train detection for specific objects and keypoints. -YOLOv8 / YOLO11 not only supports object detection but also supports keypoint detection with YOLOv8-pose / YOLO11-pose. Apart from the official human keypoints, you can also create your own keypoint dataset to train models for detecting specific objects and keypoints. +Since YOLOv8 and YOLO11 mainly differ in their internal network architecture while sharing the same preprocessing and postprocessing steps, the training and conversion procedures are identical—only the output node names differ. -Since YOLOv8 and YOLO11 mainly modify the internal network while the preprocessing and post-processing remain the same, the training and conversion steps for YOLOv8 and YOLO11 are identical, except for the output node names. +**Note:** This document covers custom training, but assumes you already possess some foundational knowledge. If not, please study independently: +* This document does not cover training environment setup. Please search for and install a PyTorch environment on your own. +* This document does not explain basic machine learning concepts or fundamental Linux usage. -**Note:** This article explains how to train a custom model but assumes some basic knowledge. If you do not have this background, please learn it independently: -* This article will not cover how to set up the training environment; please search for how to install and test a PyTorch environment. -* This article will not cover basic machine learning concepts or Linux-related knowledge. +If you find areas for improvement in this document, feel free to click `Edit this document` in the top-right corner to contribute and submit a documentation PR. -If you think there are parts of this article that need improvement, please click on `Edit this article` at the top right and submit a PR to contribute to the documentation. - -## Process and Article Goal - -To ensure our model can be used on MaixPy (MaixCAM), it must go through the following steps: -* Set up the training environment (not covered in this article, please search for how to set up a PyTorch training environment). -* Clone the [YOLO11/YOLOv8](https://github.com/ultralytics/ultralytics) source code locally. -* Prepare the dataset and format it according to the YOLO11 / YOLOv8 project requirements. -* Train the model to obtain an `onnx` model file, which is the final output of this article. -* Convert the `onnx` model into a `MUD` file supported by MaixPy, as described in the [MaixCAM Model Conversion](../ai_model_converter/maixcam.md) article. -* Use MaixPy to load and run the model. +## Workflow & Document Objectives +To use your model with MaixPy (MaixCAM), follow this process: +* Set up a training environment (omitted in this document; search for PyTorch environment setup guides). +* Clone the [YOLO11/YOLOv8/YOLO26](https://github.com/ultralytics/ultralytics) source code locally. +* Prepare a dataset formatted for YOLO11 / YOLOv8 / YOLO26. +* Train the model and export an `onnx` model file (the final output of this document). +* Convert the `onnx` model to a MaixPy-compatible `MUD` file, as detailed in [MaixCAM Model Conversion](../ai_model_converter/maixcam.md). +* Load and run the model with MaixPy. ## Where to Find Datasets for Training +See [Where to Find Datasets](../pro/datasets.md). -Please refer to [Where to find datasets](../pro/datasets.md) - - -## Reference Articles - -Since this process is quite general, this article only provides an overview. For specific details, please refer to the **[YOLO11 / YOLOv8 official code and documentation](https://github.com/ultralytics/ultralytics)** (**recommended**) and search for training tutorials to eventually export an ONNX file. +## Reference Materials +Since these are general procedures, this document only provides a workflow overview. For specific details, refer to the **[YOLO26 / YOLO11 / YOLOv8 official code and documentation](https://github.com/ultralytics/ultralytics)** (**recommended**) and search for training tutorials. The final goal is to export an ONNX file. -If you come across good articles, feel free to edit this one and submit a PR. - -## Exporting YOLO11 / YOLOv8 ONNX Models +If you find helpful articles, please modify this document and submit a PR. +## Exporting ONNX Models for YOLO26 / YOLO11 / YOLOv8 Create an `export_onnx.py` file in the `ultralytics` directory: - ```python from ultralytics import YOLO import sys @@ -69,47 +66,289 @@ model = YOLO(net_name) # load an official model # Predict with the model results = model("https://ultralytics.com/images/bus.jpg") # predict on an image -path = model.export(format="onnx", imgsz=[input_height, input_width]) # export the model to ONNX format +path = model.export(format="onnx", imgsz=[input_height, input_width], dynamic=False, simplify=True, opset=17) # export the model to ONNX format print(path) ``` -Then run `python export_onnx.py yolov8n.pt 320 224` to export the `onnx` model. Here, the input resolution is redefined; the model was originally trained with `640x640`, but we specify a different resolution to improve runtime speed. The reason for using `320x224` is that it closely matches the MaixCAM screen aspect ratio, making display easier. For MaixCAM2, you can use `640x480` or `320x240` — feel free to set it according to your specific needs. +Run `python export_onnx.py yolov8n.pt 320 224` to export the ONNX model. Here we re-specify the input resolution—models are trained at `640x640`, and we redefine it to improve runtime speed. `320x224` is chosen for its aspect ratio similarity to MaixCAM's screen for easier display. For MaixCAM2, use `640x480` or `320x240`, or set as needed for your application. +## Converting to MaixCAM-Compatible Models & MUD Files +As of 2026.2.4, MaixPy/MaixCDK supports YOLO26 / YOLOv8 / YOLO11 detection, YOLOv8-pose / YOLO11-pose keypoint detection, YOLOv8-seg / YOLO11-seg segmentation, and YOLOv8-obb / YOLO11-obb oriented bounding box detection. +Convert models following [MaixCAM Model Conversion](../ai_model_converter/maixcam.md) and [MaixCAM2 Model Conversion](../ai_model_converter/maixcam2.md). -## Convert to MaixCAM Supported Models and mud Files +### Output Node Selection +Note the output node selection (**values may vary for your model; locate matching nodes in the diagrams below**): -MaixPy/MaixCDK currently supports YOLOv8 / YOLO11 detection, YOLOv8-pose / YOLO11-pose keypoint detection, and YOLOv8-seg / YOLO11-seg segmentation models (2024.10.10). +For YOLO11 / YOLOv8, MaixPy supports two node selection schemes, chosen based on the hardware platform: -Convert the models according to [MaixCAM Model Conversion](../ai_model_converter/maixcam.md) and [MaixCAM2 Model Conversion](../ai_model_converter/maixcam2.md). - -### Select output nodes +| Model & Features | Scheme 1 | Scheme 2 | +| -- | --- | --- | +| Target Devices | **MaixCAM2** (recommended)
MaixCAM (slightly slower than Scheme 2) | **MaixCAM** (recommended) | +| Features | Offloads more computation to CPU postprocessing; quantization is more stable, slightly slower than Scheme 2 | Offloads more computation to NPU and includes quantization | +| Notes | None | Quantization fails on MaixCAM2 in practice | +| YOLOv8 Detection |`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0` | +| YOLO11 Detection |`/model.23/Concat_output_0`
`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0` | `/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0` | +| YOLOv8-pose Keypoint | `/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0`
`/model.22/Concat_output_0` | +| YOLO11-pose Keypoint | `/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0` | `/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0`
`/model.23/Concat_output_0`| +| YOLOv8-seg Segmentation |`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`
`output1`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0`
`/model.22/Concat_output_0`
`output1`| +| YOLO11-seg Segmentation |`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0`
`output1`|`/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0`
`/model.23/Concat_output_0`
`output1`| +| YOLOv8-obb Oriented BBox |`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`|`/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_1_output_0`
`/model.22/Sigmoid_output_0`| +| YOLO11-obb Oriented BBox |`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0`|`/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_1_output_0`
`/model.23/Sigmoid_output_0`| +|YOLOv8/YOLO11 Detection Output Nodes| ![](../../assets/yolo11_detect_nodes.png) | ![](../../assets/yolov8_out.jpg)| +|YOLOv8/YOLO11 Pose Extra Output Node | ![](../../assets/yolo11_pose_node.png) | See pose branch above | +|YOLOv8/YOLO11 Seg Extra Output Node | ![](../../assets/yolo11_seg_node.png) | ![](../../assets/yolo11_seg_node.png)| +|YOLOv8/YOLO11 OBB Extra Output Node | ![](../../assets/yolo11_obb_node.png) | ![](../../assets/yolo11_out_obb.jpg)| + +| Model | Node Names | Node Diagram | +| -- | --- | --- | +| YOLO26 Detection | `/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0 /model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0 /model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0 /model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0 /model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0 /model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0` |![](../../assets/yolo26_out.png) | + +### Conversion Scripts +*One-click YOLO26 conversion script (run in container):* + +`MaixCam/Pro:` +```bash +#!/bin/bash + +set -e + +net_name=yolo26n +input_w=320 +input_h=224 + +# mean: 0, 0, 0 +# std: 255, 255, 255 + +# mean +# 1/std + +# mean: 0, 0, 0 +# scale: 0.00392156862745098, 0.00392156862745098, 0.00392156862745098 + +# convert to mlir +model_transform.py \ +--model_name ${net_name} \ +--model_def ./${net_name}.onnx \ +--input_shapes [[1,3,${input_h},${input_w}]] \ +--mean "0,0,0" \ +--scale "0.00392156862745098,0.00392156862745098,0.00392156862745098" \ +--keep_aspect_ratio \ +--pixel_format rgb \ +--channel_format nchw \ +--output_names "/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0,/model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0,/model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0,/model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0,/model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0,/model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0" \ +--test_input ./image.jpg \ +--test_result ${net_name}_top_outputs.npz \ +--tolerance 0.99,0.99 \ +--mlir ${net_name}.mlir + +echo "calibrate for int8 model" +# export int8 model +run_calibration.py ${net_name}.mlir \ +--dataset ./coco \ +--input_num 200 \ +-o ${net_name}_cali_table + +echo "convert to int8 model" +# export int8 model +# add --quant_input, use int8 for faster processing in maix.nn.NN.forward_image +model_deploy.py \ +--mlir ${net_name}.mlir \ +--quantize INT8 \ +--quant_input \ +--calibration_table ${net_name}_cali_table \ +--processor cv181x \ +--test_input ${net_name}_in_f32.npz \ +--test_reference ${net_name}_top_outputs.npz \ +--tolerance 0.9,0.6 \ +--model ${net_name}_int8.cvimodel +``` -Note the selection of model output nodes (**Note that your model values may not be exactly the same, just find the corresponding nodes according to the pictures below**): +`MaixCam2:` +```bash +#!/bin/bash + +set -e + +############# Modify #################### +model_name=$1 +model_path=./${model_name}.onnx +images_dir=./coco +images_num=100 +input_names=images + +config_path=yolo26_build_config.json + +output_nodes=( + "/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0" # bbox 80x80 + "/model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0" # bbox 40x40 + "/model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0" # bbox 20x20 + "/model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0" # cls 80x80 + "/model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0" # cls 40x40 + "/model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0" # cls 20x20 +) +############################################# +# Parse node configuration +onnx_output_names="" +json_outputs="" + +for node in "${output_nodes[@]}"; do + # Build output parameters for extract_onnx + if [ -n "$onnx_output_names" ]; then + onnx_output_names="${onnx_output_names}," + fi + onnx_output_names="${onnx_output_names}${node}" + + # Build JSON output_processors + json_outputs="${json_outputs} + { + \"tensor_name\": \"${node}\", + \"dst_perm\": [0, 2, 3, 1] + }," +done + +# Remove trailing comma +json_outputs="${json_outputs%,}" + +# Generate JSON configuration file +cat > $config_path << EOF +{ + "model_type": "ONNX", + "npu_mode": "NPU1", + "quant": { + "input_configs": [ + { + "tensor_name": "${input_names}", + "calibration_dataset": "tmp_images/images.tar", + "calibration_size": ${images_num}, + "calibration_mean": [0, 0, 0], + "calibration_std": [255, 255, 255] + } + ], + "calibration_method": "MinMax", + "precision_analysis": true + }, + "input_processors": [ + { + "tensor_name": "${input_names}", + "tensor_format": "RGB", + "tensor_layout": "NCHW", + "src_format": "RGB", + "src_dtype": "U8", + "src_layout": "NHWC", + "csc_mode": "NoCSC" + } + ], + "output_processors": [${json_outputs} + ], + "compiler": { + "check": 3, + "check_mode": "CheckOutput", + "check_cosine_simularity": 0.9 + } +} +EOF + +echo -e "\e[32mConfiguration file generated: ${config_path}\e[0m" + +# Create gen_cali_images_tar.py +cat > gen_cali_images_tar.py << 'PYTHON_SCRIPT' +import sys +import os +import random +import shutil + +images_dir = sys.argv[1] +images_num = int(sys.argv[2]) + +print("images dir:", images_dir) +print("images num:", images_num) +print("current dir:", os.getcwd()) +files = os.listdir(images_dir) +valid = [] +for name in files: + path = os.path.join(images_dir, name) + ext = os.path.splitext(name)[1] + if ext.lower() not in [".jpg", ".jpeg", ".png"]: + continue + valid.append(path) +print(f"images dir {images_dir} have {len(valid)} images") +if len(valid) < images_num: + print(f"no enough images in {images_dir}, have: {len(valid)}, need {images_num}") + sys.exit(1) + +idxes = random.sample(range(len(valid)), images_num) +shutil.rmtree("tmp_images", ignore_errors=True) +os.makedirs("tmp_images/images") +for i in idxes: + target = os.path.join("tmp_images", "images", os.path.basename(valid[i])) + shutil.copyfile(valid[i], target) +os.chdir("tmp_images/images") +os.system("tar -cf ../images.tar *") +# shutil.rmtree("tmp_images/images") +PYTHON_SCRIPT + +# Create extract_onnx.py +cat > extract_onnx.py << 'PYTHON_SCRIPT' +import onnx +import sys -For YOLO11 / YOLOv8, MaixPy support two types node select method, choose proper method according to your device: +input_path = sys.argv[1] +output_path = sys.argv[2] +input_names_str = sys.argv[3] +output_names_str = sys.argv[4] +input_names = [] +for s in input_names_str.split(","): + input_names.append(s.strip()) +output_names = [] +for s in output_names_str.split(","): + output_names.append(s.strip()) + +onnx.utils.extract_model(input_path, output_path, input_names, output_names) +PYTHON_SCRIPT + +# extract and onnxsim +mkdir -p tmp1 +onnx_extracted=tmp1/${model_name}_extracted.onnx +onnxsim_path=tmp1/${model_name}.onnx + +# Step 1: Extract specified output nodes +echo -e "\e[32mStep 1: Extract ONNX output nodes\e[0m" +python extract_onnx.py $model_path $onnx_extracted $input_names "$onnx_output_names" + +# Step 2: Simplify model +echo -e "\e[32mStep 2: ONNX simplification\e[0m" +onnxsim $onnx_extracted $onnxsim_path + +python gen_cali_images_tar.py $images_dir $images_num + +mkdir -p out +tmp_config_path=tmp/$config_path + +# vnpu +echo -e "\e[32mBuilding ${model_name}_vnpu.axmodel\e[0m" +rm -rf tmp +mkdir tmp +cp $config_path $tmp_config_path +sed -i '/npu_mode/c\"npu_mode": "NPU1",' $tmp_config_path +pulsar2 build --target_hardware AX620E --input $onnxsim_path --output_dir tmp --config $tmp_config_path +cp tmp/compiled.axmodel out/${model_name}_vnpu.axmodel + +# npu all +echo -e "\e[32mBuilding ${model_name}_npu.axmodel\e[0m" +rm -rf tmp +mkdir tmp +cp $config_path $tmp_config_path +sed -i '/npu_mode/c\"npu_mode": "NPU2",' $tmp_config_path +pulsar2 build --target_hardware AX620E --input $onnxsim_path --output_dir tmp --config $tmp_config_path +cp tmp/compiled.axmodel out/${model_name}_npu.axmodel +rm -rf tmp + +echo -e "\e[32mGenerate models done, in out dir\e[0m" +``` -| Model & Feature | Method A | Method B | -| -- | --- | --- | -| Supported Devices | **MaixCAM2**(Recommend)
MaixCAM(running speed lowe than method B) | **MaixCAM**(Recommend) | -| Feature | More computation on CPU (safer quantization, slightly slower than Solution 2) | More computation on NPU (included in quantization) | -| Attention | None | Quantization failed in actual tests on MaixCAM2. | -| Detect YOLOv8 |`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0` | -| Detect YOLO11 |`/model.23/Concat_output_0`
`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0` | `/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0` | -| Keypoint YOLOv8-pose | `/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0`
`/model.22/Concat_output_0` | -| Keypoint YOLO11-pose | `/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0`| `/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0`
`/model.23/Concat_output_0`| -| Segment YOLOv8-seg|`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`
`output1`| `/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_output_0`
`/model.22/Concat_output_0`
`output1`| -| Segment YOLO11-seg |`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0`
`output1`|`/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_output_0`
`/model.23/Concat_output_0`
`output1`| -| OBB YOLOv8-obb |`/model.22/Concat_1_output_0`
`/model.22/Concat_2_output_0`
`/model.22/Concat_3_output_0`
`/model.22/Concat_output_0`|`/model.22/dfl/conv/Conv_output_0`
`/model.22/Sigmoid_1_output_0`
`/model.22/Sigmoid_output_0`| -| OBB YOLO11-obb |`/model.23/Concat_1_output_0`
`/model.23/Concat_2_output_0`
`/model.23/Concat_3_output_0`
`/model.23/Concat_output_0`|`/model.23/dfl/conv/Conv_output_0`
`/model.23/Sigmoid_1_output_0`
`/model.23/Sigmoid_output_0`| -|YOLOv8/YOLO11 Detect output nodes| ![](../../assets/yolo11_detect_nodes.png) | ![](../../assets/yolov8_out.jpg)| -|YOLOv8/YOLO11 pose extra nodes | ![](../../assets/yolo11_pose_node.png) | pose branch in the figure above | -|YOLOv8/YOLO11 seg extra nodes | ![](../../assets/yolo11_seg_node.png) | ![](../../assets/yolo11_seg_node.png)| -|YOLOv8/YOLO11 OBB extra nodes | ![](../../assets/yolo11_obb_node.png) | ![](../../assets/yolo11_out_obb.jpg)| - -### Edit mud file - -For object detection, the MUD file would be as follows (replace `yolo11` for YOLO11): +### Modifying the MUD File +For object detection, the MUD file is as follows (set `model_type` to `yolo11` for YOLO11, `yolo26` for YO26): MaixCAM/MaixCAM-Pro: ```ini @@ -147,14 +386,11 @@ mean = 0,0,0 scale = 0.00392156862745098, 0.00392156862745098, 0.00392156862745098 ``` -Replace the `labels` according to your trained objects. - -For keypoint detection (yolov8-pose), modify `type=pose`. -For keypoint detection (yolov8-seg), modify `type=seg`. -For keypoint detection (yolov8-obb), modify `type=obb`. - - +Replace `labels` with your custom trained object classes. -## Upload and Share on MaixHub +For keypoint detection (yolov8-pose), set `type=pose`. +For segmentation (yolov8-seg), set `type=seg`. +For oriented bounding box detection (yolov8-obb), set `type=obb`. -Visit the [MaixHub Model Library](https://maixhub.com/model/zoo?platform=maixcam) to upload and share your model. Consider providing multiple resolutions for others to choose from. +## Upload & Share to MaixHub +Upload and share your models at the [MaixHub Model Zoo](https://maixhub.com/model/zoo?platform=maixcam). Provide multiple resolutions for users to choose from. \ No newline at end of file diff --git a/docs/doc/en/vision/yolov5.md b/docs/doc/en/vision/yolov5.md index 33fe34a6..064bd42b 100644 --- a/docs/doc/en/vision/yolov5.md +++ b/docs/doc/en/vision/yolov5.md @@ -1,93 +1,83 @@ ---- -title: MaixPy MaixCAM Using YOLOv5 / YOLOv8 / YOLO11 for Object Detection ---- +# MaixPy: Object Detection with YOLOv5 / YOLOv8 / YOLO11 / YOLO26 Models +## Concept of Object Detection +Object detection refers to identifying the positions and categories of targets in images or videos—for example, detecting objects like apples and airplanes in an image and marking their locations. -## Object Detection Concept +Unlike image classification, it includes positional information, so the result of object detection is usually a bounding box that outlines the object's position. -Object detection refers to detecting the position and category of objects in images or videos, such as identifying apples or airplanes in a picture and marking their locations. - -Unlike classification, object detection includes positional information. Therefore, the result of object detection is generally a rectangular box that marks the location of the object. - -## Object Detection in MaixPy - -MaixPy provides `YOLOv5`, `YOLOv8`, and `YOLO11` models by default, which can be used directly: +## Using Object Detection in MaixPy +MaixPy natively supports the **YOLOv5**, **YOLOv8**, **YOLO11** and **YOLO26** models, which can be used directly: > YOLOv8 requires MaixPy >= 4.3.0. > YOLO11 requires MaixPy >= 4.7.0. - +> YOLO26 requires MaixPy >= 4.12.5. ```python from maix import camera, display, image, nn, app detector = nn.YOLOv5(model="/root/models/yolov5s.mud", dual_buff=True) # detector = nn.YOLOv8(model="/root/models/yolov8n.mud", dual_buff=True) # detector = nn.YOLO11(model="/root/models/yolo11n.mud", dual_buff=True) +# detector = nn.YOLO26(model="/root/models/yolo26n.mud", dual_buff=True) cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format()) disp = display.Display() while not app.need_exit(): img = cam.read() - objs = detector.detect(img, conf_th=0.5, iou_th=0.45) + objs = detector.detect(img, conf_th = 0.5, iou_th = 0.45) for obj in objs: - img.draw_rect(obj.x, obj.y, obj.w, obj.h, color=image.COLOR_RED) + img.draw_rect(obj.x, obj.y, obj.w, obj.h, color = image.COLOR_RED) msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}' - img.draw_string(obj.x, obj.y, msg, color=image.COLOR_RED) + img.draw_string(obj.x, obj.y, msg, color = image.COLOR_RED) disp.show(img) ``` -Example video: - +Demo Video:
-Here, the camera captures an image, passes it to the `detector` for detection, and then displays the results (classification name and location) on the screen. - -You can switch between `YOLO11`, `YOLOv5`, and `YOLOv8` simply by replacing the corresponding line and modifying the model file path. +The code above captures images via the camera, passes them to the `detector` for inference, and then displays the detection results (category names and positions) on the screen after obtaining them. -For the list of 80 objects supported by the model, see the appendix of this document. +You can switch between **YOLO11/v5/v8/26** simply by replacing the corresponding model initialization code—note to modify the model file path as well. -For more API usage, refer to the documentation for the [maix.nn](/api/maix/nn.html) module. +See the appendix of this article for the list of 80 object categories supported by the pre-trained models. -## dual_buff for Double Buffering Acceleration +For more API details, refer to the documentation of the [maix.nn](/api/maix/nn.html) module. -You may notice that the model initialization uses `dual_buff` (default value is `True`). Enabling the `dual_buff` parameter can improve efficiency and increase the frame rate. For more details and usage considerations, see the [dual_buff Introduction](./dual_buff.md). +## Dual Buffer Acceleration (`dual_buff`) +You may notice the `dual_buff` parameter is used during model initialization (it is `True` by default). Enabling this parameter can improve runtime efficiency and frame rate. For the specific principle and usage notes, see [Introduction to dual_buff](./dual_buff.md). ## More Input Resolutions - -The default model input resolution is `320x224`, which closely matches the aspect ratio of the default screen. You can also download other model resolutions: +The default model input resolutions are **320x224** for MaixCam and **640x480** for MaixCam2, as these aspect ratios are close to the native screen resolutions of the devices. You can also manually download models with other resolutions for replacement: YOLOv5: [https://maixhub.com/model/zoo/365](https://maixhub.com/model/zoo/365) YOLOv8: [https://maixhub.com/model/zoo/400](https://maixhub.com/model/zoo/400) YOLO11: [https://maixhub.com/model/zoo/453](https://maixhub.com/model/zoo/453) -Higher resolutions provide more accuracy, but take longer to process. Choose the appropriate resolution based on your application. +Higher resolutions yield higher detection accuracy but take longer to run. Choose the appropriate resolution based on your application scenario. -## Which Model to Use: YOLOv5, YOLOv8, or YOLO11? +## Which to Choose: YOLOv5, YOLOv8, YOLO11 or YOLO26? +The pre-provided models include **YOLOv5s**, **YOLOv8n**, **YOLO11n** and **YOLO26n**. The YOLOv5s model has a larger size, while YOLOv8n, YOLO11n and YOLO26n run slightly faster. According to official data, the accuracy ranking is **YOLO26n > YOLO11n > YOLOv8n > YOLOv5s**. You can conduct actual tests and select the model that fits your needs. -We provide three models: `YOLOv5s`, `YOLOv8n`, and `YOLO11n`. The `YOLOv5s` model is larger, while `YOLOv8n` and `YOLO11n` are slightly faster. According to official data, the accuracy is `YOLO11n > YOLOv8n > YOLOv5s`. You can test them to decide which works best for your situation. +You can also try the **YOLOv8s** or **YOLO11s** models—their frame rates will be slightly lower (e.g., yolov8s_320x224 runs 10ms slower than yolov8n_320x224), but their accuracy is higher than the nano versions. These models can be downloaded from the model libraries mentioned above or exported by yourself from the official YOLO repositories. -Additionally, you may try `YOLOv8s` or `YOLO11s`, which will have a lower frame rate (e.g., `yolov8s_320x224` is 10ms slower than `yolov8n_320x224`), but offer higher accuracy. You can download these models from the model library mentioned above or export them yourself from the official `YOLO` repository. +## Is It Allowed to Use Different Resolutions for Camera and Model? +When using the `detector.detect(img)` function for inference, if the resolution of `img` differs from the model's input resolution, the function will automatically call `img.resize` to scale the image to match the model's input resolution. The default resizing method is `image.Fit.FIT_CONTAIN`, which scales the image while maintaining its aspect ratio and fills the surrounding areas with black pixels. The detected bounding box coordinates are also automatically mapped back to the coordinates of the original `img`. -## Different Resolutions for Camera and Model +## Train Custom Object Detection Models Online with MaixHub +If you need to detect specific objects instead of using the pre-trained 80-class model, visit [MaixHub](https://maixhub.com) to learn and train custom object detection models—simply select **Object Detection Model** when creating a project. For details, refer to [MaixHub Online Training Documentation](./maixhub_train.md). -If the resolution of `img` is different from the model's resolution when using the `detector.detect(img)` function, the function will automatically call `img.resize` to adjust the image to the model's input resolution. The default `resize` method is `image.Fit.FIT_CONTAIN`, which scales while maintaining the aspect ratio and fills the surrounding areas with black. The detected coordinates will also be automatically mapped back to the original `img`. +You can also find models shared by the community in the [MaixHub Model Zoo](https://maixhub.com/model/zoo?platform=maixcam). -## Training Your Own Object Detection Model on MaixHub +## Train Custom Object Detection Models Offline +It is highly recommended to start with MaixHub online training—offline training is more complex and not suggested for beginners. -If you need to detect specific objects beyond the 80 categories provided, visit [MaixHub](https://maixhub.com) to learn and train an object detection model. Select "Object Detection Model" when creating a project. Refer to the [MaixHub Online Training Documentation](./maixhub_train.md). +This method assumes you have basic relevant knowledge (which will not be covered in this article). Search online for solutions if you encounter problems. -Alternatively, you can find models shared by community members at the [MaixHub Model Library](https://maixhub.com/model/zoo?platform=maixcam). - -## Training Your Own Object Detection Model Offline - -We strongly recommend starting with MaixHub for online training, as the offline method is much more difficult and is not suitable for beginners. Some knowledge may not be explicitly covered here, so be prepared to do further research. - -Refer to [Training a Custom YOLOv5 Model](./customize_model_yolov5.md) or [Training a Custom YOLOv8/YOLO11 Model Offline](./customize_model_yolov8.md). - -## Appendix: 80 Classes - -The 80 objects in the COCO dataset are: +See [Offline Training of YOLOv5 Models](./customize_model_yolov5.md) or [Offline Training of YOLOv8/YOLO11/YOLO26 Models](./customize_model_yolov8.md) for details. +## Appendix: 80 Object Categories +The 80 object categories of the COCO dataset are as follows: ```txt person bicycle @@ -169,5 +159,4 @@ scissors teddy bear hair dryer toothbrush -``` - +``` \ No newline at end of file diff --git a/docs/doc/zh/vision/customize_model_yolov8.md b/docs/doc/zh/vision/customize_model_yolov8.md index e058cb65..e2601e60 100644 --- a/docs/doc/zh/vision/customize_model_yolov8.md +++ b/docs/doc/zh/vision/customize_model_yolov8.md @@ -1,5 +1,5 @@ --- -title: 为 MaixCAM MaixPy 离线训练 YOLO11/YOLOv8 模型,自定义检测物体、关键点检测 +title: 为 MaixCAM MaixPy 离线训练 YOLO11/YOLOv8/YOLO26 模型,自定义检测物体、关键点检测 update: - date: 2024-06-21 version: v1.0 @@ -13,6 +13,10 @@ update: version: v3.0 author: neucrack content: 增加 MaixCAM2 支持 + - date: 2026-02-04 + version: v4.0 + author: Tao + content: 增加 YOLO26 支持 --- @@ -36,8 +40,8 @@ YOLOv8 / YOLO11 不光支持检测物体,还有 yolov8-pose / YOLO11-pose 支 要想我们的模型能在 MaixPy (MaixCAM)上使用,需要经历以下过程: * 搭建训练环境,本文略过,请自行搜索 pytorch 训练环境搭建。 -* 拉取 [YOLO11/YOLOv8](https://github.com/ultralytics/ultralytics) 源码到本地。 -* 准备数据集,并做成 YOLO11 / YOLOv8 项目需要的格式。 +* 拉取 [YOLO11/YOLOv8/YOLO26](https://github.com/ultralytics/ultralytics) 源码到本地。 +* 准备数据集,并做成 YOLO11 / YOLOv8 /YOLO26 项目需要的格式。 * 训练模型,得到一个 `onnx` 模型文件,也是本文的最终输出文件。 * 将`onnx`模型转换成 MaixPy 支持的 `MUD` 文件,这个过程在[MaixCAM 模型转换](../ai_model_converter/maixcam.md) 一文种有详细介绍。 * 使用 MaixPy 加载模型运行。 @@ -50,11 +54,11 @@ YOLOv8 / YOLO11 不光支持检测物体,还有 yolov8-pose / YOLO11-pose 支 ## 参考文章 -因为是比较通用的操作过程,本文只给一个流程介绍,具体细节可以自行看 **[YOLO11 / YOLOv8 官方代码和文档](https://github.com/ultralytics/ultralytics)**(**推荐**),以及搜索其训练教程,最终导出 onnx 文件即可。 +因为是比较通用的操作过程,本文只给一个流程介绍,具体细节可以自行看 **[YOLO26 / YOLO11 / YOLOv8 官方代码和文档](https://github.com/ultralytics/ultralytics)**(**推荐**),以及搜索其训练教程,最终导出 onnx 文件即可。 如果你有觉得讲得不错的文章欢迎修改本文并提交 PR。 -## YOLO11 / YOLOv8 导出 onnx 模型 +## YOLO26 / YOLO11 / YOLOv8 导出 onnx 模型 在 `ultralytics` 目录下创建一个`export_onnx.py` 文件 ```python @@ -82,7 +86,7 @@ print(path) ## 转换为 MaixCAM 支持的模型以及 mud 文件 -MaixPy/MaixCDK 目前支持了 YOLOv8 / YOLO11 检测 以及 YOLOv8-pose / YOLO11-pose 关键点检测 以及 YOLOv8-seg / YOLO11-seg 三种模型(2024.10.10)。 +MaixPy/MaixCDK 目前支持了 YOLO26 / YOLOv8 / YOLO11 检测 以及 YOLOv8-pose / YOLO11-pose 关键点检测 以及 YOLOv8-seg / YOLO11-seg YOLOv8-obb / YOLOV11-obb 四种模型(2026.2.4)。 按照[MaixCAM 模型转换](../ai_model_converter/maixcam.md) 和 [MaixCAM2 模型转换](../ai_model_converter/maixcam2.md) 进行模型转换。 @@ -109,9 +113,264 @@ MaixPy/MaixCDK 目前支持了 YOLOv8 / YOLO11 检测 以及 YOLOv8-pose / YOLO1 |YOLOv8/YOLO11 seg 额外输出节点 | ![](../../assets/yolo11_seg_node.png) | ![](../../assets/yolo11_seg_node.png)| |YOLOv8/YOLO11 OBB 额外输出节点 | ![](../../assets/yolo11_obb_node.png) | ![](../../assets/yolo11_out_obb.jpg)| + +| 模型 | 节点名 | 节点图 | +| -- | --- | --- | +| 检测 YOLO26 | `/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0 /model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0 /model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0 /model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0 /model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0 /model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0` |![](../../assets/yolo26_out.png) | + + +### 转换脚本 +*这里提供一键转换YOLO26的脚本(需要在容器下运行):* +`MaixCam/Pro:` + +```bash +#!/bin/bash + +set -e + +net_name=yolo26n +input_w=320 +input_h=224 + +# mean: 0, 0, 0 +# std: 255, 255, 255 + +# mean +# 1/std + +# mean: 0, 0, 0 +# scale: 0.00392156862745098, 0.00392156862745098, 0.00392156862745098 + + + +# convert to mlir +model_transform.py \ +--model_name ${net_name} \ +--model_def ./${net_name}.onnx \ +--input_shapes [[1,3,${input_h},${input_w}]] \ +--mean "0,0,0" \ +--scale "0.00392156862745098,0.00392156862745098,0.00392156862745098" \ +--keep_aspect_ratio \ +--pixel_format rgb \ +--channel_format nchw \ +--output_names "/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0,/model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0,/model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0,/model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0,/model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0,/model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0" \ +--test_input ./image.jpg \ +--test_result ${net_name}_top_outputs.npz \ +--tolerance 0.99,0.99 \ +--mlir ${net_name}.mlir + + +echo "calibrate for int8 model" +# export int8 model +run_calibration.py ${net_name}.mlir \ +--dataset ./coco \ +--input_num 200 \ +-o ${net_name}_cali_table + +echo "convert to int8 model" +# export int8 model +# add --quant_input, use int8 for faster processing in maix.nn.NN.forward_image +model_deploy.py \ +--mlir ${net_name}.mlir \ +--quantize INT8 \ +--quant_input \ +--calibration_table ${net_name}_cali_table \ +--processor cv181x \ +--test_input ${net_name}_in_f32.npz \ +--test_reference ${net_name}_top_outputs.npz \ +--tolerance 0.9,0.6 \ +--model ${net_name}_int8.cvimodel +``` + +`MaixCam2:` + +```bash +#!/bin/bash + +set -e + +############# 修改 #################### +model_name=$1 +model_path=./${model_name}.onnx +images_dir=./coco +images_num=100 +input_names=images + +config_path=yolo26_build_config.json + +output_nodes=( + "/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv_output_0" # bbox 80x80 + "/model.23/one2one_cv2.1/one2one_cv2.1.2/Conv_output_0" # bbox 40x40 + "/model.23/one2one_cv2.2/one2one_cv2.2.2/Conv_output_0" # bbox 20x20 + "/model.23/one2one_cv3.0/one2one_cv3.0.2/Conv_output_0" # cls 80x80 + "/model.23/one2one_cv3.1/one2one_cv3.1.2/Conv_output_0" # cls 40x40 + "/model.23/one2one_cv3.2/one2one_cv3.2.2/Conv_output_0" # cls 20x20 +) +############################################# +# 解析节点配置 +onnx_output_names="" +json_outputs="" + +for node in "${output_nodes[@]}"; do + # 构建 extract_onnx 的输出参数 + if [ -n "$onnx_output_names" ]; then + onnx_output_names="${onnx_output_names}," + fi + onnx_output_names="${onnx_output_names}${node}" + + # 构建 JSON output_processors + json_outputs="${json_outputs} + { + \"tensor_name\": \"${node}\", + \"dst_perm\": [0, 2, 3, 1] + }," +done + +# 去掉最后的逗号 +json_outputs="${json_outputs%,}" + +# 生成 JSON 配置文件 +cat > $config_path << EOF +{ + "model_type": "ONNX", + "npu_mode": "NPU1", + "quant": { + "input_configs": [ + { + "tensor_name": "${input_names}", + "calibration_dataset": "tmp_images/images.tar", + "calibration_size": ${images_num}, + "calibration_mean": [0, 0, 0], + "calibration_std": [255, 255, 255] + } + ], + "calibration_method": "MinMax", + "precision_analysis": true + }, + "input_processors": [ + { + "tensor_name": "${input_names}", + "tensor_format": "RGB", + "tensor_layout": "NCHW", + "src_format": "RGB", + "src_dtype": "U8", + "src_layout": "NHWC", + "csc_mode": "NoCSC" + } + ], + "output_processors": [${json_outputs} + ], + "compiler": { + "check": 3, + "check_mode": "CheckOutput", + "check_cosine_simularity": 0.9 + } +} +EOF + +echo -e "\e[32m已生成配置文件: ${config_path}\e[0m" + +# 创建 gen_cali_images_tar.py +cat > gen_cali_images_tar.py << 'PYTHON_SCRIPT' +import sys +import os +import random +import shutil + +images_dir = sys.argv[1] +images_num = int(sys.argv[2]) + +print("images dir:", images_dir) +print("images num:", images_num) +print("current dir:", os.getcwd()) +files = os.listdir(images_dir) +valid = [] +for name in files: + path = os.path.join(images_dir, name) + ext = os.path.splitext(name)[1] + if ext.lower() not in [".jpg", ".jpeg", ".png"]: + continue + valid.append(path) +print(f"images dir {images_dir} have {len(valid)} images") +if len(valid) < images_num: + print(f"no enough images in {images_dir}, have: {len(valid)}, need {images_num}") + sys.exit(1) + + +idxes = random.sample(range(len(valid)), images_num) +shutil.rmtree("tmp_images", ignore_errors=True) +os.makedirs("tmp_images/images") +for i in idxes: + target = os.path.join("tmp_images", "images", os.path.basename(valid[i])) + shutil.copyfile(valid[i], target) +os.chdir("tmp_images/images") +os.system("tar -cf ../images.tar *") +# shutil.rmtree("tmp_images/images") +PYTHON_SCRIPT + +# 创建 extract_onnx.py +cat > extract_onnx.py << 'PYTHON_SCRIPT' +import onnx +import sys + +input_path = sys.argv[1] +output_path = sys.argv[2] +input_names_str = sys.argv[3] +output_names_str = sys.argv[4] +input_names = [] +for s in input_names_str.split(","): + input_names.append(s.strip()) +output_names = [] +for s in output_names_str.split(","): + output_names.append(s.strip()) + +onnx.utils.extract_model(input_path, output_path, input_names, output_names) +PYTHON_SCRIPT + +# extract and onnxsim +mkdir -p tmp1 +onnx_extracted=tmp1/${model_name}_extracted.onnx +onnxsim_path=tmp1/${model_name}.onnx + +# Step 1: 提取指定输出节点 +echo -e "\e[32mStep 1: 提取 ONNX 输出节点\e[0m" +python extract_onnx.py $model_path $onnx_extracted $input_names "$onnx_output_names" + +# Step 2: 简化模型 +echo -e "\e[32mStep 2: ONNX 简化\e[0m" +onnxsim $onnx_extracted $onnxsim_path + +python gen_cali_images_tar.py $images_dir $images_num + +mkdir -p out +tmp_config_path=tmp/$config_path + +# vnpu +echo -e "\e[32mBuilding ${model_name}_vnpu.axmodel\e[0m" +rm -rf tmp +mkdir tmp +cp $config_path $tmp_config_path +sed -i '/npu_mode/c\"npu_mode": "NPU1",' $tmp_config_path +pulsar2 build --target_hardware AX620E --input $onnxsim_path --output_dir tmp --config $tmp_config_path +cp tmp/compiled.axmodel out/${model_name}_vnpu.axmodel + +# npu all +echo -e "\e[32mBuilding ${model_name}_npu.axmodel\e[0m" +rm -rf tmp +mkdir tmp +cp $config_path $tmp_config_path +sed -i '/npu_mode/c\"npu_mode": "NPU2",' $tmp_config_path +pulsar2 build --target_hardware AX620E --input $onnxsim_path --output_dir tmp --config $tmp_config_path +cp tmp/compiled.axmodel out/${model_name}_npu.axmodel +rm -rf tmp + +echo -e "\e[32mGenerate models done, in out dir\e[0m" +``` + + ### 修改 mud 文件 -对于物体检测,mud 文件为(YOLO11 model_type 改为 yolo11) +对于物体检测,mud 文件为(YOLO11 model_type 改为 yolo11 YOLO26 model_type 改为 yolo26) MaixCAM/MaixCAM-Pro: ```ini [basic] diff --git a/docs/doc/zh/vision/yolov5.md b/docs/doc/zh/vision/yolov5.md index eb55fe1c..a6997d33 100644 --- a/docs/doc/zh/vision/yolov5.md +++ b/docs/doc/zh/vision/yolov5.md @@ -1,5 +1,5 @@ --- -title: MaixPy MaixCAM 使用 YOLOv5 / YOLOv8 / YOLO11 模型进行目标检测 +title: MaixPy 使用 YOLOv5 / YOLOv8 / YOLO11 / YOLO26 模型进行目标检测 --- @@ -14,13 +14,14 @@ title: MaixPy MaixCAM 使用 YOLOv5 / YOLOv8 / YOLO11 模型进行目标检测 MaixPy 默认提供了 `YOLOv5` 和 `YOLOv8` 和 `YOLO11` 模型,可以直接使用: > YOLOv8 需要 MaixPy >= 4.3.0。 > YOLO11 需要 MaixPy >= 4.7.0。 - +> YOLO26 需要 MaixPy >= 4.12.5。 ```python from maix import camera, display, image, nn, app detector = nn.YOLOv5(model="/root/models/yolov5s.mud", dual_buff=True) # detector = nn.YOLOv8(model="/root/models/yolov8n.mud", dual_buff=True) # detector = nn.YOLO11(model="/root/models/yolo11n.mud", dual_buff=True) +# detector = nn.YOLO26(model="/root/models/yolo26n.mud", dual_buff=True) cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format()) disp = display.Display() @@ -56,7 +57,7 @@ while not app.need_exit(): ## 更多输入分辨率 -默认的模型输入是`320x224`分辨率,因为这个分辨率比例和默认提供的屏幕分辨率接近,你也可以手动下载其它分辨率的模型替换: +默认的模型输入是MaixCam:`320x224`,MaixCam2:`640x480`,因为这个分辨率比例和默认提供的屏幕分辨率接近,你也可以手动下载其它分辨率的模型替换: YOLOv5: [https://maixhub.com/model/zoo/365](https://maixhub.com/model/zoo/365) YOLOv8: [https://maixhub.com/model/zoo/400](https://maixhub.com/model/zoo/400) @@ -66,7 +67,7 @@ YOLO11: [https://maixhub.com/model/zoo/453](https://maixhub.com/model/zoo/453) ## YOLOv5 和 YOLOv8 和 YOLO11 用哪个? -这里提供的 `YOLOv5s` 和 `YOLOv8n` 和 `YOLO11n` 三种模型,`YOLOv5s`模型更大,`YOLOv8n YOLO11n`速度快一点点, 精度按照官方数据来说`YOLO11n > YOLOv8n > YOLOv5s`,可以实际测试根据自己的实际情况选择。 +这里提供的 `YOLOv5s` 和 `YOLOv8n` 和 `YOLO11n` 和 `YOLO26n` 三种模型,`YOLOv5s`模型更大,`YOLOv8n YOLO11n YOLO26n`速度快一点点, 精度按照官方数据来说`YOLO26n > YOLO11n > YOLOv8n > YOLOv5s`,可以实际测试根据自己的实际情况选择。 另外你也可以尝试`YOLOv8s`或者`YOLO11s`,帧率会低一些(比如 yolov8s_320x224 比 yolov8n_320x224 慢 10ms),准确率会比前两个都高,模型可以在上面提到的模型库下载到或者自己从`YOLO`官方仓库导出模型。 @@ -85,7 +86,7 @@ YOLO11: [https://maixhub.com/model/zoo/453](https://maixhub.com/model/zoo/453) 强烈建议先使用 MaixHub 在线训练模型,此种方式难度比较大,不建议新手一来就碰这个方式。 此种方式有些许默认你知道的知识文中不会提,遇到问题多上网搜索学习。 -请看 [离线训练YOLOv5模型](./customize_model_yolov5.md) 或者 [离线训练 YOLOv8/YOLO11 模型](./customize_model_yolov8.md) +请看 [离线训练YOLOv5模型](./customize_model_yolov5.md) 或者 [离线训练 YOLOv8/YOLO11/YOLO26 模型](./customize_model_yolov8.md) ## 附录:80分类 diff --git a/examples/vision/ai_vision/nn_yolo26_detect.py b/examples/vision/ai_vision/nn_yolo26_detect.py new file mode 100644 index 00000000..aaa1dda7 --- /dev/null +++ b/examples/vision/ai_vision/nn_yolo26_detect.py @@ -0,0 +1,20 @@ +from maix import camera, display, image, nn, app,time + +detector = nn.YOLO26(model="/root/yolo26n.mud", dual_buff=False) +cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format()) +disp = display.Display() + +while not app.need_exit(): + img = cam.read() + + time.fps_start() + objs = detector.detect(img, conf_th=0.5, iou_th=0.45) + fps = time.fps() + print(f"time: {1000/fps:.02f}ms, fps: {fps:.02f}") + + for obj in objs: + img.draw_rect(obj.x, obj.y, obj.w, obj.h, color=image.COLOR_RED) + msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}' + img.draw_string(obj.x, obj.y, msg, color=image.COLOR_RED) + + disp.show(img) \ No newline at end of file