)
从零实现DIOR遥感数据集在MMRotate框架中的完整适配指南旋转目标检测是计算机视觉领域的重要分支而MMRotate作为基于PyTorch的开源工具箱为旋转目标检测提供了强大的支持。本文将详细介绍如何将DIOR遥感数据集完整适配到MMRotate框架中涵盖从数据格式转换到模型训练的全流程。1. DIOR数据集解析与准备工作DIOR数据集包含23,463张遥感图像和190,288个实例覆盖20个常见物体类别。数据集结构主要分为四个部分Annotations包含水平边界框和旋转边界框的XML标注文件ImageSets包含训练集、验证集和测试集的划分文件JPEGImages-test测试集图像文件JPEGImages-trainval训练验证集图像文件在开始适配前需要完成以下准备工作下载完整的DIOR数据集约35GB安装MMRotate框架及其依赖项准备Python开发环境推荐使用conda# 创建conda环境 conda create -n mmrotate python3.8 -y conda activate mmrotate # 安装PyTorch pip install torch1.9.0cu111 torchvision0.10.0cu111 -f https://download.pytorch.org/whl/torch_stable.html # 安装MMRotate git clone https://github.com/open-mmlab/mmrotate.git cd mmrotate pip install -v -e .2. DIOR到DOTA格式的转换MMRotate原生支持DOTA格式的数据集因此需要将DIOR的XML标注转换为DOTA格式。DOTA格式的特点包括使用txt文件存储标注信息每个图像对应一个标注文件标注格式为x1 y1 x2 y2 x3 y3 x4 y4 class_name difficult2.1 XML到TXT的转换以下Python脚本可将DIOR的XML标注转换为DOTA格式的TXT文件import os import xml.etree.ElementTree as ET def convert_dior_to_dota(input_folder, output_folder): os.makedirs(output_folder, exist_okTrue) for filename in os.listdir(input_folder): if not filename.endswith(.xml): continue xml_path os.path.join(input_folder, filename) tree ET.parse(xml_path) root tree.getroot() text_data [] for obj in root.findall(.//object): robndbox obj.find(robndbox) if robndbox is None: continue coords [ robndbox.find(x_left_top).text, robndbox.find(y_left_top).text, robndbox.find(x_right_top).text, robndbox.find(y_right_top).text, robndbox.find(x_right_bottom).text, robndbox.find(y_right_bottom).text, robndbox.find(x_left_bottom).text, robndbox.find(y_left_bottom).text ] name obj.find(name).text difficult obj.find(difficult).text text_line .join(coords) f {name} {difficult}\n text_data.append(text_line) output_path os.path.join(output_folder, filename.replace(.xml, .txt)) with open(output_path, w) as f: f.writelines(text_data) # 使用示例 convert_dior_to_dota( input_folder/path/to/DIOR/Annotations/Oriented_Bounding_Boxes, output_folder/path/to/DIOR/Annotations/Oriented_Bounding_Boxes_processed )2.2 数据集目录结构调整DOTA格式要求特定的目录结构DIOR_processed/ ├── test/ │ ├── annfiles/ │ └── images/ └── trainval/ ├── annfiles/ └── images/使用以下脚本完成数据划分import os import shutil def organize_dior_dataset(processed_ann_folder, original_img_folder, output_root): # 创建目录结构 for split in [trainval, test]: for subdir in [annfiles, images]: os.makedirs(os.path.join(output_root, split, subdir), exist_okTrue) # 处理trainval集 with open(/path/to/DIOR/ImageSets/Main/trainval.txt) as f: trainval_files [line.strip() for line in f] for filename in trainval_files: # 处理标注文件 src_ann os.path.join(processed_ann_folder, f{filename}.txt) dst_ann os.path.join(output_root, trainval, annfiles, f{filename}.txt) if os.path.exists(src_ann): shutil.copy(src_ann, dst_ann) # 处理图像文件 src_img os.path.join(original_img_folder, f{filename}.jpg) dst_img os.path.join(output_root, trainval, images, f{filename}.jpg) if os.path.exists(src_img): shutil.copy(src_img, dst_img) # 处理测试集同上 # ...3. MMRotate框架适配3.1 创建DIOR数据集类在mmrotate/datasets/目录下创建dior.py文件from .builder import ROTATED_DATASETS from .dota import DOTADataset ROTATED_DATASETS.register_module() class DIORDataset(DOTADataset): CLASSES ( airplane, airport, baseballfield, basketballcourt, bridge, chimney, dam, Expressway-Service-area, Expressway-toll-station, golffield, groundtrackfield, harbor, overpass, ship, stadium, storagetank, tenniscourt, trainstation, vehicle, windmill ) PALETTE [ (165, 42, 42), (189, 183, 107), (0, 255, 0), (255, 0, 0), (138, 43, 226), (255, 128, 0), (255, 0, 255), (0, 255, 255), (255, 193, 193), (0, 51, 153), (255, 250, 205), (0, 139, 139), (255, 255, 0), (147, 116, 116), (0, 0, 255), (255, 69, 0), (128, 0, 128), (0, 128, 128), (218, 165, 32), (199, 21, 133) ] def __init__(self, **kwargs): super(DIORDataset, self).__init__(**kwargs)3.2 修改框架配置文件在configs/base/datasets/目录下创建dior.pydataset_type DIORDataset data_root /path/to/DIOR_processed/ img_norm_cfg dict( mean[123.675, 116.28, 103.53], std[58.395, 57.12, 57.375], to_rgbTrue) train_pipeline [ dict(typeLoadImageFromFile), dict(typeLoadAnnotations, with_bboxTrue), dict(typeRResize, img_scale(1024, 1024)), dict(typeRRandomFlip, flip_ratio0.5), dict(typeNormalize, **img_norm_cfg), dict(typePad, size_divisor32), dict(typeDefaultFormatBundle), dict(typeCollect, keys[img, gt_bboxes, gt_labels]) ] test_pipeline [ dict(typeLoadImageFromFile), dict( typeMultiScaleFlipAug, img_scale(1024, 1024), flipFalse, transforms[ dict(typeRResize), dict(typeNormalize, **img_norm_cfg), dict(typePad, size_divisor32), dict(typeDefaultFormatBundle), dict(typeCollect, keys[img]) ]) ] data dict( samples_per_gpu2, workers_per_gpu2, traindict( typedataset_type, ann_filedata_root trainval/annfiles/, img_prefixdata_root trainval/images/, pipelinetrain_pipeline), valdict( typedataset_type, ann_filedata_root test/annfiles/, img_prefixdata_root test/images/, pipelinetest_pipeline), testdict( typedataset_type, ann_filedata_root test/annfiles/, img_prefixdata_root test/images/, pipelinetest_pipeline) )3.3 模型配置文件调整选择适合的模型配置如oriented_rcnn修改以下关键参数_base_ [ ../_base_/datasets/dior.py, ../_base_/schedules/schedule_1x.py, ../_base_/default_runtime.py ] model dict( roi_headdict( bbox_headdict( num_classes20))) # DIOR有20个类别4. 训练与验证完成上述配置后可以使用以下命令开始训练python tools/train.py configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dior_le90.py --work-dir work_dirs/dior_exp训练过程中常见问题及解决方案问题现象可能原因解决方案内存不足批次大小过大减小samples_per_gpu标注加载失败路径错误或格式不符检查标注文件路径和格式类别数不匹配num_classes设置错误确保设置为20训练loss不下降学习率不合适调整lr参数验证模型性能python tools/test.py \ configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dior_le90.py \ work_dirs/dior_exp/latest.pth \ --eval mAP5. 高级优化技巧5.1 数据增强策略针对遥感图像特点可以增强以下数据增强策略train_pipeline [ # ...原有pipeline... dict(typeRandomRotate, angles[30, 60, 90, 120, 150]), dict(typeRandomBrightness, brightness_range(0.8, 1.2)), dict(typeRandomContrast, contrast_range(0.8, 1.2)), # ...其余配置... ]5.2 模型微调技巧学习率调整对于预训练模型初始学习率可设为0.005多尺度训练添加多尺度训练策略提升检测性能困难样本挖掘针对DIOR中部分困难类别如小型车辆特别优化5.3 性能优化对于大规模遥感图像检测可采用以下优化措施使用FP16混合精度训练启用cudnn benchmark调整Dataloader的num_workers数量# 在配置文件中添加 optimizer_config dict(typeFp16OptimizerHook, loss_scale512.) fp16 dict(loss_scale512.)在实际项目中我们发现DIOR数据集中vehicle类别的检测难度较大通过增加该类的数据增强比例和调整损失函数权重可以将mAP提升约3-5个百分点。