2025 Volume 6 Issue 3 Pages 1-14
Accurate crack detection using deep convolutional neural networks (DCNNs) is critical for infrastructure inspection. However, the requirement for fine-grained annotations remains a major bottleneck. To mitigate this, a previous study proposed a multi-stage Multiple-Instance Learning (MIL) framework that uses regionlevel labels and model-generated pseudo-labels to reduce annotation costs while maintaining performance. This paper introduces two key extensions to enhance that framework. First, an adaptive thresholding mechanism derives bag-specific thresholds from the likelihood distribution of negative instances, explicitly filtering unreliable positive instances without parametric assumptions. Second, a multi-scale overlapping tiling strategy increases the ratio of crack-containing instances with each positive bags, improving MIL training efficiency and robustness under weak supervision. Experiments on the Concrete Crack Segmentation Dataset demonstrate that the proposed method outperforms both the baseline MIL and fully supervised models under equivalent labeling budgets. The enhanced model improves the F1-score by 2.5 points over the baseline and also reduces false positives by approximately one-third through two-stage inference. Importantly, these improvements are achieved without any pixel-level or subregion-level manual labeling. These results highlight the proposed framework’s scalability, robustness, and practical suitability for annotation-efficient crack detection in civil infrastructure.