DSWFNet: dual-branch fusion of spatial and wavelet features for road extraction from remote sensing images

Road extraction from remote sensing imagery is essential for urban planning, traffic monitoring, and emergency response. However, existing methods often focus solely on spatial-domain features, limiting their ability to model complex topological structures like narrow or fragmented roads. To address...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Kunlun, As'arry, Azizan, Shen, Xibing, Hairuddin, Abdul Aziz, Hassan, Mohd Khair, Zhu, Liucun, Qin, Weirong
Format: Article
Language:en
Published: Springer Science and Business Media 2025
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/124729/1/124729.pdf
http://psasir.upm.edu.my/id/eprint/124729/
https://www.nature.com/articles/s41598-025-34091-3
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Road extraction from remote sensing imagery is essential for urban planning, traffic monitoring, and emergency response. However, existing methods often focus solely on spatial-domain features, limiting their ability to model complex topological structures like narrow or fragmented roads. To address this limitation, we propose a dual-branch framework-DSWFNet-that fuses spatial and frequency domain features for road extraction. The model introduces a frequency-domain branch constructed via Discrete Wavelet Transform (DWT) to complement the RGB-based spatial branch in modeling fine image details. To further enhance feature representations, we design two dedicated attention mechanisms: the Multi-Scale Coordinate Channel Attention (MSCCA) module for spatial features, and the Enhanced Frequency-Domain Channel Attention (EFDCA) module for frequency features. These are followed by a Bidirectional Cross Attention Module (BCAM) that enables deep interaction and fusion of the two feature types, significantly improving the model's sensitivity to road targets and its ability to preserve structural continuity. Experiments on two representative datasets validate the effectiveness of our approach. Specifically, on the Massachusetts dataset, DSWFNet achieves an IoU of 66.07% and an F1 of 79.57%, improving upon the best spatial-domain method, OARENet, by 1.25% and 0.92%. On the CHN6-CUG dataset, performance is further enhanced with an IoU of 70.76% and an F1 of 82.88%, surpassing the leading baseline by 1.64% and 1.13%.