摘要
Objective Pixel-wise segmentation for synthetic aperture radar (SAR) images has been challenging due to the constraints of labeled SAR data, as well as the coherent speckle contextual information. Current semantic segmentation is challenged like existing algorithms as mentioned below: First, the ability to capture contextual information is insufficient. Some algorithms ignore contextual information or just focus on local spatial contextual information derived of a few pixels,and lack global spatial contextual information. Second, in order to improve the network performance, researchers are committed to developing the spatial dimension and ignoring the relationship between channels. Third, a neural network based high-level features extracted from the late layers are rich in semantic information and have blurred spatial details. A network based low-level features extraction contains more noise pixel-level information from the early layers. They are isolated from each other, so it is difficult to make full use of them. The most common ways are not efficient based on concatenate them or per-pixel addition. Method To solve these problems, a segmentation algorithm is proposed based on fully convolutional neural network (CNN). The whole network is based on the structure of encoder-decoder network. Our research facilitates a contextual encoding module and a feature fusion module for feature extraction and feature fusion. The different rates and channel attention mechanism based contextual encoding module consists of a residual connection, a standard convolution, two dilated convolutions. Among them, the residual connection is designed to neglect network degradation issues. Standard convolution is obtained by local features with 3 × 3 convolution kernel. After convolution, batch normalization and nonlinear activation function ReLU are connected to resist over-fitting. Dilated convolutions with 2 × 2 and 3 × 3 dilated rates extend the perception field and capture multi-scale features and local contextual features further. The channel attention mechanism learns the importance of each feature channel, enhances useful features in terms of this importance, inhibits features, and completes the modeling of the dependency between channels to obtain the context information of channels. First, the feature fusion module based global context features extraction is promoted, the in the high-level features. Specifically, the global average pooling suppresses each feature to a real number, which has a global perception field to some extent. Then, these numbers are embedding into the low-level features. The enhanced low-level features are transmitted to the decoding network, which can improve the effectiveness of up sampling. This module can greatly enhance its semantic representation with no the spatial information of low-level features loss, and improve the effectiveness of their integration. Our research carries out four contextual encoding modules and two feature fusion modules are stacked in the whole network. Result We demonstrated seven experimental schemes. In the first scheme, contextual encoder module (CEM) is used as the encoder block only; In the second scheme, we combined the CEM and the feature fusion module (FFM); the rest of them are five related methods like SegNet, U-Net, pyramid scene parsing network (PSPNet), FCN-DK3 and context-aware encoder network (CAEN). Our two real SAR images experiments contain a wealth of information scene experiment are Radarsat-2 Flevoland (RS2-Flevoland) and Radarsat-2 San-Francisco-Bay (RS2-SF-Bay) . The option of overall accuracy (OA), average accuracy (AA) and Kappa coefficient is as the evaluation criteria. The OA of the CEM algorithm on the two real SAR images is 91. 082% and 90. 903% respectively in comparison to the five advanced algorithms mentioned above. The CEM-FFM algorithm increased 2. 149% and 2. 390% compare to CEM algorithm. Conclusion Our illustration designs a CNN based semantic segmentation algorithm. It is composed of two aspects of contextual encoding module and feature fusion module. The experiments have their priorities of the proposed method with other related algorithms. Our proposed segmentation network has stronger feature extraction ability, and integrates low-level features and high-level features greatly, which improves the feature representation ability of the stable network and more accurate results of image segmentation.
- 单位