摘要

Objective: With the widespread use of depth cameras and 3D scanning equipment, 3D data with point clouds as the main structure have become more readily available to people. As a result, 3D point clouds are widely used in practical applications such as self-driving cars, location recognition, robot localization, and remote sensing. In recent years, the great success of convolutional neural networks (CNNs) has changed the landscape of 2D computer vision. However, CNNs cannot directly process unstructured data such as point clouds due to the disorderly, irregular characteristics of 3D point clouds. Therefore, mine shape features from disordered point clouds have become a viable research direction in point cloud analysis. Method An end-to-end multidimensional multilayer neural network (MM-Net), which can directly process point cloud data, is presented in this paper. The multi-dimensional feature correction and fusion (MDCF) module can correct local features in different dimensions rationally. First, the local area division unit, using farthest point sampling and ball query, constructs local areas at different radii from which the 10D geometric relations and local features required are obtained for the module. Inspired by related research, the module uses geometric relations to modify the point-wise features, enhance the interaction between points, and encode useful local features, which are supplemented by point-wise features. Finally, the shape features of different region ranges are fused and mapped to a higher dimensional space. At the same time, the multi-layer feature articulation (MLFA) module focuses on integrating the contextual relationships between local regions to extract global features. In particular, these local regions are seen as distinct nodes, and global features are acquired by using convolution and jump fusion. The MLFA module uses the long-range dependencies between multiple layers to reason about the global shape required for the network. Furthermore, two network architectures (multidimensional multi-layer feature classification network (MM-Net-C) and multidimensional multi-layer feature segmentation network (MM-Net-S)) for point cloud classification and segmentation tasks are designed in this paper. In detail, MM-Net-C goes through three tandem MDCF modules with three layers of interlinked local shape features. The global features are then obtained by connecting and integrating the correlations between each local region through the MLFA module. In MM-Net-S, after processing by the MLFA module, the object data are encoded global feature vector with 1 024 dimensions. Then, the features are summed to obtain shapes that fuse local and global information, so that they are linked to the labels of the objects (e.g., motorbikes, cars). This process is followed by feature propagation, where successive up sampling operations are performed to recover the details in the original object data and to obtain a robust point-wise vector. Finally, the outputs of the different feature propagation layers are integrated and fed into the convolution operation. The features are transformed to obtain an accurate prediction of each point cloud within the object. Result The method in this paper is adequately tested on the publicly available ModelNet40 dataset and ShapeNet dataset. The experimental results are compared with various methods. In the ModelNet40 dataset, MM-Net-C is compared with several pnt-based (input point cloud coordinates only), such as dynamic graph convolutional neural network(DGCNN) (92.2%) with 1.9% accuracy improvement and relation-shape convolutional neural network(RS-CNN) (93.6%) with 0.5% accuracy improvement. MM-Net-C is also compared with several pnt-nor (coordinates and normal vectors of the input point cloud) based: point attention transformers(PAT) (91.7%) improves accuracy by 2.4%; PointConv (92.5%) improves accuracy by 1.6%; PointASNL (93.2%) improves accuracy by 0.9%. Even when several studies input more points for training, MM-Net-C still outperforms them. For example, PointNet++ (5 k, 91.9%) improves accuracy by 2.2%, and self-organizing network(SO-Net) (5 k, 93.4%) improves accuracy by 0.7%. In addition, MM-Net-C achieves higher accuracy rates than other studies with less complexity. For example, compared with PointCNN (8.20 M, 91.7%), MM-Net-C has less than one-eighth of the number of parameters while the accuracy rate is increased by 2.4%. Compared with RS-CNN (1.41 M, 93.6%), MM-Net-C has 0.33 M fewer parameters while the accuracy rate is increased by 0.5%. In the ShapeNet dataset, MM-Net-S compared with DGCNN (85.1%), the accuracy is improved by 1.4%; compared with shape-oriented convolutional neural network(SO-CNN) (85.7%), the accuracy is improved by 0.8%; and compared with annularly convolutional neural networks(A-CNN) (86.1%), the accuracy is improved by 0.4%. Ablation experiments are also conducted on the ModelNet40 dataset to confirm the effectiveness of the MM-Net architecture. The ablation experiments results validate the need for the MDCF module and MLFA module design. The results further confirm that MDCF module, which uses rich point-wise features modified and fused with potential local features, can effectively improve the network's mining of shape information within a local region. By contrast, the MLFA module captures contextual information at the global scale and reinforces the long-range dependency links that exist between different layers, effectively enhancing the robustness of the model in dealing with complex shapes. Ablation experiments are conducted on whether the MDCF needs to be designed with different dimensions. The experimental results demonstrate that MM-Net performs better than RS-CNN for the same dimensionality. Conclusion: In this paper, an MM-Net with MDCF module and MLFA module as core components is proposed. After conducting sufficient experiments, thorough comparisons and verifying MM-Net, a higher correct rate is achieved with the advantage of fewer parameters.

全文