A Feature Discretization Method Based on Fuzzy Rough Sets for High-resolution Remote Sensing Big Data Under Linear Spectral Model
Peer reviewed, Journal article
MetadataShow full item record
Original versionIEEE transactions on fuzzy systems. 2021, . 10.1109/TFUZZ.2021.3058020
As one of the most relevant data preprocessing techniques, discretization has played an important role in data mining, which is widely applied in industrial control. It can transform continuous features to discrete ones, thus improving the efficiency of data processing and adapting to learning algorithms that require discrete data as inputs. However, traditional discretization methods have shortcomings, such as highly complex programs, excessive numbers of intervals obtained, and significant loss of necessary information in the preprocessing of high-resolution remote sensing big data. Moreover, the large number of mixed pixels in the image is a primary reason for the uncertainty of remote sensing information systems, and current discretization methods are based on the assumption that one pixel only corresponds to the spectral information of a single object, without considering the influence of the uncertainty caused by a mixed spectrum, which causes the classification accuracy to drop after discretization. We propose a discretization method for high-resolution remote sensing big data. We determine the membership degree of each pixel in training samples through linear decomposition, and establish the individual fitness function based on a fuzzy rough model. An adaptive genetic algorithm selects discrete breakpoints, and a MapReduce framework calculates the individual fitness of the population in parallel, to obtain the optimal discretization scheme in the minimum time. Our method is compared to the best state-of-the-art discretization algorithms on the authentic remote sensing datasets. Experiments verified the effectiveness of the proposed method, which provides strong support for the subsequent processing of images.