Current students


VALI AVACycle: XXXIII

Section: Computer Science and Engineering
Advisor: COMAI SARA
Tutor: AMIGONI FRANCESCO

Major Research topic:
Hyperspectral Image Analysis and Advanced Feature Engineering for Optimized Classification and ; Acquisition

Abstract:
Hyperspectral Imaging (HSI) is a field where conventional imaging is combined with spectroscopy technology to capture spatial and spectral information from an object simultaneously. HSI produces images in two spatial dimensions where each pixel carries spectral information (within a spectral range from visible to mid-infrared spectrum) that generates the third dimension of the image. Hence, such images are often called hyper-cube data or cubic images. In the last decade, the concurrence of advances in HSI technologies and computational capacities has introduced hyperspectral sensors to new domains and applications. Besides the remote sensing field, where HIS knowledge is originated in, recently, a surge of interest in HSI technology is being seen in diverse fields such as in industry, agriculture, food quality and safety, pharmaceuticals, and healthcare.

The significant advantages of hyperspectral images stem from the fact that they contain the proper information to detect objects and to identify the material with different spectral absorption characteristics. However, it comes with noble challenges that are particular to this type of data, such as the so-called problem of curse of dimensionality (i.e., due to the high dimensionality of data), and redundancies and noises within the data. From the machine learning’s perspective, high dimensionality of the data feature space (i.e., the spectral dimension) comes with the inevitable need for an enormous number of training samples, as the characteristic of each feature combination can only be properly identified if there are enough samples with that exact feature combination. Moreover, classification of such problems is quite expensive in terms of time and computation, and usually they are perceived as inefficient from the application point of view. At the same time, the large number of spectral bands causes large redundancies, which means that there is some spectral information that does not contribute to the processing task while it puts heavy computational burden on the process. Noisy spectral features are also a form of such redundancies that are due to the sensor errors or are dictated by the outer conditions. Such challenges strongly motivate establishing proper research frameworks on this type of data, as an independent matter.

In addition to the aforementioned data-related challenges, the emerging hyperspectral applications are prone to a critical classification problem that is usually referred as “ground-truth scarcity”, or the lack of proper amount of labeled data. Given the fact that the majority of the prediction tasks rely on a pre-defined prior knowledge captured over the domain’s distribution of data, collection of such knowledge for the new applications that guarantees the efficiency and effectiveness of the machine learning methodologies is challenging and quite expensive in terms of time and required manual labor. Moreover, each HSI acquisition may result in a different distribution of data due to slight changes or biases caused by the sensor, the source of illumination, or any other acquisition condition. Such an intrinsic variation in the distribution of data accentuates the challenge of ground-truth scarcity. These problems highlight the importance of transferring knowledge from one domain defined by a particular distribution of data to another one. Indeed, such a cross-domain classification problem that is based on hyperspectral images requires investigations.

The research project carried out within the framework of my PhD thesis regards the design of a classification model within which the advanced computer vision and machine learning-based analysis over hyperspectral images are accomplished in order to tackle three major common issues introduced by such data: ground-truth scarcity or lack of proper labeled data for effective supervised classifications, the redundancy and variety of noises within data, and high dimensionality which leads to classification inefficacy (curse of dimensionality). To enable richer insights regarding these three challenges and their solutions, this study is carried out under two distinct application domains: the “remote sensing” domain, where the hyperspectral technology is employed to capture on-earth objects from distance (airborne or spaceborne embedded sensors), and an industrial domain, in which “smart manufacturing” can exploit real-time inspection of particular materials as contaminants using hyperspectral images. The goal of this PhD research study is to establish a framework or a base-model for a wide range of hyperspectral image classification problems, using which these three major challenges and obstacles can be effectively managed. This will eventually ease up the process of generating solutions for emerging applications.