Transforming ocean surveying by the power of DL and statistical methods

Work packages

1. WP1 Bayesian deep learning:

DL methods frequently run into problems due to their difficulty coping with uncertainty.  One of the main advantages of the Bayesian approach is the possibility of dealing with uncertainty through sampling from the posterior distribution. Moreover, relevant prior information regarding the variables subjected to inference can easily be incorporated in the developed Bayesian model.

Important research directions in this work package include:

i) Knowledge-driven prior models

ii) Sparse networks for more efficient computation

iii) Connections between DL and Lassonet

iv) New knowledge about Lottery winning tickets in DL.

After a posterior model for all parameters is available, this can be used for future predictions. Moreover, it can be used as prior model for future areas of similar tasks. This work package is performed in close collaboration with Visual Intelligence where one of the underlying goals is to obtain precise uncertainty estimates for DL methods. Another important collaborator here is FFI through the project Fast uncertainty estimation in deep learning applied to object recognition in sonar images financed by RCN where the underlying goal is to speed up DL methodology by using sparse networks.  Utilizing sparse networks may also lead to models that are easier to interpret. More efficient methods developed in this work package will also be very useful in the pipeline for the industry partners Argeo and Multiconsult. The more theoretical parts (i), iii) and iv) above) of this work package will be performed in close collaboration with Integreat, a center of excellence financed by RCN. Since computational issue is an important limiting factor, it may be necessary to speed up the training phase by parallelization so that e.g., different network models are trained by different processors. This will be performed using available computer resources at University of Tromsø (UiT).

2. WP2 Novel approaches to heterogeneous data

This work package aims at developing novel directions within classification where the data input to the classifier contains many very heterogeneous sources, which in the machine learning frequently is denoted as multimodal data. In our test case for the classification of a set of objects, this is highly relevant because we will in fact have multiple sources of information available. To be specific, these sources include digital 2D images, acoustic images, and video sequences. On top of this, we have auxiliary data given as e.g., contextual descriptions of sediment samples, including geographical location, selected physical and chemical properties, in addition to water column (current speed, temperature, salinity etc.) and various seabed geophysical measures such as multibeam (MBES), backscatter data, side scan sonar etc.

One important research direction will be a combination of the different data sources using the DIVAS methodology developed by Professor Marron who is one of the important project collaborators.  Novel development with application to both geoscience and medicine will be developed during postdoc Myrvoll-Nilsen’s visit to UNC at Chapel Hill where he collaborates closely with Marron and his coworkers. A crucial idea in DIVAS is how knowledge on the sources’ correlation and variation, and most importantly their joint variation, can be used to improve classification results.

Another crucial research direction in this work package concerns self-supervised learning (SSL) which in essence means to learn pattern from massive data sets where only little or no labelled data exist.  SSL has in recent years gained momentum through impressive results in many applications. The underlying idea in SSL methods is to construct a pretext task that a network should solve based on the data samples only. After the deep network is trained in the pretext task, it is used to generate inner representations of the data samples and solve the down-stream task. Impressive results obtained so far for microfossils in Martinsen et al. (2024), indicate that this indeed is a very valuable direction for future research. By linking the classifier to various auxiliary data this project enables multiproxy analysis of relationships between marine fauna, flora and physical/chemical parameters. This further allows for improved interpretation of the seabed environment both in relation to natural variability and anthropogenic impacts (e.g., pollution, ocean acidification and climate change).

 3. WP3 Multimodal imaging

The aim of this work package is producing data for training algorithms to automatically identify macro- and microelements of interest at and in the seabed from acoustic (sonar) data, video sequences and digital photos. On the macroscale the data is used to train algorithms to identify seabed types, selected marine habitats and their fauna and flora and large-scale natural and man-made/derived objects. On the micro-scale, data is extracted from marine sediments. Here the focus is on training algorithms to identify groupings within the foraminiferal micro fauna (presence/absence, overall abundance of types (planktic, agglutinated and calcareous benthic) and indicator species) and presence/absence of microplastic particles and their overall abundance.

The produced data can be used in mapping of the seafloor and to identify the state of the marine environment in a particular area prior to and after e.g., establishment of fish farms, construction of oil/gas installations, and deposition of terrestrial waste or other major disturbances. Additionally, it will potentially aid identification of unknown vulnerable areas/niches of high marine conservation value, which in turn can be submitted to further examination at both macro- and micro-scale. The automatic identification of macro- and micro-scale marine data enables a more comprehensive and cost-effective data acquisition from the marine environment. In combination, such data allow for expanded elucidation of causes and effects of natural and anthropogenic influence, including climate change and ocean acidification, as well as facilitating a more holistic evaluation of the state of the marine environment and its development.