A hybrid transformer with domain adaptation using interpretability techniques for the application to the detection of risk situations
Langue
EN
Article de revue
Ce document a été publié dans
Multimedia Tools and Applications. 2024-03-11
Résumé en anglais
Multimedia approaches are strongly required in multi-modal data processing for the detection and recognition of specific events in the data. Hybrid architectures with time series and image/video inputs in the framework of ...Lire la suite >
Multimedia approaches are strongly required in multi-modal data processing for the detection and recognition of specific events in the data. Hybrid architectures with time series and image/video inputs in the framework of twin CNNs have shown increased performances compared to mono-modal approaches. Pre-trained models have been used in transfer learning to fine-tune the last few layers in the network. This often leads to distribution shifts in the domain. In a real-world scenario, the distribution shifts between the source and target domains can yield poor classification results. With interpretable techniques used in deep neural networks, important features can be highlighted not only for trained models but also reinforced in the training process. Hence the initialization of the target domain model can be performed with improved weights. During data transfer between datasets, the dimensions of the data are also different. We propose a method for model transfer with the adaptation of data dimension and improved initialization with interpretability approaches.< Réduire