Show simple item record

dc.rights.licenseopenen_US
dc.contributor.authorGONZALEZ-DIAZ, Ivan
dc.contributor.authorMOLINA-MORENO, Miguel
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.authorBENOIS-PINEAU, Jenny
hal.structure.identifierInstitut de Neurosciences cognitives et intégratives d'Aquitaine [INCIA]
dc.contributor.authorDE RUGY, Aymar
dc.date.accessioned2024-09-30T08:42:13Z
dc.date.available2024-09-30T08:42:13Z
dc.date.issued2024-07-18
dc.identifier.issn2168-2208en_US
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/202009
dc.description.abstractEnThis work tackles the problem of automatically predicting the grasping intention of humans observing their environment, with eye-tracker glasses and video cameras recording the scene view. Our target application is the assistance to people with motor disabilities and potential cognitive impairments, using assistive robotics. Our proposal leverages the analysis of human attention captured in the form of gaze fixations recorded by an eye-tracker on the first person video, as the anticipation of prehension actions is a well studied and well known phenomenon. We propose a multi-task system that simultaneously addresses the prediction of human attention in the near future, and the anticipation of grasping actions. In our model, visual attention is modeled as a competitive process between a discrete set of states, each one associated to a well-known gaze movement pattern from visual psychology. We additionally consider an asymmetric multitask problem, where attention modeling is an auxiliary task that helps to regularize the learning process of the main action prediction task, and propose a constrained multi-task loss that naturally deals with this asymmetry. Our model shows superior performance than other losses for dynamic multi-task learning, current dominant deep architectures for general action forecasting and particularly-tailored models for predicting grasping intention. In particular, it provides state-of-the-art performance in three datasets for egocentric action anticipation, with an average precision of 0.569 and 0.524 in GITW and Sharon datasets, respectively, and an accuracy of 89.2% and a success rate of 51.7% in Invisible dataset.
dc.language.isoENen_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subject.enGrasping Action Forecasting
dc.subject.enMulti-Task Learning
dc.subject.enInterpretable Attention Prediction
dc.subject.enConstrained Loss
dc.title.enAsymmetric multi-task learning for interpretable gaze-driven grasping action forecasting
dc.title.alternativeIEEE J Biomed Health Informen_US
dc.typeArticle de revueen_US
dc.identifier.doi10.1109/JBHI.2024.3430810en_US
dc.subject.halSciences du Vivant [q-bio]/Neurosciences [q-bio.NC]en_US
dc.identifier.pubmed39024089en_US
bordeaux.journalIEEE Journal of Biomedical and Health Informaticsen_US
bordeaux.page1-17en_US
bordeaux.hal.laboratoriesInstitut de neurosciences cognitives et intégratives d'Aquitaine (INCIA) - UMR 5287en_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.institutionCNRSen_US
bordeaux.peerReviewedouien_US
bordeaux.inpressnonen_US
bordeaux.identifier.funderIDSpanish National Plan for Scientific and Technical Research and Innovationen_US
bordeaux.import.sourcehal
hal.identifierhal-04684197
hal.version1
hal.popularnonen_US
hal.audienceInternationaleen_US
hal.exportfalse
workflow.import.sourcehal
dc.rights.ccCC BY-NC-NDen_US
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=IEEE%20Journal%20of%20Biomedical%20and%20Health%20Informatics&rft.date=2024-07-18&rft.spage=1-17&rft.epage=1-17&rft.eissn=2168-2208&rft.issn=2168-2208&rft.au=GONZALEZ-DIAZ,%20Ivan&MOLINA-MORENO,%20Miguel&BENOIS-PINEAU,%20Jenny&DE%20RUGY,%20Aymar&rft.genre=article


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record