Mostrar el registro sencillo del ítem

dc.rights.licenseopenen_US
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorXU, Binbin
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorGIL-JARDINE, Cedric
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorTHIESSARD, Frantz
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorTELLIER, Eric
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorAVALOS FERNANDEZ, Marta
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorLAGARDE, Emmanuel
dc.date.accessioned2021-02-23T09:02:28Z
dc.date.available2021-02-23T09:02:28Z
dc.date.issued2020
dc.date.conference2020-05-19
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/26321
dc.description.abstractEnTo build a French national electronic injury surveillance system based on emergency room visits, we aim to develop a coding system to classify their causes from clinical notes in free-text. Supervised learning techniques have shown good results in this area but require a large amount of expert annotated dataset which is time consuming and costly to obtain. We hypothesize that the Natural Language Processing Transformer model incorporating a generative self-supervised pre-training step can significantly reduce the required number of annotated samples for supervised fine-tuning. In this preliminary study, we test our hypothesis in the simplified problem of predicting whether a visit is the consequence of a traumatic event or not from free-text clinical notes. Using fully retrained GPT-2 models (without OpenAI pre-trained weights), we assess the gain of applying a self-supervised pre-training phase with unlabeled notes prior to the supervised learning task. Results show that the number of data required to achieve a ginve level of performance (AUC>0.95) was reduced by a factor of 10 when applying pre-training. Namely, for 16 times more data, the fully-supervised model achieved an improvement <1% in AUC. To conclude, it is possible to adapt a multipurpose neural language model such as the GPT-2 to create a powerful tool for classification of free-text notes with only a small number of labeled samples.
dc.language.isoENen_US
dc.publisherThe AAAI Pressen_US
dc.subjectSISTM
dc.subjectERIAS
dc.subjectIETO
dc.title.enPre-Training a Neural Language Model Improves the Sample Efficiency of an Emergency Room Classification Model
dc.title.alternativeFLAIRS-32 - Thirty-Third International Flairs Conferenceen_US
dc.typeCommunication dans un congrès avec actesen_US
dc.subject.halSciences du Vivant [q-bio]/Santé publique et épidémiologieen_US
bordeaux.page264-9en_US
bordeaux.hal.laboratoriesBordeaux Population Health Research Center (BPH) - U1219en_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.conference.titleFLAIRS-32 - Thirty-Third International Flairs Conferenceen_US
bordeaux.countryusen_US
bordeaux.title.proceedingProceedings of the thirty-third international florida artificial intelligence research society conferenceen_US
bordeaux.teamSISTM_BPH
bordeaux.teamERIASen_US
bordeaux.teamIETOen_US
bordeaux.conference.cityPalo Altoen_US
bordeaux.peerReviewedouien_US
hal.exportfalse
bordeaux.COinSctx_ver=Z39.88-2004&amp;rft_val_fmt=info:ofi/fmt:kev:mtx:journal&amp;rft.date=2020&amp;rft.spage=264-9&amp;rft.epage=264-9&amp;rft.au=XU,%20Binbin&amp;GIL-JARDINE,%20Cedric&amp;THIESSARD,%20Frantz&amp;TELLIER,%20Eric&amp;AVALOS%20FERNANDEZ,%20Marta&amp;rft.genre=proceeding


Archivos en el ítem

ArchivosTamañoFormatoVer

No hay archivos asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem