Métadonnées
Afficher la notice complètePartager cette publication !
(A)KDD for Structuring Destrured Documents
Langue
EN
Communication dans un congrès
Ce document a été publié dans
Proceedings of the 2018 International Conference on Artificial Intelligence ICAI'18, 2018 International Conference on Artificial Intelligence ICAI'18, 2018-07-30, Las Vegas, NV. 89.
Résumé en anglais
The worldwide volume of digital data doubles every 9 months. Over 75% of these data are unstructured. This paper concerns structuring the graphic information contained in vector files, including PDF (Portable Document ...Lire la suite >
The worldwide volume of digital data doubles every 9 months. Over 75% of these data are unstructured. This paper concerns structuring the graphic information contained in vector files, including PDF (Portable Document Format) files, which represent a very significant share of all these vector files. We say these data are destructured because they are today produced by software. When there are stored or exchanged, (for example in PDF Files) only graphic information is kept and the data structure use to create the document is lost To structure these data, we use Knowledge Discovery in Databases (KDD). The following two issues arise: • Can the KDD method be adapted to “Structuring Destructured Documents”?• If so, what points need adapting or highlighting in the method to solve the issue of structuring essentially graphic documents?The answer to the questions is “YES” with the human in the loop. This is why we talk about the Anthropocentric Knowledge Discovery in Database method, abbreviated to (A)KDD.< Réduire
Mots clés en anglais
CHI
PDF
KDD
Graphic reconstruction
Pattern recognition
Data Minning
Unités de recherche