Explore projects
-
Updated
-
Thèse Guillaume Bernard / Développement / from events to documents / wikivents-projects / wikivents
GNU General Public License v3.0 or laterA Python package to process and represent events from ontologies and semi-structured databases such as Wikidata and Wikipedia.
archived 0Updated -
Updated
-
Python framework to identify and rank crisis-related tweets based on their informativeness.
Updated -
Updated
-
Thèse Guillaume Bernard / Jeux de données / dataset_manipulation_tools / synthesise_ocr_and_segmentation_errors_in_texts
GNU General Public License v3.0 or laterThis software enables to damage texts written in any natural language by applying OCR degradation (phantom characters, character degradation, etc.) and by over-segmenting texts (this means splitting regularly the texts in equal parts).
This is useful to reproduce common errors found in historical documents when historical data is missing.
archived 0Updated -
Updated
-
Updated
-
Updated
-
Thèse Guillaume Bernard / Développement / from events to documents / request_documents_based_on_events_they_report
GNU General Public License v3.0 or laterRequests to collect documents relating real-world events (themselves described using wikivents) stored in a global index (provided by database_infrastructure_text_mining).
archived 0Updated -
pelaverse / pelaSIG
GNU General Public License v3.0 onlyUpdated -
acadiie / datasets / PARES Dataset Tools
MIT LicenseTools to manipulate PARES dataset, images, annotations and connect to ArkIndex
Updated -
Thèse Guillaume Bernard / Développement / from documents to events / news_tracking
GNU General Public License v3.0 or laterCommand Line Tools to manipulate the document_tracking architecture. It allows to train the Miranda algorithm, to use it and the alternative one, the K-Means implementation. It also provides a tool to evaluate the results.
archived 0Updated -
Updated
-
Updated