Explore projects
-
-
Updated
-
Updated
-
-
Updated
-
Thèse Guillaume Bernard / Jeux de données / dataset_manipulation_tools / compute_dense_vectors
GNU General Public License v3.0 or laterThis software is used to compute dense vectorisations (sentence embeddings) of sequences of sentences of natural text. It is able to handle multilingual documents until the model used is a multilingual one. This relies on the S-BERT architecture, software and models (https://www.sbert.net/). It computes dense vector representations for tokens, lemmas, entities, etc. of your datasets.
Archived 0Updated -
Visualisation du registre des traitements / Application web de visualisation du registre légal des traitements
CeCILL-B Free Software License AgreementProjet Open Source de visualisation interactive du registre des traitements de l'agglomération de La Rochelle.
Updated -
Thèse Guillaume Bernard / Développement / from events to documents / request_documents_based_on_events_they_report
GNU General Public License v3.0 or laterRequests to collect documents relating real-world events (themselves described using wikivents) stored in a global index (provided by database_infrastructure_text_mining).
Archived 0Updated -
-
Methods to take into account digit preference (heaping) in count data of wildlife
Updated -
Updated
-
This competition proposes to improve / denoise OCR-ed texts, on a testbed of more than 20 million characters form English, French, German, Finish, Spanish, Dutch, Czech, Bulgarian, Slovak and Polish.
Updated