WikidataCon 2023

WikidataCon 2023

Yann Almeras


Session

10-28
23:45
30min
Supercharging Wikidata with External Aliases and New Entity Types | 透過外部資料庫的別名和新實體類型來強化維基數據
Yoan Chabot, Yann Almeras

(EN)
Wikidata plays a crucial role in facilitating Named Entities Linking and Relations extraction for companies and researchers. However, it also faces certain limitations. Unlike DBpedia, Wikidata lacks a comprehensive taxonomy of entities, and many entities have a partial list of aliases that could benefit from enrichment. In this talk, we will introduce a database built within Orange that supplements Wikidata with enriched entity information sourced from various external databases using intelligent heuristics. Then we will show how this database can be used to highlight inconsistencies and poor-quality data in Wikidata and across various Wikipedia editions. We will also share our plans to develop robots to seamlessly transfer enhanced data back into the public Wikidata instance, fostering a more robust and accurate knowledge base.


(ZH)
維基數據在協助研究員或公司進行實體鏈結以及關係擷取方面極為重要。然而,維基數據並非完美,相較於 DBpedia,維基數據缺乏清晰易懂的實體分類,並且部分實體因為別名眾多而有待補充。在本次的演講中,我們將介紹一個在 Orange 中建立的資料庫,該資料庫使用啟發式方法,透過來自各種外部資料庫的豐富實體資訊來補充維基數據。接著我們將會展示如何使用該資料庫來對維基數據以及各語言版本的維基百科條目進行比對,找出不一致或是低品質的內容。我們接著將會簡單說明我們計畫如何應用此技術,開發機器人來將資料庫中的優質資料導入維基數據,強化知識庫的的準確性。

Main program