Please use this identifier to cite or link to this item:
|Title:||MYE: Missing year estimation in academic social networks||Author(s):||Chiu, Dah Ming||Author(s):||Fu, T. Z. J.
In bibliometrics studies, a common challenge is how to deal with incorrect or incomplete data. However, given a large volume of data, there often exists certain relationships between the data items that can allow us to recover missing data items and correct erroneous data. In this paper, we study a particular problem of this sort - estimating the missing year information associated with publications (and hence authors' years of active publication). We first propose a simple algorithm that only makes use of the "direct" information, such as paper citation/reference relationships or paper-author relationships. The result of this simple algorithm is used as a benchmark for comparison. Our goal is to develop algorithms that increase both the coverage (the percentage of missing year papers recovered) and accuracy (mean absolute error of the estimated year to the real year). We propose some advanced algorithms that extend inference by information propagation. For each algorithm, we propose three versions according to the given academic social network type: a) Homogeneous (only contains paper citation links), b) Bipartite (only contains paper-author relations), and, c) Heterogeneous (both paper citation and paper-author relations). We carry out experiments on the three public data sets (MSR Libra, DBLP and APS), and evaluated by applying the K-fold cross validation method. We show that the advanced algorithms can improve both coverage and accuracy.
|URI:||https://repository.cihe.edu.hk/jspui/handle/cihe/1912||CIHE Affiliated Publication:||No|
|Appears in Collections:||SS Publication|
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.