Identifying Malicious Users in the Offshore Leaks Networks via Structural Node Representation Learning

Document Type

Conference Proceeding

Publication Date



Starting in 2013, the International Consortium of Investigative Journalists released a series of networks, known as the Offshore Leaks Networks, detailing the information of entities and transactions of offshore accounts. Through cross-referencing with known blacklists of entities, illicit individuals and transactions were able to be identified in the networks provided. In machine learning research, the Offshore Leaks Networks draws off of large databases of data to classify many nodes in high dimensional space. The chief problem with node classification is that the illicit entities are not always known, and techniques have been devised to tackle this problem, such as centrality and structural-based learning. In this paper, SparseStruct—the algorithm developed by Serra et al—is shown to achieve the best results. This is because it uses a structural node representational learning technique able to identify specific structural patterns in the graph. This technique achieved AUROC scores of between 0.61 and 0.81, with three of the four scores being the top score of all classifiers compared.