Overview
Mistakes are inevitable. But when it comes to training machine learning (ML) models, which mistakes are most important to avoid? How can we quantify the risks associated with mislabeling training data and take action to mitigate them?
These are key questions to consider when training AI and ML tools to detect illicit transactions on cryptocurrency networks. For these AI tools to successfully identify illicit and non-illicit nodes, they must have properly labeled datasets. However, labeling such datasets is difficult and costly, as it requires experienced analysts to investigate multiple networks and make their best judgments. Necessary inferences may lead to labeling errors, and analysts and policymakers need to be prepared.
This NSDPI study evaluates the impact of such labeling errors on ML models’ ability to detect illicit financial activity. Our researchers drew models from active learning research and thoroughly examined them in a less-explored context: financial datasets. They were able to introduce a new model of oracle (analyst) error based on the correlations in biases they observed over the network. By quantifying how mislabeling degrades ML performance, this study clarifies how analysts should approach using active learning strategies to fight financial crime.
Key Takeaways
- Degradation in ML performance depends on the nature of the labeling error and the metric being evaluated; for example, some error models show mislabeling non-illicit nodes mostly reduces precision, while mislabeling illicit nodes mostly harms recall.
- During active learning, any performance improvements from adding more labels reach a clear ceiling before plateauing.
- Under imperfect analysts, active learning performance only improves up to a certain point before declining, highlighting a need for accurate models of analyst error so that queries to analysts can be limited based on their reliability.