Unsupervised Machine Learning Based Anomaly Detection in High Frequency Data: Evidence from Cryptocurrency Market

Authors

  • Muhammad Nouman Latif
  • Muhittin Kaplan
  • Asad Ul Islam Khan

Keywords:

Unsupervised machine learning models, anomaly detection, Monte Carlo simulations, Bitcoin, Dashcoin, Ethereum, Stellar, Tron, Litecoin.

Abstract

The rapid integration of cryptocurrencies into the global financial ecosystem has introduced unprecedented challenges in market surveillance, risk management, and anomaly detection. While conventional statistical models such as ARIMA (Autoregressive Integrated Moving Average) and GARCH (Generalized Autoregressive Conditional Heteroscedasticity) have been widely used for anomaly detection, their reliance on assumptions of normality and stationarity often fails to capture the complexities of high-frequency, non-linear cryptocurrency trading. Furthermore, traditional risk metrics including down-to-up volatility, negative conditional skewness, and relative frequency may overlook short-term anomalies due to data aggregation limitations.

In order to address these issues, this paper proposes machine-learning model for detecting anomalies in cryptocurrency markets using Jupyter Notebook. We compare four advanced unsupervised machine learning models, i.e, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Isolation Forest (iForest), One-Class Support Vector Machine (OC-SVM), and Local Outlier Factor (LOF) for anomaly detection by using Monte Carlo simulations. The findings indicate that DBSCAN has the highest precision (79.7%) with the fewest false positives, making it ideal for supervisory monitoring. However, the high false positive rates of OC-SVM and Isolation Forest limit their use. By using data of six well-known cryptocurrencies at three different temporal resolutions (daily, hourly, and 15-minute) the performance of these four unsupervised learning techniques also examined and confirmed that the anomalies identified by DBSCAN are also consistent with the other three methods. Additionally, for robustness of results, we use UpSet Plots to incorporate the shared anomalies and found across the three unsupervised learning methods. Number of anomalies also depends on the volatility and time interval of cryptocurrencies, more volatile / high frequency more anomalies.

The study presents sound methodological approach for facilitating financial monitoring and mitigating risks in the cryptocurrencies market, and provides useful information for market players, analysts and policymakers. These results emphasize the importance of choosing algorithms based on specific surveillance targets to promote greater stability in digital asset environments.

Downloads

Published

2025-09-30

How to Cite

Unsupervised Machine Learning Based Anomaly Detection in High Frequency Data: Evidence from Cryptocurrency Market. (2025). Pakistan Journal of Commerce and Social Sciences (ISSN 1997-8553), 19(3), 407-440. https://jes.ac.pk/index.php/jes/article/view/511