Optimizing selfish mining strategies through deep reinforcement learning

Wijewardhana, W. T. R. N. D. K.; Vidanagamachchi, S. M.; Arachchilage, N. A. G.

Optimizing selfish mining strategies through deep reinforcement learning

dc.contributor.author	Wijewardhana, W. T. R. N. D. K.
dc.contributor.author	Vidanagamachchi, S. M.
dc.contributor.author	Arachchilage, N. A. G.
dc.date.accessioned	2024-11-29T07:35:54Z
dc.date.available	2024-11-29T07:35:54Z
dc.date.issued	2024
dc.description.abstract	Selfish mining is a type of mining attack where miners strategically release blocks to create forks in the main branch with the intention of acquiring a large portion of the mining reward. Traditional strategies use a Markov Decision Process (MDP) with a non-linear objective function that requires variable blockchain parameters, which are hard to determine, while model-free approaches like multidimensional Q-learning overcome this by learning optimal policies without prior blockchain information. Despite this, existing algorithms remain largely impractical for real blockchain networks, as they fail to account for realistic blockchain features, exhibit inefficient learning in large state spaces, and suffer from slow convergence rates. In this work, we propose a novel model-free Deep Reinforcement Learning (DRL) algorithm for optimal selfish mining, enabling dynamic learning without requiring prior knowledge of network parameters. The study aims to leverage deep neural networks along with advanced exploration and experience replay mechanisms to achieve faster convergence and improved learning efficiency in large state spaces which are inherent in real-world blockchain instances. The non-linearity of the objective function is addressed by incorporating two Double DQNs (DDQNs), one for adversary and one for honest network, which work together to effectively optimize the non-linear objective function. The proposed model is evaluated by constructing a Bitcoin-like Proof-of-Work blockchain simulator which takes into account various real-world blockchain parameters such as stale block rates, propagation delays, and eclipse attacks. Our simulations indicate that the proposed model achieves optimal gains while enhancing the robustness and convergence of the algorithm in large state spaces and dynamically adjusting the mining policy as the blockchain environment evolves.	en_US
dc.identifier.citation	Wijewardhana W. T. R. N. D. K.; Vidanagamachchi S. M.; Arachchilage N. A. G. (2024), Optimizing selfish mining strategies through deep reinforcement learning, Proceedings of the International Conference on Applied and Pure Sciences (ICAPS 2024-Kelaniya) Volume 4, Faculty of Science, University of Kelaniya Sri Lanka. Page 133	en_US
dc.identifier.uri	http://repository.kln.ac.lk/handle/123456789/28878
dc.publisher	Faculty of Science, University of Kelaniya Sri Lanka	en_US
dc.subject	Blockchain, Bitcoin, Selfish mining, Deep reinforcement learning	en_US
dc.title	Optimizing selfish mining strategies through deep reinforcement learning	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ICAPS 2024-Proceedings Book_20241027-49-217-pages-133.pdf
Size:: 591.12 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

ICAPS 2024