A bytecode-based approach for smart contract classification

Chaochen Shi, Yong Xiang, Jiangshan Yu, Longxiang Gao, Keshav Sood, Robin Ram Mohan Doss

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

6 Citations (Scopus)

Abstract

With the development of blockchain technologies, the number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening. The automatic classification of smart contracts can provide blockchain users with keyword-based contract searching and helps to manage smart contracts effectively. Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code. However, more than 94% of smart contracts are not open-source, so the application scenarios of NLP methods are very limited. Meanwhile, NLP models are vulnerable to adversarial attacks. This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems. We also use feature selection and ensemble learning to optimize the model. Our experimental studies on over 11K real-world Ethereum smart contracts show that our model can classify smart contracts without source code and has better performance than baseline models. Our model also has good resistance to adversarial attacks compared with NLP-based models. In addition, our analysis reveals that account features used in many smart contract classification models have little effect on classification and can be excluded.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022
EditorsZadia Codabux, Clemente Izurieta
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1046-1054
Number of pages9
ISBN (Electronic)9781665437868
ISBN (Print)9781665437875
DOIs
Publication statusPublished - 2022
EventIEEE International Conference on Software Analysis, Evolution, and Reengineering 2022 - Online, Honolulu, United States of America
Duration: 15 Mar 202218 Mar 2022
Conference number: 29th
https://ieeexplore.ieee.org/xpl/conhome/9825713/proceeding (Proceedings)
https://saner2022.uom.gr/ (Website)

Publication series

NameProceedings - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022
PublisherIEEE, Institute of Electrical and Electronics Engineers
ISSN (Electronic)1534-5351

Conference

ConferenceIEEE International Conference on Software Analysis, Evolution, and Reengineering 2022
Abbreviated titleSANER 2022
Country/TerritoryUnited States of America
CityHonolulu
Period15/03/2218/03/22
Internet address

Keywords

  • blockchain
  • bytecode
  • Ethereum
  • smart contract classification

Cite this