Criminal motivation on the dark web

A categorisation model for law enforcement

Janis Dalins, Campbell Wilson, Mark Carman

    Research output: Contribution to journalArticleResearchpeer-review

    Abstract

    Research into the nature and structure of ‘Dark Webs’ such as Tor has largely focused upon manually labelling a series of crawled sites against a series of categories, sometimes using these labels as a training corpus for subsequent automated crawls. Such an approach is adequate for establishing broad taxonomies, but is of limited value for specialised tasks within the field of law enforcement. Contrastingly, existing research into illicit behaviour online has tended to focus upon particular crime types such as terrorism. A gap exists between taxonomies capable of holistic representation and those capable of detailing criminal behaviour. The absence of such a taxonomy limits interoperability between agencies, curtailing development of standardised classification tools. We introduce the Tor-use Motivation Model (TMM), a two-dimensional classification methodology specifically designed for use within a law enforcement context. The TMM achieves greater levels of granularity by explicitly distinguishing site content from motivation, providing a richer labelling schema without introducing inefficient complexity or reliance upon overly broad categories of relevance. We demonstrate this flexibility and robustness through direct examples, showing the TMM's ability to distinguish a range of unethical and illegal behaviour without bloating the model with unnecessary detail. The authors of this paper received permission from the Australian government to conduct an unrestricted crawl of Tor for research purposes, including the gathering and analysis of illegal materials such as child pornography. The crawl gathered 232,792 pages from 7651 Tor virtual domains, resulting in the collation of a wide spectrum of materials, from illicit to downright banal. Existing conceptual models and their labelling schemas were tested against a small sample of gathered data, and were observed to be either overly prescriptive or vague for law enforcement purposes - particularly when used for prioritising sites of interest for further investigation. In this paper we deploy the TMM by manually labelling a corpus of over 4000 unique Tor pages. We found a network impacted (but not dominated) by illicit commerce and money laundering, but almost completely devoid of violence and extremism. In short, criminality on this ‘dark web’ is based more upon greed and desire, rather than any particular political motivations.

    Original languageEnglish
    Pages (from-to)62-71
    Number of pages10
    JournalDigital Investigation
    Volume24
    DOIs
    Publication statusPublished - 1 Mar 2018

    Keywords

    • Child pornography
    • Computer forensics
    • Conceptual models
    • Dark web
    • Focused crawls
    • Machine learning
    • Tor motivation model

    Cite this

    @article{a90ae1ed16e844e69f86dde6a42cf14b,
    title = "Criminal motivation on the dark web: A categorisation model for law enforcement",
    abstract = "Research into the nature and structure of ‘Dark Webs’ such as Tor has largely focused upon manually labelling a series of crawled sites against a series of categories, sometimes using these labels as a training corpus for subsequent automated crawls. Such an approach is adequate for establishing broad taxonomies, but is of limited value for specialised tasks within the field of law enforcement. Contrastingly, existing research into illicit behaviour online has tended to focus upon particular crime types such as terrorism. A gap exists between taxonomies capable of holistic representation and those capable of detailing criminal behaviour. The absence of such a taxonomy limits interoperability between agencies, curtailing development of standardised classification tools. We introduce the Tor-use Motivation Model (TMM), a two-dimensional classification methodology specifically designed for use within a law enforcement context. The TMM achieves greater levels of granularity by explicitly distinguishing site content from motivation, providing a richer labelling schema without introducing inefficient complexity or reliance upon overly broad categories of relevance. We demonstrate this flexibility and robustness through direct examples, showing the TMM's ability to distinguish a range of unethical and illegal behaviour without bloating the model with unnecessary detail. The authors of this paper received permission from the Australian government to conduct an unrestricted crawl of Tor for research purposes, including the gathering and analysis of illegal materials such as child pornography. The crawl gathered 232,792 pages from 7651 Tor virtual domains, resulting in the collation of a wide spectrum of materials, from illicit to downright banal. Existing conceptual models and their labelling schemas were tested against a small sample of gathered data, and were observed to be either overly prescriptive or vague for law enforcement purposes - particularly when used for prioritising sites of interest for further investigation. In this paper we deploy the TMM by manually labelling a corpus of over 4000 unique Tor pages. We found a network impacted (but not dominated) by illicit commerce and money laundering, but almost completely devoid of violence and extremism. In short, criminality on this ‘dark web’ is based more upon greed and desire, rather than any particular political motivations.",
    keywords = "Child pornography, Computer forensics, Conceptual models, Dark web, Focused crawls, Machine learning, Tor motivation model",
    author = "Janis Dalins and Campbell Wilson and Mark Carman",
    year = "2018",
    month = "3",
    day = "1",
    doi = "10.1016/j.diin.2017.12.003",
    language = "English",
    volume = "24",
    pages = "62--71",
    journal = "Digital Investigation",
    issn = "1742-2876",
    publisher = "Elsevier",

    }

    Criminal motivation on the dark web : A categorisation model for law enforcement. / Dalins, Janis; Wilson, Campbell; Carman, Mark.

    In: Digital Investigation, Vol. 24, 01.03.2018, p. 62-71.

    Research output: Contribution to journalArticleResearchpeer-review

    TY - JOUR

    T1 - Criminal motivation on the dark web

    T2 - A categorisation model for law enforcement

    AU - Dalins, Janis

    AU - Wilson, Campbell

    AU - Carman, Mark

    PY - 2018/3/1

    Y1 - 2018/3/1

    N2 - Research into the nature and structure of ‘Dark Webs’ such as Tor has largely focused upon manually labelling a series of crawled sites against a series of categories, sometimes using these labels as a training corpus for subsequent automated crawls. Such an approach is adequate for establishing broad taxonomies, but is of limited value for specialised tasks within the field of law enforcement. Contrastingly, existing research into illicit behaviour online has tended to focus upon particular crime types such as terrorism. A gap exists between taxonomies capable of holistic representation and those capable of detailing criminal behaviour. The absence of such a taxonomy limits interoperability between agencies, curtailing development of standardised classification tools. We introduce the Tor-use Motivation Model (TMM), a two-dimensional classification methodology specifically designed for use within a law enforcement context. The TMM achieves greater levels of granularity by explicitly distinguishing site content from motivation, providing a richer labelling schema without introducing inefficient complexity or reliance upon overly broad categories of relevance. We demonstrate this flexibility and robustness through direct examples, showing the TMM's ability to distinguish a range of unethical and illegal behaviour without bloating the model with unnecessary detail. The authors of this paper received permission from the Australian government to conduct an unrestricted crawl of Tor for research purposes, including the gathering and analysis of illegal materials such as child pornography. The crawl gathered 232,792 pages from 7651 Tor virtual domains, resulting in the collation of a wide spectrum of materials, from illicit to downright banal. Existing conceptual models and their labelling schemas were tested against a small sample of gathered data, and were observed to be either overly prescriptive or vague for law enforcement purposes - particularly when used for prioritising sites of interest for further investigation. In this paper we deploy the TMM by manually labelling a corpus of over 4000 unique Tor pages. We found a network impacted (but not dominated) by illicit commerce and money laundering, but almost completely devoid of violence and extremism. In short, criminality on this ‘dark web’ is based more upon greed and desire, rather than any particular political motivations.

    AB - Research into the nature and structure of ‘Dark Webs’ such as Tor has largely focused upon manually labelling a series of crawled sites against a series of categories, sometimes using these labels as a training corpus for subsequent automated crawls. Such an approach is adequate for establishing broad taxonomies, but is of limited value for specialised tasks within the field of law enforcement. Contrastingly, existing research into illicit behaviour online has tended to focus upon particular crime types such as terrorism. A gap exists between taxonomies capable of holistic representation and those capable of detailing criminal behaviour. The absence of such a taxonomy limits interoperability between agencies, curtailing development of standardised classification tools. We introduce the Tor-use Motivation Model (TMM), a two-dimensional classification methodology specifically designed for use within a law enforcement context. The TMM achieves greater levels of granularity by explicitly distinguishing site content from motivation, providing a richer labelling schema without introducing inefficient complexity or reliance upon overly broad categories of relevance. We demonstrate this flexibility and robustness through direct examples, showing the TMM's ability to distinguish a range of unethical and illegal behaviour without bloating the model with unnecessary detail. The authors of this paper received permission from the Australian government to conduct an unrestricted crawl of Tor for research purposes, including the gathering and analysis of illegal materials such as child pornography. The crawl gathered 232,792 pages from 7651 Tor virtual domains, resulting in the collation of a wide spectrum of materials, from illicit to downright banal. Existing conceptual models and their labelling schemas were tested against a small sample of gathered data, and were observed to be either overly prescriptive or vague for law enforcement purposes - particularly when used for prioritising sites of interest for further investigation. In this paper we deploy the TMM by manually labelling a corpus of over 4000 unique Tor pages. We found a network impacted (but not dominated) by illicit commerce and money laundering, but almost completely devoid of violence and extremism. In short, criminality on this ‘dark web’ is based more upon greed and desire, rather than any particular political motivations.

    KW - Child pornography

    KW - Computer forensics

    KW - Conceptual models

    KW - Dark web

    KW - Focused crawls

    KW - Machine learning

    KW - Tor motivation model

    UR - http://www.scopus.com/inward/record.url?scp=85041635143&partnerID=8YFLogxK

    U2 - 10.1016/j.diin.2017.12.003

    DO - 10.1016/j.diin.2017.12.003

    M3 - Article

    VL - 24

    SP - 62

    EP - 71

    JO - Digital Investigation

    JF - Digital Investigation

    SN - 1742-2876

    ER -