A multiple test correction for streams and cascades of statistical hypothesis tests

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    8 Citations (Scopus)

    Abstract

    Statistical hypothesis testing is a popular and powerful tool for inferring knowledge from data. For every such test performed, there is always a non-zero probability of making a false discovery, i.e. rejecting a null hypothesis in error. Familywise error rate (FWER) is the probability of making at least one false discovery during an inference process. The expected FWER grows exponentially with the number of hypothesis tests that are performed, almost guaranteeing that an error will be committed if the number of tests is big enough and the risk is not managed; a problem known as the multiple testing problem. State-of-the-art methods for controlling FWER in multiple comparison settings require that the set of hypotheses be predetermined. This greatly hinders statistical testing for many modern applications of statistical inference, such as model selection, because neither the set of hypotheses that will be tested, nor even the number of hypotheses, can be known in advance. This paper introduces Subfamilywise Multiple Testing, a multiple-testing correction that can be used in applications for which there are repeated pools of null hypotheses from each of which a single null hypothesis is to be rejected and neither the specific hypotheses nor their number are known until the final rejection decision is completed. To demonstrate the importance and relevance of this work to current machine learning problems, we further refine the theory to the problem of model selection and show how to use Subfamilywise Multiple Testing for learning graphical models. We assess its ability to discover graphical models on more than 7,000 datasets, studying the ability of Subfamilywise Multiple Testing to outperform the state of the art on data with varying size and dimensionality, as well as with varying density and power of the present correlations. Subfamilywise Multiple Testing provides a significant improvement in statistical efficiency, often requiring only half as much data to discover the same model, while strictly controlling FWER.
    Original languageEnglish
    Title of host publicationKDD'16 / KDD 2016
    Subtitle of host publicationProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA
    EditorsAlex Smola, Charu Aggarwal
    Place of PublicationNew York, NY, USA
    PublisherAssociation for Computing Machinery (ACM)
    Pages1255-1264
    Number of pages10
    ISBN (Print)9781450342322
    DOIs
    Publication statusPublished - 13 Aug 2016
    EventACM International Conference on Knowledge Discovery and Data Mining 2016 - Hilton San Francisco Union Square, San Francisco, United States of America
    Duration: 13 Aug 201617 Aug 2016
    Conference number: 22nd
    http://www.kdd.org/kdd2016/

    Conference

    ConferenceACM International Conference on Knowledge Discovery and Data Mining 2016
    Abbreviated titleSIGKDD 2016
    CountryUnited States of America
    CitySan Francisco
    Period13/08/1617/08/16
    OtherKDD 2016, a premier interdisciplinary conference, brings together researchers and practitioners from data science, data mining, knowledge discovery, large-scale data analytics, and big data.
    Internet address

    Keywords

    • Hypothesis testing
    • Multiple testing
    • Model selection

    Cite this

    Webb, G. I., & Petitjean, F. (2016). A multiple test correction for streams and cascades of statistical hypothesis tests. In A. Smola, & C. Aggarwal (Eds.), KDD'16 / KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA (pp. 1255-1264). New York, NY, USA: Association for Computing Machinery (ACM). https://doi.org/10.1145/2939672.2939775
    Webb, Geoff I. ; Petitjean, François. / A multiple test correction for streams and cascades of statistical hypothesis tests. KDD'16 / KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA. editor / Alex Smola ; Charu Aggarwal. New York, NY, USA : Association for Computing Machinery (ACM), 2016. pp. 1255-1264
    @inproceedings{9d18fe653c18453ca02153b51d7fe88e,
    title = "A multiple test correction for streams and cascades of statistical hypothesis tests",
    abstract = "Statistical hypothesis testing is a popular and powerful tool for inferring knowledge from data. For every such test performed, there is always a non-zero probability of making a false discovery, i.e. rejecting a null hypothesis in error. Familywise error rate (FWER) is the probability of making at least one false discovery during an inference process. The expected FWER grows exponentially with the number of hypothesis tests that are performed, almost guaranteeing that an error will be committed if the number of tests is big enough and the risk is not managed; a problem known as the multiple testing problem. State-of-the-art methods for controlling FWER in multiple comparison settings require that the set of hypotheses be predetermined. This greatly hinders statistical testing for many modern applications of statistical inference, such as model selection, because neither the set of hypotheses that will be tested, nor even the number of hypotheses, can be known in advance. This paper introduces Subfamilywise Multiple Testing, a multiple-testing correction that can be used in applications for which there are repeated pools of null hypotheses from each of which a single null hypothesis is to be rejected and neither the specific hypotheses nor their number are known until the final rejection decision is completed. To demonstrate the importance and relevance of this work to current machine learning problems, we further refine the theory to the problem of model selection and show how to use Subfamilywise Multiple Testing for learning graphical models. We assess its ability to discover graphical models on more than 7,000 datasets, studying the ability of Subfamilywise Multiple Testing to outperform the state of the art on data with varying size and dimensionality, as well as with varying density and power of the present correlations. Subfamilywise Multiple Testing provides a significant improvement in statistical efficiency, often requiring only half as much data to discover the same model, while strictly controlling FWER.",
    keywords = "Hypothesis testing, Multiple testing, Model selection",
    author = "Webb, {Geoff I.} and Fran{\cc}ois Petitjean",
    year = "2016",
    month = "8",
    day = "13",
    doi = "10.1145/2939672.2939775",
    language = "English",
    isbn = "9781450342322",
    pages = "1255--1264",
    editor = "Alex Smola and Charu Aggarwal",
    booktitle = "KDD'16 / KDD 2016",
    publisher = "Association for Computing Machinery (ACM)",
    address = "United States of America",

    }

    Webb, GI & Petitjean, F 2016, A multiple test correction for streams and cascades of statistical hypothesis tests. in A Smola & C Aggarwal (eds), KDD'16 / KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA. Association for Computing Machinery (ACM), New York, NY, USA, pp. 1255-1264, ACM International Conference on Knowledge Discovery and Data Mining 2016, San Francisco, United States of America, 13/08/16. https://doi.org/10.1145/2939672.2939775

    A multiple test correction for streams and cascades of statistical hypothesis tests. / Webb, Geoff I.; Petitjean, François.

    KDD'16 / KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA. ed. / Alex Smola; Charu Aggarwal. New York, NY, USA : Association for Computing Machinery (ACM), 2016. p. 1255-1264.

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    TY - GEN

    T1 - A multiple test correction for streams and cascades of statistical hypothesis tests

    AU - Webb, Geoff I.

    AU - Petitjean, François

    PY - 2016/8/13

    Y1 - 2016/8/13

    N2 - Statistical hypothesis testing is a popular and powerful tool for inferring knowledge from data. For every such test performed, there is always a non-zero probability of making a false discovery, i.e. rejecting a null hypothesis in error. Familywise error rate (FWER) is the probability of making at least one false discovery during an inference process. The expected FWER grows exponentially with the number of hypothesis tests that are performed, almost guaranteeing that an error will be committed if the number of tests is big enough and the risk is not managed; a problem known as the multiple testing problem. State-of-the-art methods for controlling FWER in multiple comparison settings require that the set of hypotheses be predetermined. This greatly hinders statistical testing for many modern applications of statistical inference, such as model selection, because neither the set of hypotheses that will be tested, nor even the number of hypotheses, can be known in advance. This paper introduces Subfamilywise Multiple Testing, a multiple-testing correction that can be used in applications for which there are repeated pools of null hypotheses from each of which a single null hypothesis is to be rejected and neither the specific hypotheses nor their number are known until the final rejection decision is completed. To demonstrate the importance and relevance of this work to current machine learning problems, we further refine the theory to the problem of model selection and show how to use Subfamilywise Multiple Testing for learning graphical models. We assess its ability to discover graphical models on more than 7,000 datasets, studying the ability of Subfamilywise Multiple Testing to outperform the state of the art on data with varying size and dimensionality, as well as with varying density and power of the present correlations. Subfamilywise Multiple Testing provides a significant improvement in statistical efficiency, often requiring only half as much data to discover the same model, while strictly controlling FWER.

    AB - Statistical hypothesis testing is a popular and powerful tool for inferring knowledge from data. For every such test performed, there is always a non-zero probability of making a false discovery, i.e. rejecting a null hypothesis in error. Familywise error rate (FWER) is the probability of making at least one false discovery during an inference process. The expected FWER grows exponentially with the number of hypothesis tests that are performed, almost guaranteeing that an error will be committed if the number of tests is big enough and the risk is not managed; a problem known as the multiple testing problem. State-of-the-art methods for controlling FWER in multiple comparison settings require that the set of hypotheses be predetermined. This greatly hinders statistical testing for many modern applications of statistical inference, such as model selection, because neither the set of hypotheses that will be tested, nor even the number of hypotheses, can be known in advance. This paper introduces Subfamilywise Multiple Testing, a multiple-testing correction that can be used in applications for which there are repeated pools of null hypotheses from each of which a single null hypothesis is to be rejected and neither the specific hypotheses nor their number are known until the final rejection decision is completed. To demonstrate the importance and relevance of this work to current machine learning problems, we further refine the theory to the problem of model selection and show how to use Subfamilywise Multiple Testing for learning graphical models. We assess its ability to discover graphical models on more than 7,000 datasets, studying the ability of Subfamilywise Multiple Testing to outperform the state of the art on data with varying size and dimensionality, as well as with varying density and power of the present correlations. Subfamilywise Multiple Testing provides a significant improvement in statistical efficiency, often requiring only half as much data to discover the same model, while strictly controlling FWER.

    KW - Hypothesis testing

    KW - Multiple testing

    KW - Model selection

    UR - http://www.scopus.com/inward/record.url?scp=84984992257&partnerID=8YFLogxK

    U2 - 10.1145/2939672.2939775

    DO - 10.1145/2939672.2939775

    M3 - Conference Paper

    AN - SCOPUS:84984992257

    SN - 9781450342322

    SP - 1255

    EP - 1264

    BT - KDD'16 / KDD 2016

    A2 - Smola, Alex

    A2 - Aggarwal, Charu

    PB - Association for Computing Machinery (ACM)

    CY - New York, NY, USA

    ER -

    Webb GI, Petitjean F. A multiple test correction for streams and cascades of statistical hypothesis tests. In Smola A, Aggarwal C, editors, KDD'16 / KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 13-17, 2016, San Francisco, CA, USA. New York, NY, USA: Association for Computing Machinery (ACM). 2016. p. 1255-1264 https://doi.org/10.1145/2939672.2939775