Dealing with false-positive and false-negative errors about species occurrence at multiple levels

Gurutzeta Guillera-Arroita, José Joaquín Lahoz-Monfort, Anthony R. van Rooyen, Andrew R. Weeks, Reid Tingley

Research output: Contribution to journalArticleResearchpeer-review

24 Citations (Scopus)

Abstract

Accurate knowledge of species occurrence is fundamental to a wide variety of ecological, evolutionary and conservation applications. Assessing the presence or absence of species at sites is often complicated by imperfect detection, with different mechanisms potentially contributing to false-negative and/or false-positive errors at different sampling stages. Ambiguities in the data mean that estimation of relevant parameters might be confounded unless additional information is available to resolve those uncertainties. Here, we consider the analysis of species detection data with false-positive and false-negative errors at multiple levels. We develop and examine a two-stage occupancy-detection model for this purpose. We use profile likelihoods for identifiability analysis and estimation, and study the types of additional data required for reliable estimation. We test the model with simulated data, and then analyse data from environmental DNA (eDNA) surveys of four Australian frog species. In our case study, we consider that false positives may arise due to contamination at the water sample and quantitative PCR-sample levels, whereas false negatives may arise due to eDNA not being captured in a field sample, or due to the sensitivity of laboratory tests. We augment our eDNA survey data with data from aural surveys and laboratory calibration experiments. We demonstrate that the two-stage model with false-positive and false-negative errors is not identifiable if only survey data prone to false positives are available. At least two sources of extra information are required for reliable estimation (e.g. records from a survey method with unambiguous detections, and a calibration experiment). Alternatively, identifiability can be achieved by setting plausible bounds on false detection rates as prior information in a Bayesian setting. The results of our case study matched our simulations with respect to data requirements, and revealed false-positive rates greater than zero for all species. We provide statistical modelling tools to account for uncertainties in species occurrence survey data when false negatives and false positives could occur at multiple sampling stages. Such data are often needed to support management and policy decisions. Dealing with these uncertainties is relevant for traditional survey methods, but also for promising new techniques, such as eDNA sampling.

Original languageEnglish
Pages (from-to)1081-1091
Number of pages11
JournalMethods in Ecology and Evolution
Volume8
Issue number9
DOIs
Publication statusPublished - 1 Sep 2017
Externally publishedYes

Keywords

  • detectability
  • environmental DNA
  • identifiability
  • imperfect detection
  • monitoring
  • multiple stages
  • sensitivity
  • species occupancy
  • specificity

Cite this

Guillera-Arroita, Gurutzeta ; Lahoz-Monfort, José Joaquín ; van Rooyen, Anthony R. ; Weeks, Andrew R. ; Tingley, Reid. / Dealing with false-positive and false-negative errors about species occurrence at multiple levels. In: Methods in Ecology and Evolution. 2017 ; Vol. 8, No. 9. pp. 1081-1091.
@article{406862ddfdf34b58890e20949338e35b,
title = "Dealing with false-positive and false-negative errors about species occurrence at multiple levels",
abstract = "Accurate knowledge of species occurrence is fundamental to a wide variety of ecological, evolutionary and conservation applications. Assessing the presence or absence of species at sites is often complicated by imperfect detection, with different mechanisms potentially contributing to false-negative and/or false-positive errors at different sampling stages. Ambiguities in the data mean that estimation of relevant parameters might be confounded unless additional information is available to resolve those uncertainties. Here, we consider the analysis of species detection data with false-positive and false-negative errors at multiple levels. We develop and examine a two-stage occupancy-detection model for this purpose. We use profile likelihoods for identifiability analysis and estimation, and study the types of additional data required for reliable estimation. We test the model with simulated data, and then analyse data from environmental DNA (eDNA) surveys of four Australian frog species. In our case study, we consider that false positives may arise due to contamination at the water sample and quantitative PCR-sample levels, whereas false negatives may arise due to eDNA not being captured in a field sample, or due to the sensitivity of laboratory tests. We augment our eDNA survey data with data from aural surveys and laboratory calibration experiments. We demonstrate that the two-stage model with false-positive and false-negative errors is not identifiable if only survey data prone to false positives are available. At least two sources of extra information are required for reliable estimation (e.g. records from a survey method with unambiguous detections, and a calibration experiment). Alternatively, identifiability can be achieved by setting plausible bounds on false detection rates as prior information in a Bayesian setting. The results of our case study matched our simulations with respect to data requirements, and revealed false-positive rates greater than zero for all species. We provide statistical modelling tools to account for uncertainties in species occurrence survey data when false negatives and false positives could occur at multiple sampling stages. Such data are often needed to support management and policy decisions. Dealing with these uncertainties is relevant for traditional survey methods, but also for promising new techniques, such as eDNA sampling.",
keywords = "detectability, environmental DNA, identifiability, imperfect detection, monitoring, multiple stages, sensitivity, species occupancy, specificity",
author = "Gurutzeta Guillera-Arroita and Lahoz-Monfort, {Jos{\'e} Joaqu{\'i}n} and {van Rooyen}, {Anthony R.} and Weeks, {Andrew R.} and Reid Tingley",
year = "2017",
month = "9",
day = "1",
doi = "10.1111/2041-210X.12743",
language = "English",
volume = "8",
pages = "1081--1091",
journal = "Methods in Ecology and Evolution",
issn = "2041-210X",
publisher = "Wiley-Blackwell",
number = "9",

}

Dealing with false-positive and false-negative errors about species occurrence at multiple levels. / Guillera-Arroita, Gurutzeta; Lahoz-Monfort, José Joaquín; van Rooyen, Anthony R.; Weeks, Andrew R.; Tingley, Reid.

In: Methods in Ecology and Evolution, Vol. 8, No. 9, 01.09.2017, p. 1081-1091.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Dealing with false-positive and false-negative errors about species occurrence at multiple levels

AU - Guillera-Arroita, Gurutzeta

AU - Lahoz-Monfort, José Joaquín

AU - van Rooyen, Anthony R.

AU - Weeks, Andrew R.

AU - Tingley, Reid

PY - 2017/9/1

Y1 - 2017/9/1

N2 - Accurate knowledge of species occurrence is fundamental to a wide variety of ecological, evolutionary and conservation applications. Assessing the presence or absence of species at sites is often complicated by imperfect detection, with different mechanisms potentially contributing to false-negative and/or false-positive errors at different sampling stages. Ambiguities in the data mean that estimation of relevant parameters might be confounded unless additional information is available to resolve those uncertainties. Here, we consider the analysis of species detection data with false-positive and false-negative errors at multiple levels. We develop and examine a two-stage occupancy-detection model for this purpose. We use profile likelihoods for identifiability analysis and estimation, and study the types of additional data required for reliable estimation. We test the model with simulated data, and then analyse data from environmental DNA (eDNA) surveys of four Australian frog species. In our case study, we consider that false positives may arise due to contamination at the water sample and quantitative PCR-sample levels, whereas false negatives may arise due to eDNA not being captured in a field sample, or due to the sensitivity of laboratory tests. We augment our eDNA survey data with data from aural surveys and laboratory calibration experiments. We demonstrate that the two-stage model with false-positive and false-negative errors is not identifiable if only survey data prone to false positives are available. At least two sources of extra information are required for reliable estimation (e.g. records from a survey method with unambiguous detections, and a calibration experiment). Alternatively, identifiability can be achieved by setting plausible bounds on false detection rates as prior information in a Bayesian setting. The results of our case study matched our simulations with respect to data requirements, and revealed false-positive rates greater than zero for all species. We provide statistical modelling tools to account for uncertainties in species occurrence survey data when false negatives and false positives could occur at multiple sampling stages. Such data are often needed to support management and policy decisions. Dealing with these uncertainties is relevant for traditional survey methods, but also for promising new techniques, such as eDNA sampling.

AB - Accurate knowledge of species occurrence is fundamental to a wide variety of ecological, evolutionary and conservation applications. Assessing the presence or absence of species at sites is often complicated by imperfect detection, with different mechanisms potentially contributing to false-negative and/or false-positive errors at different sampling stages. Ambiguities in the data mean that estimation of relevant parameters might be confounded unless additional information is available to resolve those uncertainties. Here, we consider the analysis of species detection data with false-positive and false-negative errors at multiple levels. We develop and examine a two-stage occupancy-detection model for this purpose. We use profile likelihoods for identifiability analysis and estimation, and study the types of additional data required for reliable estimation. We test the model with simulated data, and then analyse data from environmental DNA (eDNA) surveys of four Australian frog species. In our case study, we consider that false positives may arise due to contamination at the water sample and quantitative PCR-sample levels, whereas false negatives may arise due to eDNA not being captured in a field sample, or due to the sensitivity of laboratory tests. We augment our eDNA survey data with data from aural surveys and laboratory calibration experiments. We demonstrate that the two-stage model with false-positive and false-negative errors is not identifiable if only survey data prone to false positives are available. At least two sources of extra information are required for reliable estimation (e.g. records from a survey method with unambiguous detections, and a calibration experiment). Alternatively, identifiability can be achieved by setting plausible bounds on false detection rates as prior information in a Bayesian setting. The results of our case study matched our simulations with respect to data requirements, and revealed false-positive rates greater than zero for all species. We provide statistical modelling tools to account for uncertainties in species occurrence survey data when false negatives and false positives could occur at multiple sampling stages. Such data are often needed to support management and policy decisions. Dealing with these uncertainties is relevant for traditional survey methods, but also for promising new techniques, such as eDNA sampling.

KW - detectability

KW - environmental DNA

KW - identifiability

KW - imperfect detection

KW - monitoring

KW - multiple stages

KW - sensitivity

KW - species occupancy

KW - specificity

UR - http://www.scopus.com/inward/record.url?scp=85016636861&partnerID=8YFLogxK

U2 - 10.1111/2041-210X.12743

DO - 10.1111/2041-210X.12743

M3 - Article

AN - SCOPUS:85016636861

VL - 8

SP - 1081

EP - 1091

JO - Methods in Ecology and Evolution

JF - Methods in Ecology and Evolution

SN - 2041-210X

IS - 9

ER -