Pass putative false detections through a spatial filter — process_false_detections

The identification of false detections in acoustic telemetry data is an important aspect of processing and/or modelling these data. False detections can be identified using the short interval criterion, whereby any detection of an individual at a receiver which is not accompanied by other detections at the same receiver in a specified time window (depending on the nominal acoustic transmission delay) are flagged. This approach can be implemented using the false_detections function in the glatos package. Flag detections can then be examined, or passed through other filters, to examine their plausibility.

This function passes false detections flagged by false_detections through a spatial filter. The key idea is that detections at nearby receivers within a defined time window may, in fact, be plausible. To implement this approach, the user must define a dataframe comprising detections, a temporal threshold, a spatial threshold and a dataframe of distances between receivers. The function examines whether any putative false detections are accompanied by additional detections at other receivers within a user-defined time window and Euclidean distance of that receiver. If so, these could be explained by an individual that dips in-and-out of the detection ranges of receivers (e.g. in a sparse acoustic array) and may not, in fact, be false.

process_false_detections_sf(det, tf, sf, dist_btw_receivers)

Arguments

det: A dataframe containing detection time series. Following false_detections, this should contain the following columns: `detection_timestamp_utc', `transmitter_codespace', `transmitter_id' and `receiver_sn', as well as `passed_filter' (see false_detections).
tf: A number that defines the time threshold (s) used to flag false detections (see false_detections).
sf: A number that defines the threshold Euclidean distance between receivers beyond which, even if a false detection is accompanied by detections at other receivers, it is likely to be a true false detection, because the individual could not have moved between receivers separated by more than this threshold over the specified time interval. sf should be defined in the same units as the distances provided in dist_btw_receivers (see below).
dist_btw_receivers: A dataframe that defines the distances between receiver pairs. This should contain the columns: `r1', `r2' and `dist', whereby `r1' and `r2' contain the unique receiver serial number for all combinations of receivers and `dist' contains the distance between them. This dataframe should include duplicate combinations (e.g., both r1 = 1 and r2 = 2 and r1 = 2 and r2 = 1). This can be created with dist_btw_receivers.

Value

The function returns a vector, of the same length as det with three possible values: NA, which identifies detections which have not been flagged as false detections (i.e., passed_filter = 0, see false_detections) and are therefore not passed through the spatial filter; 1, which identifies detections which `passed' the spatial filter (i.e., false detections which are accompanied by detections at nearby receivers within the defined spatial and temporal thresholds); or 0, which defines detections which `failed' the spatial filter (i.e., false detections which are not accompanied by detections at nearby receivers within the defined spatial and temporal thresholds). This vector has one attribute, `details', a dataframe with the same number of rows as det with the following columns: `passed_filter_sf', `n_wn_sf', `detection_timestamp_utc_sf' and `receiver_sn_sf'. `passed_filter_sf' takes a value of NA, 0 or 1 depending on whether or not the detection was flagged as a false detection (if not, NA) and whether each false detection passed the spatial filter (no, 0; yes, 1). If the detection did pass the spatial filter, `n_wn_sf' provides the number of detections at nearby receivers within tf and sf; and `detection_timestamp_utc_sf', `receiver_sn_sf' and `dist_sf' define the timestamp of the detection at the nearest receiver, the receiver at which the detection was made and the distance between the two receivers respectively.

Details

There are limitations with the application of this spatial filter to false detections. First, the spatial threshold beyond which false detections are likely to be false is based on Euclidean distances at present. These may be problematic (e.g. when receivers hug complex coastlines). Second, for small arrays, fast-swimming organisms and/or a large nominal transmission delay (i.e., time threshold), the spatial filter is a poor filter because individuals can access the whole area over the whole time interval.

Author

Edward Lavender

Examples

#### Define necessary columns to compute false detections using glatos::false_detections()
det <- dat_acoustics[dat_acoustics$individual_id == 25, ]
stopifnot(!is.unsorted(det$timestamp))
det$detection_timestamp_utc <- det$timestamp
det$transmitter_codespace   <- substr(det$transmitter_id, 1, 8)
det$transmitter_id          <- substr(det$transmitter_id, 10, 13)
det$receiver_sn             <- det$receiver_id
det <- det[, c(
  "detection_timestamp_utc",
  "transmitter_codespace",
  "transmitter_id",
  "receiver_sn"
)]

#### Clean false detections from 'raw' data
det <- glatos::false_detections(det, tf = 3600)
#> The filter identified 11 (0.08%) of 13008 detections as potentially false.
det <- det[det$passed_filter == 1, ]
det$passed_filter <- NULL
det$min_lag <- NULL

#### Add 'new' false detections for demonstration purposes
# Add three rows to `det` which, below, we'll make 'false detections'
det <- rbind(det, det[rep(nrow(det), 3), ])
pos_false <- (nrow(det) - 2):nrow(det)
# Add an isolated detection accompanied by a detection at a nearby receiver
det$detection_timestamp_utc[pos_false[1:2]] <-
  det$detection_timestamp_utc[pos_false[1:2]] + 60 * 60 * 60
det$receiver_sn[pos_false[2]] <- 52
# Add an isolated detection not accompanied by a detection at a nearby receiver
det$detection_timestamp_utc[pos_false[3]] <-
  det$detection_timestamp_utc[pos_false[3]] + 60 * 60 * 60 * 2
det <- glatos::false_detections(det, tf = 3600)
#> The filter identified 3 (0.02%) of 13000 detections as potentially false.
stopifnot(length(which(det$passed_filter == 0)) == 3L)
tail(det$passed_filter)
#> [1] 1 1 1 0 0 0

##### Implement spatial filter
# Calculate distances between receivers
# * For simplicity, here, we ignore differences in deployment timing
dist_btw_receivers_km <-
  dist_btw_receivers(dat_moorings[, c("receiver_id", "receiver_long", "receiver_lat")])
# Note the function returns a vector, unlike glatos::false_detections():
det$passed_filter_sf <-
  process_false_detections_sf(det,
                              tf = 3600,
                              sf = 0.5,
                              dist_btw_receivers = dist_btw_receivers_km)
#> The spatial filter retained 1 detections, out of 3 previously identified false detections  (33.33 %) as 'true' false detections.
# Only the last observation failed the spatial filter, as expected:
tail(det$passed_filter_sf)
#> [1] NA NA NA  1  1  0
stopifnot(which(det$passed_filter_sf == 0) == nrow(det))
# Additional information is available from the attributes dataframe:
tail(attr(det$passed_filter_sf, "details"))
#>       passed_filter_sf n_wn_sf receiver_sn_sf detection_timestamp_utc_sf
#> 12995               NA      NA             NA                       <NA>
#> 12996               NA      NA             NA                       <NA>
#> 12997               NA      NA             NA                       <NA>
#> 12998                1       1             52        2017-06-03 19:58:00
#> 12999                1       1             47        2017-06-03 19:58:00
#> 13000                0      NA             NA                       <NA>
#>         dist_sf
#> 12995        NA
#> 12996        NA
#> 12997        NA
#> 12998 0.4700686
#> 12999 0.4700686
#> 13000        NA