Removes rows where coordinates fall outside valid geographic ranges
([-90, 90] for latitude, [-180, 180] for longitude).
Accepts either separate latitude and longitude columns, or a single combined
coordinate column. Supports decimal degree, DMS (DDdeg MM'SS''}), and
base-60 (\code{DDdeg MM') coordinate formats.
Usage
latlong_filter(
data,
latitude = NULL,
longitude = NULL,
combined_col = NULL,
sep = ",",
drop_na = FALSE
)Arguments
- data
A data frame containing coordinate columns.
- latitude
Optional. Column name of the latitude column, supplied either unquoted (
lat) or quoted ("lat"). Required ifcombined_colis not provided.- longitude
Optional. Column name of the longitude column, supplied either unquoted (
lon) or quoted ("lon"). Required ifcombined_colis not provided.- combined_col
Optional. Column name of a combined coordinate column containing latitude and longitude as a single delimited string (e.g.
"51.5,-0.1"), supplied either unquoted (coords) or quoted ("coords"). Required iflatitudeandlongitudeare not provided.- sep
Character. Separator used to split
combined_colinto latitude and longitude parts. Default is",".- drop_na
Logical. If
TRUE, rows withNAin either coordinate are dropped in addition to out-of-range rows. Default isFALSE.
Value
A data frame containing only rows with valid coordinates, with the same
columns as data. Removed rows are attached as
attr(result, "invalid") for inspection. A console message reports
the total number of rows removed.
Details
All coordinate formats are parsed to decimal degrees internally before
range validation. The parser handles decimal, DMS, and base-60 formats,
inferring sign from cardinal direction suffixes (S, W) or
the sign of the degree value. Zero-width and BOM characters are stripped
before parsing.
Either combined_col or both latitude and longitude
must be provided; supplying neither raises an error. When drop_na =
FALSE (the default), rows with NA coordinates are still removed
as they cannot pass range validation, and are captured in
attr(result, "invalid").
Use latlong_format to check coordinate formats before
filtering, and latlong_column to identify coordinate columns
if their names are not known in advance.
See also
latlong_format for checking coordinate formats before
filtering,
latlong_column for detecting coordinate columns in a data
frame,
latlong_convert for converting DMS or base-60 columns to
decimal degrees before filtering,
latlong_range for filtering rows to a user-defined bounding
box,
latlong_region for filtering rows to named geographic regions.
Examples
df <- data.frame(
id = 1:5,
lat = c(51.5, 48.8, 91.0, -33.9, NA),
lon = c(-0.1, 2.3, 139.7, 151.2, 37.6)
)
# Filter using separate latitude and longitude columns
latlong_filter(df, latitude = lat, longitude = lon)
#> [latlong_filter] 2 row(s) removed with invalid or out-of-range coordinates
#> id lat lon
#> 1 1 51.5 -0.1
#> 2 2 48.8 2.3
#> 4 4 -33.9 151.2
# Inspect rows that were removed
result <- latlong_filter(df, latitude = lat, longitude = lon)
#> [latlong_filter] 2 row(s) removed with invalid or out-of-range coordinates
attr(result, "invalid")
#> id lat lon
#> 3 3 91 139.7
#> 5 5 NA 37.6
# Also drop rows where either coordinate is NA
latlong_filter(df, latitude = lat, longitude = lon, drop_na = TRUE)
#> [latlong_filter] 2 row(s) removed with invalid or out-of-range coordinates
#> id lat lon
#> 1 1 51.5 -0.1
#> 2 2 48.8 2.3
#> 4 4 -33.9 151.2
# Filter using a combined coordinate column
df_combined <- data.frame(
id = 1:4,
coords = c("51.5,-0.1", "91.0,2.3", "-33.9,151.2", "48.8,181.0")
)
latlong_filter(df_combined, combined_col = coords)
#> [latlong_filter] 2 row(s) removed with invalid or out-of-range coordinates
#> id coords
#> 1 1 51.5,-0.1
#> 3 3 -33.9,151.2
# Combined column with a custom separator
df_sep <- data.frame(
coords = c("51.5;-0.1", "91.0;2.3", "-33.9;151.2")
)
latlong_filter(df_sep, combined_col = coords, sep = ";")
#> [latlong_filter] 1 row(s) removed with invalid or out-of-range coordinates
#> coords
#> 1 51.5;-0.1
#> 3 -33.9;151.2
