Detects and reports the coordinate format – decimal degrees, DMS
(DDdeg.MM'SS''}), or base-60 (\code{DDdeg.MM') – of values in one or more
columns. Combined columns (latitude and longitude stored as a single
delimited string) are split before format detection. Results are returned
as a named list, one element per column.
Arguments
- data
A data frame containing coordinate columns.
- columns
Column name or
c()of column names to check, supplied either unquoted (lat) or quoted ("lat").- sep
Character. Separator used to split combined coordinate columns before format detection. Default is
",".- drop_na
Logical. If
TRUE, values that do not match any recognized coordinate format are excluded before summarizing results. Default isTRUE.
Value
A named list with one element per column in columns. Each element
is itself a named list with two components:
formatCharacter vector of detected format names present in the column. One or more of
"decimal","dms","base60". Returns"unknown"if no values match any recognised format (or if all values are excluded bydrop_na).countsInteger vector of the same length as
format, giving the number of values matching each detected format.
Details
The three recognised coordinate formats are:
Decimal degrees:
"-12.345"or"51.5"DMS:
"12deg.34'56''N"Base-60:
"12deg.34'N"
A column may return multiple formats if values are inconsistently
formatted – for example, a mix of decimal and DMS entries. This is
reported rather than resolved, allowing the user to decide how to
handle mixed formats before passing columns to
latlong_combine or latlong_split.
Combined columns (those containing sep) are detected automatically
and split before format checking, so the same sep used in
latlong_combine or latlong_split should be
passed here for consistent results.
See also
latlong_column for detecting which columns in a data frame
contain coordinates,
latlong_combine for merging separate coordinate columns into
one,
latlong_split for splitting a combined coordinate column into
separate latitude and longitude columns,
latlong_filter for removing invalid coordinates after checking
formats,
latlong_convert for converting coordinate formats after
checking.
Examples
df <- data.frame(
id = 1:4,
lat = c("51.5", "48.8", "40.7", "35.6"),
lon = c("-0.1", "2.3", "-74.0", "139.7")
)
# Check format of a single column
latlong_format(df, lat)
#> $lat
#> $lat$format
#> [1] "decimal"
#>
#> $lat$counts
#> [1] 4
#>
#>
# Check multiple columns at once
latlong_format(df, c(lat, lon))
#> $lat
#> $lat$format
#> [1] "decimal"
#>
#> $lat$counts
#> [1] 4
#>
#>
#> $lon
#> $lon$format
#> [1] "decimal"
#>
#> $lon$counts
#> [1] 4
#>
#>
# Mixed formats in one column
df_mixed <- data.frame(
coords = c("51.5", "48deg.52'N", "40.7", "35deg.36'00''N")
)
latlong_format(df_mixed, coords)
#> $coords
#> $coords$format
#> [1] "decimal" "dms" "base60"
#>
#> $coords$counts
#> [1] 2 1 1
#>
#>
# Combined latitude-longitude column with custom separator
df_combined <- data.frame(
latlon = c("51.5;-0.1", "48.8;2.3", "40.7;-74.0")
)
latlong_format(df_combined, latlon, sep = ";")
#> $latlon
#> $latlon$format
#> [1] "decimal"
#>
#> $latlon$counts
#> [1] 6
#>
#>
# Include unknown-format values in counts
df_dirty <- data.frame(
lat = c("51.5", "not_a_coord", "40.7", NA)
)
latlong_format(df_dirty, lat, drop_na = FALSE)
#> $lat
#> $lat$format
#> [1] "decimal"
#>
#> $lat$counts
#> [1] 2
#>
#>
