Skip to content

The base function as.factor() is not a generic, but forcats::as_factor() is. haven provides as_factor() methods for labelled() and labelled_spss() vectors, and data frames. By default, when applied to a data frame, it only affects labelled columns.

Usage

# S3 method for data.frame
as_factor(x, ..., only_labelled = TRUE)

# S3 method for haven_labelled
as_factor(
  x,
  levels = c("default", "labels", "values", "both"),
  ordered = FALSE,
  ...
)

# S3 method for labelled
as_factor(
  x,
  levels = c("default", "labels", "values", "both"),
  ordered = FALSE,
  ...
)

Arguments

x

Object to coerce to a factor.

...

Other arguments passed down to method.

only_labelled

Only apply to labelled columns?

levels

How to create the levels of the generated factor:

  • "default": uses labels where available, otherwise the values. Labels are sorted by value.

  • "both": like "default", but pastes together the level and value

  • "label": use only the labels; unlabelled values become NA

  • "values: use only the values

ordered

If TRUE create an ordered (ordinal) factor, if FALSE (the default) create a regular (nominal) factor.

Details

Includes methods for both class haven_labelled and labelled for backward compatibility.

Examples

x <- labelled(sample(5, 10, replace = TRUE), c(Bad = 1, Good = 5))

# Default method uses values where available
as_factor(x)
#>  [1] Good 2    Bad  2    2    Bad  3    Good Bad  3   
#> Levels: Bad 2 3 Good
# You can also extract just the labels
as_factor(x, levels = "labels")
#>  [1] Good <NA> Bad  <NA> <NA> Bad  <NA> Good Bad  <NA>
#> Levels: Bad Good
# Or just the values
as_factor(x, levels = "values")
#>  [1] 5 2 1 2 2 1 3 5 1 3
#> Levels: 1 2 3 5
# Or combine value and label
as_factor(x, levels = "both")
#>  [1] [5] Good 2        [1] Bad  2        2        [1] Bad  3       
#>  [8] [5] Good [1] Bad  3       
#> Levels: [1] Bad 2 3 [5] Good

# as_factor() will preserve SPSS missing values from values and ranges
y <- labelled_spss(1:10, na_values = c(2, 4), na_range = c(8, 10))
as_factor(y)
#>  [1] 1  2  3  4  5  6  7  8  9  10
#> Levels: 1 2 3 4 5 6 7 8 9 10
# use zap_missing() first to convert to NAs
zap_missing(y)
#>  [1]  1 NA  3 NA  5  6  7 NA NA NA
#> attr(,"class")
#> [1] "haven_labelled"
as_factor(zap_missing(y))
#>  [1] 1    <NA> 3    <NA> 5    6    7    <NA> <NA> <NA>
#> Levels: 1 3 5 6 7