read_sav() reads both .sav and .zsav files; write_sav() creates .zsav files when compress = TRUE. read_por() reads .por files. read_spss() uses either read_por() or read_sav() based on the file extension.

read_sav(
  file,
  encoding = NULL,
  user_na = FALSE,
  col_select = NULL,
  skip = 0,
  n_max = Inf,
  .name_repair = "unique"
)

read_por(
  file,
  user_na = FALSE,
  col_select = NULL,
  skip = 0,
  n_max = Inf,
  .name_repair = "unique"
)

write_sav(data, path, compress = FALSE)

read_spss(
  file,
  user_na = FALSE,
  col_select = NULL,
  skip = 0,
  n_max = Inf,
  .name_repair = "unique"
)

Arguments

file

Either a path to a file, a connection, or literal data (either a single string or a raw vector).

Files ending in .gz, .bz2, .xz, or .zip will be automatically uncompressed. Files starting with http://, https://, ftp://, or ftps:// will be automatically downloaded. Remote gz files can also be automatically downloaded and decompressed.

Literal data is most useful for examples and tests. It must contain at least one new line to be recognised as data (instead of a path) or be a vector of greater than length 1.

Using a value of clipboard() will read from the system clipboard.

encoding

The character encoding used for the file. The default, NULL, use the encoding specified in the file, but sometimes this value is incorrect and it is useful to be able to override it.

user_na

If TRUE variables with user defined missing will be read into labelled_spss() objects. If FALSE, the default, user-defined missings will be converted to NA.

col_select

One or more selection expressions, like in dplyr::select(). Use c() or list() to use more than one expression. See ?dplyr::select for details on available selection options. Only the specified columns will be read from data_file.

skip

Number of lines to skip before reading data.

n_max

Maximum number of lines to read.

.name_repair

Treatment of problematic column names:

  • "minimal": No name repair or checks, beyond basic existence,

  • "unique": Make sure names are unique and not empty,

  • "check_unique": (default value), no name repair, but check they are unique,

  • "universal": Make the names unique and syntactic

  • a function: apply custom name repair (e.g., .name_repair = make.names for names in the style of base R).

  • A purrr-style anonymous function, see rlang::as_function()

This argument is passed on as repair to vctrs::vec_as_names(). See there for more details on these terms and the strategies used to enforce them.

data

Data frame to write.

path

Path to a file where the data will be written.

compress

If TRUE, will compress the file, resulting in a .zsav file. Otherwise the .sav file will be bytecode compressed.

Value

A tibble, data frame variant with nice defaults.

Variable labels are stored in the "label" attribute of each variable. It is not printed on the console, but the RStudio viewer will show it.

write_sav() returns the input data invisibly.

Details

Currently haven can read and write logical, integer, numeric, character and factors. See labelled_spss() for how labelled variables in SPSS are handled in R.

Examples

path <- system.file("examples", "iris.sav", package = "haven") read_sav(path)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> #> 1 5.1 3.5 1.4 0.2 1 [setosa] #> 2 4.9 3 1.4 0.2 1 [setosa] #> 3 4.7 3.2 1.3 0.2 1 [setosa] #> 4 4.6 3.1 1.5 0.2 1 [setosa] #> 5 5 3.6 1.4 0.2 1 [setosa] #> 6 5.4 3.9 1.7 0.4 1 [setosa] #> 7 4.6 3.4 1.4 0.3 1 [setosa] #> 8 5 3.4 1.5 0.2 1 [setosa] #> 9 4.4 2.9 1.4 0.2 1 [setosa] #> 10 4.9 3.1 1.5 0.1 1 [setosa] #> # … with 140 more rows
tmp <- tempfile(fileext = ".sav") write_sav(mtcars, tmp) read_sav(tmp)
#> # A tibble: 32 x 11 #> mpg cyl disp hp drat wt qsec vs am gear carb #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 #> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 #> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 #> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 #> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 #> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 #> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 #> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 #> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 #> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 #> # … with 22 more rows