Currently haven can read and write logical, integer, numeric, character and factors. See labelled for how labelled variables in Stata are handled in R.

read_dta(file, encoding = NULL)

read_stata(file, encoding = NULL)

write_dta(data, path, version = 14)

Arguments

file

Either a path to a file, a connection, or literal data (either a single string or a raw vector).

Files ending in .gz, .bz2, .xz, or .zip will be automatically uncompressed. Files starting with http://, https://, ftp://, or ftps:// will be automatically downloaded. Remote gz files can also be automatically downloaded and decompressed.

Literal data is most useful for examples and tests. It must contain at least one new line to be recognised as data (instead of a path).

encoding

The character encoding used for the file. This defaults to the encoding specified in the file, or UTF-8. But older versions of Stata (13 and earlier) did not store the encoding used, and you'll need to specify manually. A commonly used value is "Win 1252".

data

Data frame to write.

path

Path to a file where the data will be written.

version

File version to use. Supports versions 8-14.

Value

A tibble, data frame variant with nice defaults.

Variable labels are stored in the "label" attribute of each variable. It is not printed on the console, but the RStudio viewer will show it.

Examples

path <- system.file("examples", "iris.dta", package = "haven") read_dta(path)
#> # A tibble: 150 x 5 #> sepallength sepalwidth petallength petalwidth species #> <dbl> <dbl> <dbl> <dbl> <chr> #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3.0 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5.0 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa #> 7 4.6 3.4 1.4 0.3 setosa #> 8 5.0 3.4 1.5 0.2 setosa #> 9 4.4 2.9 1.4 0.2 setosa #> 10 4.9 3.1 1.5 0.1 setosa #> # ... with 140 more rows
tmp <- tempfile(fileext = ".dta") write_dta(mtcars, tmp) read_dta(tmp)
#> # A tibble: 32 x 11 #> mpg cyl disp hp drat wt qsec vs am gear carb #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 #> 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 #> 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 #> 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 #> 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 #> 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 #> 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 #> 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 #> 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 #> 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 #> # ... with 22 more rows
read_stata(tmp)
#> # A tibble: 32 x 11 #> mpg cyl disp hp drat wt qsec vs am gear carb #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 #> 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 #> 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 #> 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 #> 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 #> 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 #> 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 #> 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 #> 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 #> 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 #> # ... with 22 more rows