| <div id="main" class="col-md-9" role="main"> |
| |
| # CSV Convert Options |
| |
| <div class="ref-description section level2"> |
| |
| CSV Convert Options |
| |
| </div> |
| |
| <div class="section level2"> |
| |
| ## Usage |
| |
| <div class="sourceCode"> |
| |
| ``` r |
| csv_convert_options( |
| check_utf8 = TRUE, |
| null_values = c("", "NA"), |
| true_values = c("T", "true", "TRUE"), |
| false_values = c("F", "false", "FALSE"), |
| strings_can_be_null = FALSE, |
| col_types = NULL, |
| auto_dict_encode = FALSE, |
| auto_dict_max_cardinality = 50L, |
| include_columns = character(), |
| include_missing_columns = FALSE, |
| timestamp_parsers = NULL, |
| decimal_point = "." |
| ) |
| ``` |
| |
| </div> |
| |
| </div> |
| |
| <div class="section level2"> |
| |
| ## Arguments |
| |
| - check_utf8: |
| |
| Logical: check UTF8 validity of string columns? |
| |
| - null_values: |
| |
| Character vector of recognized spellings for null values. Analogous |
| to the `na.strings` argument to `read.csv()` or `na` in |
| `readr::read_csv()`. |
| |
| - true_values: |
| |
| Character vector of recognized spellings for `TRUE` values |
| |
| - false_values: |
| |
| Character vector of recognized spellings for `FALSE` values |
| |
| - strings_can_be_null: |
| |
| Logical: can string / binary columns have null values? Similar to |
| the `quoted_na` argument to `readr::read_csv()` |
| |
| - col_types: |
| |
| A `Schema` or `NULL` to infer types |
| |
| - auto_dict_encode: |
| |
| Logical: Whether to try to automatically dictionary-encode string / |
| binary data (think `stringsAsFactors`). This setting is ignored for |
| non-inferred columns (those in `col_types`). |
| |
| - auto_dict_max_cardinality: |
| |
| If `auto_dict_encode`, string/binary columns are dictionary-encoded |
| up to this number of unique values (default 50), after which it |
| switches to regular encoding. |
| |
| - include_columns: |
| |
| If non-empty, indicates the names of columns from the CSV file that |
| should be actually read and converted (in the vector's order). |
| |
| - include_missing_columns: |
| |
| Logical: if `include_columns` is provided, should columns named in |
| it but not found in the data be included as a column of type |
| `null()`? The default (`FALSE`) means that the reader will instead |
| raise an error. |
| |
| - timestamp_parsers: |
| |
| User-defined timestamp parsers. If more than one parser is |
| specified, the CSV conversion logic will try parsing values starting |
| from the beginning of this vector. Possible values are (a) `NULL`, |
| the default, which uses the ISO-8601 parser; (b) a character vector |
| of [strptime](https://rdrr.io/r/base/strptime.html) parse strings; |
| or (c) a list of |
| [TimestampParser](https://arrow.apache.org/docs/r/reference/CsvReadOptions.md) |
| objects. |
| |
| - decimal_point: |
| |
| Character to use for decimal point in floating point numbers. |
| |
| </div> |
| |
| <div class="section level2"> |
| |
| ## Examples |
| |
| <div class="sourceCode"> |
| |
| ``` r |
| tf <- tempfile() |
| on.exit(unlink(tf)) |
| writeLines("x\n1\nNULL\n2\nNA", tf) |
| read_csv_arrow(tf, convert_options = csv_convert_options(null_values = c("", "NA", "NULL"))) |
| #> # A tibble: 4 x 1 |
| #> x |
| #> <int> |
| #> 1 1 |
| #> 2 NA |
| #> 3 2 |
| #> 4 NA |
| open_csv_dataset(tf, convert_options = csv_convert_options(null_values = c("", "NA", "NULL"))) |
| #> FileSystemDataset with 1 csv file |
| #> 1 columns |
| #> x: int64 |
| ``` |
| |
| </div> |
| |
| </div> |
| |
| </div> |