Discretize

Convenience fns to discretize a numeric vector using different methods

discretize()

Discretize a numeric vector

get_breaks()

Calculate breaks to discretize a numeric vector, using different methods

break_methods()

List break methods

label_breaks_cut()

Label breaks as base::cut would do

label_breaks_interval()

Label breaks as an interval

label_breaks_value()

Label breaks using only point values (discarding lowest)

data.frame summarization

A way to summarize data. Although there are several tools out there, none of those fit my needs/workflow. So here’s a strongly opinionated approach to data.frame summarization.

abridge_df()

Summarize a data.frame using an opinionated -and perhaps controversial- approach

summarize_df()

Summarize a data.frame by applying summarize_obj to each column

summarize_obj()

Summarize an object, typically a vector (a column from a data.frame)

Join data

Functions that come to the rescue when your dirty data do not play nice with a simple left_join

phased_left_join()

Left join lhs and rhs, using a phased approach

Plots

Quick and helpful plots, mostly, to fit my EDA workflows. Quick in the sense that most of them should be a one-function-call-plot. Helpful in the sense that they include info I will likely find useful.

denstogram()

Plot a denstogram

ecdfgram()

Plot Empirical Cummulative Distribution Function for xvar

knitr and R Markdown helpers

Just little helpers to use in .Rmd files

time_chunk()

Time knitr chunks, by default, all of them

duckdb_explain_hook()

Add chunk hook funtionality to print EXPLAIN results from DuckDB

blogdown helpers

Just an RStudio addin to avoid typing blogdown::stop_server()

blogdown_stop_server()

Stop blogdown (hugo/jekyll) server

Text file helpers

Just little helpers to work with plain text files

file_head()

Glimpse first n lines of a file

file_tail()

Glimpse last n lines of a file

file_glimpse()

Glimpse first and last n/2 lines of a file

guess_types()

Guess data types in a delimited text file (thin wrapper on data.table::fread)

Postgres

Here we make into a function recurrent operations while working with data in PostgreSQL, typically, with the server in the local machine running on Windows. It automates some easy things. And we have to admit it, we keep forgetting the syntax for some queries (looking at you pg_get_running_queries)

pg_create_table()

Compose a SQL query to create a table in a Postgres database, matching the data in a text file

pg_copy_file()

Copy data directly from a delimited text-file (csv) to a table in PostgreSQL

pg_create_foreign_table()

Create a foreign table in Postgres, to directly access data in a text file

pg_exists_table()

Check if table_name exists in the Postgres database of given con

pg_total_relation_size()

Query table's size in the Postgres database of given con

pg_copy_data()

Copy data to a postgres instance

pg_kill_query()

Kill a running query in Postgres

pg_running_queries()

List running queries in Postgres

Data wrangling helpers

Common data wrangling tasks

is_unique()

Check if a variable is unique, i.e., there are as many unique values in the object as the total number of values

filter_duplicates()

Filter duplicated row by the columns indicated in by

filter_duplicates_dplyr()

Filter duplicated row by the columns ...

modemfv_vctrs()

Find the mode in a vector x

modemfv()

Find the mode in a vector x

are_paired()

Check if the ... columns of the df have a one-to-one relation

Text wrangling helpers

A few fns to help in processing text data

normalize_text()

Normalize text, removing spaces, common punctuation and non-ascii characters

parse_annotations()

Parse an hierarchical annotation into a data.frame

parse_single_annotation()

Parse an hierarchical annotation into a data.frame

ggplot2 helpers

Fns to reduce code duplication in writing ggplot2

bar_just()

Calculate hjust/vjust to align labels to the end of bars in ggplot2

bar_just_facet()

Calculate hjust/vjust to align labels to the end of bars in ggplot2

Misc

Miscellaneous functions (within an already miscellaneous package)

attr()

base::attr with a default value instead of NULL, and always exact

first_non_null()

Return the first non-null argument

tab()

One-way tabulate table

assert()

Assert a condition within a data frame

is_invalid_number()

Test if x is NA, NaN, Inf, -Inf, NULL

clipboard_readfrom()

Read from the clipboard to create a data.frame

clipboard_writeto()

Copy an object to the clipboard

memory_use()

Print memory use (total and each object in the global environment)

timestamp_it()

Append a timestamp to x, typically a path

ellipt_colnames()

"Delete by ellipsis" column names, putting the original column names as a tooltip

open_file()

Open a file or URL in the system's default application

`%#%` `%# %`

Function that defines the operator %#% to concatenate strings

`%#_%`

Function that defines the operator %#% to concatenate strings