R

Change R (Rscript) Windows Environment Path from OneDrive

pain in the butt When OneDrive was installed in a computer, it’s painful using R and Rscirpt in command line.

Since the Documents has been hijacked by OneDrive, whenever you open the directory C:\User\YourName\Documents it automatically redirects to C:\Users\YourName\OneDrive - Spectrum Health\Documents. You don’t want to backup 2-GB R library fold to the OneDrive.

How to reach the real C:\User\YourName\Documents instead of C:\Users\YourName\OneDrive - Spectrum Health\Documents, you have to go C: drive, then User folder, then YourName folder, then Documents.

Sparse Matrix and Dummy Variables

Why sparse matrix?

XGBoost only works with matrices that contain all numeric variables; consequently, we need to one-hot encode our data. (UC Business Analytics R Programming Guide) caret::preProcess uses bagging regression trees for missing values recovery (Yevhen Vasylenko), which requires all numeric variables.

There are different ways to do this in R.

library(tidyverse) dd <- data.frame(a = gl(3,4), b = gl(4,1,12), c = 1:12, d = sample(c("X", "Y", "Z"), 12, replace = TRUE)) str(dd)

'data.

Database with R: PostgreSQL

Load library library(tidyverse) library(RPostgres) library(DBI)

Connect to PostgreSQL Connect Method 1 with RPostgres con1 <- DBI::dbConnect(RPostgres::Postgres(), dbname = "testdb1", password = rstudioapi::askForPassword("Database password")) dbListTables(con1) #dbWriteTable(con1, "mtcars", mtcars) #dbWriteTable(con1, "flights", nycflights13::flights) rstudioapi::askForPassword requires the password input

In the command line (or block-run), there will be popout dialog to input password,

In the knit mode, render the RMarkdown file as follows: rmarkdown::render("MyDocument.Rmd", params = "ask") See Parameter User Interfaces In the blogdown, server_site will halt at the above block.

Time Series Analysis II: zoo

zoo: S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations)

An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo’s key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics

Base Time-Series Objects stats::ts Credit: Time Series Analysis in R Part 1: The Time Series Object by DataSciencePlus

US State and County Choropleth Map (Heatmap)

In this #TidyTuesday post, I try to show an example of putting together six common geographic visualization methods of heat map or choropleth map.

Code is available on GitHUB

Counties

Map data from mpas

Corrected to match with American Community Survey (ACS)

COUNS (48 states an DC)

Map data from albersusa

50 States and DC

Using R6 class choroplethr

50 States and DC

Time Series Analysis I: date and time

Date class from base::Date and lubridate d1 <- Sys.Date() d1; class(d1)

[1] "2018-09-12"

[1] "Date"

dates <- c("02/27/92", "02/27/92", "01/14/92") dd <- as.Date(dates, "%m/%d/%y") dd; class(dd)

[1] "1992-02-27" "1992-02-27" "1992-01-14"

[1] "Date"

d2 <- lubridate::as_date("2018-07-05") d2; class(d2)

[1] "2018-07-05"

[1] "Date"

POSIX* class from base::as.POSIX* POSIXt types, POSIXct and POSIXlt:

ct, calendar time, stores the number of seconds since the origin

R Spatial Data Analysis 4: stars

R Spatial Data Analysis 3: Simple Features

R Spatial Data Analysis 2: Spatial Class

R Spatial Data Analysis 1: from Data to Spatial Data

As much as R is popular in data analysis, R becomes more and more favored in geospatial analysis and visualization. To introduce spatial data, let first start with comman data. Basic R data types: vector, factor, matrix, data.frame, and list. Data and Plots There are many ways to munipulate and visualize data in R, including, typically, the basic and the tidyverse framework. Let’s warm up with plotting data.frame. basic plot attach(mtcars) par(mfrow = c(1,2)) plot(mpg, wt, main = "wt vs.