Data Science

Database with R: PostgreSQL

Load library library(tidyverse) library(RPostgres) library(DBI)

Connect to PostgreSQL Connect Method 1 with RPostgres con1 <- DBI::dbConnect(RPostgres::Postgres(), dbname = "testdb1", password = rstudioapi::askForPassword("Database password")) dbListTables(con1) #dbWriteTable(con1, "mtcars", mtcars) #dbWriteTable(con1, "flights", nycflights13::flights) rstudioapi::askForPassword requires the password input

In the command line (or block-run), there will be popout dialog to input password,

In the knit mode, render the RMarkdown file as follows: rmarkdown::render("MyDocument.Rmd", params = "ask") See Parameter User Interfaces In the blogdown, server_site will halt at the above block.

Time Series Analysis II: zoo

zoo: S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations)

An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo’s key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics

Base Time-Series Objects stats::ts Credit: Time Series Analysis in R Part 1: The Time Series Object by DataSciencePlus

US State and County Choropleth Map (Heatmap)

In this #TidyTuesday post, I try to show an example of putting together six common geographic visualization methods of heat map or choropleth map.

Code is available on GitHUB

Counties

Map data from mpas

Corrected to match with American Community Survey (ACS)

COUNS (48 states an DC)

Map data from albersusa

50 States and DC

Using R6 class choroplethr

50 States and DC

Time Series Analysis I: date and time

Date class from base::Date and lubridate d1 <- Sys.Date() d1; class(d1)

[1] "2018-09-12"

[1] "Date"

dates <- c("02/27/92", "02/27/92", "01/14/92") dd <- as.Date(dates, "%m/%d/%y") dd; class(dd)

[1] "1992-02-27" "1992-02-27" "1992-01-14"

[1] "Date"

d2 <- lubridate::as_date("2018-07-05") d2; class(d2)

[1] "2018-07-05"

[1] "Date"

POSIX* class from base::as.POSIX* POSIXt types, POSIXct and POSIXlt:

ct, calendar time, stores the number of seconds since the origin

R Spatial Data Analysis 4: stars

R Spatial Data Analysis 3: Simple Features

R Spatial Data Analysis 2: Spatial Class

R Spatial Data Analysis 1: from Data to Spatial Data

As much as R is popular in data analysis, R becomes more and more favored in geospatial analysis and visualization. To introduce spatial data, let first start with comman data. Basic R data types: vector, factor, matrix, data.frame, and list. Data and Plots There are many ways to munipulate and visualize data in R, including, typically, the basic and the tidyverse framework. Let’s warm up with plotting data.frame. basic plot attach(mtcars) par(mfrow = c(1,2)) plot(mpg, wt, main = "wt vs.

Comparison: sort, order and arrange

vector (or factor)

(x <- swiss$Education[1:20])

[1] 12 9 5 7 15 7 7 8 7 13 6 12 7 12 5 2 8 28 20 9

sort the vector

sort(x)

[1] 2 5 5 6 7 7 7 7 7 8 8 9 9 12 12 12 13 15 20 28

partial sorting

sort(x, partial = c(10, 15))

[1] 2 5 5 7 7 7 7 6 7 8 8 9 9 12 12 12 13 28 20 15

Partial sorting in R is different with that in Wikipedia.

Comparison: transform, within and mutate

Here is another comparison between two basic functions transform and within, and a tidyverse function dplyr::mutate. They all can be used for data munipulation, adding a new column to a data.frame. head(mtcars)

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.