Why sparse matrix?
XGBoost only works with matrices that contain all numeric variables; consequently, we need to one-hot encode our data. (UC Business Analytics R Programming Guide) caret::preProcess uses bagging regression trees for missing values recovery (Yevhen Vasylenko), which requires all numeric variables.
There are different ways to do this in R.
library(tidyverse) dd <- data.frame(a = gl(3,4), b = gl(4,1,12), c = 1:12, d = sample(c("X", "Y", "Z"), 12, replace = TRUE)) str(dd)