MXNet is a multi-language deep learning framework that allows you to mix the flavours of deep learning programs together to maximize the efficiency and your productivity. It can interface with R, Python, Julia, and C++. Embedded in the host language, it combines declarative symbolic expression with imperative tensor computation. It provides auto differentiation to derive gradients. MXNet is computation and memory efficient and runs on various systems from mobile devices to distributed GPU clusters. In recent benchmarks, it performed comparably or faster than its counterparts such as TensorFlow, Torch, or Caffe.
Before starting, install R packages “mxnet” and “caret” if haven’t done so. In case you may encounter some problems installing the package “mxnet” in R 3.5.1, below is a chunk of code that might be of help.
Execute the R code below first. If nothing goes wrong, then you are good to go. Otherwise you may need to execute the bash code to fix the problem.
In R, do:
cran = getOption("repos")
cran["dmlc"] = ""
options(repos = cran)
In command line, do:
# if you've already installed Homebrew, openblas and opencv, you can just skip the following three lines
ruby -e "$(curl -fsSL"
brew install openblas
brew install opencv
# skip following two lines if your openblas and opencv are up-to-date
brew upgrade openblas
brew upgrade opencv
ln -sf /usr/local/opt/openblas/lib/libopenblasp-r0.3.3.dylib /usr/local/opt/openblas/lib/libopenblasp-r0.3.1.dylib
For more detailed help, you may refer to
require(mxnet) # this package enables us to train neural network model
library(caret) # the createDataPartition function would allow us to do cross validation
churn = read.csv('./WA_Fn-UseC_-Telco-Customer-Churn.csv')
churn = churn[complete.cases(churn), ]
op = churn[,'Churn']
op = as.numeric(op) - 1
ip = churn[,1:20]
ip = sapply(ip, as.numeric)
Creating indices, the trainIndex object, and use it to split data into training and test datasets:
set.seed(123) # randomization that controls the random process in createDataPartition
trainIndex = createDataPartition(1:dim(churn)[1], p = 0.75, list = FALSE)
train_op = op[trainIndex]
test_op = op[-trainIndex]
train_ip = ip[trainIndex, ]
test_ip = ip[-trainIndex, ]
train_ip = data.matrix( scale(train_ip) )
test_ip = data.matrix( scale( test_ip, attr(train_ip, "scaled:center"),
attr(train_ip, "scaled:scale") ) )
Here we configure a neuralnetwork with two hidden layers, where the first hidden layer contains 20 neurons and the second contains 2 neurons:
# configure a two layer neuralnetwork
data1 = mx.symbol.Variable("data")
fc1 = mx.symbol.FullyConnected(data1, num_hidden = 20)
act2 = mx.symbol.Activation(fc1, act_type = "relu")
fc2 = mx.symbol.FullyConnected(act2, num_hidden = 2)
softmax = mx.symbol.SoftmaxOutput(fc2)
devices = mx.cpu()
# create a MXNet Feedorward neural net model with the specified training
model =
mx.model.FeedForward.create(softmax, # the symbolic configuration of the neural network
X = train_ip, # the training data
y = train_op, # optional label of the data
ctx = devices, # the devices used to perform training (GPU or CPU)
num.round = 30, # the number of iterations over training data
array.batch.size = 40, # the batch size used for R array training
learning.rate = 0.1,
momentum = 0.9,
eval.metric = mx.metric.accuracy, # the evaluation function on the results
initializer = mx.init.uniform(0.07), # the initialization scheme for parameters
epoch.end.callback = mx.callback.log.train.metric(100) ) # the callback when one # mini-batch iteration ends
## Start training with 1 devices
## [1] Train-accuracy=0.769696969426039
## [2] Train-accuracy=0.782196971051621
## [3] Train-accuracy=0.782196971503171
## [4] Train-accuracy=0.787689395926215
## [5] Train-accuracy=0.789015151786082
## [6] Train-accuracy=0.792424243959514
## [7] Train-accuracy=0.79375000027093
## [8] Train-accuracy=0.795643941019521
## [9] Train-accuracy=0.796969696879387
## [10] Train-accuracy=0.797159092444362
## [11] Train-accuracy=0.797159090638161
## [12] Train-accuracy=0.800757577925017
## [13] Train-accuracy=0.79848485101353
## [14] Train-accuracy=0.799242426951726
## [15] Train-accuracy=0.799621210856871
## [16] Train-accuracy=0.803030301224102
## [17] Train-accuracy=0.803030302578753
## [18] Train-accuracy=0.801515151605462
## [19] Train-accuracy=0.801136366345666
## [20] Train-accuracy=0.800189394390944
## [21] Train-accuracy=0.804356062954122
## [22] Train-accuracy=0.802272728446758
## [23] Train-accuracy=0.805492424603665
## [24] Train-accuracy=0.805113636634567
## [25] Train-accuracy=0.806628788962509
## [26] Train-accuracy=0.808143940838901
## [27] Train-accuracy=0.809469698053418
## [28] Train-accuracy=0.810795453461734
## [29] Train-accuracy=0.810227273088513
## [30] Train-accuracy=0.809848486474066
# make a prediction use the model trained
preds = predict(model, test_ip)
predict_test_df = data.frame(t(preds))
pred_test = predict_test_df
pred_label = max.col(pred_test) - 1
df_pred = data.frame( table(pred_label, test_op) )
# get the probability matrix
knitr::kable(df_pred, col.names = c('Prediction Label','Test Output','Frequency'), align = 'c')
Prediction Label | Test Output | Frequency |
0 | 0 | 1200 |
1 | 0 | 103 |
0 | 1 | 251 |
1 | 1 | 202 |
df_pred$pred_label = as.numeric( as.character(df_pred$pred_label) )
df_pred$op = as.numeric( as.character(df_pred$test_op) )
# get the index where the prediction is correct
ind = which( df_pred$pred_label == df_pred$op )
# calculate the accuracy rate
pred_accuracy = sum(df_pred[,3][ind]) / sum(df_pred[,3])
Prediction accuracy:
## [1] 0.7984055
