Group 12: Chang Li, Xiaotian Zhan, Sunghun Lee


What is Artificial Neural Network (ANN)?

ANNs are considered nonlinear statistical data modeling tools where the complex relationships between inputs and outputs are modeled or patterns are found and are also inspired simulations performed on the computer to perform certain specific tasks like clustering, classification, pattern recognition etc.





Outline for Analysis


Data (Boston Housing Price)

The Boston dataset (14 columns, 506 obs.) is a collection of data about housing values in the suburbs of Boston. We divided MEDV(The median value of owner-occupied homes measured in $1000s) into 3 groups (5~20, 20~35 and 35~50). The goal is to predict MEDV using the artificial nueral network classifier with all the other variables. What’s more, the data is sampled and split into train and test parts with 80:20. We used the same train and test dataset to R, Matlab and Python to compare the results in the end.


Steps of our analysis



Performance evaluation among languages

Confusion matrix

Confusion matrix

F1 score

F1 Score is the weighted average of Precision and Recall and it is one of a performance measure for classification model. Since this score takes both false positives and false negatives into account, F1 is usually more useful than accuracy, especially if you have a class imbalance in the response variable. When we divide the data into three groups by the values of MEDV, the number of samples in each group is imbalance(Training 163:202:40 / Testing 40:46:5). Thus, the F1 score is resaonable for our analysis to campare the performance of prediction among different languages.

\[ F1~score = 2 * \frac{precision * recall}{precision + recall}~~~~~~~~~~~Precision = \frac{TP}{TP+FP} ~~~~~~~~~~~ Recall = \frac{TP}{TP+FN}\]