Visualizing Your Data with Terrene
When you started this process of creating a new model and making predictions there were probably a few questions you had in mind. Namely, how accurate will the predictions be and what are the most important features? These are very important questions which I will hopefully be able to answer shortly.
Accuracy and Loss
A prediction can only be as accurate as it's model. On Terrene there are two graphs which are automatically displayed on the workspace dashboard as soon as the model is trained, they are accuracy and loss. Both graphs show the change over time, so if you continue to train or add new data you can track the changes. Accuracy is a measure of how correct future predictions will be and loss is a measure of how far off a prediction will be.
In the case of the Titanic dataset I did not include all of the variables that are important, therefore, my accuracy is ~ 62% and my loss is ~ 0.2.
There are several ways to improve the accuracy of a model. Adding more data or more variables can often improve a models accuracy. In this scenario, we can't add more passengers to get more data, but we can add more variables. If you create a new model which includes the passengers class (1st, 2nd, and 3rd class) you will see the accuracy improve. Another option is to train the model more. By adding training time the model will become more accurate, there is a limit to this as eventually, the model will become as accurate as the data will allow, but, it is an easy way to improve initial results. At the bottom of the page, I have included links to tutorials which will teach you how to create new models in a workspace and retrain a model.
As I mentioned earlier I did not have all of the important variables in my model. However, it is just as easy to select variables which are not important. To determine which variables are important and which are not we need to look at the Feature Importance graph. This graph summarizes how much influence each variable has on the final outcome.
As you can see on the Titanic age and gender were the two most important factors in survival. We can rationalize all of our observations based on what we know from history. Women and children were put on lifeboats first and therefore, had a much higher chance of survival.
You are all done! You have successfully created your first workspace, trained a machine learning model, made predictions, and made observations about your data. You are now ready to use predictive analytics to drive your business decisions with Terrene.
- Connecting a training database
- Connecting a prediction database
- Creating a new model in an existing workspace
- Retraining an existing model
Still, have questions? You can search for a specific question with the search bar or email [email protected]