Retraining a Model in a Workspace Dashboard
If you have already created a model but you are trying to improve your accuracy Terrene offers you a few options. You can create a new model with different variables than your current model, if you wish to do that you should check out this tutorial on creating a new model in an existing workspace Or you can change the variables your current model uses. If you already have all of the variables you are interested in then you can retrain your existing model or train the model on extra datasets. Finally, depending on your needs you may want to change the type of model, currently Terrene supports regression and classification models. If you are looking to change the model from a general predictive model to an anomaly detection model you can learn more here.
Before changing the optimizer or model type read this
Once you create a workspace you can add additional training epochs to improve the model. To do this on the workspace dashboard there is a window called "Schedule Training Job". From this window, you can add additional training time. Each epoch represents reading through the dataset once, therefore, for large datasets, it can sometimes be necessary to use thousands of Epochs. Once you have selected the number of epochs, choose the dataset you originally trained the model on from the drop-down menu and click "Start Training".
You can also change the learning rate. This impacts how much a single data point changes the model. A lower learning rate means outliers in your data are less damaging but requires more training epochs. Higher learning rates mean faster training times, but outliers in your data are more influential. We recommend keeping it close to the default value of 0.001.
If after additional training time you realize that your predictions are growing less accurate (even though the model accuracy is increasing) it means your model is overfitting. Adding more training time will generally improve the model, however, if your dataset is too limited it can also lead to overfitting. This is the problem of a machine learning model creating rules that are too specific. For example, with the Titanic dataset which we often use examples, if the model is trained too many times it will begin to recognize each passenger which means it will be able to recognize the training data with near 100% accuracy (therefore the accuracy graph on the dashboard will show near 100%). But, when it is exposed to test data it will not recognize any passengers and the accuracy will suffer.
Before you change the variables in your model it is important to note that changing the variables will delete your current model. If you would like to keep your current model we recommend this tutorial on adding a second model to a workspace. If you would still like to change the variables in your model you will need to open the model's application page by selecting the model you are interested in on the left-hand page of your workspace dashboard under "Predictive Models".
Once you have clicked on the model you will see a predictive model application page like this.
Scroll down until you see a training settings box like this one.
In this box, you can change the variables by manually writing in the new variables you would like to consider separated by a comma. Once you have selected your new variables you can save your changes and it is time to train your new model.
Retraining the Model
The next box, below Training Settings, is Schedule Training Job.
In this box, you can choose between many options for training including the number of epochs or the learning rate. Select the dataset you want to train the model on from the drop-down menu and click "Start Training" to begin training your model.
If you would like to train a model on multiple datasets which are not connected you can do this from the schedule training job window. If you do not already have your additional data set connected follow these instructions to upload a new CSV or a new database.
From the drop-down menu select the new dataset you want to train on and then select the number of epochs and the learning rate. When you're finished press "Start Training" to begin the new training job.
The previous methods of training improved the existing model, however, if you want to make a drastic change like changing the model type or optimizer Terrene will create a new model and delete the old one.
Regression vs. Classification
Changing between a regression model and a classification model can sometimes change the accuracy slightly. Generally, the previously mentioned methods of additional training and adding additional data are better methods of improving accuracy. The major difference between the two is regression will give a number value, for example, percentage, whereas, classification will give you an absolute answer (1 or 0). If your application requires an absolute answer than changing your model to a classification model may be necessary, otherwise, regression is probably right for you.
Once you have selected the classification tab you can fill out the rest of the values as you would have for the regression model training.
Optimizers affect how the model trains, certain optimizers are ideal for "optimizing" the training of a model in a certain situation. The default optimizer "adam" is a very good general optimizer. We do not recommend changing optimizers unless you have a thorough understanding of how the different optimizers work.