Machine Learning component part1
The design choices to be made while building the machine learning server were:
1. Train the model in real time
2. Train the model in batches beforehand and predict in real time
3. Keep samples of data for training. Train and predict in real time
The factors that help decide which approach to use are:
1. Volume of training data and How much time does it take to train the model?
If the training takes too long, there is no point in real time model building. We would need to build the model apriori. Load the model in real time and use real time streams as test flows.
The machine learning algorithm chosen in this project takes less time to train(~2-5 seconds) , hence we can afford to learn, train and predict in real time.
2. Criticality of accuracy, how much error is reasonable?
If the accuracy obtained is greater than 85%, its reasonable to go ahead and predict using the model. If not we will need to consider methods to improve the accuracy.
3. How do we improve accuracy of the model? Will feedback be considered?
If the training is done offline, we can train the model with the flows it predicted wrong. rebuild the model and load it.
If the training is done in real time and the accuracy is good enough, we do not have to retrain the model. The real time flows can be used as test flows for prediction.
If the training is done in real time and the accuracy is bad, then maybe use a different algorithm that fits the data better. The algorithm is time critical. If algorithm cant be changed then use different set of parameters to change the nature of the model.
Some suggestions for the model -
1. Rule based approach - Decision trees, K Nearest Neighbours
2. Neural network with real time feedback - One Shot Learning
1. Train the model in real time
2. Train the model in batches beforehand and predict in real time
3. Keep samples of data for training. Train and predict in real time
The factors that help decide which approach to use are:
1. Volume of training data and How much time does it take to train the model?
If the training takes too long, there is no point in real time model building. We would need to build the model apriori. Load the model in real time and use real time streams as test flows.
The machine learning algorithm chosen in this project takes less time to train(~2-5 seconds) , hence we can afford to learn, train and predict in real time.
2. Criticality of accuracy, how much error is reasonable?
If the accuracy obtained is greater than 85%, its reasonable to go ahead and predict using the model. If not we will need to consider methods to improve the accuracy.
3. How do we improve accuracy of the model? Will feedback be considered?
If the training is done offline, we can train the model with the flows it predicted wrong. rebuild the model and load it.
If the training is done in real time and the accuracy is good enough, we do not have to retrain the model. The real time flows can be used as test flows for prediction.
If the training is done in real time and the accuracy is bad, then maybe use a different algorithm that fits the data better. The algorithm is time critical. If algorithm cant be changed then use different set of parameters to change the nature of the model.
Some suggestions for the model -
1. Rule based approach - Decision trees, K Nearest Neighbours
2. Neural network with real time feedback - One Shot Learning
Comments
Post a Comment