Setting up kafka bolt

Once storm topology was set up, there were two ways in which machine learning could be incorporated in the project
1. ML in storm 
2. ML outside storm topology

In order to choose the design for implementation the factors considered were:

1. Scalability of components
           If the storm component is separated from machine learning component, then we can scale each component independently and individually. This factor gives us better flexibility for programming each component.

2. Ease of debugging
           Errors that prop up in each individual component will be raised and identified better when the components are disaggregated. We can provide workarounds for the bugs and solve them more efficiently in the suggested setup.

3. Options and research scope that can be explored in each choice
            Machine learning in storm has restrictions of its own. The types of built-in algorithms available are limited. If we want to explore different kinds of Machine learning algorithms we need to set it up outside the topology. 

Hence, the design choice is justified.

Now, we need to figure out a way to connect the storm topology to the machine learning server.
We could do it using a data messaging pipeline that we have already used - KAFKA! 

The idea is :
1. Set up storm topology
2. Stream/Publish the output tuples from bolt into a Kafka topic (aka KAFKA BOLT). 
  • Kafka bolt is within the topology.
  • The tuple has to be serialized while dumping it to the Kafka topic 

3. Subscribe to the data in Kafka topic, from machine learning server.
  • Since the tuple is serialized, if you directly consume it, garbage values will be dumped
  • We need a deserializer code before consuming the data
  • Machine learning server code can now consume the data, train the model and do some useful predictions.



   

Comments

Popular posts from this blog

Day 12: Master Slave SDN Controller Architecture

Day 50: Tcpreplay and tcpliveplay approach

Day 10: Mininet Simulation of a basic distributed SDN controller architeture