Setting up kafka bolt

May 03, 2019

Once storm topology was set up, there were two ways in which machine learning could be incorporated in the project

1. ML in storm

2. ML outside storm topology

In order to choose the design for implementation the factors considered were:

1. Scalability of components

If the storm component is separated from machine learning component, then we can scale each component independently and individually. This factor gives us better flexibility for programming each component.

2. Ease of debugging

Errors that prop up in each individual component will be raised and identified better when the components are disaggregated. We can provide workarounds for the bugs and solve them more efficiently in the suggested setup.

3. Options and research scope that can be explored in each choice

Machine learning in storm has restrictions of its own. The types of built-in algorithms available are limited. If we want to explore different kinds of Machine learning algorithms we need to set it up outside the topology.

Hence, the design choice is justified.

Now, we need to figure out a way to connect the storm topology to the machine learning server.

We could do it using a data messaging pipeline that we have already used - KAFKA!

The idea is :

1. Set up storm topology

2. Stream/Publish the output tuples from bolt into a Kafka topic (aka KAFKA BOLT).

Kafka bolt is within the topology.
The tuple has to be serialized while dumping it to the Kafka topic

3. Subscribe to the data in Kafka topic, from machine learning server.

Since the tuple is serialized, if you directly consume it, garbage values will be dumped
We need a deserializer code before consuming the data
Machine learning server code can now consume the data, train the model and do some useful predictions.

Search This Blog

Securing SDN Network using Distributed Controller and Real time Machine Learning

Setting up kafka bolt

Comments

Post a Comment

Popular posts from this blog

Day 12: Master Slave SDN Controller Architecture

Day 50: Tcpreplay and tcpliveplay approach

Day 1: Understanding Ransomware and how to detect them?