Getting Started with Kafka

Step 1: If you are Installing kafka on aws , you need to change the Instance type to m3.2xlarge.

1.A : Login to your aws account and stop your instace First. Then use the fallowing steps to change the Instance type.

Right Click on the Instance type and then select Instance Settings->Change Instance Tpye

1

From the Drop down list Select m3.2xlarge Instance.

1

Then Click on “Apply” Button.

1

Start your Instance.

1

Note : You can skip this step if you’re not using aws.

Step 2: Downloading the Kafka.

2.A: Use the Fallowing command to Download KAFKA.

Command: wget http://archive.apache.org/dist/kafka/old_releases/kafka-0.7.2-/kafka-0.7.2-incubating-                 src.tgz

1

 2.B: Unzip Downloaded package.

Command: tar xzf kafka-0.7.2-incubating-src.tgz

1

2.C: Change the Directory to kafka-0.7.2-incubating-src

Command: cd kafka-0.7.2-incubating-src

1

Step 3: Download “jline-0.9.94.jar” file.

Command: wget http://central.maven.org/maven2/jline/jline/0.9.94/jline-0.9.94.jar

1

Step 4: Setup the path to move the “jline-0.9.94.jar” file.

4.A: Change the Directory to home.

Command: cd

1

4.B: Create a Directory /home/user

Command: mkdir /home/user

1

4.C:  Create a Directory /home/user/.m2/

Command: mkdir /home/user/.m2/

1

4.D : Create a Directory /home/user/.m2/repository

Command: mkdir /home/user/.m2/repository

1

4.E : Create a Directory  /home/user/.m2/repository/jline/

Command: mkdir /home/user/.m2/repository/jline/

1

4.F :  Create a Directory /home/user/.m2/repository/jline/jline/

Command: mkdir /home/user/.m2/repository/jline/jline/

1

4.G :  Create a Directory /home/user/.m2/repository/jline/jline/0.9.94/

Command: mkdir /home/user/.m2/repository/jline/jline/0.9.94/

1

Step 5: Moving the File “jline-0.9.94.jar” from “kafka-0.7.2-incubating-src” directory to “/home/ec2-user/.m2/repository/jline/jline/0.9.94/” directory.

5.A : Change the Directory to “kafka-0.7.2-incubating-src”.

Command: cd kafka-0.7.2-incubating-src

1

5.B : Checking the list of files in “kafka-0.7.2-incubating-src”

Command: ls -l

1

5.C : Moving the File “jline-0.9.94.jar” to

“/home/ec2-user/.m2/repository/jline/jline/0.9.94/”.

Command: sudo mv jline-0.9.94.jar /home/ec2-user/.m2/repository/jline/jline/0.9.94/

Note: Instead of “ec2-user” in the above path, you have to specify your user name.

1

Step 6: “sbt” Update under “kafka-0.7.2-incubating-src”.

Command: ./sbt update

1

1

Step 7: “sbt” package under “kafka-0.7.2-incubating-src”.

Command: ./sbt package

1

1

Step 8: Installing “lein”

8.A : Change the Directory to “/bin/” under “kafka-0.7.2-incubating-src”.

Command : cd /bin/

1

8.B : Download package using the below Command.

Command: sudo wget https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein

1

8.C : Change the Permissions of the “lein” directory

Command: sudo chmod 755 lein

1

8.D : Installing “lein”.

Command: ./lein

1

1

Step 9 : Installing “zkclient”

9.A : Downloading the package using the below command under  “kafka-0.7.2-incubating-src/bin/”.

Command: wget https://testpypi.python.org/packages/source/z/zkclient/zkclient-v1.3.tar.gz#md5=300866e1c849338806f337b83672792d

1

9.B :  Unzip the Package.

Command: tar xzf zkclient-v1.3.tar.gz

1

9.C : Checking the list of files under “zkclient-v1.3.tar.gz”.

Command: ls -l

1

9.D : Change the Directory to “zkclient-v1.3.tar.gz”

Command : cd zkclient-v1.3.tar.gz

1

9.E : Checking the files under “zkclient-v1.3.tar.gz”

Command : ls -l

1

9.F : Installing “setup.py”

Command : sudo python setup.py install

1

Step 10: Installing “log4j” package.

Command: sudo yum install log4j*

1

1

Step 11: Configure the Twitter Dev Stream.

If you don’t have a Twitter Account, Create an account in Twitter and fallow the steps provided in the below link to generate the twitter Access keys to get the twitter data.

http://www.74by2.com/2014/06/easily-get-twitter-api-key-api-secret-access-token-access-secret-pictures/

Step 12: Downloading the code for “clj-kafka-storm” from github

12.A : Change the directory to home.

Command:  cd

12.B :  Downloading the code from github.

Command: git clone https://github.com/echeran/clj-kafka-storm.git

1

12.C : Change the Directory to “clj-kafka-storm”

Command: cd clj-kafka-storm

1

12.D : Open the file “./twitter-kafka-producer/src/twitter_kafka_producer/core.clj” to Set the Twitter Access keys.

Command: vi ./twitter-kafka-producer/src/twitter_kafka_producer/core.clj

1

Enter the Twitter API {key, secret}from youraccount as the OAuth Consumer {key, secret}

And Twitter access {token , token secret} as the OAuth access {token , token secret}

in the file.

1

Step 13: Configure the Kafka brokers.

13.A : Change the directory to “kafka-0.7.2-incubating-src”

Command: cd kafka-0.7.2-incubating-src

1

13.B : Change the Directory to config under “kafka-0.7.2-incubating-src”.

Command: cd config

1

13.C : Configuring broker 0 .

13.C.1: Copy “server.properties” file to “server-0.properties”

Command: cp server.properties server-0.properties

1

13.C.2: Open the file “server-0.properties”

Command: vi server-0.properties

1

And modify the fallowing fields in the file like below.

brokerid=0
port=9092
log.dir=/tmp/kfk0.7-logs-0

1

13.D: Configuring broker 1 .

13.D.1: Copy “server.properties” file to “server-0.properties”

Command: cp server.properties server-0.properties

1

13.D.2: Open the file “server-1.properties”.

Command: vi server-1.properties

1

And modify the fallowing fields in the file like below.

brokerid=1
port=9093
log.dir=/tmp/kfk0.7-logs-1

1

13.E: Configuring broker 2.

13.E.1: Copy “server.properties” file to “server-2.properties”

Command: cp server.properties server-2.properties

1

13.E.2: Open the file “server-2.properties”.

Command: vi server-2.properties

And modify the fallowing fields in the file like below.

brokerid=2
port=9094
log.dir=/tmp/kfk0.7-logs-2

1
Step 14: Configure the built-in Kafka Zookeeper instance

Open the file “zookeeper.properties” under  “kafka-0.7.2-incubating-src/config directory”.

Change the directory to home.

Command: cd

Change the directory to “kafka-0.7.2-incubating-src/config”

Command: cd kafka-0.7.2-incubating-src/config

Command: vi zookeeper.properties

1

ensure clientPort=2181 in zookeeper.properties file.

1

Step 15: Set log4j file.

15.A : Change the directory to home.

Command: cd

15.B: Change the directory to “clj-kafka-storm/twitter-kafka-producer/src”.

Command : cd clj-kafka-storm/twitter-kafka-producer/src

1

15.C : Open the file “log4j.properties”.

Command: vi log4j.properties

1

Paste the fallowing code in the file “log4j.properties” and save the file.

#  Logging level

solr.log=logs/

log4j.rootLogger=INFO, file, CONSOLE

log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender

log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout

log4j.appender.CONSOLE.layout.ConversionPattern=%-4r [%t] %-5p %c %x \u2013 %m%n

#- size rotation with log cleanup.

log4j.appender.file=org.apache.log4j.RollingFileAppender

log4j.appender.file.MaxFileSize=4MB

log4j.appender.file.MaxBackupIndex=9

#- File to log to and log format

log4j.appender.file.File=${solr.log}/solr.log

log4j.appender.file.layout=org.apache.log4j.PatternLayout

log4j.appender.file.layout.ConversionPattern=%-5p – %d{yyyy-MM-dd HH:mm:ss.SSS}; %C; %m\n

log4j.logger.org.apache.zookeeper=WARN

log4j.logger.org.apache.hadoop=WARN

set to INFO to enable infostream log messages

log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF

1

Step 16: Run the Kafka Zookeeper instance.

16.A: Change the directory to home.

Command: cd

16.B : Change the directory to “kafka-0.7.2-incubating-src”

Command: cd kafka-0.7.2-incubating-src

1

16.C : Running the Kafka Zookeeper instance.

Command: bin/zookeeper-server-start.sh config/zookeeper.properties

Step 17: Start the Kafka brokers under  “kafka-0.7.2-incubating-src” directory, each with its own JMX port number.

17.A: Open a new terminal, and use the fallowing command to start the first broker .

Command: JMX_PORT=2002 bin/kafka-server-start.sh config/server-0.properties

1

1

17.B: Open a new terminal, and use the fallowing command to start the Second broker .

Command: JMX_PORT=2003 bin/kafka-server-start.sh config/server-1.properties

1

1

17.C: Open a new terminal, and use the fallowing command to start the Third broker .

Command: JMX_PORT=2004 bin/kafka-server-start.sh config/server-2.properties

1

1

Step 18: Run the Kafka producer.

18.A : Open a new terminal and change the directory to “clj-kafka-storm/twitter-kafka-producer” .

Command: cd clj-kafka-storm/twitter-kafka-producer

1

18.B: Run lein do clean (this is lein clean and lein run combined as one call to lein – note the spacing around do and commas).

Command: lein do clean, run

1

Step 19: Run the Storm topology (that is, a Storm instance with this topology)

19.A : open a new terminal and change the directory to “./twitter-example”

Command: cd ./twitter-example

1

19.B : Run lein do clean.

Command: lein do clean, run

1

1

Leave a Reply