kafka-on-EC2

Intro.

Kafka data flow hands-on with public data. Created three EC2 instances to simulating the message broker.

Producers:
- Fetch public real estate transaction data from S3 storage.
- Generate Kafka topics,apartinfo.
- Send the data to the queue via Logstash.
Kafka Cluster:
- Central message broker that manages topics and queues.
- Contains Topics(apartinfo in this case)
Consumers:
- Subscribes to topics in the Kafka cluster.
- Receives and processes data from the cluster.
- Runs python script to displays recieved data with polling every second.

Producer --> Kafka Cluster(Queue) --> Consumer

1. Create EC2 Connect to EC2

$ ssh -i "dataEng-seoul.pem" ec2-user@<your-ec2-public-ip>.ap-northeast-1.compute.amazonaws.com

Kafka Setup on EC2

Create 3 EC2 Instances for Kafka

Use medium-type instances.

Install Kafka

Download and Extract Kafka

$ wget https://downloads.apache.org/kafka/3.6.1/kafka_2.13-3.6.1.tgz
$ tar xvf kafka_2.13-3.6.1.tgz
$ ln -s kafka_2.13-3.6.1 kafka

Start Kafka Services

$ cd kafka
$ ./bin/zookeeper-server-start.sh config/zookeeper.properties &
$ ./bin/kafka-server-start.sh config/server.properties &

Verify Services
```
$ sudo netstat -anp | egrep "9092|2181"
```

Create a Kafka Topic

$ bin/kafka-topics.sh --create --topic apartinfo --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092 &

List Topics

$ bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Consume Messages

$ ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic apartinfo --from-beginning

Kafka Producer and Consumer Setup

Install Logstash for the both `Producer` and `Consumer`

Add Logstash Repository

$ sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
$ sudo vi /etc/yum.repos.d/logstash.repo

Add the following:

[logstash-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Install and Configure Logstash

$ sudo yum install logstash -y

Open the initialization file:

$ vi ~/.bash_profile

Add the following lines:

export LS_HOME=/usr/share/logstash
PATH=$PATH:$LS_HOME/bin

Apply the changes:

$ source ~/.bash_profile

$ logstash --version

Create Logstash Config File

$ vi apartinfo.conf

Add S3 config info for fetching log data.

input {
   s3 {
     access_key_id => "accesskey"
     secret_access_key => "security_key"
     region => "ap-northeast-2"
     prefix => "ods/danji_master.json/" #set bucket directory
     bucket => "fc-storydata"
     additional_settings => {
       force_path_style => true
       follow_redirects => false
     }
   }
 }

output {
stdout { }
   kafka {
      codec => json
      topic_id => "apartinfo"
      bootstrap_servers =>  ["[172.31.6.238:kafka server ip]:9092"]
   }  
}

Run Logstash
```
$ logstash -f apartinfo.conf
```

Kafka Client Installation and Setup for Consumer Instance

Create a configuration file for the Kafka consumer:
```
$ vi consumer_ls.conf
```

Add the following configuration:

input {
    kafka {
        bootstrap_servers => "172.31.6.238:9092"
        group_id => "apart_info"
        topics => ["apartinfo"] # Topic name
        consumer_threads => 1 # Number of consumer threads
    }
}

output {
    stdout { codec => rubydebug }
}

Run Logstash with the configuration file:

$ logstash -f /home/ec2-user/consumer_ls.conf

Kafka Consumer Server Setup

Install the required dependencies:

$ sudo yum install pip -y
$ pip install confluent_kafka

Create a Python script for the consumer:
```
   $ python3 consumer_ph.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
README.md		README.md
consumer_ph.py		consumer_ph.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kafka-on-EC2

Intro.

Kafka Setup on EC2

Create 3 EC2 Instances for Kafka

Install Kafka

Kafka Producer and Consumer Setup

Install Logstash for the both `Producer` and `Consumer`

Kafka Client Installation and Setup for Consumer Instance

Kafka Consumer Server Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kafka-on-EC2

Intro.

Kafka Setup on EC2

Create 3 EC2 Instances for Kafka

Install Kafka

Kafka Producer and Consumer Setup

Install Logstash for the both Producer and Consumer

Kafka Client Installation and Setup for Consumer Instance

Kafka Consumer Server Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Install Logstash for the both `Producer` and `Consumer`

Packages