pipeline real-time yang memproses 100.000 data menggunakan Kafka + Python (data fake)
Install terlebih dahulu python faker pada folder kafka agar Producer Mengirim 100.000 Data
-> pip install kafka-python faker
tambah file producer.py
from kafka import KafkaProducer
from faker import Faker
import json
import random
import time
producer = KafkaProducer(
bootstrap_servers='127.0.0.1:9092',
value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
fake = Faker()
TOTAL = 100000
start = time.time()
for i in range(TOTAL):
data = {
"transaction_id": i,
"customer": fake.name(),
"amount": round(random.uniform(10, 5000), 2),
"city": fake.city(),
"timestamp": time.time()
}
producer.send("transactions", value=data)
if i % 10000 == 0:
print(f"{i} events sent")
producer.flush()
print(f"Finished {TOTAL} events")
print(f"Duration: {time.time()-start:.2f} seconds")
jalankan python producer.py
Output:

0 Reviews:
Posting Komentar