Troubleshooting
Connection issues
Section titled “Connection issues””Connection refused” when connecting
Section titled “”Connection refused” when connecting”Symptom: Client throws connection refused on localhost:9092.
Causes & fixes:
-
klite isn’t running — Start it with
./kliteand check for startup errors in the log output. -
Wrong port — If you started klite with
--listen :9093, connect to port 9093:bootstrap.servers = localhost:9093 -
Firewall blocking the port — Check that port 9092 is open:
Terminal window nc -zv localhost 9092
Clients connect but can’t produce or consume
Section titled “Clients connect but can’t produce or consume”Symptom: Connection succeeds but operations fail with “broker not available” or “leader not available”.
Cause: The advertised address doesn’t match what clients can reach.
Fix: Set --advertised-addr to the address clients use:
# Docker./klite --advertised-addr host.docker.internal:9092
# Remote server./klite --advertised-addr my-server.example.com:9092
# Kubernetes./klite --advertised-addr klite.default.svc.cluster.local:9092Docker: clients outside the container can’t connect
Section titled “Docker: clients outside the container can’t connect”Symptom: kcat from the host machine can’t connect to klite running in Docker.
Fixes:
-
Ensure you published the port:
Terminal window docker run -p 9092:9092 ghcr.io/klaudworks/klite -
Set the advertised address:
Terminal window docker run -p 9092:9092 ghcr.io/klaudworks/klite --advertised-addr localhost:9092 -
On Linux with Docker bridge networking, you may need
--network host:Terminal window docker run --network host ghcr.io/klaudworks/klite
Docker Compose: services can’t reach klite
Section titled “Docker Compose: services can’t reach klite”Symptom: Other containers in the same Compose stack get connection errors.
Fix: Use the service name as the advertised address:
services: klite: image: ghcr.io/klaudworks/klite command: ["--advertised-addr", "klite:9092"] ports: - "9092:9092"Other containers connect with bootstrap.servers = klite:9092.
Topic issues
Section titled “Topic issues””Topic not found” errors
Section titled “”Topic not found” errors”Symptom: Consumer gets UNKNOWN_TOPIC_OR_PARTITION error.
Causes & fixes:
-
Auto-create is disabled — If you started klite with
--auto-create-topics=false, create topics explicitly before use:Terminal window # TODO: Add klite admin CLI example for topic creation# For now, producing with kcat auto-creates:echo "" | kcat -P -b localhost:9092 -t my-topic -
Typo in topic name — Topic names are case-sensitive. Check the exact name.
Need more partitions
Section titled “Need more partitions”klite defaults to 1 partition per topic. For higher throughput or more consumer parallelism:
# Set default for new topics./klite --default-partitions 6Consumer group issues
Section titled “Consumer group issues”Consumers not receiving messages
Section titled “Consumers not receiving messages”Symptom: Consumer is connected and in a group but not receiving any messages.
Causes & fixes:
-
More consumers than partitions — Extra consumers are idle. Add more partitions.
-
Wrong
auto.offset.reset— If the consumer group has no committed offsets andauto.offset.resetislatest(default in some clients), it only sees new messages:auto.offset.reset = earliest -
Consumer group is stuck in rebalancing — Check klite logs for rebalance events:
Terminal window ./klite --log-level debug 2>&1 | grep -i "group\|rebalance"
“Group coordinator not available”
Section titled ““Group coordinator not available””Symptom: Client reports the group coordinator is not available.
Fix: This is usually a transient error during startup. The client will retry automatically. If it persists, check that klite is running and reachable.
Storage issues
Section titled “Storage issues””disk full” or write errors
Section titled “”disk full” or write errors”Symptom: klite logs WAL write failed with a disk full error.
Fixes:
- Free disk space or expand the volume
- Enable S3 storage to offload old data:
Terminal window ./klite --s3-bucket my-bucket --s3-region us-east-1 - Reduce retention:
Terminal window ./klite --retention-ms 86400000 # 24 hours
Data directory permissions
Section titled “Data directory permissions”Symptom: klite fails to start with “permission denied” on the data directory.
Fix: Ensure the klite process can read and write the data directory:
mkdir -p ./datachmod 755 ./data
# In Docker, ensure the container user has access:docker run -p 9092:9092 -v ./data:/data --user $(id -u):$(id -g) ghcr.io/klaudworks/klite --data-dir /dataStartup issues
Section titled “Startup issues”klite fails to start
Section titled “klite fails to start”Check the error message in the output:
| Error | Cause | Fix |
|---|---|---|
bind: address already in use | Another process on port 9092 | Stop it or use --listen :9093 |
permission denied | Can’t bind port or open data dir | Run as appropriate user, check permissions |
WAL replay failed | Corrupted WAL | Check disk health; restore from S3 if available |
How to reset all state
Section titled “How to reset all state”To start fresh, stop klite and delete the data directory:
rm -rf ./data./kliteThis deletes all topics, messages, consumer group offsets, and metadata.
Performance issues
Section titled “Performance issues”High produce latency
Section titled “High produce latency”Possible causes:
- Slow disk (klite fsyncs on produce with acks=all). Use an NVMe SSD for best performance.
- Large batch sizes. klite processes batches atomically; very large batches take longer.
High memory usage
Section titled “High memory usage”klite keeps recent WAL data in the OS page cache. This is normal and managed by the kernel. RES memory in top should stay low; VIRT may be high due to mmap.
Getting help
Section titled “Getting help”If your issue isn’t covered here:
- Check the klite GitHub issues
- Run klite with
--log-level debugand include the logs in your report - Include your klite version, OS, and client library version