Thanks Han and Alexander for taking time out and your responses.
setup.
zookeeper.
if the server room goes down that has 3 nodes.
I agree on this is one cluster but having one additional ZK node per site
does not help. (as far as I understand ZK)
A 3 out of 6 is also not a majority. So I think you mean 3/5 with a cloned
3rd one. This would mean manually switching the cloned one for majority
which can cause issues again.
1. You actually build a master/slave ZK with manually switch over.
2. While switching the clone from room to room you would have downtime.
3. If you switch on both ZK node clones at the same time (by mistake) you
screwed.
4. If you "switch" clones instead of moving it will all data on disk you
generate a split brain from which you have to recover first.
So if you loose the connection between the rooms / the rooms get separated
* You (might) need manual interaction
* loose automatic fail-over between the rooms
* might face complete outage if your "master" room with the active 3rd
node is hit.
Actually this is the same scenario with 2/3 nodes spread over two locations.
What you need is a third cross connected location for real fault tolerance
and distribute your 3 or 5 ZK nodes over those.
Or live with a possible outage in such a scenario.
* You can run any number of Kafka brokers on a ZK cluster. In your case
this could be 4 Kafka brokers on 3 ZK nodes.
* You should set topic replication to 2 (can be done at any time) and some
other producer/broker settings to ensure your messages will not get lost in
switch over cases.
* ZK service does not react nicely on disk full.
Post by Hans JespersenIn that case itâs really one cluster. Make sure to set different rack ids
for each server room so kafka will ensure that the replicas always span
both floors and you donât loose availability of data if a server room goes
down.
You will have to configure one addition zookeeper node in each site which
you will only ever startup if a site goes down because otherwise 2 of 4
zookeeper nodes is not a quorum.Again you would be better with 3 nodes
because then you would only have to do this in the site that has the single
active node.
-hans
Post by Jens RantilHi Hans,
Thank you for your reply.
Its basically two different server rooms on different floors and they are
connected with fiber connectivity so its almost like a local connection
between them no network latencies / lag.
If i do a Mirror Maker / Replicator then i will not be able to use them at
the same time for writes./ producers. because the consumers / producers
will request from all of them
BR,
Lee
What do you mean when you say you have "2 sites not datacenters"? You
Post by Hans Jespersenshould be very careful configuring a stretch cluster across multiple sites.
What is the RTT between the two sites? Why do you think that MIrror Maker
(or Confluent Replicator) would not work between the sites and yet you
think a stretch cluster will work? That seems wrong.
-hans
/**
* Hans Jespersen, Principal Systems Engineer, Confluent Inc.
*/
Hi Guys,
Post by Le CyberianThank you very much for you reply.
The scenario which i have to implement is that i have 2 sites not
datacenters so mirror maker would not work here.
There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The
idea
Post by Le Cyberianis to have Active-Active setup along with fault tolerance so that if one
of
Post by Le Cyberianthe site goes on the operations are normal.
In this case if i go ahead with 4 node-cluster of both zookeeper and
kafka
Post by Le Cyberianit will give failover tolerance for 1 node only.
What do you suggest to do in this case ? because to divide between 2
sites
Post by Le Cyberianit needs to be even number if that makes sense ? Also if possible some
help
Post by Le Cyberianregarding partitions for topic and replication factor.
I already have Kafka running with quiet few topics having replication
factor 1 along with 1 default partition, is there a way to repartition /
increase partition of existing topics when i migrate to above setup ? I
think we can increase replication factor by Kafka rebalance tool.
Thanks alot for your help and time looking into this.
BR,
Le
Jens,
Post by Hans JespersenI think you are correct that a 4 node zookeeper ensemble can be made to
work but it will be slightly less resilient than a 3 node ensemble
because
Post by Hans Jespersenit can only tolerate 1 failure (same as a 3 node ensemble) and the
likelihood of node failures is higher because there is 1 more node that
could fail.
So it SHOULD be an odd number of zookeeper nodes (not MUST).
-hans
Hi Hans,
Post by Hans JespersenPost by Hans JespersenA 4 node zookeeper ensemble will not even work. It MUST be an odd
number
of zookeeper nodes to start.
Post by Hans JespersenAre you sure about that? If Zookeer doesn't run with four nodes, that
means
Post by Hans Jespersena running ensemble of three can't be live-migrated to other nodes
(because
Post by Hans Jespersenthat's done by increasing the ensemble and then reducing it in the
case
of
Post by Hans JespersenPost by Hans Jespersen3-node ensembles). IIRC, you can run four Zookeeper nodes, but that
means
quorum will be three nodes, so there's no added benefit in terms of
Post by Hans Jespersenavailability since you can only loose one node just like with a three
node
Post by Hans Jespersencluster.
Cheers,
Jens
--
Jens Rantil
Backend engineer
Tink AB
Phone: +46 708 84 18 32
Web: www.tink.se
Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_
companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%
2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
Post by Hans JespersenTwitter <https://twitter.com/tink>