Discussion:
Deciding number of partitions
Sachit Murarka
2018-11-11 14:18:02 UTC
Permalink
Hello All,

I am new to kafka. Could anyone please tell how to decided number of
partitions for our topic?


Kind Regards,
Sachit Murarka
花儿向阳开
2018-11-11 14:38:14 UTC
Permalink
Hi,
Maybe you can use the command like:
kafka-topic.sh --create --topic <topic_name> --partitions <partition_number> --replication-factor <replica_number> --zookeeper <zookeeperIP:Port>
I hope it helps!


Regards!
Cathy
------------------ Ô­ÊŒÓÊŒþ ------------------
·¢ŒþÈË: "Sachit Murarka"<***@gmail.com>;
·¢ËÍʱŒä: 2018Äê11ÔÂ11ÈÕ(ÐÇÆÚÌì) ÍíÉÏ10:48
ÊÕŒþÈË: "users"<***@kafka.apache.org>;

Ö÷Ìâ: Deciding number of partitions



Hello All,

I am new to kafka. Could anyone please tell how to decided number of
partitions for our topic?


Kind Regards,
Sachit Murarka
Sachit Murarka
2018-11-11 15:28:32 UTC
Permalink
Hi,

Thanks for reply.
You have shared the command. I am looking for the formula or way to
calculate the number of partitions for my topic.

Regards,
Sachit

On Sun, 11 Nov 2018, 20:08 花儿向阳匀 <***@qq.com wrote:

> Hi,
> Maybe you can use the command like:
> kafka-topic.sh --create --topic <topic_name> --partitions
> <partition_number> --replication-factor <replica_number> --zookeeper
> <zookeeperIP:Port>
> I hope it helps!
>
>
> Regards!
> Cathy
> ------------------ 原始邮件 ------------------
> 发件人: "Sachit Murarka"<***@gmail.com>;
> 发送时闎: 2018幎11月11日(星期倩) 晚䞊10:48
> 收件人: "users"<***@kafka.apache.org>;
>
> 䞻题: Deciding number of partitions
>
>
>
> Hello All,
>
> I am new to kafka. Could anyone please tell how to decided number of
> partitions for our topic?
>
>
> Kind Regards,
> Sachit Murarka
M. Manna
2018-11-11 15:52:22 UTC
Permalink
If I understood your question correctly - you are looking for recommended
guideline for #partitions / consumer.
Kafka doesn’t hold any responsibility for that - and I’m happy to be
corrected.

There are two partition assignment strategy - default and sticky. But you
should always aim for x/N partitions per consumer (in a given consumer
group).

x = number of consumers
N = number of partitions in a given topic.

This concept is irrelevant of Kafka. In any distributed ecosystem you
should maintain the above number such that

1) there’s no race between threads I.e. processing same task

2) tasks are evenly distributed amongst workers - unless a worker dies/
exits their pool.

So if you have not planned for an even distribution of partitions per
consumer thread, you may need to revisit how you have planned to spawn your
consumers in your cgroup.

I hope that makes sense.

Thanks,



On Sun, 11 Nov 2018 at 15:29, Sachit Murarka <***@gmail.com>
wrote:

> Hi,
>
> Thanks for reply.
> You have shared the command. I am looking for the formula or way to
> calculate the number of partitions for my topic.
>
> Regards,
> Sachit
>
> On Sun, 11 Nov 2018, 20:08 花儿向阳匀 <***@qq.com wrote:
>
> > Hi,
> > Maybe you can use the command like:
> > kafka-topic.sh --create --topic <topic_name> --partitions
> > <partition_number> --replication-factor <replica_number> --zookeeper
> > <zookeeperIP:Port>
> > I hope it helps!
> >
> >
> > Regards!
> > Cathy
> > ------------------ 原始邮件 ------------------
> > 发件人: "Sachit Murarka"<***@gmail.com>;
> > 发送时闎: 2018幎11月11日(星期倩) 晚䞊10:48
> > 收件人: "users"<***@kafka.apache.org>;
> >
> > 䞻题: Deciding number of partitions
> >
> >
> >
> > Hello All,
> >
> > I am new to kafka. Could anyone please tell how to decided number of
> > partitions for our topic?
> >
> >
> > Kind Regards,
> > Sachit Murarka
>
M. Manna
2018-11-11 15:56:21 UTC
Permalink
My mistake in explaining the formula

N / x partitions per consumer - the ideal ratio should be 1.


On Sun, 11 Nov 2018 at 15:52, M. Manna <***@gmail.com> wrote:

> If I understood your question correctly - you are looking for recommended
> guideline for #partitions / consumer.
> Kafka doesn’t hold any responsibility for that - and I’m happy to be
> corrected.
>
> There are two partition assignment strategy - default and sticky. But you
> should always aim for x/N partitions per consumer (in a given consumer
> group).
>
> x = number of consumers
> N = number of partitions in a given topic.
>
> This concept is irrelevant of Kafka. In any distributed ecosystem you
> should maintain the above number such that
>
> 1) there’s no race between threads I.e. processing same task
>
> 2) tasks are evenly distributed amongst workers - unless a worker dies/
> exits their pool.
>
> So if you have not planned for an even distribution of partitions per
> consumer thread, you may need to revisit how you have planned to spawn your
> consumers in your cgroup.
>
> I hope that makes sense.
>
> Thanks,
>
>
>
> On Sun, 11 Nov 2018 at 15:29, Sachit Murarka <***@gmail.com>
> wrote:
>
>> Hi,
>>
>> Thanks for reply.
>> You have shared the command. I am looking for the formula or way to
>> calculate the number of partitions for my topic.
>>
>> Regards,
>> Sachit
>>
>> On Sun, 11 Nov 2018, 20:08 花儿向阳匀 <***@qq.com wrote:
>>
>> > Hi,
>> > Maybe you can use the command like:
>> > kafka-topic.sh --create --topic <topic_name> --partitions
>> > <partition_number> --replication-factor <replica_number> --zookeeper
>> > <zookeeperIP:Port>
>> > I hope it helps!
>> >
>> >
>> > Regards!
>> > Cathy
>> > ------------------ 原始邮件 ------------------
>> > 发件人: "Sachit Murarka"<***@gmail.com>;
>> > 发送时闎: 2018幎11月11日(星期倩) 晚䞊10:48
>> > 收件人: "users"<***@kafka.apache.org>;
>> >
>> > 䞻题: Deciding number of partitions
>> >
>> >
>> >
>> > Hello All,
>> >
>> > I am new to kafka. Could anyone please tell how to decided number of
>> > partitions for our topic?
>> >
>> >
>> > Kind Regards,
>> > Sachit Murarka
>>
>
Ryanne Dolan
2018-11-11 15:44:16 UTC
Permalink
Sachit, there are many factors. Some to consider:

- # partitions is the max number of consumers the topic can have per group,
since each consumer will need at least one partition assigned to it.
- # partitions is usually also the max concurrency factor across a consumer
group, since consumers often process records sequentially within each
partition.
- if your key space is lopsided, you may wish to have enough partitions to
enable finer-grained load balancing.
- you may wish to keep the # of partitions around the # of brokers in your
Kafka cluster

Ryanne

On Sun, Nov 11, 2018, 9:29 AM Sachit Murarka <***@gmail.com>
wrote:

> Hi,
>
> Thanks for reply.
> You have shared the command. I am looking for the formula or way to
> calculate the number of partitions for my topic.
>
> Regards,
> Sachit
>
> On Sun, 11 Nov 2018, 20:08 花儿向阳匀 <***@qq.com wrote:
>
> > Hi,
> > Maybe you can use the command like:
> > kafka-topic.sh --create --topic <topic_name> --partitions
> > <partition_number> --replication-factor <replica_number> --zookeeper
> > <zookeeperIP:Port>
> > I hope it helps!
> >
> >
> > Regards!
> > Cathy
> > ------------------ 原始邮件 ------------------
> > 发件人: "Sachit Murarka"<***@gmail.com>;
> > 发送时闎: 2018幎11月11日(星期倩) 晚䞊10:48
> > 收件人: "users"<***@kafka.apache.org>;
> >
> > 䞻题: Deciding number of partitions
> >
> >
> >
> > Hello All,
> >
> > I am new to kafka. Could anyone please tell how to decided number of
> > partitions for our topic?
> >
> >
> > Kind Regards,
> > Sachit Murarka
>
Jonathan Santilli
2018-11-11 15:56:52 UTC
Permalink
Hello Sachit, I think that's the first question all of us ask our self when
facing that moment, "Ok, but, how many partitions do we need?"

I would say, according to my experience, that directly depends on your
business and the kind of problem you are trying to solve, how many
consumer and producers that topic will have, and so on.

I would recommend reading the blog post "How to choose the number of
topics/partitions in a Kafka cluster
<https://webcache.googleusercontent.com/search?q=cache:Kum1ZTnjfKUJ:https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster+&cd=1&hl=en&ct=clnk&gl=uk>".
Also, as a general advice, plan ahead your topic configuration and size,
otherwise you could regret in the future.

Last, but not least, this presentation from Gwen Shapira How much Kafka?
<https://www.slideshare.net/ConfluentInc/how-much-kafka/ConfluentInc/how-much-kafka>
is a good resource.

Hope that helps.
--
Jonathan



On Sun, Nov 11, 2018 at 3:29 PM Sachit Murarka <***@gmail.com>
wrote:

> Hi,
>
> Thanks for reply.
> You have shared the command. I am looking for the formula or way to
> calculate the number of partitions for my topic.
>
> Regards,
> Sachit
>
> On Sun, 11 Nov 2018, 20:08 花儿向阳匀 <***@qq.com wrote:
>
> > Hi,
> > Maybe you can use the command like:
> > kafka-topic.sh --create --topic <topic_name> --partitions
> > <partition_number> --replication-factor <replica_number> --zookeeper
> > <zookeeperIP:Port>
> > I hope it helps!
> >
> >
> > Regards!
> > Cathy
> > ------------------ 原始邮件 ------------------
> > 发件人: "Sachit Murarka"<***@gmail.com>;
> > 发送时闎: 2018幎11月11日(星期倩) 晚䞊10:48
> > 收件人: "users"<***@kafka.apache.org>;
> >
> > 䞻题: Deciding number of partitions
> >
> >
> >
> > Hello All,
> >
> > I am new to kafka. Could anyone please tell how to decided number of
> > partitions for our topic?
> >
> >
> > Kind Regards,
> > Sachit Murarka
>


--
Santilli Jonathan
s***@163.com
2018-11-12 01:48:30 UTC
Permalink
well, it's complicated to decide the num of partitons, it's up on your application. I think 8-64 is a better practise for most application, such as 16/32 partitions with 2 or 3 replicas.
generally,
the less the num of partitions you have , the less tps you will get,
the more the num of partitions you have, the more cpu/io/mem/fd... it will occupies. And too many partitons means too many files which means sequence writting will be broken to random writting and reading.

forgive my poor English~



Éò¹â»Ô
ÖйúÒøÁª ¿ÆŒŒÊÂÒµ²¿ ÔÆÉÁž¶ÍŶÓ
µç»°£º20633284 | 13696519872
ÉϺ£ÊÐÆÖ¶«ÐÂÇø¹ËÌÆ·1699ºÅ ÖйúÒøÁªÔ°Çø


·¢ŒþÈË£º »š¶ùÏòÑô¿ª
·¢ËÍʱŒä£º 2018-11-11 22:38
ÊÕŒþÈË£º users
Ö÷Ì⣺ »ØžŽ£ºDeciding number of partitions
Hi,
Maybe you can use the command like:
kafka-topic.sh --create --topic <topic_name> --partitions <partition_number> --replication-factor <replica_number> --zookeeper <zookeeperIP:Port>
I hope it helps!


Regards!
Cathy
------------------ Ô­ÊŒÓÊŒþ ------------------
·¢ŒþÈË: "Sachit Murarka"<***@gmail.com>;
·¢ËÍʱŒä: 2018Äê11ÔÂ11ÈÕ(ÐÇÆÚÌì) ÍíÉÏ10:48
ÊÕŒþÈË: "users"<***@kafka.apache.org>;

Ö÷Ìâ: Deciding number of partitions



Hello All,

I am new to kafka. Could anyone please tell how to decided number of
partitions for our topic?


Kind Regards,
Sachit Murarka
Loading...