Discussion:
latency performance test
Yuheng Du
2015-07-15 08:38:37 UTC
Permalink
Hi,

I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c

In end to end latency test, the latency was around 2ms. In
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
speedx1 50000000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196


The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
know why the latency in producerperformance test is significantly larger
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I believe
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and what
other components?


Thanks.


best,

Yuheng
Ewen Cheslack-Postava
2015-07-16 04:21:21 UTC
Permalink
The tests are meant to evaluate different things and the way they send
messages is the source of the difference.

EndToEndLatency works with a single message at a time. It produces the
message then waits for the consumer to receive it. This approach guarantees
there is no delay due to queuing. The goal with this test is to evaluate
the *minimum* latency.

ProducerPerformance focuses on achieving maximum throughput. This means it
will enqueue lots of records so it will always have more data to send (and
can use batching to increase the throughput). Unlike EndToEndLatency, this
means records may just sit in a queue on the producer for awhile because
the maximum number of in flight requests has been reached and it needs to
wait for responses for those requests. Since EndToEndLatency only ever has
one record outstanding, it will never encounter this case.

Batching itself doesn't increase the latency because it only occurs when
the producer is either a) already unable to send messages anyway or b)
linger.ms is greater than 0, but the tests use the default setting that
doesn't linger at all.

In your example for ProducerPerformance, you have 100 byte records and will
buffer up to 64MB. Given the batch size of 8K and default producer settings
of 5 in flight requests, you can roughly think of one round trip time
handling 5 * 8K = 40K bytes of data. If your roundtrip is 1ms, then if your
buffer is full at 64MB it will take you 64 MB / (40 KB/ms) = 1638ms = 1.6s.
That means that the record that was added at the end of the buffer had to
just sit in the buffer for 1.6s before it was sent off to the broker. And
if your buffer is consistently full (which it should be for
ProducerPerformance since it's sending as fast as it can), that means
*every* record waits that long.

Of course, these numbers are estimates, depend on my having used 1ms, but
hopefully should make it clear why you can see relatively large latencies.

-Ewen
Post by Yuheng Du
Hi,
I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
In end to end latency test, the latency was around 2ms. In
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
speedx1 50000000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196
The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
know why the latency in producerperformance test is significantly larger
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I believe
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and what
other components?
Thanks.
best,
Yuheng
--
Thanks,
Ewen
Yuheng Du
2015-07-16 12:20:54 UTC
Permalink
Hi Ewen,

Thank you for your patient explaining. It is very helpful.

Can we assume that the long latency of ProducerPerformance comes from
queuing delay in the buffer and it is related to buffer size?

Thank you!

best,
Yuheng
Post by Ewen Cheslack-Postava
The tests are meant to evaluate different things and the way they send
messages is the source of the difference.
EndToEndLatency works with a single message at a time. It produces the
message then waits for the consumer to receive it. This approach guarantees
there is no delay due to queuing. The goal with this test is to evaluate
the *minimum* latency.
ProducerPerformance focuses on achieving maximum throughput. This means it
will enqueue lots of records so it will always have more data to send (and
can use batching to increase the throughput). Unlike EndToEndLatency, this
means records may just sit in a queue on the producer for awhile because
the maximum number of in flight requests has been reached and it needs to
wait for responses for those requests. Since EndToEndLatency only ever has
one record outstanding, it will never encounter this case.
Batching itself doesn't increase the latency because it only occurs when
the producer is either a) already unable to send messages anyway or b)
linger.ms is greater than 0, but the tests use the default setting that
doesn't linger at all.
In your example for ProducerPerformance, you have 100 byte records and will
buffer up to 64MB. Given the batch size of 8K and default producer settings
of 5 in flight requests, you can roughly think of one round trip time
handling 5 * 8K = 40K bytes of data. If your roundtrip is 1ms, then if your
buffer is full at 64MB it will take you 64 MB / (40 KB/ms) = 1638ms = 1.6s.
That means that the record that was added at the end of the buffer had to
just sit in the buffer for 1.6s before it was sent off to the broker. And
if your buffer is consistently full (which it should be for
ProducerPerformance since it's sending as fast as it can), that means
*every* record waits that long.
Of course, these numbers are estimates, depend on my having used 1ms, but
hopefully should make it clear why you can see relatively large latencies.
-Ewen
Post by Yuheng Du
Hi,
I have run the end to end latency test and the producerPerformance test
on
Post by Yuheng Du
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
In end to end latency test, the latency was around 2ms. In
bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance
Post by Yuheng Du
speedx1 50000000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196
The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
know why the latency in producerperformance test is significantly larger
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I believe
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and
what
Post by Yuheng Du
other components?
Thanks.
best,
Yuheng
--
Thanks,
Ewen
Ewen Cheslack-Postava
2015-07-16 23:24:52 UTC
Permalink
That's a good bet. Running with a few different buffer sizes would verify
this. Adjusting other settings (e.g. max in flight requests, as mentioned
earlier) could also affect latency and throughput.

-Ewen
Post by Yuheng Du
Hi Ewen,
Thank you for your patient explaining. It is very helpful.
Can we assume that the long latency of ProducerPerformance comes from
queuing delay in the buffer and it is related to buffer size?
Thank you!
best,
Yuheng
Post by Ewen Cheslack-Postava
The tests are meant to evaluate different things and the way they send
messages is the source of the difference.
EndToEndLatency works with a single message at a time. It produces the
message then waits for the consumer to receive it. This approach
guarantees
Post by Ewen Cheslack-Postava
there is no delay due to queuing. The goal with this test is to evaluate
the *minimum* latency.
ProducerPerformance focuses on achieving maximum throughput. This means
it
Post by Ewen Cheslack-Postava
will enqueue lots of records so it will always have more data to send
(and
Post by Ewen Cheslack-Postava
can use batching to increase the throughput). Unlike EndToEndLatency,
this
Post by Ewen Cheslack-Postava
means records may just sit in a queue on the producer for awhile because
the maximum number of in flight requests has been reached and it needs to
wait for responses for those requests. Since EndToEndLatency only ever
has
Post by Ewen Cheslack-Postava
one record outstanding, it will never encounter this case.
Batching itself doesn't increase the latency because it only occurs when
the producer is either a) already unable to send messages anyway or b)
linger.ms is greater than 0, but the tests use the default setting that
doesn't linger at all.
In your example for ProducerPerformance, you have 100 byte records and
will
Post by Ewen Cheslack-Postava
buffer up to 64MB. Given the batch size of 8K and default producer
settings
Post by Ewen Cheslack-Postava
of 5 in flight requests, you can roughly think of one round trip time
handling 5 * 8K = 40K bytes of data. If your roundtrip is 1ms, then if
your
Post by Ewen Cheslack-Postava
buffer is full at 64MB it will take you 64 MB / (40 KB/ms) = 1638ms =
1.6s.
Post by Ewen Cheslack-Postava
That means that the record that was added at the end of the buffer had to
just sit in the buffer for 1.6s before it was sent off to the broker. And
if your buffer is consistently full (which it should be for
ProducerPerformance since it's sending as fast as it can), that means
*every* record waits that long.
Of course, these numbers are estimates, depend on my having used 1ms, but
hopefully should make it clear why you can see relatively large
latencies.
Post by Ewen Cheslack-Postava
-Ewen
Post by Yuheng Du
Hi,
I have run the end to end latency test and the producerPerformance test
on
Post by Yuheng Du
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
In end to end latency test, the latency was around 2ms. In
bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance
Post by Yuheng Du
speedx1 50000000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196
The results show that max latency is 3617ms, avg latency 626.7ms. I
wanna
Post by Ewen Cheslack-Postava
Post by Yuheng Du
know why the latency in producerperformance test is significantly
larger
Post by Ewen Cheslack-Postava
Post by Yuheng Du
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I
believe
Post by Ewen Cheslack-Postava
Post by Yuheng Du
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and
what
Post by Yuheng Du
other components?
Thanks.
best,
Yuheng
--
Thanks,
Ewen
--
Thanks,
Ewen
Continue reading on narkive:
Loading...