kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gwens...@apache.org
Subject [kafka] branch trunk updated: MINOR: Lower producer throughput in flaky upgrade system test
Date Fri, 07 Jun 2019 23:54:14 GMT
This is an automated email from the ASF dual-hosted git repository.

gwenshap pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/kafka.git


The following commit(s) were added to refs/heads/trunk by this push:
     new c7c310b  MINOR: Lower producer throughput in flaky upgrade system test
c7c310b is described below

commit c7c310beff23423c484b3a86aa284141a190d581
Author: Jason Gustafson <jason@confluent.io>
AuthorDate: Fri Jun 7 16:53:50 2019 -0700

    MINOR: Lower producer throughput in flaky upgrade system test
    
    We see the upgrade test failing from time to time. I looked into it and found that the
root cause is basically that the test throughput can be too high for the 0.9 producer to make
progress. Eventually it reaches a point where it has a huge backlog of timed out requests
in the accumulator which all have to be expired. We see a long run of messages like this in
the output:
    
    ```
    {"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386132,"name":"producer_send_error","topic":"test_topic","message":"Batch
Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335160","key":null}
    {"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386132,"name":"producer_send_error","topic":"test_topic","message":"Batch
Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335163","key":null}
    {"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386133,"name":"producer_send_error","topic":"test_topic","message":"Batch
Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335166","key":null}
    {"exception":"class org.apache.kafka.common.errors.TimeoutException","time_ms":1559907386133,"name":"producer_send_error","topic":"test_topic","message":"Batch
Expired","class":"class org.apache.kafka.tools.VerifiableProducer","value":"335169","key":null}
    ```
    This can continue for a long time (I have observed up to 1 min) and prevents the producer
from successfully writing any new data. While it is busy expiring the batches, no data is
getting delivered to the consumer, which causes it to eventually raise a timeout.
    ```
    kafka.consumer.ConsumerTimeoutException
    at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:50)
    at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:109)
    at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
    at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:47)
    at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
    ```
    The fix here is to reduce the throughput, which seems reasonable since the purpose of
the test is to verify the upgrade, which does not demand heavy load. Note that I investigated
several failing instances of this test going back to 1.0 and saw a similar pattern, so there
does not appear to be a regression.
    
    Author: Jason Gustafson <jason@confluent.io>
    
    Reviewers: Gwen Shapira
    
    Closes #6907 from hachikuji/lower-throughput-for-upgrade-test
---
 tests/kafkatest/tests/core/upgrade_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/kafkatest/tests/core/upgrade_test.py b/tests/kafkatest/tests/core/upgrade_test.py
index 80f18ff..1cda4e7 100644
--- a/tests/kafkatest/tests/core/upgrade_test.py
+++ b/tests/kafkatest/tests/core/upgrade_test.py
@@ -36,7 +36,7 @@ class TestUpgrade(ProduceConsumeValidateTest):
         self.zk.start()
 
         # Producer and consumer
-        self.producer_throughput = 10000
+        self.producer_throughput = 1000
         self.num_producers = 1
         self.num_consumers = 1
 


Mime
View raw message