phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From manohar mc <>
Subject Fw: Read Performance in latest code
Date Sun, 21 Jul 2019 17:18:12 GMT

   ----- Forwarded message ----- From: manohar mc <>To:
<>Sent: Friday, 19 July, 2019, 11:14:41 am ISTSubject:
Read Performance in latest code
 Hi List, I am using the latest phoenix spark connector
While using initially we observed issues in write performance and after some changes we could
be get down time from 30 minutes to < 1 minute in our test environment. But we are seeing
lots of CPU time is consumed while reading data into dataframe, if you see below picture >50%
cpu time is spent in ShuffleMapTask. 

If you see picture there are lots of recursive calls till DataSourceRDD.compute get called.
So wanted to understand what happening in this case and any way we can reduce the CPU time
while shuffleMapTask.
View raw message