phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: missing rows after using performance.py
Date Tue, 08 Sep 2015 19:16:52 GMT
Hi James,
Looks like currently you'll get a error log message generated if a row is
attempted to be imported but cannot be (usually due to the data not being
compatible with the schema). For psql.py, this would be the client side log
and messages would look like this:
            LOG.error("Error upserting record {}: {}", csvRecord,
errorMessage);

FWIW, we have a "strict" option for CSV loading (using the -s or --strict
option) which is meant to cause the load to abort if bad data is found, but
it doesn't look like this is currently checked (when bad data is
encountered). I've filed PHOENIX-2239 for this.

Thanks,
James

On Tue, Sep 8, 2015 at 11:26 AM, James Heather <james.heather@mendeley.com>
wrote:

> I've had another go running the performance.py script to upsert
> 100,000,000 rows into a Phoenix table, and again I've ended up with around
> 500 rows missing.
>
> Can anyone explain this, or reproduce it?
>
> It is rather concerning: I'm reluctant to use Phoenix if I'm not sure
> whether rows will be silently dropped.
>
> James
>

Mime
View raw message