phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Agarwal <vi...@infoobjects.com>
Subject Re: Recursive queries?
Date Tue, 07 Oct 2014 07:09:12 GMT
Hi James,

Just out of curiosity, can you predict the expected time in running such
queries? I mean what should be the acceptable response time in this case.
My concern is that if someone has to use these type of queries for real
time analytics or showing the results on web page in standard web
application, would it able to return results in acceptable time (in
seconds)?

There can be different load situations e.g. there can be million of trees
with limited depth (eg.  < 100) OR there can be hundreds of trees with
depth in millions. Although, I am not able to think of any practical use
case for the later, still I would like to understand the performance issues
in such cases. :)

On Fri, Sep 26, 2014 at 3:40 AM, James Taylor <jamestaylor@apache.org>
wrote:

> Hey JM,
>
> My idea for processing this kind of query assumes that the tree does
> not have cycles and would do one query per depth of the hierarchy.
> Something like the following, with targetTable having a DEPTH and ID
> column:
>
> int collectChildren(long rootId) {
>     connection.setAutoCommit(true)
>     int depth = 0;
>     // seed our temp table:
>     upsert into targetTable(depth, id) values (0, <rootId>);
>     do {
>         depth++;
>         rowCount =
>             upsert into tmp.target(depth, id)
>             select <depth>,id
>             from targetTable t join sourceTable s
>             on (t.id = s.id and t.depth=<depth - 1>)
>     } while (rowCount > 0);
>
>     return depth;
> }
>
> I suppose if there were cycles, you could add a WHERE NOT IN clause.
>
> Thanks,
> James
>
> On Thu, Sep 25, 2014 at 5:38 AM, Jean-Marc Spaggiari
> <jean-marc@spaggiari.org> wrote:
> > Hi James,
> >
> > Thanks for the feedback. My knowledge of Phoenix and SQL is not good
> enough
> > for now to jump on such a big patch.
> >
> > Regarding your suggesting with UPSERT SELECT and temp table, the
> question is
> > when do we know we should stop.
> >
> > I upsert select from the first level, then do that again with the IN from
> > this temp table, and then again and again, but at some point I'm at the
> > bottom of the tree so I can stop, but how do I know? Does UPSERT SELECT
> > return the number of rows inserted into the destination table? If so it
> > might be an option.
> >
> > That might give something like:
> >
> > UPSERT INTO tmp.targetTable(id) SELECT id FROM test.sourceTable WHERE
> parent
> > IN (SELECT id FROM tmp.targetTable) AND id NOT IN (SELECT * FROM
> > tmp.targetTable)
> >
> > Will something like that work? Bascially, I insert all rows which are not
> > already there and which have the parent in the temp table. I just need to
> > populate the table with the top node and iterate until this request
> doesn't
> > give me any result back.
> >
> > Thanks,
> >
> > JM
> >
> > 2014-09-24 23:43 GMT-04:00 James Taylor <jamestaylor@apache.org>:
> >
> >> Hey JM,
> >> We'd like to support all of SQL-99 eventually, so based on that, it's
> >> on our roadmap. Like most open source projects, we'd look for a
> >> volunteer to take this on - it certainly meets the criteria of being
> >> interesting.
> >>
> >> I think priority-wise, it's lower than most of the join work
> >> identified on our Roadmap (http://phoenix.apache.org/roadmap.html).
> >> Would it be feasible to drive this through multiple client queries (1
> >> per level) using the IN clause support we have (i.e. by generating a
> >> query)? You could use UPSERT SELECT to dump the IDs you get back at
> >> each level into a temp table if need be and join against it for the
> >> next query.
> >>
> >> Thanks,
> >> James
> >>
> >> On Wed, Sep 24, 2014 at 1:08 PM, Jean-Marc Spaggiari
> >> <jean-marc@spaggiari.org> wrote:
> >> > Hi,
> >> >
> >> > We have something like this that we want to "translate" into Phoenix
> >> > (snippet):
> >> >
> >> >
> >> >
> >> > RETURN QUERY WITH RECURSIVE first_level AS (
> >> > -- non-recursive term
> >> >  (
> >> >     SELECT a.id AS id FROM asset a
> >> >        WHERE a.parent_id = p_id AND TYPE = 2
> >> > )
> >> > UNION
> >> >   -- Recursive Term
> >> >   SELECT a.id AS id FROM first_level fflf, asset a
> >> >      WHERE a.parent_id = flf.id AND type = 2
> >> > )
> >> >
> >> >
> >> > Basically, let's consider we have millions of trees stored into HBase.
> >> > For
> >> > any node, we want to get all the children recursively.
> >> >
> >> > Is that something we can translate to Phoenix? If not, is it in the
> >> > roadmap?
> >> >
> >> > Thanks,
> >> >
> >> > JM
> >
> >
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Mime
View raw message