Hello Gurus,

I am trying to migrate some of PLSQL code to hadoop + Apache Pheonix.

See below sample example. it has cursor declaration and then iterating over cursor to update table.


Cursor Declaration:

  cursor cur is

      select id_num,

             txt_num

        from numbers_en join translations using(id_num)

                   left join lang         using(id_lang);


Iterate over curser to update table translations:


  for rec in cur loop

    if mod(rec.id_num,2) = 0 then

      update translations set txt_trans = upper(txt_trans)

       where current of cur; 

  end loop;


Easiest way to achieve is below :

  1. * Make JDBC call to collect ResultSet 
  2. * Iterate through ResultSet 
  3. * In each record in resultset, make JDBC for update.
Dis-advantages with approach:
  - Client become bottleneck as complete resultset will come to client  first.
  - cursor logic is not leveraging parallelism processing.


Help me to implement the same bottleneck removed through Apache Pheonix + HBAse + Hbase Coprocessor.
( Have Coprocessor on Query/View/Table scan and update operation within coprocessor logic)

Any guidance highly appreciated.
   



Regards
Sanjiv Singh
Mob :  +1 571-599-5236