phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Liang <>
Subject Re: How to Manage Data Architecture & Modeling for HBase
Date Mon, 06 Apr 2015 13:34:26 GMT
Thank you for your prompt reply.

In my daily work, I mainly used Oracle DB to build a data warehouse with star topology data
modeling, about financial analysis and marketing analysis.
Now I trying to use Hbase to do it. 

 I has a question,
1) many tables from ERP should be Incremental loading every day , Including some insert and
some update,  this scenario is appropriate to use  hbase to build data worehose?
2) Is there some case about Enterprise BI Solutions with HBASE? 


Ben Liang

> On Apr 6, 2015, at 20:27, Michael Segel <> wrote:
> Yeah. Jean-Marc is right. 
> You have to think more in terms of a hierarchical model where you’re modeling records
not relationships. 
> Your model would look like a single ER box per record type. 
> The HBase schema is very simple.  Tables, column families and that’s it for static
structures.  Even then, column families tend to get misused. 
> If you’re looking at a relational model… Phoenix or Splice Machines would allow you
to do something… although Phoenix is still VERY primitive. 
> (Do they take advantage of cell versioning like spice machines yet? ) 
> There are a couple of interesting things where you could create your own modeling tool
/ syntax (relationships)… 
> 1) HBase is more 3D than RDBMS 2D and similar to ORDBMSs. 
> 2) You can join entities on either a FK principle or on a weaker relationship type. 
> HBase stores CLOBS/BLOBs in each cell. Its all just byte arrays with a finite bounded
length not to exceed the size of a region. So you could store an entire record as a CLOB within
a cell.  Its in this sense that a cell can represent multiple attributes of your object/record
that you gain an additional dimension and why you only need to use a single data type. 
> HBase and Hadoop in general allow one to join orthogonal data sets that have a weak relationship.
 So while you can still join sets against a FK which implies a relationship, you don’t have
to do it. 
> Imagine if you wanted to find out the average cost of a front end collision by car of
college aged drivers by major. 
> You would be joining insurance records against registrations for all of the universities
in the US for those students between the ages of 17 and 25. 
> How would you model this when in fact neither defining attribute is a FK? 
> (This is why you need a good Secondary Indexing implementation and not something brain
dead that wasn’t alcohol induced. ;-) 
> Does that make sense? 
> Note: I don’t know if anyone like CCCis, Allstate, State Farm, or Progressive Insurance
are doing anything like this. But they could.
>> On Apr 5, 2015, at 7:54 PM, Jean-Marc Spaggiari <> wrote:
>> Not sure you want to ever do that... Designing an HBase application is far
>> different from designing an RDBMS one. Not sure those tools fit well here.
>> What's you're goal? Designing your HBase schema somewhere and then let the
>> tool generate your HBase tables?
>> 2015-04-05 18:26 GMT-04:00 Ben Liang <>:
>>> Hi all,
>>>       Do you have any tools to manage Data Architecture & Modeling for
>>> HBase( or Phoenix) ?  Can we  use Powerdesinger or ERWin to do it?
>>>       Please give me some advice.
>>> Regards,
>>> Ben Liang
> The opinions expressed here are mine, while they may reflect a cognitive thought, that
is purely accidental. 
> Use at your own risk. 
> Michael Segel
> michael_segel (AT)

View raw message