Hello,
A user of mine brought up a question around dynamic columns in Phoenix today. The quantity
of columns should become asymptotic to a few tends of thousands of columns as their data fills
in.
The user want to query all columns in a table and they are today thinking of using views to
do this -- but it is ugly management. They have an unbounded number of views -- which will
pollute the global catalog and fail relatively quickly.
Has anyone thought about the potentially wasteful[1] approach of scanning all rows in a query
to determine columns and then re-running the query for the rows once we know what columns
the SQL result will contain. Maybe something cleaner like persisting the set of columns in
the statistics table and a SELECT * may return columns with nothing but nulls. Or, even better
is there an overall better way to model such a wide schema in Phoenix?
-Clay
[1]: Perhaps some heuristics could allow for not needing to do 2n reads in all cases? |