phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Jones <pajo...@adobe.com>
Subject Phoenix spark and dynamic columns
Date Mon, 25 Jul 2016 21:49:24 GMT
Is it possible to save a dataframe into a table where the columns are dynamic?

For instance, I have a loaded a CSV file with header (key, cat1, cat2) into a dataframe. All
values are strings. I created a table like this: create table mytable ("KEY" varchar not null
primary key); The code is as follows:

    val df = sqlContext.read
        .format("com.databricks.spark.csv")
        .option("header", "true")
        .option("inferSchema", "true")
        .option("delimiter", "\t")
        .load("saint.tsv")
    
    df.write
        .format("org.apache.phoenix.spark")
        .mode("overwrite")
        .option("table", "mytable")
        .option("zkUrl", "servier:2181/hbase")
        .save()

The CSV files I process always have a key column but I don’t know what the other columns
will be until I start processing. The code above fails my example unless I create static columns
named cat1 and cat2. Can I change the save somehow to run an upsert specifying the names/column
types thus saving into dynamic columns?

Thanks in advance,
Paul

Mime
View raw message