scala - Write to disk with Spark Streaming -


i trying write disk in parquet format using data spark streaming.

i getting slow write results using:

 val stream = ssc.receiverstream(...)    stream.foreachrdd { rdd =>      if (rdd.count() > 0) {        // singleton instance of sqlcontext       val sqlcontext = sqlcontext.getorcreate(rdd.sparkcontext)       import sqlcontext.implicits._        // save models in parquet format.       rdd.todf()         .write.mode(savemode.append)         .parquet("../myfile")     }   } 

i compared against reading same data set once memory , once disk spark, opposed above, streaming approach used.

can tell me why , give solution?


Comments

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

c# - Get rid of xmlns attribute when adding node to existing xml -