scala - Write to disk with Spark Streaming -
i trying write disk in parquet format using data spark streaming.
i getting slow write results using:
val stream = ssc.receiverstream(...) stream.foreachrdd { rdd => if (rdd.count() > 0) { // singleton instance of sqlcontext val sqlcontext = sqlcontext.getorcreate(rdd.sparkcontext) import sqlcontext.implicits._ // save models in parquet format. rdd.todf() .write.mode(savemode.append) .parquet("../myfile") } }
i compared against reading same data set once memory , once disk spark, opposed above, streaming approach used.
can tell me why , give solution?
Comments
Post a Comment