scala - Spark reparition() function increases number of tasks per executor, how to increase number of executor -


i'm working on ibm server of 30gb ram (12 cores engine), have provided cores spark still, uses 1 core, tried while loading file , got successful command

val name_db_rdd = sc.textfile("input_file.csv",12) 

and able provide 12 cores processing starting jobs want split operation in between intermediate operations executors, can use 12 cores.

image - description

val new_rdd = rdd.repartition(12) 

enter image description here

as can see in image 1 executor running , repartition function split data many tasks @ 1 executor.

it depends how you're launching job, want add --num-executors command line when you're launching spark job.

something like

spark-submit     --num-executors 10 \     --driver-memory 2g \     --executor-memory 2g \     --executor-cores 1 \ 

might work you.

have on running spark on yarn more details, though of switches mention yarn specific.


Comments

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

c# - Get rid of xmlns attribute when adding node to existing xml -