scala - Spark reparition() function increases number of tasks per executor, how to increase number of executor -

- May 15, 2013

i'm working on ibm server of 30gb ram (12 cores engine), have provided cores spark still, uses 1 core, tried while loading file , got successful command

val name_db_rdd = sc.textfile("input_file.csv",12)

and able provide 12 cores processing starting jobs want split operation in between intermediate operations executors, can use 12 cores.

image - description

val new_rdd = rdd.repartition(12)

as can see in image 1 executor running , repartition function split data many tasks @ 1 executor.

it depends how you're launching job, want add --num-executors command line when you're launching spark job.

something like

spark-submit     --num-executors 10 \     --driver-memory 2g \     --executor-memory 2g \     --executor-cores 1 \

might work you.

have on running spark on yarn more details, though of switches mention yarn specific.

Search This Blog

ITEMscalal

scala - Spark reparition() function increases number of tasks per executor, how to increase number of executor -

Comments

Post a Comment

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

python - RuntimeWarning: PyOS_InputHook is not available for interactive use of PyGTK -

Fatal error: Call to undefined function menu_execute_active_handler() in drupal 7.9 -