pyspark.BarrierTaskContext¶
-
class
pyspark.
BarrierTaskContext
[source]¶ A
TaskContext
with extra contextual info and tooling for tasks in a barrier stage. UseBarrierTaskContext.get()
to obtain the barrier context for a running barrier task.New in version 2.4.0.
Notes
This API is experimental
Examples
Set a barrier, and execute it with RDD.
>>> from pyspark import BarrierTaskContext >>> def block_and_do_something(itr): ... taskcontext = BarrierTaskContext.get() ... # Do something. ... ... # Wait until all tasks finished. ... taskcontext.barrier() ... ... return itr ... >>> rdd = spark.sparkContext.parallelize([1]) >>> rdd.barrier().mapPartitions(block_and_do_something).collect() [1]
Methods
allGather
([message])This function blocks until all tasks in the same stage have reached this routine.
How many times this task has been attempted.
barrier
()Sets a global barrier and waits until all tasks in this stage hit this barrier.
cpus
()CPUs allocated to the task.
get
()Return the currently active
BarrierTaskContext
.getLocalProperty
(key)Get a local property set upstream in the driver, or None if it is missing.
Returns
BarrierTaskInfo
for all tasks in this barrier stage, ordered by partition ID.The ID of the RDD partition that is computed by this task.
Resources allocated to the task.
stageId
()The ID of the stage that this task belong to.
An ID that is unique to this task attempt (within the same
SparkContext
, no two task attempts will share the same attempt ID).