:: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
:: Experimental :: A Row representing a mutable aggregation buffer.
:: Experimental :: A Row representing a mutable aggregation buffer.
This is not meant to be extended outside of Spark.
:: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).
:: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).
A user-defined function.
A user-defined function. To create one, use the udf
functions in functions.
Note that the user-defined functions must be deterministic. Due to optimization,
duplicate invocations may be eliminated or the function may even be invoked more times than
it is present in the query.
As an example:
// Defined a UDF that returns true or false based on some numeric score. val predict = udf((score: Double) => if (score > 0.5) true else false) // Projects a column that adds a prediction column based on the score column. df.select( predict(df("score")) )
1.3.0
:: Experimental :: Utility functions for defining window in DataFrames.
:: Experimental :: Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
:: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.
:: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.
Use the static methods in Window to create a WindowSpec.
1.4.0
:: Experimental :: Utility functions for defining window in DataFrames.
:: Experimental :: Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
:: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
For example, the following aggregator extracts an
int
from a specific class and adds them up:Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
The input type for the aggregation.
The type of the intermediate value of the reduction.
The type of the final output result.
1.6.0