Skip to content

Commit f939c71

Browse files
committed
[SPARK-12242][SQL] Add DataFrame.transform method
Author: Reynold Xin <rxin@databricks.com> Closes apache#10226 from rxin/df-transform. (cherry picked from commit 76540b6) Signed-off-by: Reynold Xin <rxin@databricks.com>
1 parent b5e5812 commit f939c71

File tree

2 files changed

+14
-1
lines changed

2 files changed

+14
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/Column.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ class TypedColumn[-T, U](
8484
* col("`a.column.with.dots`") // Escape `.` in column names.
8585
* $"columnName" // Scala short hand for a named column.
8686
* expr("a + 1") // A column that is constructed from a parsed SQL Expression.
87-
* lit("1") // A column that produces a literal (constant) value.
87+
* lit("abc") // A column that produces a literal (constant) value.
8888
* }}}
8989
*
9090
* [[Column]] objects can be composed to form complex expressions:

sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1413,6 +1413,19 @@ class DataFrame private[sql](
14131413
*/
14141414
def first(): Row = head()
14151415

1416+
/**
1417+
* Concise syntax for chaining custom transformations.
1418+
* {{{
1419+
* def featurize(ds: DataFrame) = ...
1420+
*
1421+
* df
1422+
* .transform(featurize)
1423+
* .transform(...)
1424+
* }}}
1425+
* @since 1.6.0
1426+
*/
1427+
def transform[U](t: DataFrame => DataFrame): DataFrame = t(this)
1428+
14161429
/**
14171430
* Returns a new RDD by applying a function to all rows of this DataFrame.
14181431
* @group rdd

0 commit comments

Comments
 (0)