pyspark.sql.DataFrame.to#
- DataFrame.to(schema)[source]#
- Returns a new - DataFramewhere each row is reconciled to match the specified schema.- New in version 3.4.0. - Parameters
- schemaStructType
- Specified schema. 
 
- schema
- Returns
- DataFrame
- Reconciled DataFrame. 
 
 - Notes - Reorder columns and/or inner fields by name to match the specified schema. 
- Project away columns and/or inner fields that are not needed by the specified schema.
- Missing columns and/or inner fields (present in the specified schema but not input DataFrame) lead to failures. 
 
- Cast the columns and/or inner fields to match the data types in the specified schema,
- if the types are compatible, e.g., numeric to numeric (error if overflows), but not string to int. 
 
- Carry over the metadata from the specified schema, while the columns and/or inner fields
- still keep their own metadata if not overwritten by the specified schema. 
 
- Fail if the nullability is not compatible. For example, the column and/or inner field
- is nullable but the specified schema requires them to be not nullable. 
 
 - Supports Spark Connect. - Examples - >>> from pyspark.sql.types import StructField, StringType >>> df = spark.createDataFrame([("a", 1)], ["i", "j"]) >>> df.schema StructType([StructField('i', StringType(), True), StructField('j', LongType(), True)]) - >>> schema = StructType([StructField("j", StringType()), StructField("i", StringType())]) >>> df2 = df.to(schema) >>> df2.schema StructType([StructField('j', StringType(), True), StructField('i', StringType(), True)]) >>> df2.show() +---+---+ | j| i| +---+---+ | 1| a| +---+---+