我在df中注册了一个tmp表,在列header中有空格 . 我可以通过sqlContext使用sql查询时提取列 . 我尝试使用后退但它不起作用
df1 = sqlContext.sql("""select Company, Sector, Industry, `Altman Z-score as Z_Score` from tmp1 """)
您只需将列名放在后面的刻度中,而不是它的别名:
Without Alias :
df1 = sqlContext.sql("""select Company, Sector, Industry, `Altman Z-score` as Z_Score from tmp1""")
With Alias :
df1 = sqlContext.sql("""select t1.Company, t1.Sector, t1.Industry, t1.`Altman Z-score` as Z_Score from tmp1 t1""")
查询中存在问题,更正后的查询如下( wrapped as Z_Score in `` ): -
df1 = sqlContext.sql("""select Company, Sector, Industry, `Altman Z-score` as Z_Score from tmp1 """)
还有一个替代: -
import pyspark.sql.functions as F df1 = sqlContext.sql("""select * from tmp1 """) df1.select(F.col("Altman Z-score").alias("Z_Score")).show()
2 回答
您只需将列名放在后面的刻度中,而不是它的别名:
Without Alias :
With Alias :
查询中存在问题,更正后的查询如下( wrapped as Z_Score in `` ): -
还有一个替代: -