Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CREATE EXTERNAL TABLE does not save schema [Spark] #165

Closed
osopardo1 opened this issue Feb 16, 2023 · 0 comments · Fixed by #168
Closed

CREATE EXTERNAL TABLE does not save schema [Spark] #165

osopardo1 opened this issue Feb 16, 2023 · 0 comments · Fixed by #168
Labels
type: bug Something isn't working

Comments

@osopardo1
Copy link
Member

What went wrong?

When creating an already existing table using qbeast format, the schema is not saved properly on the Glue Catalog.

And trting

How to reproduce?

1. Code that triggered the bug, or steps to reproduce:

spark.sql("""CREATE EXTERNAL TABLE tpc_ds_1gb_qbeast_store_sales
USING qbeast
LOCATION "/tmp/store_sales"
OPTIONS ('columnsToIndex'='ss_sold_date_sk,ss_item_sk')""")

And then execute:

spark.sql("""SELECT * FROM tpc_ds_1gb_qbeast_store_sales"").show()

Throws the following error:

org.apache.spark.sql.AnalysisException: Unable to resolve ss_sold_date_sk given []
  at org.apache.spark.sql.errors.QueryCompilationErrors$.cannotResolveAttributeError(QueryCompilationErrors.scala:1020)
  at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.$anonfun$resolve$3(LogicalPlan.scala:91)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.$anonfun$resolve$1(LogicalPlan.scala:90)
  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
  at scala.collection.Iterator.foreach(Iterator.scala:943)

And when describing the table:

spark.sql("DESCRIBE EXTENDED  tpc_ds_1gb_qbeast_store_sales").show()

Only the properties appear:

+--------------------+--------------------+-------+
|            col_name|           data_type|comment|
+--------------------+--------------------+-------+
|                    |                    |       |
|      # Partitioning|                    |       |
|     Not partitioned|                    |       |
|                    |                    |       |
|# Detailed Table ...|                    |       |
|                Name|tpc_ds_1gb_qbeast...|       |
|            Location|s3://qbeast-priva...|       |
|            Provider|              qbeast|       |
|               Owner|              hadoop|       |
|    Table Properties|[columnsToIndex=s...|       |
+--------------------+--------------------+-------+

2. Branch and commit id:

Main at 49163e9

3. Spark version:

On the spark shell run spark.version.

3.2.2

4. Hadoop version:

On the spark shell run org.apache.hadoop.util.VersionInfo.getVersion().

3.3.1

5. How are you running Spark?

Are you running Spark inside a container? Are you launching the app on a remote K8s cluster? Or are you just running the tests in a local computer?

EMR cluster

6. Stack trace:

Trace of the log/error messages.

@osopardo1 osopardo1 added the type: bug Something isn't working label Feb 16, 2023
@osopardo1 osopardo1 changed the title CREATE EXTERNAL TABLE does not save schema CREATE EXTERNAL TABLE does not save schema [Spark] Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant