Open
Description
Hi team,
I'm encountering an issue when trying to load a Delta Sharing table into a pandas DataFrame using load_as_pandas. Here's a minimal reproducible example:
!pip3 install --upgrade delta-sharing
import delta_sharing
# Point to the profile file. It can be a file on the local file system or a file on a remote storage.
profile_file = "/content/config.share"
# Create a SharingClient.
client = delta_sharing.SharingClient(profile_file)
# List all shared tables – this returns [Table(name='test', share='test2', schema='test')]
print(client.list_all_tables())
# Attempt to load a specific table
table_url = profile_file + "#test2.test.test"
delta_sharing.load_as_pandas(table_url)
While client.list_all_tables() works as expected, calling load_as_pandas on some shares results in the following error:
ArrowInvalid: External error: Arrow error: Parquet error: Could not parse metadata: bad data
Any insights into what might be causing this or how to resolve it would be greatly appreciated!
Thanks in advance!
EDIT 1: After some more testing this only seems to happen on MANAGED TABLES. External tables works fine.
Metadata
Metadata
Assignees
Labels
No labels