You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe your new request in detail
This is already kind of known, filing as separate issue so it's easy to track down.
It's currently not possible to enforce a schema on top of OracleArrowArray, example below.
oracle_df = conn.fetch_df_all(statement="select 1 n1 from dual", arraysize=2)
df = pa.Table.from_arrays(arrays=oracle_df.column_arrays(), schema=pa.schema([("n1", pa.int8())]))
File "pyarrow/table.pxi", line 4893, in pyarrow.lib.Table.from_arrays
File "pyarrow/table.pxi", line 1622, in pyarrow.lib._sanitize_arrays
File "pyarrow/array.pxi", line 405, in pyarrow.lib.asarray
File "pyarrow/array.pxi", line 273, in pyarrow.lib.array
File "src/oracledb/interchange/nanoarrow_bridge.pyx", line 500, in oracledb.interchange.nanoarrow_bridge.OracleArrowArray.__arrow_c_array__
NotImplementedError: requested_schema
This would be especially useful when casting to ints or date32, there are ways around it but they are a bit annoying (and slow).
Give supporting information about tools and operating systems. Give relevant product version numbers
Not applicable
The text was updated successfully, but these errors were encountered:
If the caller requests a schema that is not compatible with the data, say requesting a schema with a different number of fields, the callee should raise an exception. The requested schema mechanism is only meant to negotiate between different representations of the same data and not to allow arbitrary schema transformations.
I guess, for compatibility, we need to map each arrow data type to their alternative representations (if they support multiple representations of the same data)
I think the spec makes sense, I wouldn't expect some sort of magic projection (i.e. provide a subset of columns) to work here.
This is more to make it easy to disambiguate in those cases where the Oracle format is rather generous or lacking.
For the generous side, think NUMBER that can map to any number (no pun intended) of things and you need to enforce 0 precision to get a int64 now. Even that is quite wasteful when you could use something smaller if you know the data.
For the lacking side, imagine DATE in Oracle that you want to map to date32 because you know there is no time of day component. I'm not sure there is a way at all to specify something in db to "dictionary encode" that your date stops at day (you can do a check constraint on date = trunc(date) but that's not a datatype) and now you got a timestamp in Arrow land that is I believe 2x the size, plus all the fun associated with it :-)
This is already kind of known, filing as separate issue so it's easy to track down.
It's currently not possible to enforce a schema on top of OracleArrowArray, example below.
This would be especially useful when casting to ints or date32, there are ways around it but they are a bit annoying (and slow).
Not applicable
The text was updated successfully, but these errors were encountered: