Skip to content

Fetching DF shows up as memory corruption #489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mauropagano opened this issue May 1, 2025 · 9 comments
Closed

Fetching DF shows up as memory corruption #489

mauropagano opened this issue May 1, 2025 · 9 comments
Labels
bug Something isn't working patch available

Comments

@mauropagano
Copy link

  1. What versions are you using?
    3.1.0 on python 3.9.6 and 3.12.7

  2. Is it an error or a hang or a crash?
    Looks like memory corruption, running the code over and over at some point you see the same Arrow object printing junk (I don't think it goes back to db to re-run)
    This fails CONSISTENTLY if the same connection is used twice (I'm not sure that's expected or not), but it also fails (if you try enough) with a new connection

  3. What error(s) or behavior you are seeing?
    Incorrect result, need to run a few times

  4. Does your application call init_oracle_client()?
    Reproduces in both thin and thick, thick seems easier to trigger

  5. Include a runnable Python script that shows the problem.

import oracledb
import pyarrow

                                                                                                                                                                                                                                                      
def f() -> pyarrow.Table:
    c = oracledb.connect("...")
    oracle_df = c.fetch_df_all(
        statement="select dbms_random.value from dual", arraysize=5000
    ) 
    pyarrow_table = pyarrow.Table.from_arrays(
        arrays=oracle_df.column_arrays(), names=oracle_df.column_names()
    ) 
    return pyarrow_table


df = f()
print(f"df: {df}")


df2 = f() # <-- no use of df2
print(f"df: {df}") 
print(f"df: {df}")

@mauropagano mauropagano added the bug Something isn't working label May 1, 2025
@cjbj
Copy link
Member

cjbj commented May 1, 2025

@mauropagano thanks for your continued focus on dataframes.
What OS are you reproducing this on? What is your DB version?

@mauropagano
Copy link
Author

Client is linux, db is 19.x.
It reproduces with select 1 from dual too btw, I just picked something else for the testcase

@mauropagano
Copy link
Author

Just realized I never provided an output example. Below is one from thick mode, thin reproduces too but need to try a few times (5-10 probably)

Working case (commenting out df2 = f())

df: pyarrow.Table
N1: double
----
N1: [[0]]
df: pyarrow.Table
N1: double
----
N1: [[0]]
df: pyarrow.Table
N1: double
----
N1: [[0]]

Example of incorrect output (df2 = f() in code)

df: pyarrow.Table
N1: double
----
N1: [[0]]
df: pyarrow.Table
N1: double
----
N1: [[2.3682269498151e-310]]
df: pyarrow.Table
N1: double
----
N1: [[2.3682269498151e-310]]

@cjbj
Copy link
Member

cjbj commented May 2, 2025

@mauropagano thanks for that. I had to change the connection handling before I could see it. Also using select 1 from dual makes it clearer.

def f(c) -> pyarrow.Table:
    oracle_df = c.fetch_df_all(
#        statement="select dbms_random.value from dual", arraysize=5000
        statement="select 1 from dual", arraysize=5000
    )
    pyarrow_table = pyarrow.Table.from_arrays(
        arrays=oracle_df.column_arrays(), names=oracle_df.column_names()
    )
    return pyarrow_table


c = oracledb.connect(user=un, password=pw, dsn=cs)

I'll let @anthony-tuininga and @aosingh dig into it.

@aosingh
Copy link
Member

aosingh commented May 6, 2025

@mauropagano

I was able to reproduce this and it happens because of premature deallocation of the underlying OracleArrowArray. Let me check how to fix this and get back.

anthony-tuininga added a commit that referenced this issue May 12, 2025
for converting an OracleDataFrame object to a foreign data frame object
multiple times (#470).
@anthony-tuininga
Copy link
Member

I have pushed a patch that corrects this issue and have initated a build from which you can download pre-built development wheels once it completes. You can also build from source if you prefer. If you can test your scenario and confirm the patch works as expected, that would be appreciated!

@mauropagano
Copy link
Author

mauropagano commented May 13, 2025

Fix works!

Do you have a sense of when you'll publish 3.2?

@anthony-tuininga
Copy link
Member

We are currently planning on releasing this fix as part of python-oracledb 3.1.1. The exact timing is not determined yet but it will be "soon". :-)

anthony-tuininga added a commit that referenced this issue May 15, 2025
for converting an OracleDataFrame object to a foreign data frame object
multiple times (#470).
@anthony-tuininga
Copy link
Member

This was included in python-oracledb 3.1.1 which was just released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working patch available
Projects
None yet
Development

No branches or pull requests

4 participants