Skip to content

Difference in results after changing to 3.0.1 from 2.2.1 #230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
venkatesh71097 opened this issue Jun 8, 2020 · 6 comments · Fixed by #238
Closed

Difference in results after changing to 3.0.1 from 2.2.1 #230

venkatesh71097 opened this issue Jun 8, 2020 · 6 comments · Fixed by #238
Assignees
Labels

Comments

@venkatesh71097
Copy link

Hello!

I recently pip installed the latest wfdb version onto my system. I see that the results are drastically changing. Like, I had to calculate the number of patients having one particular column of data. With version 2.2.1, I was getting 341 for p00 file. But, with latest wfdb package, I'm getting just 272. I'm not sure what kind of a glitch this is. I reverted back to 2.2.1 to try out, once I got a reduced count of result in 3.0.1, and it worked fine.

@Lucas-Mc
Copy link
Contributor

Lucas-Mc commented Jun 8, 2020

Hey @venkatesh71097, do you have any more specifics about the database and files than "p00"? One thing to note is that version >=3.0.0 shifted from Physiobank to Physionet which means it is using the most updated versions of the databases and, in some cases, some content may have changed. The underlying functionality in the code should still be the same but let's see what we can figure out here! Thanks for posting!

@venkatesh71097
Copy link
Author

Hey, so I'm trying to check the number of patients having 'PLETH' channel in p00 file of the mimic matched subset database that was released a month back, as I want only the pleth channel from the signals. It is from physionet.org. I've p00 files placed in my directory. Let me know if you need any further details.

@Lucas-Mc
Copy link
Contributor

Lucas-Mc commented Jun 9, 2020

Hey @venkatesh71097, do you have any code that can reproduce this issue? I want to check the specific differences between the two versions when replicating this bug that you are receiving. Thanks!

@venkatesh71097
Copy link
Author

venkatesh71097 commented Jun 10, 2020

import pandas as pd
import os
from IPython.display import display
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import os
import shutil
import posixpath
import time
import wfdb

count_hea = 0
count_pleth = 0
for subfile in sorted(os.listdir('physionet.org/files/mimic3wdb-matched/1.0/p00')):
count_hea = 0
if subfile.startswith('p'):
for subsubfile in sorted(os.listdir(os.path.join('physionet.org/files/mimic3wdb-matched/1.0/p00/' , subfile))):
if subsubfile.startswith('p') and count_hea == 0 and not subsubfile[:-4].endswith('n'):
try:
i = i + 1
print(subsubfile)
count_hea = count_hea + 1
strip = subsubfile[:-4]
record = wfdb.rdrecord('physionet.org/files/mimic3wdb-matched/1.0/p00/' + subfile + '/' + strip, channel_names = ['PLETH'], return_res = 16)
r = record.dict
print(r['sig_name'])
if r['sig_name'] != None:
if r['sig_name'][0] == 'PLETH':
count_pleth = count_pleth + 1
print(count_pleth)
except Exception:
pass
print(count_pleth)

Hey lucas , you can try this out. Sorry for the delay. I missed the notification!

@Lucas-Mc
Copy link
Contributor

Hey @venkatesh71097, I just downloaded the files locally and ran your code and reproduced both numbers.

One file it failed on was p00/p000298/p000298-2148-01-02-18-16 which gave an error of ValueError: Samples were not loaded correctly when the return_res=16, ValueError: cannot reshape array of size 375 into shape (2) when the return_res=32, and loads correctly when the return_res=64. The return_res should only be changing the significant figures so I have to see what's going on further.

For now, you can catch the try statement with ValueError and try it again with return_res=64 and that should work though I haven't tested it myself. Will give it a go now.

Lucas-Mc added a commit that referenced this issue Jun 12, 2020
Fixes error when using the return_res parameter which can sometimes modify the format in which the data from file is read. This oftentimes creates illogical arrays which can't be recovered into the proper values or even the proper shape. Fortunately, the removal of this does not significantly affect the runtime of the function even for large arrays. Fixes #230.
@Lucas-Mc Lucas-Mc added the bug label Jun 12, 2020
@Lucas-Mc Lucas-Mc self-assigned this Jun 12, 2020
Lucas-Mc added a commit that referenced this issue Jun 12, 2020
@Lucas-Mc
Copy link
Contributor

I made the fix and got the same number for both the current Github version and 2.2.1. Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants