-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
How to limit cross correlation window width in Numpy? #5954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Out of curiosity, how do matlab (maybe R?) handle rescaling effects due to different values when the max lag is not much smaller then the amount of data? |
Not sure what you mean by rescaling effects. Your workaround throws out huge amounts of data. You still want to know how b[-maxlag-1] correlates with entries in a that are within maxlag entries from that. matplotlib.pyplot.xcorr has a maxlag argument, but it is just a wrapper for numpy.correlate, so it doesn't get any performance benefits because it ends up calculating the entire cross-correlation and then throwing part away before returning to the user. |
Maybe that was a bit silly, since I was more concerned about getting something like |
On the one hand, maxlag you could say: Well if "valid" mode gives something smaller, that is fine. But then we want maxlag to be |
yes, i plan to do that. I’m not 100% sure of how to deal with vectors of different sizes. Should I center the second vector with respect to the first or right align them?
|
I have now implemented this functionality in numpy.correlate() and numpy.convolve(). master...bringingheavendown:maxlag. The files that were edited are: |
Hey, its a bit difficultl to discuss things like this (can't get a diff easily, etc.). It will be a bit of hoop jumping, but maybe you can see it as an exercise.... Could you try to:
Note that while it is often said that changing history is bad in git, the opposite is true for pull requests, you can always rewrite the branch history completly when people ask you to change things (we usually do this at the end to clean up the branch history a bit). |
Ah, you actually got the branch already I guess. Then you can just do a PR as well I think, but that gives an easy way to view the diff in any case: |
Here are the specific issues that could use attention in this implementation: How to deal with the optional lagvec output array? |
Hi bringingheavendown, the correlate function modification you made is exactly what I was looking for. I think the way you handle the minlag, maxlag is perfect. Any hope of seeing it in the offical numpy ? |
The way to move forward on this would be:
On Sat, Oct 24, 2015 at 1:58 PM, Eridum notifications@github.com wrote:
Nathaniel J. Smith -- http://vorpus.org |
There already is a pull request: #5978. It passed all tests on all machines except for one, but I couldn't get enough help debugging it. What can I do about that? There also already was discussion about the user-facing API on the numpy-discussion list back in June. At this point, I feel that the code is complete and just needs to be incorporated. If more discussion is desired, we can do that. In any case, I only dropped this because it didn't seem like it was going anywhere. Let me know what the process is at this point, and I would be happy to do it. |
Sorry, I missed the pull request, was just looking at this bug. It sounds On Mon, Oct 26, 2015 at 8:05 AM, bringingheavendown <
Nathaniel J. Smith -- http://vorpus.org |
The pull request is just waiting for approval now. I can squash commits and give appropriate commit messages once I know everything is ready to go forward. |
Content in this issue overlaps with the actual PR (gh-5978), which has more up-to-date comments. So I'll close this issue to limit the amount of places the same |
replaced int with npy_intp for lag parameters in _pyarray_correlate and related functions renamed lagvector input to correlate() and convolve() to be named return_lags
_pyarray_correlate ignores minlag, maxlag, and lagstep for modes other than 3
I am learning numpy/scipy, coming from a MATLAB background. The xcorr function in Matlab has an optional argument "maxlag" that limits the lag range from –maxlag to maxlag. This is very useful if you are looking at the cross-correlation between two very long time series but are only interested in the correlation within a certain time range. The performance increases are enormous considering that cross-correlation is incredibly expensive to compute.
What is troubling me is that numpy.correlate does not have a maxlag feature. This means that even if I only want to see correlations between two time series with lags between -100 and +100 ms, for example, it will still calculate the correlation for every lag between -20000 and +20000 ms (which is the length of the time series). This (theoretically) gives a 200x performance hit! Is it possible that I could contribute this feature?
I have introduced this question as a scipy issue and on the scipy-dev list. It seems the best place to start is with numpy.correlate, so that is what I am requesting. I have done a simple implementation which gives 50x speedup under my conditions.
This is my first experience with contributing to open-source software, so any pointers are appreciated.
Other issues related to correlate functions include ENH: Fold fftconvolve into convolve/correlate functions as a parameter #2651, Use FFT in np.correlate/convolve? (Trac #1260) #1858, and normalized cross-correlation (Trac #1714) #2310.
The text was updated successfully, but these errors were encountered: