Skip to content

DOC: document the change in .base #2737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 27, 2012
Merged

Conversation

certik
Copy link
Contributor

@certik certik commented Nov 14, 2012

Fixes gh-438.

/cc @njsmith, @iguananaut.

@njsmith
Copy link
Member

njsmith commented Nov 14, 2012

You should check that this description is actually correct - I think Travis
merged a patch to change the rules slightly to check subtypes to work
around some regressions people saw.
On 14 Nov 2012 03:36, "Ondřej Čertík" notifications@github.com wrote:

Fixes gh-438 #438.

/cc @njsmith https://github.com/njsmith, @iguananauthttps://github.com/iguananaut

.

You can merge this Pull Request by running:

git pull https://github.com/certik/numpy fix438

Or view, comment on, or merge it at:

#2737
Commit Summary

  • DOC: document the change in .base

File Changes

  • M doc/release/1.7.0-notes.rst (4)

Patch Links

@certik
Copy link
Contributor Author

certik commented Nov 14, 2012

You probably mean #2708. Yes, I used that one.

@njsmith
Copy link
Member

njsmith commented Nov 14, 2012

Yes that's the one I'm thinking of... but I don't think the rule it
implements is the same as the rule described in the text this PR adds to
the release notes?

-n
On 14 Nov 2012 07:12, "Ondřej Čertík" notifications@github.com wrote:

You probably mean #2708 #2708.
Yes, I used that one.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2737#issuecomment-10356366.

@certik
Copy link
Contributor Author

certik commented Nov 14, 2012

Here is the exact text from the PR #2708:

+     * Don't allow infinite chains of views, always set the base
+     * to the first owner of the data.  
+     * That is, either the first object which isn't an array, 
+     * or the first object which owns its own data.

and here is the exact text from this PR:

+The ``.base`` attribute now points to the first owner of the data; that is,
+either the first object which isn't an array, or the first object which owns
+its own data.

It seems to me that the two texts are exactly equivalent. Can you point me to a difference? Maybe the comment in the PR #2708 is not correct.

@njsmith
Copy link
Member

njsmith commented Nov 14, 2012

The texts are equivalent. Now that I look at it, that PR is kind of not very well written though...

The key check appears to be:

    if (!(PyArray_CheckExact(arr)) & (Py_TYPE(tmp) != Py_TYPE(arr))) {
        /* stop collapsing */
    }

So obviously that should be && not &, but I think in this case & does produce the same behaviour as && since the two inputs are the output from C boolean operators, which always return either 0 or 1 exactly. So, we can ignore that. What I think the code says is that if your view is a subclass of ndarray, then you only collapse out the base pointers that also point to the exact same subclass. As soon as you hit a .base which points to a regular ndarray, or a subclass-of-your-subclass, or anything else, you stop. For regular base class ndarrays, you ignore this rule, and just collapse right through subclasses etc. until you run out of .base pointers (like the comment says).

Sort of a strange rule, and the comment is misleading.

@certik
Copy link
Contributor Author

certik commented Nov 14, 2012

@teoliphant, do you agree with the better explanation by Nathan? If so, I can send a PR to both the master and the release branch with this better explanation.

@embray
Copy link
Contributor

embray commented Dec 4, 2012

Maybe this isn't the best place to report this, but I was just playing around with Numpy dev with some of my code and hit a use case where these new .base semantics don't hold up as documented:

print np.__version__
arr_base = np.arange(0, 10000, dtype=np.ubyte)
dtype = np.rec.format_parser('2i4', ['foo'], None).dtype
arr_overlay = arr_base[:2000].view(dtype=dtype, type=np.rec.recarray)
print arr_base.shape
print arr_base.flags

print arr_overlay.base.shape
print arr_overlay.base.flags
print arr_overlay.base is arr_base
1.8.0.dev-7b75899
(10000,)
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
(2000,)
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
False

arr_overlay.base should be the original arr_base. Instead the slice of arr_base[:2000] is the base. In my actual code the arr_base is actually a mmap and then things get even worse.

@certik
Copy link
Contributor Author

certik commented Dec 26, 2012

@njsmith, I've updated the docs based on your comment. Can you please review it?

@njsmith
Copy link
Member

njsmith commented Dec 26, 2012

Can we write

The `.base` attribute on ndarrays, which is used on views to ensure that the underlying array owning the memory is not deallocated prematurely, now collapses out references when you have a view-of-a-view. For example::

    a = np.arange(10)
    b = a[1:]
    c = b[1:]

In numpy 1.6, `c.base` is `b`, and `c.base.base` is `a`. In numpy 1.7, `c.base` is `a`.

To increase backwards compatibility for software which relies on the old behaviour of `.base`, we only 'skip over' objects which have exactly the same type as the newly created view. This makes a difference if you use `ndarray` subclasses. For example, if we have a mix of `ndarray` and `matrix` objects which are all views on the same original `ndarray`::

    a = np.arange(10)
    b = np.asmatrix(a)
    c = b[0, 1:]
    d = c[0, 1:]

then `d.base` will be `b`. This is because `d` is a `matrix` object, and so the collapsing process only continues so long as it encounters other `matrix` objects. It considers `c`, `b`, and `a` in that order, and `b` is the last entry in that list which is a `matrix` object.

? I think that's accurate and hopefully clearer for non-experts...?

@embray
Copy link
Contributor

embray commented Dec 26, 2012

@njsmith I think that makes sense and explains my earlier confusion on this.

Fixes numpygh-438. Based on Nathan's comments in numpygh-2737.
@certik
Copy link
Contributor Author

certik commented Dec 26, 2012

Sounds good. I have rebased the commits and just used the latest Nathan's description. I think it is finally clear (even to me:). I just fixed the rst formatting. Now it renders correctly. @njsmith, if you agree, then just merge this PR please.

@njsmith
Copy link
Member

njsmith commented Dec 27, 2012

okie-dokie

njsmith added a commit that referenced this pull request Dec 27, 2012
DOC: document the change in .base
@njsmith njsmith merged commit c2b2422 into numpy:maintenance/1.7.x Dec 27, 2012
certik added a commit that referenced this pull request Dec 27, 2012
@certik certik deleted the fix438 branch December 27, 2012 17:46
@certik
Copy link
Contributor Author

certik commented Dec 27, 2012

Thanks. Forwardported in a8c641f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants