DOC: update structured array docs to reflect #6053 #9056

ahaldane · 2017-05-05T16:40:00Z

These are updated structure arrays docs to reflect the changes in #6053.

Don't merge this before #6053.

I'm putting them here now for comments to accompany #6053.

While this PR is open I will maintain an HTML compiled version of these docs at https://ahaldane.github.io/user/basics.rec.html

eric-wieser · 2017-05-05T16:45:46Z

numpy/doc/structured_arrays.py

-changes the structured array, the field view also changes: ::
+If ``fieldname`` is the empty string (``''``) the field will be given a default
+name of the form ``f#``, where ``#`` is the integer index of the field,
+counting from 0 from the left::


Not a complaint about this PR, but I don't think I like this behaviour:

>>> np.dtype([('', 'f4'),('f0', 'i4'),('z', 'i8')]) ValueError: field 'f0' occurs more than once

In #9054, I change this in some cases to be "index within the unnamed values". Is that a good thing?

eric-wieser · 2017-05-05T16:47:01Z

numpy/doc/structured_arrays.py

+to a datatype, and shape is a tuple of integers specifying subarray shape.
+
+ >>> x = np.zeros(3, dtype=[('x', 'f4'), ('y', np.float32), ('z', 'f4', (2,2))])
+ >>> x


I think it might be clearer to show these as >>> dt = np.dtype(...), >>> np.zeros(1, dtype=dt)

eric-wieser · 2017-05-05T16:49:39Z

numpy/doc/structured_arrays.py

- array([(0, 0.0), (0, 0.0), (0, 0.0)],
-       dtype=[('col1', '>i4'), ('col2', '>f4')])
+In this shorthand notation any of the :ref:`string dtype specifications
+<arrays.dtypes.constructing>` may be used in a string, separated by commas. The


It seems that arrays.dtypes.rst duplicates a lot of the contents here, and perhaps these should be condensed into a single help page

eric-wieser · 2017-05-05T16:52:21Z

numpy/doc/structured_arrays.py


-Filling structured arrays
-=========================
+Note that unlike other numpy scalars void structured scalars act like views


Missing comma

eric-wieser · 2017-05-05T16:55:00Z

numpy/doc/structured_arrays.py

+the arrays will result in a boolean array with the dimension of the original
+arrays, with elements set to True where all fields of the correspnding
+structures are equal. Structured dtypes are equivalent if the field names,
+dtypes and titles are the same, ignoring endianness.


This is misleading - we should clarify whether "equivalent dtypes" are such that dt1 == dt2, or if simply np.can_cast(dt1, dt2)

eric-wieser · 2017-05-05T18:10:22Z

numpy/doc/structured_arrays.py

+ >>> np.zeros(3, dtype={'names':    ['col1', 'col2'],
+ ...                    'formats':  ['i4','f4'],
+ ...                    'offsets':  [0, 4],
+ ...                    'itemsize': 12})


Might be tempting to write this in the following form:

>>> np.zeros(3, dtype=dict(names= ['col1', 'col2'], ... formats= ['i4','f4'], ... offsets= [0, 4], ... itemsize= 12)

Which of course, raises the question of whether np.dtype(**dict) should be added as a shorhand for np.dtype(dict)

Aligning the [...] isn't PEP8. It may take a bit of getting used to but eventually the unaligned versions become easier to read.

charris · 2017-09-21T16:54:10Z

@ahaldane Now that #6053 is in we should get this finished up.

charris · 2017-09-24T18:09:14Z

@ahaldane ping.

ahaldane · 2017-09-24T23:07:25Z

Got it, I'll go over it soon.

ahaldane · 2017-09-25T22:33:51Z

Updated, and ready to read through.

You can view an html version of the current state at https://ahaldane.github.io/user/basics.rec.html

charris · 2017-10-02T16:57:16Z

numpy/doc/structured_arrays.py

-this array is a structure that contains three items, a 32-bit integer, a 32-bit
-float, and a string of length 10 or less. If we index this array at the second
-position we get the second structure: ::
+Here ``x`` is a one-dimensional array length 2, whose datatype is a structure


"array of length two" and omit the comma. The clause after the comma is essential.

charris · 2017-10-02T17:09:43Z

numpy/doc/structured_arrays.py

+       dtype=[('name', 'S10'), ('age', '<i4'), ('weight', '<f4')])
+
+Structured arrays are designed for low-level manipulation of structured data,
+for example for interpreting binary blobs. Structured datatypes are designed to


"for example, interpreting ..."

charris · 2017-10-02T17:11:30Z

numpy/doc/structured_arrays.py

+Structured arrays are designed for low-level manipulation of structured data,
+for example for interpreting binary blobs. Structured datatypes are designed to
+mimic 'structs' in the C language, making them useful for interfacing with C
+code. For these purposes numpy supports specialized features such as subarrays


comma after "purposes"

charris · 2017-10-02T17:19:55Z

numpy/doc/structured_arrays.py

+and nested datatypes, and allows manual control over the memory layout of the
+structure.
+
+If you only wish to manipulate tabular data with labelled columns, you are


Maybe "For simple manipulation of tabular data, other pydata projects, such as pandas, xarray, or DataArray, provide higher-level interfaces that may be more suitable."

charris · 2017-10-02T17:26:09Z

numpy/doc/structured_arrays.py

+structured datatypes, and it may also be a :term:`sub-array` which behaves like
+an ndarray of a specified shape. The offsets of the fields are arbitrary, and
+fields may even overlap. These offsets are usually determined automatically by
+numpy but can also be manually specified.


"by numpy, but can be manually specified."

charris · 2017-10-02T17:29:15Z

numpy/doc/structured_arrays.py

+Structured Datatype Creation
+----------------------------
+
+Structured datatypes may be created using the function :func:`numpy.dtype` with


Could use a rewrite. I'd probably start a new sentence instead of using "with".

charris · 2017-10-02T17:34:06Z

numpy/doc/structured_arrays.py

+:ref:`Data Type Objects <arrays.dtypes.constructing>` reference page, and in
+summary they are:
+
+1. A list of tuples, one tuple per field


Do we want to number subtitles? Maybe a simple enumerated list would do.

charris · 2017-10-02T17:55:23Z

numpy/doc/structured_arrays.py

+ >>> np.dtype([('x', 'f4'), ('y', np.float32), ('z', 'f4', (2,2))])
+ dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4', (2, 2))])
+
+If ``fieldname`` is the empty string (``''``) the field will be given a default


"is an empty string, '', the ..."

charris · 2017-10-02T17:56:28Z

numpy/doc/structured_arrays.py

+```````````````````````````````````````````````````
+
+In this shorthand notation any of the :ref:`string dtype specifications
+<arrays.dtypes.constructing>` may be used in a string, separated by commas. The


"... string and separated ..."

charris · 2017-10-02T17:58:56Z

numpy/doc/structured_arrays.py

+The dictionary has two required keys, 'names' and 'formats', and four optional
+keys, 'offsets', 'itemsize', 'aligned' and 'titles'. 'names' and 'formats'
+should respectively correspond to a list of field names and a list of dtype
+specifications of the same length. The optional 'offsets' key must correspond


"specifications, all of the same length."

charris · 2017-10-02T18:01:35Z

numpy/doc/structured_arrays.py

+keys, 'offsets', 'itemsize', 'aligned' and 'titles'. 'names' and 'formats'
+should respectively correspond to a list of field names and a list of dtype
+specifications of the same length. The optional 'offsets' key must correspond
+to a list of integer byte-offsets of each field within the structure, of the


"The optional 'offsets' key is a list of integer byte offsets, one for each field within the structure."

charris · 2017-10-02T18:02:36Z

numpy/doc/structured_arrays.py

+same length. If 'offsets' is not given the offsets are determined
+automatically. The optional 'itemsize' key should correspond to an integer
+describing the total size in bytes of the dtype, which must be large enough
+that all the fields are contained. ::


"... to contain all the fields."

charris · 2017-11-10T00:15:02Z

numpy/doc/structured_arrays.py

+Because of this, and because the ``names`` attribute preserves the field order
+while the ``fields`` attribute may not, it is recommended to iterate through
+the fields of a dtype using the ``names`` attribute of the dtype (which will
+not list titles), as in::


Commas rather than parenthesis. I don't know when the current parenthetical scourge originated, but it seems to be everywhere these days :-(

charris · 2017-11-10T00:19:13Z

numpy/doc/structured_arrays.py

-For the last example: ::
+A scalar assigned to a structured element will be assigned to all fields. This
+happens when a scalar is assigned to a structured array, or when a scalar array
+is assigned to a structured array::


Is there an example of using a scalar array for the rhs? Does scalar array mean 1-D array here?

I guess by "scalar array" I mean "unstructured array", will fix.

charris · 2017-11-10T00:21:16Z

numpy/doc/structured_arrays.py

+       dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '?'), ('f3', 'S1')])
+
+Structured arrays can also be assigned to scalar arrays, but only if the
+structured datatype has just a single field::


Is there an example of that? Which side has the single field, or is that both sides?

I'll try to modify the example, perhaps using different variable names,.

charris · 2017-11-10T00:24:11Z

numpy/doc/structured_arrays.py

+       dtype=[('a', '<i8'), ('b', '<i4'), ('c', '<f8')])
+
+The resulting array is a view into the original array, such that assignment to
+the view modifies the original array. This view's fields will be in the order


Maybe "The view's" instead of "This view's".

charris · 2017-11-10T00:25:14Z

numpy/doc/structured_arrays.py

+The resulting array is a view into the original array, such that assignment to
+the view modifies the original array. This view's fields will be in the order
+they were indexed. Note that unlike for single-field indexing, the view's dtype
+has the same itemsize as the original array and has fields at the same offsets


and <- comma

charris · 2017-11-10T00:25:35Z

numpy/doc/structured_arrays.py

+has the same itemsize as the original array and has fields at the same offsets
+as in the original array, and unindexed fields are merely missing.
+
+Since this view is a structured array itself, it obeys the assignment rules


this <- the

charris · 2017-11-10T00:26:42Z

numpy/doc/structured_arrays.py

+ >>> type(scalar)
+ numpy.void
+
+Importantly, unlike other numpy scalars, structured scalars are mutable and act


Could omit "Importantly".

charris · 2017-11-10T00:27:03Z

numpy/doc/structured_arrays.py

+ numpy.void
+
+Importantly, unlike other numpy scalars, structured scalars are mutable and act
+like views into the original array, such that modifying the scalar will modify


"such that" <- "so that".

charris · 2017-11-10T00:27:34Z

numpy/doc/structured_arrays.py


-Notice that `x` is created with a list of tuples. ::
+Thus, tuples might be though of as the native Python equivalent to numpy's


though <- thought

charris · 2017-11-10T00:28:06Z

numpy/doc/structured_arrays.py

- >>> x[['y','x']]
- array([(2.5, 1.5), (4.0, 3.0), (3.0, 1.0)],
-      dtype=[('y', '<f4'), ('x', '<f4')])
+In order to prevent clobbering of object pointers in fields of


charris · 2017-11-10T00:33:51Z

numpy/doc/structured_arrays.py


-Structured arrays can be filled by field or row by row. ::
+If the dtypes of two structured arrays are equivalent, testing the equality of


What does "equivalent" mean?

equal, will fix

charris · 2017-11-10T00:34:15Z

numpy/doc/structured_arrays.py


-Structured arrays can be filled by field or row by row. ::
+If the dtypes of two structured arrays are equivalent, testing the equality of
+the arrays will result in a boolean array with the dimension of the original


dimension <- dimesions.

charris · 2017-11-10T00:35:01Z

numpy/doc/structured_arrays.py

-Structured arrays can be filled by field or row by row. ::
+If the dtypes of two structured arrays are equivalent, testing the equality of
+the arrays will result in a boolean array with the dimension of the original
+arrays, with elements set to True where all fields of the corresponding


charris · 2017-11-10T00:35:44Z

numpy/doc/structured_arrays.py

-If you fill it in row by row, it takes a take a tuple
-(but not a list or array!)::
+Currently, if the dtypes of two arrays are not equivalent all comparisons will
+return ``False``. This behavior is deprecated as of numpy 1.10 and may change


What might be the alternative?

Currently in this case we get:

[1]: a = np.zeros(3, dtype='f,f,f') [2]: b = np.zeros(3, dtype='f,f') [3]: a == b FutureWarning: elementwise == comparison failed and returning scalar instead; this will raise an error or perform elementwise comparison in the future

I'll reword the text to more accurately describe what happens.

charris · 2017-11-10T00:37:26Z

numpy/doc/structured_arrays.py

- >>> arr
- array([(10.0, 20.0), (1.0, 0.0), (2.0, 0.0), (3.0, 0.0), (4.0, 0.0)],
-      dtype=[('var1', '<f8'), ('var2', '<f8')])
+Currently, the ``<`` and ``>`` operators will always return ``False`` when


Maybe "The .... operators always return ..."?

charris · 2017-11-10T00:37:58Z

numpy/doc/structured_arrays.py

- array([(10.0, 20.0), (1.0, 0.0), (2.0, 0.0), (3.0, 0.0), (4.0, 0.0)],
-      dtype=[('var1', '<f8'), ('var2', '<f8')])
+Currently, the ``<`` and ``>`` operators will always return ``False`` when
+comparing structured arrays. Many other pairwise operators are not supported.


Many or all? Maybe "no other"?

charris · 2017-11-10T00:40:10Z

numpy/doc/structured_arrays.py

-which allows field access by attribute on the individual elements of the array. 
+As an optional convenience numpy provides an ndarray subclass,
+:class:`numpy.recarray`, and associated helper functions in the
+:mod:`numpy.rec` submodule, which allows access to fields of structured arrays


which <- that.

charris · 2017-11-10T00:42:17Z

numpy/doc/structured_arrays.py

+As an optional convenience numpy provides an ndarray subclass,
+:class:`numpy.recarray`, and associated helper functions in the
+:mod:`numpy.rec` submodule, which allows access to fields of structured arrays
+by attribute, instead of only by index. Record arrays also use a special


Comma not needed.

charris · 2017-11-10T00:42:36Z

numpy/doc/structured_arrays.py

+:class:`numpy.recarray`, and associated helper functions in the
+:mod:`numpy.rec` submodule, which allows access to fields of structured arrays
+by attribute, instead of only by index. Record arrays also use a special
+datatype, :class:`numpy.record`, which allows field access by attribute on the


which <- that

charris · 2017-11-10T00:48:26Z

I'm not sure that we should be using 'title' instead of italicized title or title, the same for other names for parts of the dtype.

NumPy should hire a copy editor, there should be plenty out there in this time of self publishing.

[ci skip]

ahaldane · 2017-11-10T02:47:40Z

Updated, thanks a lot. You make a great copy editor!

For the styling of the parts of the dtype, my idea was to use them as normal nouns through most of the document (eg, "the field name is.."), except in the dtype-specification section where I need to make clear the format of the tuple, in which case I write (fieldname, datatype, shape) and refer to the three variables in the tuple in code-styling like fieldname.

[ci skip]

charris · 2017-11-11T23:20:51Z

OK, let's get this in. Thanks Allan.

ahaldane force-pushed the structure_docs branch from 0fb0697 to 7c07eeb Compare May 5, 2017 16:40

ahaldane added 04 - Documentation 51 - In progress component: documentation labels May 5, 2017

eric-wieser reviewed May 5, 2017

View reviewed changes

charris added this to the 1.14.0 release milestone May 9, 2017

ahaldane mentioned this pull request Jun 27, 2017

BUG: Subarrays casts truncate and zero-pad without error or warning #9313

Open

ahaldane mentioned this pull request Sep 3, 2017

Structured array drops field-titles when being 'sliced' by field-names #9625

Closed

ahaldane force-pushed the structure_docs branch 2 times, most recently from f07ea87 to 3a7f388 Compare September 25, 2017 21:33

ahaldane force-pushed the structure_docs branch 2 times, most recently from 602e22d to 5be7def Compare September 27, 2017 23:33

charris reviewed Oct 2, 2017

View reviewed changes

charris reviewed Nov 10, 2017

View reviewed changes

DOC: update structured array docs to reflect numpy#6053

c43e0e5

[ci skip]

ahaldane force-pushed the structure_docs branch from 47f1041 to b282aff Compare November 10, 2017 02:42

ahaldane force-pushed the structure_docs branch 2 times, most recently from 6ca3fda to 4280fb3 Compare November 10, 2017 02:51

DOC: update structured array docs to reflect numpy#6053, fixups

a08da3f

[ci skip]

ahaldane force-pushed the structure_docs branch from 4280fb3 to a08da3f Compare November 10, 2017 02:54

charris merged commit de26584 into numpy:master Nov 11, 2017

ahaldane mentioned this pull request Nov 20, 2017

DOC: describe the expansion of take and apply_along_axis in detail #9946

Merged


		Notice that `x` is created with a list of tuples. ::
		Thus, tuples might be though of as the native Python equivalent to numpy's


		Structured arrays can be filled by field or row by row. ::
		If the dtypes of two structured arrays are equivalent, testing the equality of

Uh oh!

DOC: update structured array docs to reflect #6053 #9056

DOC: update structured array docs to reflect #6053 #9056

Uh oh!

Conversation

ahaldane commented May 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented Sep 21, 2017

Uh oh!

charris commented Sep 24, 2017

Uh oh!

ahaldane commented Sep 24, 2017

Uh oh!

ahaldane commented Sep 25, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahaldane commented May 5, 2017 •

edited

Loading