Skip to content

Commit b16ef72

Browse files
committed
implemented LArray.apply
1 parent 33e215c commit b16ef72

File tree

2 files changed

+73
-0
lines changed

2 files changed

+73
-0
lines changed

doc/source/changes/version_0_30.rst.inc

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,75 @@ New features
168168
a0 0 0 2
169169
a1 0 1 1
170170

171+
* implemented :py:obj:`LArray.apply()` method to apply a python function or mapping to all
172+
elements of an array or to all sub-arrays along some axes of an array and return the result. This is an extremely
173+
versatile method as it can be used both with aggregating functions or element-wise functions.
174+
175+
First let us define a test array
176+
177+
>>> arr = LArray([[0, 2, 1],
178+
... [3, 1, 5]], 'a=a0,a1;b=b0..b2')
179+
>>> arr
180+
a\b b0 b1 b2
181+
a0 0 2 1
182+
a1 3 1 5
183+
184+
Here is a simple function we would like to apply to each element of the array.
185+
Note that this particular example should rather be written as: arr ** 2
186+
as it is both more concise and much faster.
187+
188+
>>> def square(x):
189+
... return x ** 2
190+
>>> arr.apply(square)
191+
a\b b0 b1 b2
192+
a0 0 4 1
193+
a1 9 1 25
194+
195+
Now, assuming for a moment that the values of our test array above were in fact some numeric representation of
196+
names and we had the correspondence to the actual names stored in a dictionary:
197+
198+
>>> code_to_names = {0: 'foo', 1: 'bar', 2: 'baz',
199+
... 3: 'boo', 4: 'far', 5: 'faz'}
200+
201+
We could get back an array with the actual names by using:
202+
203+
>>> arr.apply(code_to_names)
204+
a\b b0 b1 b2
205+
a0 foo baz bar
206+
a1 boo bar faz
207+
208+
Functions can also be applied along some axes:
209+
210+
>>> # this is equivalent to (but much slower than): arr.sum_by('a')
211+
... arr.apply(sum, 'a')
212+
a a0 a1
213+
3 9
214+
215+
Applying the function along some axes will return an array with the
216+
union of those axes and the axes of the returned values. For example,
217+
let us define a function which returns the k highest values of an array.
218+
219+
>>> def topk(a, k=2):
220+
... return a.sort_values(ascending=False).ignore_labels().i[:k]
221+
>>> arr.apply(topk, 'a')
222+
a\b* 0 1
223+
a0 2 1
224+
a1 5 3
225+
226+
Other arguments can be passed to the function as a tuple in the "args" argument:
227+
228+
>>> arr.apply(topk, axes='a', args=(3,))
229+
a\b* 0 1 2
230+
a0 2 1 0
231+
a1 5 3 1
232+
233+
or by using keyword arguments:
234+
235+
>>> arr.apply(topk, axes='a', k=3)
236+
a\b* 0 1 2
237+
a0 2 1 0
238+
a1 5 3 1
239+
171240
* implemented :py:obj:`LArray.keys()` :py:obj:`LArray.values()` and :py:obj:`LArray.items()`
172241
methods to iterate (loop) on an array labels (keys), values or (key, value) pairs.
173242

larray/core/array.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7725,6 +7725,10 @@ def apply(self, transform, axes=None, dtype=None, ascending=True, args=(), **kwa
77257725
a0 2 1 0
77267726
a1 5 3 1
77277727
"""
7728+
# XXX: we could go one step further than vectorize and support a array of callables which would be broadcasted
7729+
# with the other arguments. I don't know whether that would actually help because I think it always
7730+
# possible to emulate that with a single callable with an extra argument (eg type) which dispatches to
7731+
# potentially different callables. It might be more practical & efficient though.
77287732
if axes is None:
77297733
if isinstance(transform, abc.Mapping):
77307734
mapping = transform

0 commit comments

Comments
 (0)