Skip to content

Conversation

nzjrs
Copy link

@nzjrs nzjrs commented Aug 5, 2014

Based on the original pythonic branch by Hildon Bristow this branch adds more flexibility in how variables returned from functions are handled (576ee23). This is best demonstrated with an example

x,y = mlab.meshgrid(range(1,4),range(10,15),nout=2)

returns the two matrices to the python side

placeholder = mlab.meshgrid(range(15,20),range(10,15),nout=2,saveout=('X','Y'))

saves the two output variables as X and Y in MATLAB. Subsequently one can do

mlab.run_code('Z=X+Y')
z = mlab.get_variable('Z')

and so on. This saves the serialization penality when one wishes to generatively or programatically call MATLAB code from python without necessarily wanting to generate whole matlab m files.

This required a few other changes, such as unifying run_code, run_script and run_function in such a way that they are all executed in the base workspace so variables can be shared (a07604a, 3993679, e315ad0).

I also added logging (cdaded9) and the ability to show MATLAB stdout (3a880cf).

Finally, due to the ambiguity of passing and returning adjacent matrix arguments via textual JSON, all arguments are passed via dictionaries/structures and converted on appropriately (7d38469)

The obvious extension to this work is to extend the placeholder listed above to be a true proxy type. When this proxy type is passed across the python/MATLAB boundary it would be done only as a reference and the value filled out appropriately on the MATLAB side. Similarly, on the python side any attempt to perform arithemetic with the proxy type would return its true value (e.g. https://pypi.python.org/pypi/ProxyTypes)

@nzjrs nzjrs mentioned this pull request Aug 5, 2014
@arokem
Copy link
Owner

arokem commented Aug 5, 2014

Cool. Thanks for working on this!

My main concern with this branch is that it breaks backwards compatibility rather dramatically. That is, none of the tests we have in place passes (I can't even run the test suite cleanly).

Moreover, I can't really get this to work:

In [4]: x,y = matlab.meshgrid(range(1,4), range(10,15),nout=2)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-5437a7e15002> in <module>()
----> 1 x,y = matlab.meshgrid(range(1,4), range(10,15),nout=2)

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in __getattr__(self, name)
    407 
    408     def __getattr__(self, name):
--> 409         return self.bind_method(name)
    410     def bind_method(self, name, unconditionally=False):
    411         """Generate a Matlab function and bind it to the instance

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in bind_method(self, name, unconditionally)
    439         """
    440         # TODO: This does not work if the function is a mex function inside a folder of the same name
--> 441         if not unconditionally and not self.exist(name):
    442             raise AttributeError("'Matlab' object has no attribute '%s'" % name)
    443 

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in __call__(self, _, *args, **kwargs)
    543         self.parent.log("REQ:     %r:"%req)
    544 
--> 545         resp  = self.parent.execute_in_matlab(req)
    546 
    547         # return the result

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in execute_in_matlab(self, req)
    399         resp = AttributeDict(json.loads(resp, cls=MatlabDecoder))
    400         if not resp.success:
--> 401             raise RuntimeError(resp.result +': '+ resp.message)
    402 
    403         if hasattr(resp, 'result') and isinstance(resp.result, dict) and 'nout' in resp.result:

RuntimeError: MATLAB:UndefinedFunction: Undefined function 'strsplit' for input arguments of type 'char'.

Error in matlabserver>call (line 76)
    [saveout nsaveout] = strsplit(req.saveout,';');

Error in matlabserver (line 32)
                    resp = call(req);

@arokem
Copy link
Owner

arokem commented Aug 5, 2014

Also - please rebase on top of master to make it easier to review and (ultimately) merge.

@nzjrs
Copy link
Author

nzjrs commented Aug 5, 2014

Hmm, it appears strsplit was (unbelievably) only added in r2013. I'll just
ship l ship my own.

I will address the rest of your mail tomorrow.
On 5/08/2014 7:07 PM, "Ariel Rokem" notifications@github.com wrote:

Cool. Thanks for working on this!

My main concern with this branch is that it breaks backwards compatibility
rather dramatically. That is, none of the tests we have in place passes (I
can't even run the test suite cleanly).

Moreover, I can't really get this to work:

In [4]: x,y = matlab.meshgrid(range(1,4), range(10,15),nout=2)

RuntimeError Traceback (most recent call last)
in ()
----> 1 x,y = matlab.meshgrid(range(1,4), range(10,15),nout=2)

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in getattr(self, name)
407
408 def getattr(self, name):
--> 409 return self.bind_method(name)
410 def bind_method(self, name, unconditionally=False):
411 """Generate a Matlab function and bind it to the instance

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in bind_method(self, name, unconditionally)
439 """
440 # TODO: This does not work if the function is a mex function inside a folder of the same name
--> 441 if not unconditionally and not self.exist(name):
442 raise AttributeError("'Matlab' object has no attribute '%s'" % name)
443

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in call(self, _, _args, *_kwargs)
543 self.parent.log("REQ: %r:"%req)
544
--> 545 resp = self.parent.execute_in_matlab(req)
546
547 # return the result

/Users/arokem/tmp/junk/lib/python2.7/site-packages/pymatbridge/pymatbridge.pyc in execute_in_matlab(self, req)
399 resp = AttributeDict(json.loads(resp, cls=MatlabDecoder))
400 if not resp.success:
--> 401 raise RuntimeError(resp.result +': '+ resp.message)
402
403 if hasattr(resp, 'result') and isinstance(resp.result, dict) and 'nout' in resp.result:

RuntimeError: MATLAB:UndefinedFunction: Undefined function 'strsplit' for input arguments of type 'char'.

Error in matlabserver>call (line 76)
[saveout nsaveout] = strsplit(req.saveout,';');

Error in matlabserver (line 32)
resp = call(req);


Reply to this email directly or view it on GitHub
#77 (comment)
.

@nzjrs
Copy link
Author

nzjrs commented Aug 7, 2014

You should be able to run this now - I replaced strsplit with regexp, which apparently has existed for longer.

What is your general thoughts about the API? It breaks it to the same degree as Hildon's branch, and I'm not sure what your thoughts were on that branch either.

@arokem
Copy link
Owner

arokem commented Aug 11, 2014

Hi - sorry for the slow response. In general, I really like this API, and
think that it is a good addition, but we need to find a way for this to be
at least somewhat backwards compatible. My main priority is to still be
able to use the matlab magic in ipython notebooks. I am very time-limited
in working on this, but I can try to take a closer look at what we would
need to do to harmonize things so that a merge is possible, rather than
silo-ing off this excellent new set of features in it's own branch.

On Thu, Aug 7, 2014 at 10:23 AM, John Stowers notifications@github.com
wrote:

You should be able to run this now - I replaced strsplit with regexp,
which apparently has existed for longer.

What is your general thoughts about the API? It breaks it to the same
degree as Hildon's branch, and I'm not sure what your thoughts were on that
branch either.


Reply to this email directly or view it on GitHub
#77 (comment)
.

@arokem
Copy link
Owner

arokem commented Aug 19, 2014

OK - I just spent a couple of hours yesterday/today trying to understand this code, and to see whether it could be unified with the code in master. Have you had a chance to take a look and see whether you can rebase on top of master in any reasonable way, and make the tests pass? It seems that the original PR (#63) was already so far from the original functionality, that it's hard to make things work. I think that it would be OK to change the API, but we need some backwards compatibility.

@nzjrs
Copy link
Author

nzjrs commented Aug 21, 2014

Yeah, I had a quick look. The main difficulty with rebasing cleanly is the
first commit 12e0411, it really just
replaces all the old specific pymat_call function with a simpler execution
model.

All commits following that are more reasoned.

I agree with the part about backwards compatibility - I will think about
that.

Maybe I add it as a shim in a pymatlab.compat module

On Tue, Aug 19, 2014 at 7:40 PM, Ariel Rokem notifications@github.com
wrote:

OK - I just spent a couple of hours yesterday/today trying to understand
this code, and to see whether it could be unified with the code in master.
Have you had a chance to take a look and see whether you can rebase on top
of master in any reasonable way, and make the tests pass? It seems that the
original PR (#63 #63)
was already so far from the original functionality, that it's hard to make
things work. I think that it would be OK to change the API, but we need
some backwards compatibility.


Reply to this email directly or view it on GitHub
#77 (comment)
.

@arokem
Copy link
Owner

arokem commented Aug 22, 2014

As I understand it, that commit introduced 'pymat_call.m' and made some
associated changes in pymatbridge.py and in the structure of the
messenger. I think that could actually be reconciled with what we currently
have in pymatbridge.py. Or am I missing something? Maybe I'll try to make
a separate PR with just that change in functionality, and we can start
going slowly from there?

I don't think that we want to have two incompatible sets of functionality
in two different modules? Is that what you were suggesting with the
pymatlab.compat module shim?

On Thu, Aug 21, 2014 at 4:34 AM, John Stowers notifications@github.com
wrote:

Yeah, I had a quick look. The main difficulty with rebasing cleanly is the
first commit 12e0411, it really just
replaces all the old specific pymat_call function with a simpler execution
model.

All commits following that are more reasoned.

I agree with the part about backwards compatibility - I will think about
that.

Maybe I add it as a shim in a pymatlab.compat module

On Tue, Aug 19, 2014 at 7:40 PM, Ariel Rokem notifications@github.com
wrote:

OK - I just spent a couple of hours yesterday/today trying to understand
this code, and to see whether it could be unified with the code in
master.
Have you had a chance to take a look and see whether you can rebase on
top
of master in any reasonable way, and make the tests pass? It seems that
the
original PR (#63 #63)

was already so far from the original functionality, that it's hard to
make
things work. I think that it would be OK to change the API, but we need
some backwards compatibility.


Reply to this email directly or view it on GitHub
<
https://github.com/arokem/python-matlab-bridge/pull/77#issuecomment-52670561>

.


Reply to this email directly or view it on GitHub
#77 (comment)
.

@nzjrs
Copy link
Author

nzjrs commented Aug 25, 2014

So I'm slowly cleaning this up now, or at least removing no longer called code. I mispoke earlier about pymatlab_call. That was cruft that Hildon later removed.

The first commit could definitely be split somehow, it is two parts

  1. the pythonic API
  2. greatly simplifying the matlab side code by putting the call function/script/assign/get variable code into the python layer.

I'm not sure about a backwards compatibility story yet.

I think 2 is something that could be adopted upstream separately, although I have not tried to split/rebase that yet.

@arokem
Copy link
Owner

arokem commented Aug 27, 2014

If these things could be separated into two PRs, that would be very
helpful.

I would think that we could do number 2 first, followed by a careful
implementation of number 1. Presumably number 2 would be easier to keep
backwards compatible, which is why I propose to start with that.

How does that sound to you?

On Mon, Aug 25, 2014 at 7:38 AM, John Stowers notifications@github.com
wrote:

So I'm slowly cleaning this up now, or at least removing no longer called
code. I mispoke earlier about pymatlab_call. That was cruft that Hildon
later removed.

The first commit could definitely be split somehow, it is two parts

  1. the pythonic API
  2. greatly simplifying the matlab side code by putting the call
    function/script/assign/get variable code into the python layer.

I'm not sure about a backwards compatibility story yet.

I think 2 is something that could be adopted upstream separately, although
I have not tried to split/rebase that yet.


Reply to this email directly or view it on GitHub
#77 (comment)
.

Hilton Bristow and others added 15 commits September 24, 2014 14:38
* draw the plots with a call to matlab.drawnow().
Shows a full MATLAB traceback in case of error
This allows python callers to optionally save variables
returned from functions on the MATLAB side. This improves
performance as large objects do not need to be
serialized so frequently.

In combination with changes to run_code and run_script
it allows more performant execution of line-by-line
MATLAB from python
Add capture_stdout to constructor. Passing False
allows one to see display() etc messages
Add log to constructor arguments which shows which
functions are called, and what their arguments are
Because run_code, run_script, and run_function all
manipulate the MATLAB path add a local python
path cache to reduce the number of addpath calls
* It didnt actually work with functions with
  multiple arguments
* Log function proxy creation
* So we can get to the variables created within
Due to the inherent ambiguity in matlab's JSON
decoding ability this is the only way to make
passing >1 same-sized array arguments work. Otherwise
the matlab json library coerces them to a matrix
Runs a m script file and stores any variables
in the base workspace
Write a temporary m file and execute it via run_script to
allows variables to persist in the base workspace.
@nzjrs
Copy link
Author

nzjrs commented Sep 24, 2014

A few more cleanups to this PR to update it to what I am running day-to-day. This work concludes my attempts to improve performance. ec9757b passes large matrices via the filesytem (for a > 20x speed improvement in many cases. c3d90a7 adds 'proxy variable' support to function calls such that arguments and return values left on the python side can be transparently returned to the python side using call. For example

#perform a command but leave the result in matlab
_x,_y = mlab.meshgrid(range(1,4),range(10,15),nout=2,saveout=('X','Y'))
assert repr(_x) == '<ProxyVariable __VAR=X|double([5 3])>' #this is a proxy variable

#now return the real variable, not the proxy
x = _x()

#alternatively the same thing
x = mlab.get_variable('X')
#also, calling str(proxy) will return the name
x = mlab.get_variable(str(_x))

#now passing the proxy to a call automatically decodes it
z = mlab.cat(3,_x,_y)

#putting this all together allows performant mixing of python and MATLAB
#side code
#mixing variables on the matlab side
mlab.run_code('zb=cat(3,%s,%s);' % (_x, _y))
zb = mlab.get_variable('zb')

@arokem
Copy link
Owner

arokem commented Sep 25, 2014

This all sounds great. Any more thoughts on how we might merge this into master? It's getting more and more difficult, as things diverge further and further.

@nzjrs nzjrs force-pushed the store-variables branch 3 times, most recently from 893c0c7 to 446ee24 Compare October 1, 2014 09:25
nzjrs added 5 commits October 1, 2014 11:30
Proxy variables will are now returned from calls to
functions with saveout=(...). These variables can
be passed to subsequent function calls and will
be transparently reconstituted on the MATLAB
side
Allows one to do

with mlab.fig('foo.jpg') as f:
    mlab.plot(...)
@nzjrs
Copy link
Author

nzjrs commented Feb 18, 2015

Good to see the other stuff merged. I should take a look and rebase this branch again.

I have been using my fork for a while now, and things seem to be working very well.

@blink1073
Copy link
Collaborator

@nzjrs, I believe the only thing we haven't covered is the proxy idea. You are welcome to create a new PR that implements that piece.

@blink1073 blink1073 closed this Feb 18, 2015
@arokem
Copy link
Owner

arokem commented Feb 18, 2015

Yes - it would be great if you could rebase on the current master, and make a PR from the difference. Note that many other things have changed, and merging into your fork might be tricky. @blink1073 has been on fire for the last few days.

@blink1073
Copy link
Collaborator

It looks like we also need:
"I also added logging (cdaded9) and the ability to show MATLAB stdout (3a880cf)."

@nzjrs
Copy link
Author

nzjrs commented Feb 18, 2015

@blink1073 Is the performance still not very good here because of the json serialisation?

I go via /tmp files for large arrays in my branch.

@blink1073
Copy link
Collaborator

we are using base64 to encode numpy arrays, which is ~10X faster than the old way for floating points, and does not rely on the file system.

@arokem
Copy link
Owner

arokem commented Feb 18, 2015

See here: #103

On Wed, Feb 18, 2015 at 11:13 AM, Steven Silvester <notifications@github.com

wrote:

we are using base64 to encode numpy arrays, which is ~10X faster than the
old way for floating points, and does not rely on the file system.


Reply to this email directly or view it on GitHub
#77 (comment)
.

@nzjrs
Copy link
Author

nzjrs commented Feb 18, 2015

@arokem, @blink1073 I tried that for large arrays, and the filesystem was faster (although maybe a little messy, although I clean up after myself IIRC). Thanks for the note though.

There are a few other things, like a fig decorator, and some other bits and pieces in other branches. I will have another look.

@blink1073
Copy link
Collaborator

If you can, please try and limit to one feature per PR to make it easier to review/merge.

@blink1073
Copy link
Collaborator

I used the filesystem route exclusively in Oct2Py, and I've found it to be more trouble than it was worth. Typically passing arrays should not be the bottleneck, assuming the computation itself is what is expensive, just my 2 cents.

@nzjrs
Copy link
Author

nzjrs commented Feb 18, 2015

@blink1073 will do. I'm normally a polite pull request person (one feature per PR, etc), but this was more of a 'so it doesn't get lost kind of PR'.

I thought you might of been interested in my changes, as a real world user of this on large datasets. At the moment I am a in paper-preparation mode, so my free time is limited, but I might file some of my other changes later.

@blink1073
Copy link
Collaborator

Understood on all accounts. If I get the itch, I'll fold in a few of your ideas, keep them coming. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants