-
Notifications
You must be signed in to change notification settings - Fork 96
Unify variable visibility and optionally leave in MATLAB #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cool. Thanks for working on this! My main concern with this branch is that it breaks backwards compatibility rather dramatically. That is, none of the tests we have in place passes (I can't even run the test suite cleanly). Moreover, I can't really get this to work:
|
Also - please rebase on top of master to make it easier to review and (ultimately) merge. |
Hmm, it appears strsplit was (unbelievably) only added in r2013. I'll just I will address the rest of your mail tomorrow.
|
You should be able to run this now - I replaced strsplit with regexp, which apparently has existed for longer. What is your general thoughts about the API? It breaks it to the same degree as Hildon's branch, and I'm not sure what your thoughts were on that branch either. |
Hi - sorry for the slow response. In general, I really like this API, and On Thu, Aug 7, 2014 at 10:23 AM, John Stowers notifications@github.com
|
OK - I just spent a couple of hours yesterday/today trying to understand this code, and to see whether it could be unified with the code in master. Have you had a chance to take a look and see whether you can rebase on top of master in any reasonable way, and make the tests pass? It seems that the original PR (#63) was already so far from the original functionality, that it's hard to make things work. I think that it would be OK to change the API, but we need some backwards compatibility. |
Yeah, I had a quick look. The main difficulty with rebasing cleanly is the All commits following that are more reasoned. I agree with the part about backwards compatibility - I will think about Maybe I add it as a shim in a pymatlab.compat module On Tue, Aug 19, 2014 at 7:40 PM, Ariel Rokem notifications@github.com
|
As I understand it, that commit introduced 'pymat_call.m' and made some I don't think that we want to have two incompatible sets of functionality On Thu, Aug 21, 2014 at 4:34 AM, John Stowers notifications@github.com
|
So I'm slowly cleaning this up now, or at least removing no longer called code. I mispoke earlier about pymatlab_call. That was cruft that Hildon later removed. The first commit could definitely be split somehow, it is two parts
I'm not sure about a backwards compatibility story yet. I think 2 is something that could be adopted upstream separately, although I have not tried to split/rebase that yet. |
If these things could be separated into two PRs, that would be very I would think that we could do number 2 first, followed by a careful How does that sound to you? On Mon, Aug 25, 2014 at 7:38 AM, John Stowers notifications@github.com
|
* draw the plots with a call to matlab.drawnow().
Shows a full MATLAB traceback in case of error
This allows python callers to optionally save variables returned from functions on the MATLAB side. This improves performance as large objects do not need to be serialized so frequently. In combination with changes to run_code and run_script it allows more performant execution of line-by-line MATLAB from python
Add capture_stdout to constructor. Passing False allows one to see display() etc messages
Add log to constructor arguments which shows which functions are called, and what their arguments are
Because run_code, run_script, and run_function all manipulate the MATLAB path add a local python path cache to reduce the number of addpath calls
* It didnt actually work with functions with multiple arguments * Log function proxy creation
* So we can get to the variables created within
Due to the inherent ambiguity in matlab's JSON decoding ability this is the only way to make passing >1 same-sized array arguments work. Otherwise the matlab json library coerces them to a matrix
Runs a m script file and stores any variables in the base workspace
Write a temporary m file and execute it via run_script to allows variables to persist in the base workspace.
24a20dd
to
023127e
Compare
A few more cleanups to this PR to update it to what I am running day-to-day. This work concludes my attempts to improve performance. ec9757b passes large matrices via the filesytem (for a > 20x speed improvement in many cases. c3d90a7 adds 'proxy variable' support to function calls such that arguments and return values left on the python side can be transparently returned to the python side using call. For example #perform a command but leave the result in matlab
_x,_y = mlab.meshgrid(range(1,4),range(10,15),nout=2,saveout=('X','Y'))
assert repr(_x) == '<ProxyVariable __VAR=X|double([5 3])>' #this is a proxy variable
#now return the real variable, not the proxy
x = _x()
#alternatively the same thing
x = mlab.get_variable('X')
#also, calling str(proxy) will return the name
x = mlab.get_variable(str(_x))
#now passing the proxy to a call automatically decodes it
z = mlab.cat(3,_x,_y)
#putting this all together allows performant mixing of python and MATLAB
#side code
#mixing variables on the matlab side
mlab.run_code('zb=cat(3,%s,%s);' % (_x, _y))
zb = mlab.get_variable('zb') |
This all sounds great. Any more thoughts on how we might merge this into |
893c0c7
to
446ee24
Compare
Proxy variables will are now returned from calls to functions with saveout=(...). These variables can be passed to subsequent function calls and will be transparently reconstituted on the MATLAB side
Allows one to do with mlab.fig('foo.jpg') as f: mlab.plot(...)
Good to see the other stuff merged. I should take a look and rebase this branch again. I have been using my fork for a while now, and things seem to be working very well. |
@nzjrs, I believe the only thing we haven't covered is the proxy idea. You are welcome to create a new PR that implements that piece. |
Yes - it would be great if you could rebase on the current master, and make a PR from the difference. Note that many other things have changed, and merging into your fork might be tricky. @blink1073 has been on fire for the last few days. |
@blink1073 Is the performance still not very good here because of the json serialisation? I go via /tmp files for large arrays in my branch. |
we are using base64 to encode numpy arrays, which is ~10X faster than the old way for floating points, and does not rely on the file system. |
See here: #103 On Wed, Feb 18, 2015 at 11:13 AM, Steven Silvester <notifications@github.com
|
@arokem, @blink1073 I tried that for large arrays, and the filesystem was faster (although maybe a little messy, although I clean up after myself IIRC). Thanks for the note though. There are a few other things, like a fig decorator, and some other bits and pieces in other branches. I will have another look. |
If you can, please try and limit to one feature per PR to make it easier to review/merge. |
I used the filesystem route exclusively in Oct2Py, and I've found it to be more trouble than it was worth. Typically passing arrays should not be the bottleneck, assuming the computation itself is what is expensive, just my 2 cents. |
@blink1073 will do. I'm normally a polite pull request person (one feature per PR, etc), but this was more of a 'so it doesn't get lost kind of PR'. I thought you might of been interested in my changes, as a real world user of this on large datasets. At the moment I am a in paper-preparation mode, so my free time is limited, but I might file some of my other changes later. |
Understood on all accounts. If I get the itch, I'll fold in a few of your ideas, keep them coming. 😄 |
Based on the original pythonic branch by Hildon Bristow this branch adds more flexibility in how variables returned from functions are handled (576ee23). This is best demonstrated with an example
returns the two matrices to the python side
saves the two output variables as X and Y in MATLAB. Subsequently one can do
and so on. This saves the serialization penality when one wishes to generatively or programatically call MATLAB code from python without necessarily wanting to generate whole matlab m files.
This required a few other changes, such as unifying run_code, run_script and run_function in such a way that they are all executed in the base workspace so variables can be shared (a07604a, 3993679, e315ad0).
I also added logging (cdaded9) and the ability to show MATLAB stdout (3a880cf).
Finally, due to the ambiguity of passing and returning adjacent matrix arguments via textual JSON, all arguments are passed via dictionaries/structures and converted on appropriately (7d38469)
The obvious extension to this work is to extend the
placeholder
listed above to be a true proxy type. When this proxy type is passed across the python/MATLAB boundary it would be done only as a reference and the value filled out appropriately on the MATLAB side. Similarly, on the python side any attempt to perform arithemetic with the proxy type would return its true value (e.g. https://pypi.python.org/pypi/ProxyTypes)