Skip to content
This repository was archived by the owner on Dec 31, 2021. It is now read-only.

Latest commit

 

History

History
45 lines (38 loc) · 1.58 KB

README.md

File metadata and controls

45 lines (38 loc) · 1.58 KB

parallel-sshfs

An unmaintained fsspec implementation for the ssh/sftp protocol via using parallel-ssh.

Disclaimer

I didn't spend a lot of time on this project, it was just test to see how good parallel ssh performs in our use case though I wasn't able to run the benchmarks due to various errors. I also have to note that this implementation depends on various internal methods of parallel-ssh (like _eagain, _make_sftp) and undocumented public methods (like eagain_write). I couldn't find a public API to cover all the operations I needed so just be aware.

You might encounter with different sorts of errors from different places, some might arise on race conditions and others just happen randomly. Here are a few that I stumbled against;

  • When writing to a stream it might poll indefinitely and never continue
  • When writing to a stream sometimes it just hits an assertion in the underlying libssh2 library (the C one, not the python-binding) and gets a core dump.
  • There is a double free happening that I didn't try to deduce at which point, though I suspect might be a race condition (? even then it shouldn't happen).

Other stuff to note;

  • Since all sftp errors throw a generic error, we have to waste a couple exists() calls for each operation to ensure FileNotFoundErrors are properly delegated. This is costful and also doesn't guarantee much since you might still hit other sorts of errors which are nearly indistungishable.
  • No kerberos/gss auth for the ssh2, though just to note parallel-ssh claims they are available on the other client.