Opened at 2010-08-28T01:13:53Z
Last modified at 2014-12-02T19:44:04Z
#1189 new defect
investigate best FUSE+sshfs options to use for performance and correctness of SFTP via sshfs
Reported by: | davidsarah | Owned by: | bj0 |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-frontend-ftp-sftp | Version: | 1.8β |
Keywords: | sftp sshfs performance docs | Cc: | |
Launchpad Bug: |
Description
It looks as though at least direct_io and big_writes may be beneficial, so that writes are not limited to 4 KiB blocks.
Change History (6)
comment:1 Changed at 2010-08-28T08:26:44Z by bj0
comment:2 Changed at 2010-08-28T14:40:15Z by zooko
Dear bj0: thanks for the report!
What version(s) of Tahoe-LAFS were you using? If you have just been tracking the official trunk repo at http://tahoe-lafs.org/source/tahoe-lafs/trunk and haven't applied any other patches, then you can find out by running make make-version.
comment:3 Changed at 2010-08-28T22:49:18Z by davidsarah
Thanks bj0.
big_writes should only affect writes, and I can't immediately see why it would have anything but a beneficial effect. direct_io might affect both reads and writes, and could cause some loss of performance for applications that are relying for performance on kernel caching. Can you try the same tests with -o big_writes only?
[I initially suggested both because http://xtreemfs.blogspot.com/2008/08/fuse-performance.html said that direct_io was needed (at least for some version of sshfs and Linux tested in 2008) to support writing in blocks greater than 4 KiB. However, point 2 in http://article.gmane.org/gmane.comp.file-systems.fuse.devel/5292 suggests that this restriction might have been lifted.]
What Linux kernel version and sshfs version did you use?
comment:4 Changed at 2010-08-29T07:18:10Z by bj0
make make-version returned: setup.py darcsver: wrote '1.8.0c2-r4702' into src/allmydata/_version.py
sshfs Version: 2.2-1build1 (from ubuntu repo)
client uname -a: Linux nazgul 2.6.32-020632-generic #020632 SMP Thu Dec 3 10:09:58 UTC 2009 x86_64 GNU/Linux server uname -a: Linux testbuntu 2.6.32-24-generic #41-Ubuntu SMP Thu Aug 19 01:12:52 UTC 2010 i686 GNU/Linux
I was going to try the tests without direct_io, but I seem to be having trouble with my vm...
comment:5 Changed at 2010-09-11T23:57:32Z by davidsarah
- Owner set to bj0
comment:6 Changed at 2014-12-02T19:44:04Z by warner
- Component changed from code-frontend to code-frontend-ftp-sftp
I started doing a couple random tests I could think of. I didn't repeat many of the tests, since they took a bit of time and it's all manual, but it sort of gives an idea:
Most of these tests were copying a large file (728MB .iso file) to and from a tahoe introducer/storage client running inside a VirtualBox? VM on the same host (both guest and host running ubuntu). Most of the copying was done with "time rsync -rhPa ", and when copying a large file to tahoe, after the transfer the command hangs for another minute or so. I checked the flog during this time and there was activity (read i think) so it may be that rsync tries to checksum the file after transfer, i'm not sure.
To verify the file was being transfered correctly, I also did "time md5sum mnt/iso". I would expect this to be similar to simply reading the file, but for some reason it performed differently...
with it mounted as: "sshfs -p PORT server:/ mnt/"
with it mounted as: "sshfs -p PORT -o direct_io,big_writes server:/ mnt/"
obviously I expect the values to fluctuate a bit, but it seems like direct_io,big_writes bumps up the write speed a bit, without really affecting the read speed. I'm not really sure why it hit md5sum time so bad...
I also tried to rsync a large directory of source files (4mb, 981 files) to tahoe, but it seems to be acting odd, and stalls a lot, resulting in a very long transfer time (120 - 140 minutes). This happened with and without the options.