|
ioorclone May 8, 2023, 8:49pm 1 What is the problem you are having with rclone? I am trying to use rclone sftp backend on a local network to transfer files. Rclone's sftp implementation is substantially slower than mounting a simple sshfs or using the filesystem's sftp. Run the command 'rclone version' and share the full output of the command.rclone v1.59.1 Which cloud storage system are you using? (eg Google Drive)sftp The command you were trying to run (eg rclone copy /tmp remote:tmp) rclone mount remote /local/mount The rclone config contents with secrets removed. [remote] user = username type = sftp host = localhost shell_type = unix md5sum_command = md5sum sha1sum_command = sha1sum A log from the command with the -vv flag 2023/05/08 16:11:56 DEBUG : testfile.bin: Attr: 2023/05/08 16:11:56 DEBUG : testfile.bin: >Attr: a=valid=1s ino=0 size=965738496 mode=-rw-rw-r--, err=<nil> 2023/05/08 16:11:56 DEBUG : testfile.bin: Open: flags=OpenReadOnly 2023/05/08 16:11:56 DEBUG : testfile.bin: Open: flags=O_RDONLY 2023/05/08 16:11:56 DEBUG : testfile.bin: >Open: fd=testfile.bin (r), err=<nil> 2023/05/08 16:11:56 DEBUG : testfile.bin: >Open: fh=&{testfile.bin (r)}, err=<nil> 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=0 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=131072 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.openRange at 0 length 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 0 length 4096 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 4096 length 8192 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 12288 length 16384 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 28672 length 32768 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 61440 length 65536 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 126976 length 131072 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 258048 length 262144 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 520192 length 524288 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=262144 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=393216 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 1044480 length 1048576 chunkOffset 0 chunkSize 134217728 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=655360 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=524288 2023/05/08 16:11:56 DEBUG : testfile.bin: waiting for in-sequence read to 655360 for 20ms 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=917504 2023/05/08 16:11:56 DEBUG : testfile.bin: waiting for in-sequence read to 917504 for 20ms ... 2023/05/08 16:12:41 DEBUG : testfile.bin: waiting for in-sequence read to 965607424 for 20ms 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil> 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: Flush: 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Flush: err=<nil> 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: Release: 2023/05/08 16:12:41 DEBUG : testfile.bin: ReadFileHandle.Release closing 2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Release: err=<nil> 2023/05/08 16:12:41 DEBUG : /: Lookup: 2023/05/08 16:12:41 DEBUG : /: >Lookup: node=testfile.bin, err=<nil> 2023/05/08 16:12:41 DEBUG : testfile.bin: Attr: 2023/05/08 16:12:41 DEBUG : testfile.bin: >Attr: a=valid=1s ino=0 size=965738496 mode=-rw-rw-r--, err=<nil> 2023/05/08 16:13:38 DEBUG : sftp://user@x.x.x.x:port/: Closing 2 unused connections Test setupI mounted the same locally remote directory with both rclone sftp and sshfs: Mount the rclone sftp directory with the above settings. Mount an sshfs directory with the following settings: sshfs -o rw,follow_symlinks,reconnect,ServerAliveInterval=5,compression=no user@remote: /mount/sshfsI then use Dolphin to copy a 1gb file from the rclone sftp mounted drive and record the time. Repeat the same file copy using the simple sshfs mount and record the time. Test Resultsrclone sftp copies the file at 44 seconds (around 185 Mb/s) sshfs copies the file at 30 seconds (around 270MB/s). This is substantially faster (150%) than rclone and I can't figure out why. I have tried adjusting many rclone settings, adding flags with varying values and this has only made a marginal difference. Different trial settings:Here is a list of the settings I have tried to adjust: Add larger transfers to the rclone mount config (--transfers=50) Add large transfers (--transfers=200) Add very large transfers (--transfers=2000) Remove checksum flag ( --no-checksum) Remove md5 and sha1 commands from the remote config (md5sum_command = md5sum Add chunk size flag and adjust various values from large to small (--sftp-chunk-size large, small) Add -bw limit flag and adjust various values from large to small (--bwlimit-file large small) Try all 3 values for vfs cache (--vfs-cache-mode=none, off/minimal/writes/full) I based these trials on the following thread discussions: The results of adjusting all of these settings are disappointing. I am able to see a very minor change from 43 to 49 seconds (170-190Mb/s average) to transfer the 1gb file but nothing approaching the simple sshfs transfer of 30 seconds (around 270MB/s). This doesn't really make sense from a conceptual point of view. Isn't sshfs a more complex, heavier and capable protocol. Nothing I have tried gets the rclone implementation of sftp to approach the speed of the simple sshfs mount. Final testAs a final test I setup a remote using Dolphin's sftp remote template. This also uses sftp and it eliminates the FUSE sshfs mount component of the test. The result was a transfer of 32 seconds, similar to sshfs. This indicates that the issue is not sftp versus sshfs but either rclone's implementation of sftp or a configuration problem with my remote or mount setup. Versions of other softwareHere is the sshfs version I am using: Local: Local/Remote: I tried asking chatGPT as well but it did not product anything of value. Any suggestions on what I can try or what I have overlooked? I would have expected the two solutions to be in the same ballpark. I really want to use rclone's sftp implementation but I can justify it being so much slower.
asdffdsa (jojothehumanmonkey) May 8, 2023, 8:55pm 2
hi, can you copy a single file, testfile.bin and post the full debug output.
ioorclone May 8, 2023, 9:21pm 3
have you tested using rclone copy, not rclone mount? Hi I just did a 4th test as you wrote your comment. You read my mind! Here are the results of doing an rclone copy: <7>DEBUG : Using config file from "/path/rclone.conf" <7>DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3" <7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Shell type "unix" from config <7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Relative path resolved to "/path/file.bin" <7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Using root directory "/path" <7>DEBUG : fs cache: adding new entry for parent of "remote:file.bin", "remote:" <7>DEBUG : Creating backend with remote "/path" 2023-05-08 17:09:41 DEBUG : file.bin: Need to transfer - File not found at Destination 2023-05-08 17:09:41 DEBUG : file.bin: Starting multi-thread copy with 3 parts of size 307Mi 2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 2/3 (321912832-643825664) size 307Mi starting 2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 3/3 (643825664-965738496) size 307Mi starting 2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 1/3 (0-321912832) size 307Mi starting 2023-05-08 17:09:41 DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3" 2023-05-08 17:09:41 DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3" 2023-05-08 17:10:02 DEBUG : file.bin: multi-thread copy: stream 3/3 (643825664-965738496) size 307Mi finished 2023-05-08 17:10:06 DEBUG : file.bin: multi-thread copy: stream 1/3 (0-321912832) size 307Mi finished 2023-05-08 17:10:09 DEBUG : file.bin: multi-thread copy: stream 2/3 (321912832-643825664) size 307Mi finished 2023-05-08 17:10:09 DEBUG : file.bin: Finished multi-thread copy with 3 parts of size 307Mi 2023-05-08 17:10:09 DEBUG : sftp://user@x.x.x.x:port/file.bin: Shell path "/path/file.bin" 2023-05-08 17:10:09 DEBUG : sftp://user@x.x.x.x:port/file.bin: Running remote command: md5sum /path/file.bin 2023-05-08 17:10:14 DEBUG : sftp://user@x.x.x.x:port/file.bin: Remote command result: md5hashmd5hashmd5hashmd5hash /path/file.bin 2023-05-08 17:10:14 DEBUG : file.bin: Parsed hash: md5hashmd5hashmd5hashmd5hash 2023-05-08 17:10:14 DEBUG : file.bin: md5 = md5hashmd5hashmd5hashmd5hash OK 2023-05-08 17:10:14 INFO : file.bin: Multi-thread Copied (new) Transferred: 921 MiB / 921 MiB, 100%, 23.571 MiB/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 44.3s <6>INFO : Transferred: 921 MiB / 921 MiB, 100%, 23.571 MiB/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 44.3sSo almost exactly the same results as the above 8 tests that used the mount: 44 seconds. This eliminates the mount and various configuration options as the cause.
asdffdsa (jojothehumanmonkey) May 8, 2023, 9:36pm 4
You read my mind! lol, even crapgpt cannot do that.
This eliminates the mount and various configuration options as the cause. yes, it does, the simpler the test the better. --- up above the testing was done with testfile.bin and now it is file.bin, are the two files the same size?
ioorclone May 8, 2023, 9:41pm 5
Here is the result of fetching the same file using the OS sftp implementation: $ sftp -v -p remote:/path/file.bin /path Fetching /path/file.bin to /path/file.bin file.bin 100% 921MB 31.5MB/s 00:29So in this case the OS sftp took 29s, versus rclone's sftp of 45s which is 155% slower than the OS sftp. This result corresponds with all results from the above tests. Since both of these tests are using sftp, this suggests that it isn't the sftp protocol that is slower than the sshfs protocol. What can account for this substantial decrease in performance?
ioorclone May 8, 2023, 9:54pm 6
--- up above the testing was done with testfile.bin and now it is file.bin, are the two files the same size? It's the same file yes, I just redacted the name differently with a find and replace. --- maybe i am confused but 17:10:14 - 17:09:41 = 0:00:33? Interesting! I did a straight copy of the debug output form the terminal. I only redacted the IP addresses using a universal Find and Replace for IP addresses, credentials etc. --- did you intent to force the md5 hash, which takes additional time? I see the hash starts at 17:10:09 and ends at 17:10:14 taking 5 seconds. The has ran because I have the md5sum command as part of the remote. --- the rclone copy, can you just post the full output, from the terminal including the exact command and without that other log stuff or post the full output of a rclone debug log file. Here is the command: --- just curious, did you try rclone serve sftp as the sftp server? No I haven't tried this. I am using the remote settings from the top of the post: type = sftp shell_type = unixI will run another test and look at the times reported by the log versus the times reported by the progress at the end however in the previous tests, I was timing manually by pressing Start when I copied the file and stop as soon as Dolphin told me the file copy had completed.
asdffdsa (jojothehumanmonkey) May 8, 2023, 10:02pm 7
Here is the command: just copy/paste the full rclone debug output, not snippets. the full debug output and the exact command.
I see the hash starts at 17:10:09 and ends at 17:10:14 taking 5 seconds. The has ran because I have the md5sum command as part of the remote. might be best to disable checksums unless whatever you are comparing against also calculates the hash. 17:10:14 - 17:09:41 = 0:00:33
ioorclone May 8, 2023, 10:07pm 8
might be best to disable checksums unless whatever you are comparing against also calculates the hash. Yes I am already on it because the OS's sftp is not doing an after checksum.
just copy/paste the full rclone debug output, not snippets. Yes this is what I did. I started at the top of the terminal with the mouse and copied the entire operation to kate, then I used a find and replace function to redact three pieces of information: IP addresses, file paths and login credentials. I will run another test without the checksum and post the full results including the command to ensure we are comparing apples to apples.
ioorclone May 8, 2023, 10:09pm 9
17:10:14 - 17:09:41 = 0:00:33 I do but I cannot account for the discrepancy here between the file transfer times in the log and the time the rclone reports the file transfer is complete.
ncw (Nick Craig-Wood) May 9, 2023, 7:57am 10
I am trying to use rclone sftp backend on a local network to transfer files. Rclone's sftp implementation is substantially slower than mounting a simple sshfs or using the filesystem's sftp. I think this is fixed in v1.62.2 - I suggest you give that a go. See github.com/rclone/rclone
sftp: fix uploads being 65% slower than they should be with crypt
committed 03:47PM - 14 Feb 23 UTC
+4 -4
The block size for crypt is 64k + a few bytes. The default block size for sftp i…s 32k. This means that the blocks for crypt get split over 3 sftp packets two of 32k and one of a few bytes. However due to a bug in pkg/sftp it was sending 32k instead of just a few bytes, leading to the 65% slowdown. This was fixed in the upstream library. This bug probably affected transfers from over the network sources also. Fixes #6763 See: https://github.com/pkg/sftp/pull/537
ioorclone May 9, 2023, 6:06pm 11
I think this is fixed in v1.62.2 - I suggest you give that a go. Hi @ncw thanks for the reply. The issue you linked mentions uploads and crypted remotes, would it affect remotes that are not wrapped in a crypt? In this case I am transferring (downloading) files from a sftp remote only. This brings up a question however, I assume that an rclone sftp transfer would already be encrypted over the ssh tunnel which is configured in the rclone remote config file. In this case would using a crypt wrapper be redundant? I have downloaded 1.62.2 as mentioned and it did not improve the performance, same results as the test conducted with the previous version. Since my last post I have setup and run many tests and have a few conceptual questions. Here is what I found so far.
rclone's sftp performance (doing an rclone copy from a sftp remote, no crypt) is close to the OS sftp performance. Although a few second slower it is roughly 30 seconds to transfer a 1gb file in this environment. So the major issue is not sftp implementation. Thanks to asdffdsa for helping to peel away the layers and clarify this!
The checksum at the end of the operation does add 3-5 seconds to the operation. It also seems to affect the accuracy of the --progress indicator. I assume this hash is an expensive operation so this can be accounted for. Question 1: According to a user in this thread sftp and checksums: With the SFTP, running over an encrypted SSH session, there's negligible chance the file contents could get corrupted while transferring. The SSH itself does data integrity verification. but then he does on to say: While there's the to the SFTP protocol to calculate a remote file checksum, it's not widely supported. Particularly it's not supported by the most widespread SFTP server implementation, the OpenSSH. It appears I am using openSSH so likely not using the mentioned sftp file integrity checks. Question 2 Question 3: So the question, is it expected that using an sftp remote with rclone mount will add an additional (13-15 seconds or 120% - 150%) of overhead to a simple copy operation, in this case a single 1gb file across a lan? If so what could account for this? Is it the additional overhead of transferring the file via sftp then reading it again with the chunkreader? I noticed there were 1729 instances of 'file.bin: waiting for in-sequence read to xxxxxxxx for 20ms in the debug log. 1729 x 20ms = 1.7 seconds so this can account for a small portion of the difference. I have tried many different settings and flags but nothing makes a substantial difference. If this is the case, mounting the same sshfs drive does not incur the same penalty, even though they are both using FUSE. Summary Since a mounted sshfs is a similar solution to an rclone mount, unless there is a way to increase the performance of the rclone mount it makes sense to use the sshfs solution. I would strongly prefer to use the rclone solution if there is any chance it can be further tuned to be as performant as an sshfs mount.
ncw (Nick Craig-Wood) May 10, 2023, 11:09am 12
The issue you linked mentions uploads and crypted remotes, would it affect remotes that are not wrapped in a crypt? In this case I am transferring (downloading) files from a sftp remote only. It affects remotes which send chunks that are shorter than the SFTP block size. So it could do but probably not.
This brings up a question however, I assume that an rclone sftp transfer would already be encrypted over the ssh tunnel which is configured in the rclone remote config file. In this case would using a crypt wrapper be redundant? The data is encrypted in transit by ssh. crypt will ensure your data is encrypted at rest on the remote server which may/may not be what you want.
I have downloaded 1.62.2 as mentioned and it did not improve the performance, same results as the test conducted with the previous version.
rclone's sftp performance (doing an rclone copy from a sftp remote, no crypt) is close to the OS sftp performance. Although a few second slower it is roughly 30 seconds to transfer a 1gb file in this environment. So the major issue is not sftp implementation. Great!
The checksum at the end of the operation does add 3-5 seconds to the operation. It also seems to affect the accuracy of the --progress indicator. I assume this hash is an expensive operation so this can be accounted for. Yes the checksum does take some time.
Is it safe to turn off the checksum and realize the improved performance when using an sftp remote? It is an end to end verification. The file is unlikely to get corrupted when transferring but it could get corrupted in a multitude of other places (eg bit flip in ram, disk drive corrupts it, bug in the transfer code, etc, etc). It depends on your tolerance for corrupted files. Rclone by defaults always defaults to maximum protection for your data!
While there's the to the SFTP protocol to calculate a remote file checksum, it's not widely supported. Particularly it's not supported by the most widespread SFTP server implementation, the OpenSSH. It appears I am using openSSH so likely not using the mentioned sftp file integrity checks. I've not seen that extension before. Looks very useful. I think you'd find it was no faster than shelling out to md5sum which is what rclone does though.
I see that rclone sftp is spiting the file into 3 parts and using a "multi-copy". Is this configurable? I tried adding --transfers=10 but it did not change it. Does the multi-copy increase the changes of corruption thus necessitating the end checksum that would otherwise not be needed with a straight sftp transfer? --multi-thread-cutoff SizeSuffix Use multi-thread downloads for files above this size (default 250Mi) --multi-thread-streams int Max number of streams to use for multi-thread downloads (default 4)
It appears that from the results of all the testing, the mount is the root cause of the increased transfer time. If I mount the same sftp remote and do a cp operation from the mounted remote to the local disk (rather than an rclone copy), the same file will transfer in roughly 45 seconds (versus 31 seconds). When I look at the debug log the chunk reader in involved now. So the question, is it expected that using an sftp remote with rclone mount will add an additional (13-15 seconds or 120% - 150%) of overhead to a simple copy operation, in this case a single 1gb file across a lan? If so what could account for this? Is it the additional overhead of transferring the file via sftp then reading it again with the chunkreader? The overhead here is the fact that your data has to loop in and out the kernel. I think it is unlikely to be the chunkreader.
I noticed there were 1729 instances of 'file.bin: waiting for in-sequence read to xxxxxxxx for 20ms in the debug log. That's a consequence of the kernel sending multiple read requests to rclone and them arriving out of order. With --vfs-cache-mode off (the default) the requests can only be serviced in order so rclone waits for up to 20ms for it to arrive. The fact the messages arrive out of order is very annoying and I tried to convince the author of the upstream library that it was a bug with no luck! You can try disabling this with --async-read=false which might help. Though it will slow other stuff down! Note that if the read arrives sooner it won't wait 20ms and if it does wait 20ms then it will seek the file. This really does slow down the performance a lot. Are you seeing logs which say failed to wait for in-sequence or aborting in-sequence?
system (system) Closed June 9, 2023, 11:09am 13
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed. (责任编辑:) |

