#2235 closed defect (fixed)

Error from 'tahoe cp' on Windows, due to a long path (IOError: Errno2 - no such file or dir.)

Reported by: CyberAxe Owned by:
Priority: normal Milestone: 1.10.1
Component: code-frontend-cli Version: 1.10.0
Keywords: tahoe-cp error windows win32 LeastAuthority.com review-needed Cc:
Launchpad Bug:

Description

6449/61880 files, 169/2003 directories
6450/61880 files, 169/2003 directories
Traceback (most recent call last):
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\runner.py", line 156, in run
    rc = runner(sys.argv[1:], install_node_control=install_node_control)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\runner.py", line 141, in runner
    rc = cli.dispatch[command](so)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\cli.py", line 551, in cp
    rc = tahoe_cp.copy(options)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 770, in copy
    return Copier().do_copy(options)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 451, in do_copy

    status = self.try_copy()
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 512, in try_cop
y
    return self.copy_to_directory(sources, target)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 672, in copy_to
_directory
    self.copy_files_to_target(self.targetmap[target], target)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 703, in copy_fi
les_to_target
    self.copy_file_into(source, name, target)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 748, in copy_fi
le_into
    target.put_file(name, f)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\scripts\tahoe_cp.py", line 158, in put_fil
e
    fileutil.put_file(pathname, inf)
  File "c:\allmydata-tahoe-1.10.0\src\allmydata\util\fileutil.py", line 274, in put_file
    outf = open(os.path.expanduser(pathname), "wb")
IOError: [Errno 2] No such file or directory: u'c:\\IveBeenRestored\\Marketing\\Photos\\As
sets\\042056-abstract-red-and-gold-paint-splatter-icon-sports-hobbies-camping\\042056-abst
ract-red-and-gold-paint-splatter-icon-sports-hobbies-camping\\042056-abstract-red-and-gold
-paint-splatter-icon-sports-hobbies-camping.png'

C:\allmydata-tahoe-1.10.0\bin>python tahoe cp -r -v tahoe:BackedUp/Latest c:\IveBeenRestor
ed\ >> c:\tahoe-restore.log

Change History (38)

comment:1 Changed at 2014-04-29T16:53:16Z by daira

This may be a result of the length of the path name; Windows APIs do not normally accept paths longer than 259 Unicode characters, unless the "\\?\" prefix is used. The path in question has 262 characters.

Most Windows software doesn't use that prefix and just breaks on long paths. I have never understood the motivation for the design that supports longer paths but requires you to jump through an obscure hoop to use them.

Last edited at 2014-04-29T16:58:14Z by daira (previous) (diff)

comment:2 Changed at 2014-04-29T16:56:12Z by daira

  • Component changed from unknown to code-frontend-cli
  • Keywords tahoe-cp error windows win32 added; Restore removed
  • Summary changed from Error while restoring data - IOError: Errno2 - no such file or dir. to Error from 'tahoe cp' on Windows, possibly due to a long path (IOError: Errno2 - no such file or dir.)

comment:3 follow-up: Changed at 2014-04-30T00:15:23Z by daira

http://bugs.python.org/issue18199. This will not be fixed in Python 2.x, so the only way for us to work around it is either to:

  1. chdir to each subdirectory in turn using relative paths (this changes the limitation to 255 characters per directory component);
  2. on Windows, call the Win32 API functions with the "\\?\" prefix directly via ctypes or similar.
  1. seems like too much work.
Last edited at 2014-04-30T00:26:16Z by daira (previous) (diff)

comment:4 in reply to: ↑ 3 ; follow-up: Changed at 2014-04-30T00:37:08Z by zooko

Replying to daira:

http://bugs.python.org/issue18199. This will not be fixed in Python 2.x, so the only way for us to work around it is either to:

  1. chdir to each subdirectory in turn using relative paths (this changes the limitation to 255 characters per directory component);
  2. on Windows, call the Win32 API functions with the "\\?\" prefix directly via ctypes or similar.

Why can't we just prepend "\\?\" to the string before we pass it to open() on line 274 of fileutil.py?

comment:5 in reply to: ↑ 4 ; follow-up: Changed at 2014-04-30T00:45:21Z by daira

Replying to zooko:

Why can't we just prepend "\\?\" to the string before we pass it to open() on line 274 of fileutil.py?

It's not quite as simple as that:

  • Paths prefixed with "\\?\" must be absolute, but CLI commands accept relative paths. Also, they don't canonicalize in the same way, e.g. ".." and "." are not supported.
  • "\\?\" is only supported by the Unicode Win32 file APIs. I think Python does use those APIs (indirectly, via msvcrt) when given a unicode string, though.
  • For consistency you'd need to change many other places in Tahoe-LAFS that access files, even to just support long paths in CLI commands.
Last edited at 2014-04-30T00:51:44Z by daira (previous) (diff)

comment:7 in reply to: ↑ 5 ; follow-up: Changed at 2014-04-30T01:10:23Z by daira

Replying to daira:

  • Paths prefixed with "\\?\" must be absolute, but CLI commands accept relative paths. Also, they don't canonicalize in the same way, e.g. ".." and "." are not supported.

Given this constraint, the best place to prepend "\\?\" would probably be in fileutil.abspath_expanduser_unicode.

comment:8 in reply to: ↑ 7 Changed at 2014-05-01T15:54:27Z by zooko

Replying to daira:

Replying to daira:

  • Paths prefixed with "\\?\" must be absolute, but CLI commands accept relative paths. Also, they don't canonicalize in the same way, e.g. ".." and "." are not supported.

Given this constraint, the best place to prepend "\\?\" would probably be in fileutil.abspath_expanduser_unicode.

Okay, I don't see any reason not to prepend "\\?\" to all paths when reading files.

We should probably think about it a bit more when creating files i.e. during a tahoe get, because if we do this then the resulting files wouldn't be accessible via tools that don't support long paths, such as the Explorer or cmd.exe, for example, if I understand correctly.

comment:9 Changed at 2014-05-03T23:53:02Z by daira

Prepending "\\?\" in fileutil.abspath_expanduser_unicode would also have the effect of making all file accesses underneath a node base directory use "long" paths on Windows. I don't know whether or not that is what we want.

Last edited at 2014-05-12T22:39:24Z by daira (previous) (diff)

comment:10 Changed at 2014-05-03T23:54:17Z by daira

I created a branch https://github.com/daira/tahoe-lafs/commit/2235-long-paths-on-windows. It results in the following failures of existing tests, some of which are obviously shallow. The others I haven't had time to investigate.

[FAIL]
Traceback (most recent call last):
  File "c:\tahoe\git\trunk\src\allmydata\test\test_cli.py", line 389, in test_catalog_shares_error
    "didn't see 'error processing' in '%s'" % err)
twisted.trial.unittest.FailTest: didn't see 'error processing' in ''

allmydata.test.test_cli.CLI.test_catalog_shares_error
===============================================================================
[FAIL]
Traceback (most recent call last):
  File "c:\tahoe\git\trunk\src\allmydata\test\test_cli.py", line 3826, in test_basedir
    ".tahoe"))
  File "C:\Python27\lib\site-packages\twisted-12.2.0-py2.7-win32.egg\twisted\trial\unittest.py", lin
e 271, in assertEqual
    % (msg, pformat(first), pformat(second)))
twisted.trial.unittest.FailTest: not equal:
a = u'\\\\?\\C:\\Documents and Settings\\David-Sarah\\.tahoe'
b = 'C:\\Documents and Settings\\David-Sarah\\.tahoe'


allmydata.test.test_cli.Options.test_basedir
===============================================================================
[FAIL]
Traceback (most recent call last):
  File "C:\tahoe\git\trunk\support\Lib\site-packages\mock-1.0.1-py2.7.egg\mock.py", line 1201, in pa
tched
    return func(*args, **keywargs)
  File "c:\tahoe\git\trunk\src\allmydata\test\test_client.py", line 48, in test_error_on_old_config_
files
    self.failUnlessIn(os.path.abspath(os.path.join(basedir, "introducer.furl")), e.args[0])
twisted.trial.unittest.FailTest: 'C:\\tahoe\\git\\trunk\\_trial_temp\\test_client.Basic.test_error_o
n_old_config_files\\introducer.furl' not in set([u'\\\\?\\C:\\tahoe\\git\\trunk\\_trial_temp\\test_c
lient.Basic.test_error_on_old_config_files\\readonly_storage', u'\\\\?\\C:\\tahoe\\git\\trunk\\_tria
l_temp\\test_client.Basic.test_error_on_old_config_files\\introducer.furl', u'\\\\?\\C:\\tahoe\\git\
\trunk\\_trial_temp\\test_client.Basic.test_error_on_old_config_files\\no_storage', u'\\\\?\\C:\\tah
oe\\git\\trunk\\_trial_temp\\test_client.Basic.test_error_on_old_config_files\\debug_discard_storage
'])

allmydata.test.test_client.Basic.test_error_on_old_config_files
===============================================================================
[FAIL]
Traceback (most recent call last):
  File "c:\tahoe\git\trunk\src\allmydata\test\test_system.py", line 1374, in _test_runner
    self.failUnlessEqual(len(sharefiles), 10)
  File "C:\Python27\lib\site-packages\twisted-12.2.0-py2.7-win32.egg\twisted\trial\unittest.py", lin
e 271, in assertEqual
    % (msg, pformat(first), pformat(second)))
twisted.trial.unittest.FailTest: not equal:
a = 0
b = 10


allmydata.test.test_system.SystemTest.test_filesystem
===============================================================================
[FAIL]
Traceback (most recent call last):
  File "c:\tahoe\git\trunk\src\allmydata\test\test_util.py", line 491, in test_abspath_expanduser_un
icode
    self.failUnlessEqual(abspath_cwd, saved_cwd)
  File "C:\Python27\lib\site-packages\twisted-12.2.0-py2.7-win32.egg\twisted\trial\unittest.py", lin
e 271, in assertEqual
    % (msg, pformat(first), pformat(second)))
twisted.trial.unittest.FailTest: not equal:
a = u'\\\\?\\C:\\tahoe\\git\\trunk\\_trial_temp'
b = u'C:\\tahoe\\git\\trunk\\_trial_temp'


allmydata.test.test_util.FileUtil.test_abspath_expanduser_unicode
-------------------------------------------------------------------------------
Ran 1137 tests in 878.831s

FAILED (skips=15, expectedFailures=3, failures=5, successes=1114)

comment:11 Changed at 2014-05-04T23:03:55Z by daira

I fixed the test failures apart from the ones in allmydata.test.test_cli.CLI.test_catalog_shares_error and allmydata.test.test_system.SystemTest.test_filesystem.

comment:12 Changed at 2014-05-12T22:35:28Z by daira

  • Summary changed from Error from 'tahoe cp' on Windows, possibly due to a long path (IOError: Errno2 - no such file or dir.) to Error from 'tahoe cp' on Windows, due to a long path (IOError: Errno2 - no such file or dir.)

comment:13 Changed at 2014-05-12T22:35:36Z by daira

  • Status changed from new to assigned

comment:14 Changed at 2014-05-19T15:33:32Z by Zancas

  • Owner changed from daira to Zancas
  • Status changed from assigned to new

comment:15 follow-up: Changed at 2014-05-20T15:18:51Z by zooko

The next step on this should be to write a test, perhaps in trunk/src/allmydata/test/test_util.py, that attempts to create a file under a long filename in the (real, not faked) local filesystem, and if the file is successfully created then the test passes. That test should be red on current trunk when run on Windows, and go green when daira's patch or another fixit-patch is applied.

I don't understand this not path.startswith(u"\\\\") clause in Daira's patch, and I'd like to see a second test added to the one described above, which verifies that whatever behavior is intended by that clause is what actually happens.

Other than those two additional tests, just getting all of the current tests to pass with the patch applied will be sufficient! ☺

comment:16 Changed at 2014-05-20T15:19:52Z by zooko

P.S. If you don't know why that not path.startswith(u"\\\\") clause is in there, and Daira isn't around to explain it, then try removing that clause. If all the tests still pass, then great — you've removed unnecessary code! If tests start failing then examining the failing tests will probably show you why that clause is there.

comment:17 Changed at 2014-05-20T16:41:18Z by daira

Because prepending \\?\ is not correct for a UNC path ( http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx#maxpath ):

The "\\?\" prefix can also be used with paths constructed according to the universal naming convention (UNC). To specify such a path using UNC, use the "\\?\UNC\" prefix. For example, "\\?\UNC\server\share", where "server" is the name of the computer and "share" is the name of the shared folder. These prefixes are not used as part of the path itself. They indicate that the path should be passed to the system with minimal modification, which means that you cannot use forward slashes to represent path separators, or a period to represent the current directory, or double dots to represent the parent directory. Because you cannot use the "\\?\" prefix with a relative path, relative paths are always limited to a total of MAX_PATH characters.

It is also not correct for a path that already starts with \\?\ (or, more obscurely, \\.\).

I could have written code like this to cover all cases:

if sys.platform == "win32":
    path = to_windows_long_path(path)
    [...]

def to_windows_long_path(path):
    if path.startswith(u"\\\\?\\") or path.startswith(u"\\\\.\\"):
        return path
    elif path.startswith(u"\\\\"):
        return u"\\\\?\\UNC\\" + path[2 :]
    else:
        return u"\\\\?\\" + path

but I don't actually care about long UNC paths; I just wanted to avoid doing something obviously incorrect for them.

Last edited at 2014-05-20T16:46:15Z by daira (previous) (diff)

comment:20 Changed at 2014-05-20T17:39:54Z by daira

  • Keywords test-needed added

comment:21 follow-up: Changed at 2014-05-22T05:24:52Z by CyberAxe

I'm not sure this hasn't been covered but this error didn't show up during backup, but only has shown up during recovery with IIRC -CP -R shown above.

I'll start testing with a single file name with 300 char long.

comment:22 in reply to: ↑ 21 Changed at 2014-05-26T20:59:32Z by zooko

Replying to CyberAxe:

I'm not sure this hasn't been covered but this error didn't show up during backup, but only has shown up during recovery with IIRC -CP -R shown above.

I'll start testing with a single file name with 300 char long.

CyberAxe: have you tested this yet? What did you learn?

Last edited at 2014-05-26T21:13:49Z by zooko (previous) (diff)

comment:23 in reply to: ↑ 15 Changed at 2014-05-28T21:40:45Z by daira

Replying to zooko:

The next step on this should be to write a test, perhaps in trunk/src/allmydata/test/test_util.py, that attempts to create a file under a long filename in the (real, not faked) local filesystem, and if the file is successfully created then the test passes. That test should be red on current trunk when run on Windows, and go green when daira's patch or another fixit-patch is applied.

I agree that we need such a test.

I don't understand this not path.startswith(u"\\\\") clause in Daira's patch, and I'd like to see a second test added to the one described above, which verifies that whatever behavior is intended by that clause is what actually happens.

This has been done.

comment:24 Changed at 2014-08-05T01:59:49Z by daira

The test failures on Windows have been fixed. The problem was that when using the \\?\ prefix, Windows is pickier about path syntax and does not accept / as a path component separator.

comment:25 Changed at 2014-08-14T20:42:11Z by daira

  • Milestone changed from undecided to 1.11.0

comment:26 Changed at 2014-08-20T16:17:28Z by nejucomo

  • Keywords LeastAuthority.com added

comment:27 follow-up: Changed at 2014-08-21T02:06:39Z by daira

  • Keywords review-needed added; test-needed removed

Pull request https://github.com/tahoe-lafs/tahoe-lafs/pull/100 has a test and is now ready for review.

comment:28 in reply to: ↑ 27 Changed at 2014-08-21T08:50:56Z by daira

Replying to daira:

Pull request https://github.com/tahoe-lafs/tahoe-lafs/pull/100 has a test and is now ready for review.

Please hold off merging this; there are instances of os.path.abspath and os.path.expanduser that don't go through fileutil.abspath_expanduser_unicode and should be changed to do so. That includes the call in fileutil.put_file that motivated this ticket, so the pull request doesn't actually fix the bug. See the results of egrep -Rn --include='*.py' 'expanduser|abspath' src |egrep -v 'abspath_expanduser_unicode|argv_to_abspath' for other cases.

Last edited at 2014-08-22T07:00:39Z by daira (previous) (diff)

comment:29 Changed at 2014-08-21T08:56:28Z by daira

Omitting some spurious hits:

src/allmydata/introducer/server.py:67:        staticdir = os.path.expanduser(staticdir)
src/allmydata/stats.py:261:                       quote_output(os.path.abspath(self.picklefile)))
src/allmydata/util/fileutil.py:274:    outf = open(os.path.expanduser(pathname), "wb")
src/allmydata/util/fileutil.py:299:    path = os.path.expanduser(path)
src/allmydata/manhole.py:66:        self.authorized_keys_file = os.path.expanduser(authorized_keys_file)
src/allmydata/manhole.py:253:        # TODO: expanduser this, and make it relative to the buildmaster's
src/allmydata/frontends/auth.py:27:        for line in open(os.path.expanduser(accountfile), "r"):
src/allmydata/frontends/drop_upload.py:22:            local_dir_u = os.path.expanduser(local_dir_utf8.decode('utf-8'))
src/allmydata/scripts/tahoe_get.py:29:            outf = open(os.path.expanduser(to_file), "wb")
src/allmydata/scripts/tahoe_put.py:76:        infileobj = open(os.path.expanduser(from_file), "rb")
src/allmydata/scripts/tahoe_cp.py:71:        return open(os.path.expanduser(self.pathname), "rb")
src/allmydata/client.py:454:        staticdir = os.path.expanduser(staticdir)
src/allmydata/test/test_mutable.py:3113:            fso.nodedirs = [unicode(os.path.dirname(os.path.abspath(storedir)))
src/allmydata/test/test_encodingutil.py:277:        expanded = os.path.expanduser("~/" + lumiere_nfc)
src/allmydata/test/check_grid.py:67:        self.nodedir = os.path.expanduser(nodedir)
src/allmydata/test/check_grid.py:68:        self.tahoe = os.path.abspath(os.path.expanduser(tahoe))
Last edited at 2014-08-21T09:36:22Z by daira (previous) (diff)

comment:30 Changed at 2014-08-21T09:30:55Z by daira

Making all user-expansions go through fileutil.abspath_expanduser_unicode will make it easier to fix #1674.

comment:31 Changed at 2014-09-02T18:08:43Z by daira

  • Owner changed from Zancas to daira
  • Status changed from new to assigned

comment:33 Changed at 2015-01-20T17:38:08Z by daira

I need to rebase this and check that it is complete and passes tests.

comment:34 Changed at 2015-01-22T17:26:02Z by zooko

  • Keywords review-needed removed
  • Status changed from assigned to new

removing review-needed keyword

The next step is for Daira to do the changes she described in comment:28.

comment:35 Changed at 2015-01-29T19:32:27Z by daira

  • Keywords review-needed added

Done. Review needed for https://github.com/tahoe-lafs/tahoe-lafs/pull/138 (which also fixes #1674 and #2027).

comment:36 Changed at 2015-01-29T19:45:47Z by daira

  • Owner daira deleted

comment:37 Changed at 2015-02-03T18:59:14Z by zooko

Reviewed all the patches on https://github.com/tahoe-lafs/tahoe-lafs/pull/138 and posted comments.

comment:38 Changed at 2015-02-05T02:09:07Z by daira

  • Resolution set to fixed
  • Status changed from new to closed

Fixed on the pr138 branch which was merged in [3afe827ad0cbdb41b2928d17c8bfbbaf5102acc8/trunk].

(Woohoo! This set of bugs has been irking me for some time.)

Note: See TracTickets for help on using tickets.