Opened at 2016-03-10T06:09:58Z
Closed at 2019-07-25T13:12:18Z
#2739 closed defect (wontfix)
windows vs len("pycryptopp-0.7.1. 869544967005693312591928092448767568728501330214")
Reported by: | warner | Owned by: | daira |
---|---|---|---|
Priority: | normal | Milestone: | undecided |
Component: | packaging | Version: | 1.10.2 |
Keywords: | pycryptopp packaging windows | Cc: | |
Launchpad Bug: |
Description
(moved from #1582, specifically comment:28:ticket:1582 and comment:38:ticket:1582)
Windows machines have problems building pycryptopp, because the length of the version string (65 chars when turned into a directory, including the package name, connecting hyphen, and version) causes pathnames to exceed the maximum allowable on windows. Zooko points out that this should be considered a bug in setuptools, since building *any* package would fail if the parent directory name is sufficiently long.
I spent the day getting a windows VM set up, and found that the situation is indeed dire when building from a directory with a long name. I used my spake2 package as a test case, because it is pure-python and has a very short package+version name.
The base directory is C:\Users\$USERNAME\$AAAA, where "$USERNAME" is 18 characters long.
I started with "$AAAA" at 200 characters long, so the total pathname up to this point is 2+1+18+1+200= 221. Inside that is the "spake2-0.3" directory, bringing pwd to 232 chars. From a directory of that length, python cannot even import the versioneer.py that's sitting right next to setup.py (it can import setuptools and other packages that are in the much-shorter C:\Python27\Lib directory, but nothing that involves long sys.path entries). This is unrelated to setuptools, and if it is to be considered a bug, it would be assigned to python itself.
It is not possible to unpack the pycryptopp tarball into this directory. Moving one that was unpacked elsewhere works, but then even basic windows commands like dir fail, with an error like "The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters". (note that I'm using PowerShell on a Windows 2012 Server system).
When we move to $AAAA=150 chars (for a total length of 182 chars), versioneer can be imported, but a setup.py bdist_wheel fails as it attempts to copy the egg-info data into place, with an error like the one seen in comment:27:ticket:1582 . The chain of responsibility is:
- setuptools.command.install_egg_info.copytree()
- setuptools.archive_util.unpack_archive()
- setuptools.archive_util.unpack_directory()
- shutil.copyfile()
shutil.copyfile() is handed a dst= argument of build\bdist.win32\wheel\.\spake2-0.3-py2.7.egg-info\dependency_links.txt, which is a relative pathname of len=72. It then attempts a regular open(dst, 'wb'), which fails with errno 2 (No such file or directory).
I'm not sure what setuptools could do differently here. Is there something other than open() that python code is supposed to use on windows? I tried making open use os.path.abspath(), thinking that a prefix of C:\ might trigger a differently-capable syscall, but that didn't seem to help.
In an AAAA=145 directory (total length 167 chars), spake2-0.3 builds a wheel with no problem.
Note that the package version string appears twice: once in the copyfile() argument, and a second time in the implicit CWD (where the source tarball was unpacked). So every extra character in the version string means two characters in the absolute filename.
As a workaround, I found that it *is* possible to build a pycryptopp wheel (using python setup.py bdist_wheel) if the parent directory is 2+1+18+1+25= 47 characters or less (the cutoff is somewhere between 47 and 52 characters).
Appveyor.com (the CI service I've been trying to get working) unpacks the source tree into C:\projects\tahoe-lafs. If we used them to build pycryptopp wheels from a git checkout, they'd get C:\projects\pycryptopp, which would be short enough to work. Technically, we really only need one machine to run setup.py bdist_wheel, then everybody else can install the pre-compiled wheels.
Until we have pre-compiled wheels available, windows devs will be building wheels as a side-effect of pip install, which will use some temp directory for the purpose. I don't yet know how the temp directory is created (how long it is, and whether it lives under the user's home directory, or somewhere unaffected by the length of their username like /tmp).
So maybe publishing pre-compiled pycryptopp wheels (built on appveyor from git, not from a tarball) will be enough to allow windows devs of all username-lengths to build tahoe.
Change History (8)
comment:1 Changed at 2016-03-10T07:14:57Z by warner
comment:2 Changed at 2016-03-10T15:08:44Z by daira
I think we should simply release pycryptopp 0.7.2. The long version string is just not worth the hassle. Yes, Windows is broken, but it's not going to get unbroken any time soon (and neither is Python 2.x going to get a workaround to convert to and from the "\\?\" long path prefix in all Win32 API calls that use paths, which would be the only practical alternative).
comment:3 Changed at 2016-03-10T20:29:17Z by warner
As an experiment, I modified install_egg_info.py to change:
unpack_archive(self.source, self.target, skimmer)
with:
unpack_archive(self.source, "\\\\?\\" + os.path.abspath(self.target), skimmer)
and I was able to pip install . spake2 (with the artificial dependency on pycryptopp) in $AAAA directories up to 180 chars long. At len($AAAA)=185, a different error occurred (pip.download.unpack_file_url calling shutil.copytree, copying the egg-info directory into a tempdir).
comment:4 Changed at 2016-03-11T18:19:06Z by warner
marlowe asked some windows admins about how long usernames could be. Their answer:
Logon names can be up to 104 characters. However, it isn't practical to use logon names that are longer than 64 characters. According to Microsoft's docs.
comment:5 Changed at 2016-03-12T19:13:26Z by daira
comment:6 Changed at 2016-03-15T18:49:14Z by warner
- Summary changed from windows vs len("pycryptopp-0.7.1.869544967005693312591928092448767568728501330214") to windows vs len("pycryptopp-0.7.1. 869544967005693312591928092448767568728501330214")
I'm changing the title of this ticket (adding a space after "0.7.1."), because when I went to find this ticket during today's devchat (I loaded the Timeline page as usual, and searched for "pycryptopp"), the string "pycryptopp" was nowhere to be found.
I realized that, on the Timeline page, Trac truncates the ticket descriptions in that view, and the truncation routine must first split on whitespace or something (to avoid splitting a word in half?), because the Timeline page displayed this ticket as windows vs .... By adding a space, I think I can convince Trac to summarize this as windows vs len("pycryptopp-0.7.1. ... instead.
The recursive nature of this problem made me laugh. Thank you.
comment:7 Changed at 2016-10-12T13:24:49Z by daira
- Keywords pycryptopp packaging windows added
comment:8 Changed at 2019-07-25T13:12:18Z by exarkun
- Resolution set to wontfix
- Status changed from new to closed
ticket:3031 switched Tahoe-LAFS from pycryptopp to cryptography.
I did another experiment: modifying spake2 to depend on pycryptopp, then doing a pip install . from the spake2 directory underneath variously-lengthed $AAAA parent directories. This causes pip to build (and install) a pycryptopp wheel in the same way that pip install tahoe-lafs ought to do. I found that it worked at least as far as $AAAA=150, which means that the pycryptopp build is *not* happening inside the current working directory.
I eventually found the tempdir that is hosting the build: C:\Users\USERNAME\AppData\Local\Temp\2\pip-build-j4xbzr\pycryptopp. This directory does include the username, but does *not* include the CWD copy of the version string (it unpacks into pycryptopp rather than pycryptopp-0.7.1.869544967005693312591928092448767568728501330214, which must be pip or wheel controlling the target directory of TarFile.extract_all()). So I think that means each byte of the version string only counts against the ultimate path-length limit once, not twice.
Since the original error (on Daira's box) was using a username of just "Daira", I think something else is going on here. I do notice that Daira's log in comment:27:ticket:1582 shows the pycryptopp version string getting included *twice*:
For some reason it's reaching the egg-info directory as a sibling (up-and-over) relative to the .data directory, both of which have pycryptopp's long version string. It didn't do that in my tests.
Daira: what version of setuptools and 'wheel' do you have on your box? I've got setuptools-20.2.2 and wheel-0.29.0 .
I see one potentially-relevant note in the wheel changelog: version 0.27.0 fixed issue 91 "Don’t attempt to (recursively) create a build directory ending with .. (invalid on all platforms, but code was only executed on Windows)". It looks like the patch involves adding an os.path.normpath() call, which might flatten out that extra directory, and might reduce the severity of this issue.
So maybe Daira is using wheel 0.26.0 or older, and it'd be sufficient to just upgrade that to 0.27.0 or newer. That'd give us an extra 65 characters or headroom (maybe 70, since the .data suffix is removed too).
If that helps, what would the new maximum version-string length be? Suppose we have $USERNAME, $PKGNAME, and $VERSION. The longest filename written into egg-info is dependency_links.txt. So I think we're talking: C:\Users\$USERNAME\AppData\Local\Temp\2\pip-build-XXXXXX\$PKGNAME\build\bdist.win-amd64\wheel\$PKGNAME-$VERSION-py2.7.egg-info\dependency_links.txt. That's a fully-qualified file name of 114 + len($USERNAME) + 2*len($PKGNAME) + len($VERSION) bytes (which must be less than 260 characters). The directory name is 93+len($USERNAME)+2*len($PKGNAME)+len($VERSION), and must be less than 248 characters, so those are nearly equal constraints.
So for a 10-char $USERNAME and 10-char $PKGNAME, we can tolerate a $VERSION of up to 116 characters. Zooko's SHA1-as-decimal string is pycryptopp is 48 (49 max). So we should be able to build the current pycryptopp on systems with up to 78-character usernames. If git moves to SHA256 and Zooko keeps the same versioning scheme (SHA256-as-decimal would be 78 characters), then we can tolerate usernames of up to 48 characters.