Context Navigation

Changes between Version 1 and Version 2 of Ticket #2787, comment 1

Timestamp:: 2018-05-23T14:14:58Z (6 years ago)
Author:: exarkun
Comment:

Legend:

: Unmodified
: Added
: Removed
: Modified

Ticket #2787, comment 1

v1	v2
1	1	It's not possible to fix this inside `allocate_tcp_port` itself. So I'm planning to close this ticket. Instead, we'll have a ticket for each test which can fail this way and they'll have to be fixed one by one.
2	2
3		The reason we cannot fix this inside `allocate_tcp_port` is that the approach it is a component of suffers from an unavoidable race condition. `allocate_tcp_port` tries to figure out a specific TCP port number which _will not be in use at a later point in time_. Since there is no part of the system which allows the port number to be reserved or otherwise kept out of us except by the one piece of code we intend it cannot actually know whether any port number it selects will satisfy this requirement.
	3	The reason we cannot fix this inside `allocate_tcp_port` is that the approach it is a component of suffers from an unavoidable race condition. `allocate_tcp_port` tries to figure out a specific TCP port number which ''will not be in use at a later point in time''. Since there is no part of the system which allows the port number to be reserved or otherwise kept out of us '''except by the one piece of code we intend''' it cannot actually know whether any port number it selects will satisfy this requirement.
4	4
5	5	In practice, it does succeed with high probability. However, due to the large number of cases in which it is used (many times per test suite run and the test suite itself is run many times), even this high probability of success is not good enough. I will make an incredibly naive estimate that there are 2^15^ ports available for "random" assignment and that the chance of an unrelated intermediate assignment being made is about 1 in 2 (I suspect some tests themselves trigger an unrelated intermediate port assignment). The chance of a collision is therefore 1 in 2^16^ (around a thousandth of a percent). If there are 100 users of `allocate_tcp_port` in the test suite then the chance of a collision anywhere in the test suite is 100 in 2^16^. There are about 15 different CI runners of the test suite. So the chance of a failure on any of them for one build set is 15 * 100 in 2^16^. The test suite is run for every pull request and every master revision. If there is one PR merged a day, the chance of a failure in a week is at least 14 * 15 * 100 in 2^16^ which reduces to around 32%. Quite easily high enough to be disruptive to development.