Use select() in Python Socket Programming at Your Own Risk
When one reads the
Python Socket Programming HOWTO,
non-blocking sockets are mentioned along with select
. This is a tale of
when select
can cause issues if you're not careful.
Start off by reading How to increase filedescriptor's range in python select(). Pay special attention to this part:
Strictly speaking, select() is limited in the highest file descriptor it can support, as opposed to the number of them in a given call - see the start of the Notes section of the select() manpage.
Let's get the relevant information from man pages on macOS and Linux.
$ man -S 2 select | tail -n 22 | head -n 10 BUGS Although the provision of getdtablesize(2) was intended to allow user programs to be written independent of the kernel limit on the number of open files, the dimension of a sufficiently large bit field for select remains a problem. The default size FD_SETSIZE (currently 1024) is some- what smaller than the current kernel limit to the number of open files. However, in order to accommodate programs which might potentially use a larger number of open files with select, it is possible to increase this size within a program by providing a larger definition of FD_SETSIZE before the inclusion of <sys/types.h>. $ curl -s http://man7.org/linux/man-pages/man2/select.2.html#BUGS | grep -A 15 'href="#BUGS"></a>BUGS' <h2><a id="BUGS" href="#BUGS"></a>BUGS <a href="#top_of_page"><span class="top-link">top</span></a></h2><pre> POSIX allows an implementation to define an upper limit, advertised via the constant <b>FD_SETSIZE</b>, on the range of file descriptors that can be specified in a file descriptor set. The Linux kernel imposes no fixed limit, but the glibc implementation makes <i>fd_set</i> a fixed- size type, with <b>FD_SETSIZE </b>defined as 1024, and the <b>FD_*</b>() macros operating according to that limit. To monitor file descriptors greater than 1023, use <a href="../man2/poll.2.html">poll(2)</a> instead. According to POSIX, <b>select</b>() should check all specified file descriptors in the three file descriptor sets, up to the limit <i>nfds-1</i>. However, the current implementation ignores any file descriptor in these sets that is greater than the maximum file descriptor number that the process currently has open. According to POSIX, any such file descriptor that is specified in one of the sets should result in the error <b>EBADF</b>.
In simpler words, select
can only handle file descriptors whose ID is less
than or equal to the value of FD_SETSIZE at the time Python was built. By
default it is 1024. One could pass a single socket to select
and it would
raise an exception -- ValueError: filedescriptor out of range in select() --
if the fileno
of the socket is greater than the value of FD_SETSIZE.
This is a long standing issue and there are multiple workarounds:
Rebuild Python after increasing the value of FD_SETSIZE in sys/types.h C file
Do not pass in non-blocking socket whose
fileno
is greater than 1024Do not use
select
with non-blocking sockets
Repro
Let's look at some examples that reproduce this issue.
Here's an example that works.
$ python3.6 Python 3.6.1 (default, May 1 2017, 22:40:40) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.24.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> import select >>> s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) >>> s.fileno() 3 >>> s.connect(('127.0.0.1', 9090)) >>> r, w, err = select.select([], [s], []) >>> r [] >>> w [<socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 52739), raddr=('127.0.0.1', 9090)>] >>> err [] >>> s.shutdown(socket.SHUT_RDWR) >>> s.close() >>>
Following is functionally the same as above.
$ python3.6 Python 3.6.1 (default, May 1 2017, 22:40:40) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.24.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> import select >>> s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) >>> s.fileno() 3 >>> s.timeout >>> s.setblocking(1) >>> s.settimeout(None) >>> s.connect(('127.0.0.1', 9090)) >>> r, w, err = select.select([], [s], []) >>> r [] >>> w [<socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 52748), raddr=('127.0.0.1', 9090)>] >>> err [] >>> s.shutdown(socket.SHUT_RDWR) >>> s.close()
Now let's make the socket non-blocking and demonstrate it still works.
$ python3.6 Python 3.6.1 (default, May 1 2017, 22:40:40) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.24.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> import select >>> s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) >>> s.fileno() 3 >>> s.timeout >>> s.setblocking(0) >>> s.settimeout(5) >>> s.timeout 5.0 >>> s.connect(('127.0.0.1', 9090)) >>> r, w, err = select.select([], [s], []) >>> r [] >>> w [<socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 52774), raddr=('127.0.0.1', 9090)>] >>> err [] >>> s.shutdown(socket.SHUT_RDWR) >>> s.close()
Let's reproduce the exception ValueError: filedescriptor out of range in select().
$ python3.6 Python 3.6.1 (default, May 1 2017, 22:40:40) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.24.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> import select >>> sockets = [] >>> for i in range(1024): ... s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) ... s.setblocking(False) ... s.settimeout(5) ... s.connect(('127.0.0.1', 9090)) ... sockets.append(s) ... r, w, err = select.select([], sockets, []) ... Traceback (most recent call last): File "<stdin>", line 6, in <module> BlockingIOError: [Errno 36] Operation now in progress >>> s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) >>> s.fileno() 1025 >>> s.timeout >>> s.setblocking(False) >>> s.settimeout(5) >>> s.timeout 5.0 >>> s.connect(('127.0.0.1', 9090)) Traceback (most recent call last): File "<stdin>", line 1, in <module> BlockingIOError: [Errno 36] Operation now in progress >>> r, w, err = select.select([], [s], [], 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: filedescriptor out of range in select() >>> s.shutdown(socket.SHUT_RDWR) >>> s.close() >>> for s in sockets: ... try: ... s.shutdown(socket.SHUT_RDWR) ... except: ... pass ... try: ... s.close() ... except: ... pass ... >>>
Workaround
With a lot of open sockets in a process, opening a new connect
raises the
exception BlockingIOError: [Errno 36] Operation now in progress. A solution
that worked for me was to sleep for a few seconds and try connect
again.
The second attempt raises a different exception. If the second exception is
OSError: [Errno 56] Socket is already connected then the socket is ready and
you can use it as intended.
$ python3.6 Python 3.6.1 (default, May 1 2017, 22:40:40) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.24.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> import select >>> sockets = [] >>> for i in range(1024): ... s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) ... s.setblocking(False) ... s.settimeout(5) ... s.connect(('127.0.0.1', 9090)) ... sockets.append(s) ... r, w, err = select.select([], sockets, []) ... Traceback (most recent call last): File "<stdin>", line 5, in <module> BlockingIOError: [Errno 36] Operation now in progress >>> import time >>> time.sleep(5) >>> s = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM) >>> s.fileno() 1025 >>> s.connect(('127.0.0.1', 9090)) Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 56] Socket is already connected >>> try: ... s.connect(('127.0.0.1', 9090)) ... except OSError as e: ... if "[Errno 56] Socket is already connected" in str(e): ... print("All is well") ... else: ... print("Something went horribly wrong") ... raise e ... All is well >>> s.shutdown(socket.SHUT_RDWR) >>> s.close() >>> for s in sockets: ... try: ... s.shutdown(socket.SHUT_RDWR) ... except: ... pass ... try: ... s.close() ... except: ... pass ... >>>
Conclusion
Using socket.socket.setblocking(False)
or socket.socket.settimeout(0)
sets the socket to be non-blocking. Using such a socket with select
when
its fileno
is greater than FD_SETSIZE will raise the
exception ValueError: filedescriptor out of range in select().
Avoid this case by not using select
on non-blocking sockets when a single
process could create a lot of sockets.