17db96d56Sopenharmony_ci.. _socket-howto:
27db96d56Sopenharmony_ci
37db96d56Sopenharmony_ci****************************
47db96d56Sopenharmony_ci  Socket Programming HOWTO
57db96d56Sopenharmony_ci****************************
67db96d56Sopenharmony_ci
77db96d56Sopenharmony_ci:Author: Gordon McMillan
87db96d56Sopenharmony_ci
97db96d56Sopenharmony_ci
107db96d56Sopenharmony_ci.. topic:: Abstract
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci   Sockets are used nearly everywhere, but are one of the most severely
137db96d56Sopenharmony_ci   misunderstood technologies around. This is a 10,000 foot overview of sockets.
147db96d56Sopenharmony_ci   It's not really a tutorial - you'll still have work to do in getting things
157db96d56Sopenharmony_ci   operational. It doesn't cover the fine points (and there are a lot of them), but
167db96d56Sopenharmony_ci   I hope it will give you enough background to begin using them decently.
177db96d56Sopenharmony_ci
187db96d56Sopenharmony_ci
197db96d56Sopenharmony_ciSockets
207db96d56Sopenharmony_ci=======
217db96d56Sopenharmony_ci
227db96d56Sopenharmony_ciI'm only going to talk about INET (i.e. IPv4) sockets, but they account for at least 99% of
237db96d56Sopenharmony_cithe sockets in use. And I'll only talk about STREAM (i.e. TCP) sockets - unless you really
247db96d56Sopenharmony_ciknow what you're doing (in which case this HOWTO isn't for you!), you'll get
257db96d56Sopenharmony_cibetter behavior and performance from a STREAM socket than anything else. I will
267db96d56Sopenharmony_citry to clear up the mystery of what a socket is, as well as some hints on how to
277db96d56Sopenharmony_ciwork with blocking and non-blocking sockets. But I'll start by talking about
287db96d56Sopenharmony_ciblocking sockets. You'll need to know how they work before dealing with
297db96d56Sopenharmony_cinon-blocking sockets.
307db96d56Sopenharmony_ci
317db96d56Sopenharmony_ciPart of the trouble with understanding these things is that "socket" can mean a
327db96d56Sopenharmony_cinumber of subtly different things, depending on context. So first, let's make a
337db96d56Sopenharmony_cidistinction between a "client" socket - an endpoint of a conversation, and a
347db96d56Sopenharmony_ci"server" socket, which is more like a switchboard operator. The client
357db96d56Sopenharmony_ciapplication (your browser, for example) uses "client" sockets exclusively; the
367db96d56Sopenharmony_ciweb server it's talking to uses both "server" sockets and "client" sockets.
377db96d56Sopenharmony_ci
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ciHistory
407db96d56Sopenharmony_ci-------
417db96d56Sopenharmony_ci
427db96d56Sopenharmony_ciOf the various forms of :abbr:`IPC (Inter Process Communication)`,
437db96d56Sopenharmony_cisockets are by far the most popular.  On any given platform, there are
447db96d56Sopenharmony_cilikely to be other forms of IPC that are faster, but for
457db96d56Sopenharmony_cicross-platform communication, sockets are about the only game in town.
467db96d56Sopenharmony_ci
477db96d56Sopenharmony_ciThey were invented in Berkeley as part of the BSD flavor of Unix. They spread
487db96d56Sopenharmony_cilike wildfire with the internet. With good reason --- the combination of sockets
497db96d56Sopenharmony_ciwith INET makes talking to arbitrary machines around the world unbelievably easy
507db96d56Sopenharmony_ci(at least compared to other schemes).
517db96d56Sopenharmony_ci
527db96d56Sopenharmony_ci
537db96d56Sopenharmony_ciCreating a Socket
547db96d56Sopenharmony_ci=================
557db96d56Sopenharmony_ci
567db96d56Sopenharmony_ciRoughly speaking, when you clicked on the link that brought you to this page,
577db96d56Sopenharmony_ciyour browser did something like the following::
587db96d56Sopenharmony_ci
597db96d56Sopenharmony_ci   # create an INET, STREAMing socket
607db96d56Sopenharmony_ci   s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
617db96d56Sopenharmony_ci   # now connect to the web server on port 80 - the normal http port
627db96d56Sopenharmony_ci   s.connect(("www.python.org", 80))
637db96d56Sopenharmony_ci
647db96d56Sopenharmony_ciWhen the ``connect`` completes, the socket ``s`` can be used to send
657db96d56Sopenharmony_ciin a request for the text of the page. The same socket will read the
667db96d56Sopenharmony_cireply, and then be destroyed. That's right, destroyed. Client sockets
677db96d56Sopenharmony_ciare normally only used for one exchange (or a small set of sequential
687db96d56Sopenharmony_ciexchanges).
697db96d56Sopenharmony_ci
707db96d56Sopenharmony_ciWhat happens in the web server is a bit more complex. First, the web server
717db96d56Sopenharmony_cicreates a "server socket"::
727db96d56Sopenharmony_ci
737db96d56Sopenharmony_ci   # create an INET, STREAMing socket
747db96d56Sopenharmony_ci   serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
757db96d56Sopenharmony_ci   # bind the socket to a public host, and a well-known port
767db96d56Sopenharmony_ci   serversocket.bind((socket.gethostname(), 80))
777db96d56Sopenharmony_ci   # become a server socket
787db96d56Sopenharmony_ci   serversocket.listen(5)
797db96d56Sopenharmony_ci
807db96d56Sopenharmony_ciA couple things to notice: we used ``socket.gethostname()`` so that the socket
817db96d56Sopenharmony_ciwould be visible to the outside world.  If we had used ``s.bind(('localhost',
827db96d56Sopenharmony_ci80))`` or ``s.bind(('127.0.0.1', 80))`` we would still have a "server" socket,
837db96d56Sopenharmony_cibut one that was only visible within the same machine.  ``s.bind(('', 80))``
847db96d56Sopenharmony_cispecifies that the socket is reachable by any address the machine happens to
857db96d56Sopenharmony_cihave.
867db96d56Sopenharmony_ci
877db96d56Sopenharmony_ciA second thing to note: low number ports are usually reserved for "well known"
887db96d56Sopenharmony_ciservices (HTTP, SNMP etc). If you're playing around, use a nice high number (4
897db96d56Sopenharmony_cidigits).
907db96d56Sopenharmony_ci
917db96d56Sopenharmony_ciFinally, the argument to ``listen`` tells the socket library that we want it to
927db96d56Sopenharmony_ciqueue up as many as 5 connect requests (the normal max) before refusing outside
937db96d56Sopenharmony_ciconnections. If the rest of the code is written properly, that should be plenty.
947db96d56Sopenharmony_ci
957db96d56Sopenharmony_ciNow that we have a "server" socket, listening on port 80, we can enter the
967db96d56Sopenharmony_cimainloop of the web server::
977db96d56Sopenharmony_ci
987db96d56Sopenharmony_ci   while True:
997db96d56Sopenharmony_ci       # accept connections from outside
1007db96d56Sopenharmony_ci       (clientsocket, address) = serversocket.accept()
1017db96d56Sopenharmony_ci       # now do something with the clientsocket
1027db96d56Sopenharmony_ci       # in this case, we'll pretend this is a threaded server
1037db96d56Sopenharmony_ci       ct = client_thread(clientsocket)
1047db96d56Sopenharmony_ci       ct.run()
1057db96d56Sopenharmony_ci
1067db96d56Sopenharmony_ciThere's actually 3 general ways in which this loop could work - dispatching a
1077db96d56Sopenharmony_cithread to handle ``clientsocket``, create a new process to handle
1087db96d56Sopenharmony_ci``clientsocket``, or restructure this app to use non-blocking sockets, and
1097db96d56Sopenharmony_cimultiplex between our "server" socket and any active ``clientsocket``\ s using
1107db96d56Sopenharmony_ci``select``. More about that later. The important thing to understand now is
1117db96d56Sopenharmony_cithis: this is *all* a "server" socket does. It doesn't send any data. It doesn't
1127db96d56Sopenharmony_cireceive any data. It just produces "client" sockets. Each ``clientsocket`` is
1137db96d56Sopenharmony_cicreated in response to some *other* "client" socket doing a ``connect()`` to the
1147db96d56Sopenharmony_cihost and port we're bound to. As soon as we've created that ``clientsocket``, we
1157db96d56Sopenharmony_cigo back to listening for more connections. The two "clients" are free to chat it
1167db96d56Sopenharmony_ciup - they are using some dynamically allocated port which will be recycled when
1177db96d56Sopenharmony_cithe conversation ends.
1187db96d56Sopenharmony_ci
1197db96d56Sopenharmony_ci
1207db96d56Sopenharmony_ciIPC
1217db96d56Sopenharmony_ci---
1227db96d56Sopenharmony_ci
1237db96d56Sopenharmony_ciIf you need fast IPC between two processes on one machine, you should look into
1247db96d56Sopenharmony_cipipes or shared memory.  If you do decide to use AF_INET sockets, bind the
1257db96d56Sopenharmony_ci"server" socket to ``'localhost'``. On most platforms, this will take a
1267db96d56Sopenharmony_cishortcut around a couple of layers of network code and be quite a bit faster.
1277db96d56Sopenharmony_ci
1287db96d56Sopenharmony_ci.. seealso::
1297db96d56Sopenharmony_ci   The :mod:`multiprocessing` integrates cross-platform IPC into a higher-level
1307db96d56Sopenharmony_ci   API.
1317db96d56Sopenharmony_ci
1327db96d56Sopenharmony_ci
1337db96d56Sopenharmony_ciUsing a Socket
1347db96d56Sopenharmony_ci==============
1357db96d56Sopenharmony_ci
1367db96d56Sopenharmony_ciThe first thing to note, is that the web browser's "client" socket and the web
1377db96d56Sopenharmony_ciserver's "client" socket are identical beasts. That is, this is a "peer to peer"
1387db96d56Sopenharmony_ciconversation. Or to put it another way, *as the designer, you will have to
1397db96d56Sopenharmony_cidecide what the rules of etiquette are for a conversation*. Normally, the
1407db96d56Sopenharmony_ci``connect``\ ing socket starts the conversation, by sending in a request, or
1417db96d56Sopenharmony_ciperhaps a signon. But that's a design decision - it's not a rule of sockets.
1427db96d56Sopenharmony_ci
1437db96d56Sopenharmony_ciNow there are two sets of verbs to use for communication. You can use ``send``
1447db96d56Sopenharmony_ciand ``recv``, or you can transform your client socket into a file-like beast and
1457db96d56Sopenharmony_ciuse ``read`` and ``write``. The latter is the way Java presents its sockets.
1467db96d56Sopenharmony_ciI'm not going to talk about it here, except to warn you that you need to use
1477db96d56Sopenharmony_ci``flush`` on sockets. These are buffered "files", and a common mistake is to
1487db96d56Sopenharmony_ci``write`` something, and then ``read`` for a reply. Without a ``flush`` in
1497db96d56Sopenharmony_cithere, you may wait forever for the reply, because the request may still be in
1507db96d56Sopenharmony_ciyour output buffer.
1517db96d56Sopenharmony_ci
1527db96d56Sopenharmony_ciNow we come to the major stumbling block of sockets - ``send`` and ``recv`` operate
1537db96d56Sopenharmony_cion the network buffers. They do not necessarily handle all the bytes you hand
1547db96d56Sopenharmony_cithem (or expect from them), because their major focus is handling the network
1557db96d56Sopenharmony_cibuffers. In general, they return when the associated network buffers have been
1567db96d56Sopenharmony_cifilled (``send``) or emptied (``recv``). They then tell you how many bytes they
1577db96d56Sopenharmony_cihandled. It is *your* responsibility to call them again until your message has
1587db96d56Sopenharmony_cibeen completely dealt with.
1597db96d56Sopenharmony_ci
1607db96d56Sopenharmony_ciWhen a ``recv`` returns 0 bytes, it means the other side has closed (or is in
1617db96d56Sopenharmony_cithe process of closing) the connection.  You will not receive any more data on
1627db96d56Sopenharmony_cithis connection. Ever.  You may be able to send data successfully; I'll talk
1637db96d56Sopenharmony_cimore about this later.
1647db96d56Sopenharmony_ci
1657db96d56Sopenharmony_ciA protocol like HTTP uses a socket for only one transfer. The client sends a
1667db96d56Sopenharmony_cirequest, then reads a reply.  That's it. The socket is discarded. This means that
1677db96d56Sopenharmony_cia client can detect the end of the reply by receiving 0 bytes.
1687db96d56Sopenharmony_ci
1697db96d56Sopenharmony_ciBut if you plan to reuse your socket for further transfers, you need to realize
1707db96d56Sopenharmony_cithat *there is no* :abbr:`EOT (End of Transfer)` *on a socket.* I repeat: if a socket
1717db96d56Sopenharmony_ci``send`` or ``recv`` returns after handling 0 bytes, the connection has been
1727db96d56Sopenharmony_cibroken.  If the connection has *not* been broken, you may wait on a ``recv``
1737db96d56Sopenharmony_ciforever, because the socket will *not* tell you that there's nothing more to
1747db96d56Sopenharmony_ciread (for now).  Now if you think about that a bit, you'll come to realize a
1757db96d56Sopenharmony_cifundamental truth of sockets: *messages must either be fixed length* (yuck), *or
1767db96d56Sopenharmony_cibe delimited* (shrug), *or indicate how long they are* (much better), *or end by
1777db96d56Sopenharmony_cishutting down the connection*. The choice is entirely yours, (but some ways are
1787db96d56Sopenharmony_cirighter than others).
1797db96d56Sopenharmony_ci
1807db96d56Sopenharmony_ciAssuming you don't want to end the connection, the simplest solution is a fixed
1817db96d56Sopenharmony_cilength message::
1827db96d56Sopenharmony_ci
1837db96d56Sopenharmony_ci   class MySocket:
1847db96d56Sopenharmony_ci       """demonstration class only
1857db96d56Sopenharmony_ci         - coded for clarity, not efficiency
1867db96d56Sopenharmony_ci       """
1877db96d56Sopenharmony_ci
1887db96d56Sopenharmony_ci       def __init__(self, sock=None):
1897db96d56Sopenharmony_ci           if sock is None:
1907db96d56Sopenharmony_ci               self.sock = socket.socket(
1917db96d56Sopenharmony_ci                               socket.AF_INET, socket.SOCK_STREAM)
1927db96d56Sopenharmony_ci           else:
1937db96d56Sopenharmony_ci               self.sock = sock
1947db96d56Sopenharmony_ci
1957db96d56Sopenharmony_ci       def connect(self, host, port):
1967db96d56Sopenharmony_ci           self.sock.connect((host, port))
1977db96d56Sopenharmony_ci
1987db96d56Sopenharmony_ci       def mysend(self, msg):
1997db96d56Sopenharmony_ci           totalsent = 0
2007db96d56Sopenharmony_ci           while totalsent < MSGLEN:
2017db96d56Sopenharmony_ci               sent = self.sock.send(msg[totalsent:])
2027db96d56Sopenharmony_ci               if sent == 0:
2037db96d56Sopenharmony_ci                   raise RuntimeError("socket connection broken")
2047db96d56Sopenharmony_ci               totalsent = totalsent + sent
2057db96d56Sopenharmony_ci
2067db96d56Sopenharmony_ci       def myreceive(self):
2077db96d56Sopenharmony_ci           chunks = []
2087db96d56Sopenharmony_ci           bytes_recd = 0
2097db96d56Sopenharmony_ci           while bytes_recd < MSGLEN:
2107db96d56Sopenharmony_ci               chunk = self.sock.recv(min(MSGLEN - bytes_recd, 2048))
2117db96d56Sopenharmony_ci               if chunk == b'':
2127db96d56Sopenharmony_ci                   raise RuntimeError("socket connection broken")
2137db96d56Sopenharmony_ci               chunks.append(chunk)
2147db96d56Sopenharmony_ci               bytes_recd = bytes_recd + len(chunk)
2157db96d56Sopenharmony_ci           return b''.join(chunks)
2167db96d56Sopenharmony_ci
2177db96d56Sopenharmony_ciThe sending code here is usable for almost any messaging scheme - in Python you
2187db96d56Sopenharmony_cisend strings, and you can use ``len()`` to determine its length (even if it has
2197db96d56Sopenharmony_ciembedded ``\0`` characters). It's mostly the receiving code that gets more
2207db96d56Sopenharmony_cicomplex. (And in C, it's not much worse, except you can't use ``strlen`` if the
2217db96d56Sopenharmony_cimessage has embedded ``\0``\ s.)
2227db96d56Sopenharmony_ci
2237db96d56Sopenharmony_ciThe easiest enhancement is to make the first character of the message an
2247db96d56Sopenharmony_ciindicator of message type, and have the type determine the length. Now you have
2257db96d56Sopenharmony_citwo ``recv``\ s - the first to get (at least) that first character so you can
2267db96d56Sopenharmony_cilook up the length, and the second in a loop to get the rest. If you decide to
2277db96d56Sopenharmony_cigo the delimited route, you'll be receiving in some arbitrary chunk size, (4096
2287db96d56Sopenharmony_cior 8192 is frequently a good match for network buffer sizes), and scanning what
2297db96d56Sopenharmony_ciyou've received for a delimiter.
2307db96d56Sopenharmony_ci
2317db96d56Sopenharmony_ciOne complication to be aware of: if your conversational protocol allows multiple
2327db96d56Sopenharmony_cimessages to be sent back to back (without some kind of reply), and you pass
2337db96d56Sopenharmony_ci``recv`` an arbitrary chunk size, you may end up reading the start of a
2347db96d56Sopenharmony_cifollowing message. You'll need to put that aside and hold onto it, until it's
2357db96d56Sopenharmony_cineeded.
2367db96d56Sopenharmony_ci
2377db96d56Sopenharmony_ciPrefixing the message with its length (say, as 5 numeric characters) gets more
2387db96d56Sopenharmony_cicomplex, because (believe it or not), you may not get all 5 characters in one
2397db96d56Sopenharmony_ci``recv``. In playing around, you'll get away with it; but in high network loads,
2407db96d56Sopenharmony_ciyour code will very quickly break unless you use two ``recv`` loops - the first
2417db96d56Sopenharmony_cito determine the length, the second to get the data part of the message. Nasty.
2427db96d56Sopenharmony_ciThis is also when you'll discover that ``send`` does not always manage to get
2437db96d56Sopenharmony_cirid of everything in one pass. And despite having read this, you will eventually
2447db96d56Sopenharmony_ciget bit by it!
2457db96d56Sopenharmony_ci
2467db96d56Sopenharmony_ciIn the interests of space, building your character, (and preserving my
2477db96d56Sopenharmony_cicompetitive position), these enhancements are left as an exercise for the
2487db96d56Sopenharmony_cireader. Lets move on to cleaning up.
2497db96d56Sopenharmony_ci
2507db96d56Sopenharmony_ci
2517db96d56Sopenharmony_ciBinary Data
2527db96d56Sopenharmony_ci-----------
2537db96d56Sopenharmony_ci
2547db96d56Sopenharmony_ciIt is perfectly possible to send binary data over a socket. The major problem is
2557db96d56Sopenharmony_cithat not all machines use the same formats for binary data. For example,
2567db96d56Sopenharmony_ci`network byte order <https://en.wikipedia.org/wiki/Endianness#Networking>`_
2577db96d56Sopenharmony_ciis big-endian, with the most significant byte first,
2587db96d56Sopenharmony_ciso a 16 bit integer with the value ``1`` would be the two hex bytes ``00 01``.
2597db96d56Sopenharmony_ciHowever, most common processors (x86/AMD64, ARM, RISC-V), are little-endian,
2607db96d56Sopenharmony_ciwith the least significant byte first - that same ``1`` would be ``01 00``.
2617db96d56Sopenharmony_ci
2627db96d56Sopenharmony_ciSocket libraries have calls for converting 16 and 32 bit integers - ``ntohl,
2637db96d56Sopenharmony_cihtonl, ntohs, htons`` where "n" means *network* and "h" means *host*, "s" means
2647db96d56Sopenharmony_ci*short* and "l" means *long*. Where network order is host order, these do
2657db96d56Sopenharmony_cinothing, but where the machine is byte-reversed, these swap the bytes around
2667db96d56Sopenharmony_ciappropriately.
2677db96d56Sopenharmony_ci
2687db96d56Sopenharmony_ciIn these days of 64-bit machines, the ASCII representation of binary data is
2697db96d56Sopenharmony_cifrequently smaller than the binary representation. That's because a surprising
2707db96d56Sopenharmony_ciamount of the time, most integers have the value 0, or maybe 1.
2717db96d56Sopenharmony_ciThe string ``"0"`` would be two bytes, while a full 64-bit integer would be 8.
2727db96d56Sopenharmony_ciOf course, this doesn't fit well with fixed-length messages.
2737db96d56Sopenharmony_ciDecisions, decisions.
2747db96d56Sopenharmony_ci
2757db96d56Sopenharmony_ci
2767db96d56Sopenharmony_ciDisconnecting
2777db96d56Sopenharmony_ci=============
2787db96d56Sopenharmony_ci
2797db96d56Sopenharmony_ciStrictly speaking, you're supposed to use ``shutdown`` on a socket before you
2807db96d56Sopenharmony_ci``close`` it.  The ``shutdown`` is an advisory to the socket at the other end.
2817db96d56Sopenharmony_ciDepending on the argument you pass it, it can mean "I'm not going to send
2827db96d56Sopenharmony_cianymore, but I'll still listen", or "I'm not listening, good riddance!".  Most
2837db96d56Sopenharmony_cisocket libraries, however, are so used to programmers neglecting to use this
2847db96d56Sopenharmony_cipiece of etiquette that normally a ``close`` is the same as ``shutdown();
2857db96d56Sopenharmony_ciclose()``.  So in most situations, an explicit ``shutdown`` is not needed.
2867db96d56Sopenharmony_ci
2877db96d56Sopenharmony_ciOne way to use ``shutdown`` effectively is in an HTTP-like exchange. The client
2887db96d56Sopenharmony_cisends a request and then does a ``shutdown(1)``. This tells the server "This
2897db96d56Sopenharmony_ciclient is done sending, but can still receive."  The server can detect "EOF" by
2907db96d56Sopenharmony_cia receive of 0 bytes. It can assume it has the complete request.  The server
2917db96d56Sopenharmony_cisends a reply. If the ``send`` completes successfully then, indeed, the client
2927db96d56Sopenharmony_ciwas still receiving.
2937db96d56Sopenharmony_ci
2947db96d56Sopenharmony_ciPython takes the automatic shutdown a step further, and says that when a socket
2957db96d56Sopenharmony_ciis garbage collected, it will automatically do a ``close`` if it's needed. But
2967db96d56Sopenharmony_cirelying on this is a very bad habit. If your socket just disappears without
2977db96d56Sopenharmony_cidoing a ``close``, the socket at the other end may hang indefinitely, thinking
2987db96d56Sopenharmony_ciyou're just being slow. *Please* ``close`` your sockets when you're done.
2997db96d56Sopenharmony_ci
3007db96d56Sopenharmony_ci
3017db96d56Sopenharmony_ciWhen Sockets Die
3027db96d56Sopenharmony_ci----------------
3037db96d56Sopenharmony_ci
3047db96d56Sopenharmony_ciProbably the worst thing about using blocking sockets is what happens when the
3057db96d56Sopenharmony_ciother side comes down hard (without doing a ``close``). Your socket is likely to
3067db96d56Sopenharmony_cihang. TCP is a reliable protocol, and it will wait a long, long time
3077db96d56Sopenharmony_cibefore giving up on a connection. If you're using threads, the entire thread is
3087db96d56Sopenharmony_ciessentially dead. There's not much you can do about it. As long as you aren't
3097db96d56Sopenharmony_cidoing something dumb, like holding a lock while doing a blocking read, the
3107db96d56Sopenharmony_cithread isn't really consuming much in the way of resources. Do *not* try to kill
3117db96d56Sopenharmony_cithe thread - part of the reason that threads are more efficient than processes
3127db96d56Sopenharmony_ciis that they avoid the overhead associated with the automatic recycling of
3137db96d56Sopenharmony_ciresources. In other words, if you do manage to kill the thread, your whole
3147db96d56Sopenharmony_ciprocess is likely to be screwed up.
3157db96d56Sopenharmony_ci
3167db96d56Sopenharmony_ci
3177db96d56Sopenharmony_ciNon-blocking Sockets
3187db96d56Sopenharmony_ci====================
3197db96d56Sopenharmony_ci
3207db96d56Sopenharmony_ciIf you've understood the preceding, you already know most of what you need to
3217db96d56Sopenharmony_ciknow about the mechanics of using sockets. You'll still use the same calls, in
3227db96d56Sopenharmony_cimuch the same ways. It's just that, if you do it right, your app will be almost
3237db96d56Sopenharmony_ciinside-out.
3247db96d56Sopenharmony_ci
3257db96d56Sopenharmony_ciIn Python, you use ``socket.setblocking(False)`` to make it non-blocking. In C, it's
3267db96d56Sopenharmony_cimore complex, (for one thing, you'll need to choose between the BSD flavor
3277db96d56Sopenharmony_ci``O_NONBLOCK`` and the almost indistinguishable POSIX flavor ``O_NDELAY``, which
3287db96d56Sopenharmony_ciis completely different from ``TCP_NODELAY``), but it's the exact same idea. You
3297db96d56Sopenharmony_cido this after creating the socket, but before using it. (Actually, if you're
3307db96d56Sopenharmony_cinuts, you can switch back and forth.)
3317db96d56Sopenharmony_ci
3327db96d56Sopenharmony_ciThe major mechanical difference is that ``send``, ``recv``, ``connect`` and
3337db96d56Sopenharmony_ci``accept`` can return without having done anything. You have (of course) a
3347db96d56Sopenharmony_cinumber of choices. You can check return code and error codes and generally drive
3357db96d56Sopenharmony_ciyourself crazy. If you don't believe me, try it sometime. Your app will grow
3367db96d56Sopenharmony_cilarge, buggy and suck CPU. So let's skip the brain-dead solutions and do it
3377db96d56Sopenharmony_ciright.
3387db96d56Sopenharmony_ci
3397db96d56Sopenharmony_ciUse ``select``.
3407db96d56Sopenharmony_ci
3417db96d56Sopenharmony_ciIn C, coding ``select`` is fairly complex. In Python, it's a piece of cake, but
3427db96d56Sopenharmony_ciit's close enough to the C version that if you understand ``select`` in Python,
3437db96d56Sopenharmony_ciyou'll have little trouble with it in C::
3447db96d56Sopenharmony_ci
3457db96d56Sopenharmony_ci   ready_to_read, ready_to_write, in_error = \
3467db96d56Sopenharmony_ci                  select.select(
3477db96d56Sopenharmony_ci                     potential_readers,
3487db96d56Sopenharmony_ci                     potential_writers,
3497db96d56Sopenharmony_ci                     potential_errs,
3507db96d56Sopenharmony_ci                     timeout)
3517db96d56Sopenharmony_ci
3527db96d56Sopenharmony_ciYou pass ``select`` three lists: the first contains all sockets that you might
3537db96d56Sopenharmony_ciwant to try reading; the second all the sockets you might want to try writing
3547db96d56Sopenharmony_cito, and the last (normally left empty) those that you want to check for errors.
3557db96d56Sopenharmony_ciYou should note that a socket can go into more than one list. The ``select``
3567db96d56Sopenharmony_cicall is blocking, but you can give it a timeout. This is generally a sensible
3577db96d56Sopenharmony_cithing to do - give it a nice long timeout (say a minute) unless you have good
3587db96d56Sopenharmony_cireason to do otherwise.
3597db96d56Sopenharmony_ci
3607db96d56Sopenharmony_ciIn return, you will get three lists. They contain the sockets that are actually
3617db96d56Sopenharmony_cireadable, writable and in error. Each of these lists is a subset (possibly
3627db96d56Sopenharmony_ciempty) of the corresponding list you passed in.
3637db96d56Sopenharmony_ci
3647db96d56Sopenharmony_ciIf a socket is in the output readable list, you can be
3657db96d56Sopenharmony_cias-close-to-certain-as-we-ever-get-in-this-business that a ``recv`` on that
3667db96d56Sopenharmony_cisocket will return *something*. Same idea for the writable list. You'll be able
3677db96d56Sopenharmony_cito send *something*. Maybe not all you want to, but *something* is better than
3687db96d56Sopenharmony_cinothing.  (Actually, any reasonably healthy socket will return as writable - it
3697db96d56Sopenharmony_cijust means outbound network buffer space is available.)
3707db96d56Sopenharmony_ci
3717db96d56Sopenharmony_ciIf you have a "server" socket, put it in the potential_readers list. If it comes
3727db96d56Sopenharmony_ciout in the readable list, your ``accept`` will (almost certainly) work. If you
3737db96d56Sopenharmony_cihave created a new socket to ``connect`` to someone else, put it in the
3747db96d56Sopenharmony_cipotential_writers list. If it shows up in the writable list, you have a decent
3757db96d56Sopenharmony_cichance that it has connected.
3767db96d56Sopenharmony_ci
3777db96d56Sopenharmony_ciActually, ``select`` can be handy even with blocking sockets. It's one way of
3787db96d56Sopenharmony_cidetermining whether you will block - the socket returns as readable when there's
3797db96d56Sopenharmony_cisomething in the buffers.  However, this still doesn't help with the problem of
3807db96d56Sopenharmony_cidetermining whether the other end is done, or just busy with something else.
3817db96d56Sopenharmony_ci
3827db96d56Sopenharmony_ci**Portability alert**: On Unix, ``select`` works both with the sockets and
3837db96d56Sopenharmony_cifiles. Don't try this on Windows. On Windows, ``select`` works with sockets
3847db96d56Sopenharmony_cionly. Also note that in C, many of the more advanced socket options are done
3857db96d56Sopenharmony_cidifferently on Windows. In fact, on Windows I usually use threads (which work
3867db96d56Sopenharmony_civery, very well) with my sockets.
3877db96d56Sopenharmony_ci
3887db96d56Sopenharmony_ci
389