select / poll / epoll: the practical difference

When designing high-performance network applications with non-blocking sockets, it is important to decide which method of monitoring network events we will use. There are several of them and each is good and bad in his own way. Choosing the right method can be critical to your application architecture.

In this article we will look at:


Using select ()


The old, proven over the years hard worker select () was created back in the days when "sockets" were called " Berkeley sockets ". This method was not included in the very first specification of the Berkeley sockets themselves, because at that time there was no concept of non-blocking I / O. But somewhere in the 80s it appeared, and along with it select (). Since then, nothing has changed in its interface.

To use select (), the developer needs to initialize and populate several fd_set structures with descriptors and events that need to be monitored, and then call select (). A typical code looks like this:

fd_set fd_in, fd_out; struct timeval tv; //   FD_ZERO( &fd_in ); FD_ZERO( &fd_out ); //        sock1 FD_SET( sock1, &fd_in ); //        sock2 FD_SET( sock2, &fd_out ); //       (select   ) int largest_sock = sock1 > sock2 ? sock1 : sock2; //    10  tv.tv_sec = 10; tv.tv_usec = 0; //  select int ret = select( largest_sock + 1, &fd_in, &fd_out, NULL, &tv ); //    if ( ret == -1 ) //  else if ( ret == 0 ) // ,    else { if ( FD_ISSET( sock1, &fd_in ) ) //    sock1 if ( FD_ISSET( sock2, &fd_out ) ) //    sock2 } 

When select () was designed, no one probably expected that in the future we would need to write multi-threaded applications serving thousands of connections. In select () there are several significant flaws that make it poorly suited to work in such systems. The main ones are as follows:


Of course, all of the above is not news. The developers of operating systems have long realized these problems and many of them were taken into account when designing the poll method. In this place you may ask, why do we even study ancient history now and are there any reasons today to use ancient select? Yes, there are two such reasons. Not the fact that they will be useful to you once, but why not learn about them.

The first reason is portability. select () has been with us for a million years. In whatever jungle of software and hardware platforms you are not brought, if there is a network there, there will be a select. There may not be any other methods, but select will be almost guaranteed. And do not think that I am now falling into senility and remembering something like punched cards and ENIAC, no. There is no more modern poll method , for example, in Windows XP . But select is.

The second reason is more exotic and is related to the fact that select can (theoretically) work with timeouts on the order of one nanosecond (if the hardware allows), while both poll and epoll support only millisecond accuracy. This should not play a special role on ordinary desktops (or even servers), where you still do not have a nanosecond precision hardware timer. But still there are real-time systems in the world that have such timers. So I beg you, when you write the firmware of a nuclear reactor or rocket - do not be lazy to measure the time to nanoseconds. You know, I want to live.

The case described above is probably the only one in which you really have no choice what to use (only selects are appropriate). However, if you write a normal application to work on ordinary hardware, and you will operate with an adequate number of sockets (tens, hundreds - and no more), the difference in the performance of poll and select will not be noticeable, so the choice will be based on other factors.

Poll with poll ()


poll is a new method for polling sockets, created after people started trying to write large and high-loaded network services. It is designed much better and does not suffer from most of the shortcomings of the select method. In most cases, when writing modern applications, you will choose between using poll and epoll / libevent.

To use poll, the developer needs to initialize the members of the pollfd structure with observable handles and events, and then call poll ().
A typical code looks like this:

 //   struct pollfd fds[2]; //  sock1      fds[0].fd = sock1; fds[0].events = POLLIN; //   sock2 -  fds[1].fd = sock2; fds[1].events = POLLOUT; //   10  int ret = poll( &fds, 2, 10000 ); //    if ( ret == -1 ) //  else if ( ret == 0 ) // ,    else { //  ,  revents      if ( pfd[0].revents & POLLIN ) pfd[0].revents = 0; //     sock1 if ( pfd[1].revents & POLLOUT ) pfd[1].revents = 0; //     sock2 } 

Poll was created to solve the problems of the select method, let's see how it did it:


We already talked about the shortcomings of the poll method above: it is not on some platforms, like Windows XP. Starting with Vista, it exists, but is called WSAPoll. The prototype is the same, so for platform-independent code you can write an override, like:

 #if defined (WIN32) static inline int poll( struct pollfd *pfd, int nfds, int timeout) { return WSAPoll ( pfd, nfds, timeout ); } #endif 

Well, the accuracy of timeouts in 1 ms, which is very rarely enough. However, poll has other disadvantages:


However, all of the above can be considered relatively irrelevant for most client applications. The only exceptions are probably only the p2p protocols, where each client can be connected with thousands of others. These problems can be ignored even by most server applications. Thus, poll should be your default preference to select, unless one of the above two causes limits you.

Looking ahead, I would say that poll is preferable even compared to the more modern epoll (discussed below) in the following cases:


Polling with epoll ()


epoll is the latest and best way to wait for events on Linux (and only on Linux). Well, it's not that straightforward "newest" - it has been in the core since 2002. It differs from poll and select in that it provides an API for adding / deleting / modifying a list of observed descriptors and events.

Using epoll requires a bit more thorough preparation. The developer should:


A typical code looks like this:

 //   epoll.       ,      //    (    ,   ),        int pollingfd = epoll_create( 0xCAFE ); if ( pollingfd < 0 ) //  //   epoll_event struct epoll_event ev = { 0 }; //     .    ,   // epoll     . , ,       ev.data.ptr = pConnection1; //    ,     ev.events = EPOLLIN | EPOLLONESHOT; //     .        //      epoll_wait -    if ( epoll_ctl( epollfd, EPOLL_CTL_ADD, pConnection1->getSocket(), &ev ) != 0 ) // report error //       20    struct epoll_event pevents[ 20 ]; //  10  int ready = epoll_wait( pollingfd, pevents, 20, 10000 ); //    if ( ret == -1 ) //  else if ( ret == 0 ) // ,    else { //     for ( int i = 0; i < ret; i++ ) { if ( pevents[i].events & EPOLLIN ) { //        ,   Connection * c = (Connection*) pevents[i].data.ptr; c->handleReadEvent(); } } } 

Let's start with the shortcomings of epoll - they are obvious from the code. This method is more difficult to use, you need to write more code, it makes more system calls.

The advantages are also evident:


But it must also be remembered that epoll is not “improved poll in everything”. It also has disadvantages compared to poll:


Thus, epoll should be used only when all of the following is fulfilled:


If one or more items are not met, consider using poll or libevent.

libevent


libevent is a library that wraps the polling methods listed in this article (as well as some others) into a unified API. The advantage here is that once you have written the code, you can build and run it on different operating systems. Nevertheless, it is important to understand that libevent is just a wrapper, within which all the same methods listed above work, with all their advantages and disadvantages. libevent does not force select to listen to more than 1024 sockets, and epoll to modify the list of events without an additional system call. So knowing the underlying technology is still important.

The need to support different polling methods makes the libevent library API more complex. Still, its use is simpler than manually writing two different event selection engines for, for example, Linux and FreeBSD (using epoll and kqueue).

Consider using libevent when combining two events:

Source: https://habr.com/ru/post/415259/


All Articles