BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED
The crappy epoll API stroke again with reloads and transferred FDs. Indeed, when listening sockets are retrieved by a new worker from a previous one, and the old one finally stops listening on them, it closes the FDs. But in this case, since the sockets themselves were not closed, epoll will not unregister them and will continue to report new activity for these in the old process, which can only observe, count an fd_poll_drop event and not unregister them since they're not reachable anymore. The unfortunate effect is that long-lasting old processes are woken up at the same rate as the new process when accepting new connections, and can waste a lot of CPU. Accept rates divided by 8 were observed on a small test involving a slow transfer on 10 connections facing a reload every second so that 10 processes were busy dealing with them while another process was hammering the service with new connections. Fortunately, years ago we implemented a flag FD_CLONED exactly for similar purposes. Let's simply mark transferred FDs with FD_CLONED so that the process knows that these ones require special treatment and have to be manually unregistered before being closed. This does the job fine, now old processes correctly unregister the FD before closing it and no longer receive accept events for the new process. This needs to be backported to all stable versions. It only affects epoll, as usual, and this time in combination with transferred FDs (typically reloads in master-worker mode). Thanks to Damien Claisse for providing all detailed measurements and statistics allowing to understand and reproduce the problem.
This commit is contained in:
parent
e2744d23be
commit
561319bd1c
@ -2361,6 +2361,9 @@ static int _getsocks(char **args, char *payload, struct appctx *appctx, void *pr
|
||||
if (!(fdtab[cur_fd].state & FD_EXPORTED))
|
||||
continue;
|
||||
|
||||
/* this FD is now shared between processes */
|
||||
HA_ATOMIC_OR(&fdtab[cur_fd].state, FD_CLONED);
|
||||
|
||||
ns_name = if_name = "";
|
||||
ns_nlen = if_nlen = 0;
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user