this is a new post because the old post about presumed speed-up from disabling delayed ACK got closed.
if disabling delayed ACKs speeds-up an application, it suggests that application is actually broken and not presenting logically associated data to the transport in a single call.
for example, an application that does:
write app header
write app data
read remote response
is "broken" because app header and data shoudl be written to the transport at the same time. In that way, it will have no issue with Nagle and then no issue with interaction of Nagle with delayed ACK.
disabling delayed ACK may speed-up such a broken application, but it means more ACKs and in broad handwaving terms, it takes just as much CPU to process an ACK as it does a data segment.
so, what _should_ be a two packet exchange on the network - app request followed by app response becomes a five or six packet exchange:
-> app header
<- immediate ACK
-> app data
<- immediate ACK
<- app response
-> immediate ACK
(assumes delayed ACK is disabled at both ends - if the application is broken on one direction, it is probably broken in the other
while this six (perhaps 8 even if response is header and then data) may take less clock time than the four-packet delayed ACK version:
-> app header
<- delayed ACK
-> app data
<- remote response
it will take much more CPU, particularly when compared with the properly writen version of the application
there is no rest for the wicked yet the virtuous have no pillows