MeasurementsA sequence of messages of the same size is sent on a socket, and the time is measured for the whole sequence. Then the size of the message is doubled and the number of messages in the sequence is halfed, to keep the total amount of data fairly constant (approx. 4 MByte). And so on...
The messages contain all bytes of equal value, and no checking of the contents at the time of reception is done, as this would create a lot of overhead. The messages are received in passive mode.
The measurements has been done on two Sun Ultra 10 running Solaris 7, iluvatar (local) and gorbag (remote), connected with a 10 Mbit/s Ethernet. The Erlang/OTP releases were R6B-patched and R7A. There is no reason to assume that R7B should perform different from R7A.
The plots have the message size on the X-axis, and either the time or the bandwidth (bytes per second) on the Y-axis.
Server node type
Number of sockets
ResultsI will not show all plots in this document, to prevent the reader from falling asleap, but plots for all test combinations do exist in this very directory. It should not be hard for the curious to find and identify them.
Echo testsThe improvements are not astonishing in the echo tests. Most of the time is probably consumed by process scheduling, also on the server side. For remote tests, the network transfer time seems to be dominant.
Note that the plots have latency on the Y-axis, so the lower the better.
The new driver is about 25% faster. This is probably because one intermediate process has been removed. The same also applies to binaries.
When the server is on a remote node, this improvement is not noticable. It probably disappears in the TCP/IP protocol overhead on the host machine.
Here the new driver is about 50% faster. In this case there are 16 intermediate processes less in the new system, so it probably has faster process scheduling. For binaries, this effect is not as pronounced, but the [16 parallel] is faster than [single].
In this case, the scheduling improvements becomes measurable (about 40%), at least for small binaries. The corresponding test for lists show the same result.
Flow testsIn the flow tests, the improvements are more noticeable. There are some peculiarities for remote tests, though, where the new system seems faster only for small messages. Anyone who figures out why will receive honour and gratitude.
Note that the plots have bandwidth on the Y-axis, so the higher the better.
The new driver seems to be 2 to 3 times faster. The corresponding figure for binaries is almost 2 times.
Here the new driver is some 3 times faster for small binaries, but about 25% slower (i.e 0.75 times faster) for big ones. We find this strange. The same also applies to lists, also with a break-even at about 200 bytes.
For [flow test, 16 parallel] the results are as in the previous tests. The new driver is almost 3 times faster, but for remote tests of messages larger than 200 bytes it is some 20% slower. Weird. This plot shows [remote, list].
UAB/F/P Raimo Niskanen