[flow-tools] flow-capture reports PDUs out-of-sequence w/Juniper

Mark Fullmer maf@eng.oar.net
Thu, 13 Sep 2001 15:51:08 -0400


On Thu, Sep 13, 2001 at 12:31:49PM -0500, Dave Plonka wrote:
> >   o Turn of compression on flow-capture.
> 
> This may be it.  I use "-z0" on the collector where I haven't seen the
> problem, and WiscNet, which did see the problem, was using the default
> compression.

Compression does have a big impact on the number of flows per second
that can be processed:

% for z in 0 1 2 3 4 5 6 7 8 9; do
echo --- Compression level $z
flow-gen -n500000 | flow-cat -d1 -z$z >/dev/null
done

--- Compression level 0
flow-cat: processed 500000 flows
  sys:   seconds=0.123 flows/second=4061012.654115
  wall:  seconds=0.201 flows/second=2477811.200698
--- Compression level 1
flow-cat: processed 500000 flows
  sys:   seconds=3.661 flows/second=136560.792905
  wall:  seconds=3.775 flows/second=132427.564109
--- Compression level 2
flow-cat: processed 500000 flows
  sys:   seconds=4.040 flows/second=123756.004641
  wall:  seconds=4.154 flows/second=120340.071415
--- Compression level 3
flow-cat: processed 500000 flows
  sys:   seconds=5.285 flows/second=94598.196996
  wall:  seconds=5.401 flows/second=92569.415954
--- Compression level 4
flow-cat: processed 500000 flows
  sys:   seconds=5.749 flows/second=86963.478817
  wall:  seconds=5.877 flows/second=85063.395197
--- Compression level 5
flow-cat: processed 500000 flows
  sys:   seconds=7.802 flows/second=64079.725431
  wall:  seconds=7.927 flows/second=63068.761979
--- Compression level 6
flow-cat: processed 500000 flows
  sys:   seconds=14.026 flows/second=35647.797479
  wall:  seconds=14.179 flows/second=35262.763657
--- Compression level 7
flow-cat: processed 500000 flows
  sys:   seconds=22.384 flows/second=22336.486754
  wall:  seconds=22.519 flows/second=22202.597100
--- Compression level 8
flow-cat: processed 500000 flows
  sys:   seconds=57.210 flows/second=8739.584011
  wall:  seconds=57.364 flows/second=8716.185840
--- Compression level 9
flow-cat: processed 500000 flows
  sys:   seconds=136.186 flows/second=3671.428826
  wall:  seconds=136.416 flows/second=3665.255580

To actually benchmark a collector you could run
flow-gen -n500000 | flow-send 0/collector/9991
on one machine and
flow-receive -z{compress level} 0/0/9991 >/dev/null
on another.

When the collector starts complaining about lost PDU's
it's maxed out.  One thing to watch out for is overrunning
the transmit buffers on the sender.

> I just used strace(1)/truss(1) to see what value it got.  When using
> flow-tools 0.55 it did get 229376 (as you coded it 224*1024).

224*1024 was larger than anything I had seen work.  If Linux
accepts this without tweaking the kernel maybe it should
default higher.

> > Currently an out of order packet will emit two log messages, both
> > erroneously indicating a lost PDU.  Adding a buffer in front of
> > the sequence number detector could eliminate the messages and
> > provide more accurate stats on lost or out of order PDU's.
> 
> Still not following you exactly, but no matter...  Maybe you mean for
> it to remember recently received sequence numbers so that it can detect
> that they were out of order rather than lost, and log an appropriate
> message?  I didn't want to get into a mess of waiting for the right
> sequence number before writing it to a file, esp. since I switch files
> fairly often every five minutes.

Yes.  No intention to actually fix the ordering before writing
to disk, just better log messages.  If Juniper actually is sending
out lots of out of order packets I'll make it work better, but I'd
rather not.

mark