[flow-tools] flow-capture reports PDUs out-of-sequence w/Juniper
Mark Fullmer
maf@eng.oar.net
Wed, 12 Sep 2001 18:45:30 -0400
On Wed, Sep 12, 2001 at 04:09:45PM -0500, Dave Plonka wrote:
> ~800 flow PDUs in one second. Each PDU except the last is usually 1416
> bytes, which works out to about 1MB of data on the wire in 1 second.
> Perhaps the socket receive buffer is being over-run?
Probably. With FreeBSD this is easy to measure, netstat -s will
report dropped UDP datagrams due to full socket buffers.
> I've been examining the flow-capture code again and comparing it with
> cflowd's method of dynamically setting the SO_RCVBUF, since the two
> receivers sometimes show different behavior.
bigsockbuf() was added after 0.53, which will attempt to get the
largest socket buffer available. Prior to 0.53 it was a hardcoded
minimum value that would work on various Solaris, Linux, and FreeBSD
configurations. Unfortunately there's no portable way to inquire
the limits from the kernel so bigsockbuf() tries a large value and
continues, decrementing the request by 512, until the setsockopt()
no longer fails.
> This is quite weird, esp. since I'm seeing exactly the opposite when
> collecting flows from just one Juniper. When I use flow-capture I get
> more data than when I use cflowd.
A few ideas:
o Turn of compression on flow-capture.
o Try using multiple copies of flow-capture, there may be
issues with many bursty sources trying to use one socket
buffer.
o Run flow-capture at a high priority. Under FreeBSD they
have something called rtprio (pseudo realtime process
scheduling). This helped a lot on older Pentium 166
collectors when we were running reports on the same
box as flow-capture.
o Instrument bigsockbuf() to log the value it acquires. It
may be that the other processes running on the machine
that use UDP are using up the buffer space.
> > Dealing with
> > out of order exports isn't that much work to fix...
>
> Not quite sure what you mean. Doesn't flow-capture already doing the
> Right Thing(tm) with respect to out of order PDUs? From my perusal of
> the code it seems to collect and record the flows regardless of the
> sequence number, right? From my reading, it appears to just decide
> whether or not to _count_ them as out-of-sequence based on the
> magnitude of the delta value betw. the expected and received sequence
> number. Did I miss something?
Currently an out of order packet will emit two log messages, both
erroneously indicating a lost PDU. Adding a buffer in front of
the sequence number detector could eliminate the messages and
provide more accurate stats on lost or out of order PDU's.
mark