Discussion:
[cricket-users] Getting frequent "NaN" gaps in graphs, yet data is being retrieved ok
Andy Dills
2011-03-31 16:43:09 UTC
Permalink
I've been using cricket for around a decade.

In the last few years, we've grown to the gigabit ethernet level. Since
then, we've been getting strange gaps in our graphs, with NaN listed as
the current value if you look at the graph when the gap is occuring. It
doesn't seem to be based on a traffic level threshhold...I'll see valid
data displayed at times that are higher usage than the traffic rates when
the data is NaN.

Whenever I run a manual collector, the data is being retrieved fine. I
can't figure out why the data isn't being displayed properly, and I'm not
able to think of a way to troubleshoot this further. I've tried specifying
the speeds of the interfaces, and I've even tried doing a fresh install on
a much more modern server (as the server cricket has been on since 2004 is
still chugging along, and producing graphs a bit more slowly than it
should).

Any suggestions?

Thanks,
Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
Clarke Morledge
2011-03-31 18:13:16 UTC
Permalink
Andy,

Most probably, you are using 32-bit counters in your SNMP sources, and the
counters are wrapping around. When that happens, you'll get negative
results in your deltas, which is what gives you the NaN values in your
graphs.

I started running into this a few years ago once we started to put some
load on our gigabit links. Moving to using 64-bit counters (where
possible) will fix this. Just remember that you'll probably have to
graduate from SNMPv1 to SNMPv2c or 3 to get the 64-bit support in your
MIBs.

Clarke Morledge
College of William and Mary
Information Technology - Network Engineering
Jones Hall (Room 18)
Williamsburg VA 23187
Post by Andy Dills
I've been using cricket for around a decade.
In the last few years, we've grown to the gigabit ethernet level. Since
then, we've been getting strange gaps in our graphs, with NaN listed as
the current value if you look at the graph when the gap is occuring. It
doesn't seem to be based on a traffic level threshhold...I'll see valid
data displayed at times that are higher usage than the traffic rates when
the data is NaN.
Whenever I run a manual collector, the data is being retrieved fine. I
can't figure out why the data isn't being displayed properly, and I'm not
able to think of a way to troubleshoot this further. I've tried specifying
the speeds of the interfaces, and I've even tried doing a fresh install on
a much more modern server (as the server cricket has been on since 2004 is
still chugging along, and producing graphs a bit more slowly than it
should).
Any suggestions?
Thanks,
Andy
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
Andy Dills
2011-03-31 19:24:14 UTC
Permalink
Thanks, and thank you to the people who replied off list as well.

The solution was as simple as adding "snmp-version = 3" to the interfaces
file for the routers and switches that have gigabit interfaces.

Thanks,
Andy
Post by Clarke Morledge
Andy,
Most probably, you are using 32-bit counters in your SNMP sources, and the
counters are wrapping around. When that happens, you'll get negative results
in your deltas, which is what gives you the NaN values in your graphs.
I started running into this a few years ago once we started to put some load
on our gigabit links. Moving to using 64-bit counters (where possible) will
fix this. Just remember that you'll probably have to graduate from SNMPv1 to
SNMPv2c or 3 to get the 64-bit support in your MIBs.
Clarke Morledge
College of William and Mary
Information Technology - Network Engineering
Jones Hall (Room 18)
Williamsburg VA 23187
Post by Andy Dills
I've been using cricket for around a decade.
In the last few years, we've grown to the gigabit ethernet level. Since
then, we've been getting strange gaps in our graphs, with NaN listed as
the current value if you look at the graph when the gap is occuring. It
doesn't seem to be based on a traffic level threshhold...I'll see valid
data displayed at times that are higher usage than the traffic rates when
the data is NaN.
Whenever I run a manual collector, the data is being retrieved fine. I
can't figure out why the data isn't being displayed properly, and I'm not
able to think of a way to troubleshoot this further. I've tried specifying
the speeds of the interfaces, and I've even tried doing a fresh install on
a much more modern server (as the server cricket has been on since 2004 is
still chugging along, and producing graphs a bit more slowly than it
should).
Any suggestions?
Thanks,
Andy
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
Andy Dills
2011-04-01 15:24:04 UTC
Permalink
Post by Andy Dills
Thanks, and thank you to the people who replied off list as well.
The solution was as simple as adding "snmp-version = 3" to the interfaces
file for the routers and switches that have gigabit interfaces.
...or I just didn't see the problem for a few hours and was completely
mistaken :)

Turns out you can change the version to 3, and it will silently just keep
using 1. I spent a bit of time trying to get version 3 to work (easy on
the router end, difficult on the cricket end...router debug says I got the
username sending with the alternat snmpUtil.pm, but was getting a strange
"Unknown Report" when running the collector), and so I decided to just use
snmp 2c. But that still didn't fix it...because you need to query
different counters. They don't just magically turn into 64 bit counters
(duh).

So, here's the actual solution.

In the interfaces files for routers/switches with gigabit interfaces, add:

snmp-verion = 2c

dataSource ifInOctets
ds-source = snmp://%snmp%/ifHCInOctets.%inst%

dataSource ifOutOctets
ds-source = snmp://%snmp%/ifHCOutOctets.%inst%

Hope that helps somebody in the future.

Thanks,
Andy

---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
Francois Mikus
2011-06-15 01:29:05 UTC
Permalink
Hello,

You can also try out genDevConfig just to see what kind of target files
it will create, as a strating point to create new defaults. Or just
peruse the code to see what logic is used to determine what counters to use.

Glad to hear you got things fixed up.

Cheers

Francois Mikus
Post by Andy Dills
Post by Andy Dills
Thanks, and thank you to the people who replied off list as well.
The solution was as simple as adding "snmp-version = 3" to the interfaces
file for the routers and switches that have gigabit interfaces.
...or I just didn't see the problem for a few hours and was completely
mistaken :)
Turns out you can change the version to 3, and it will silently just keep
using 1. I spent a bit of time trying to get version 3 to work (easy on
the router end, difficult on the cricket end...router debug says I got the
username sending with the alternat snmpUtil.pm, but was getting a strange
"Unknown Report" when running the collector), and so I decided to just use
snmp 2c. But that still didn't fix it...because you need to query
different counters. They don't just magically turn into 64 bit counters
(duh).
So, here's the actual solution.
snmp-verion = 2c
dataSource ifInOctets
ds-source = snmp://%snmp%/ifHCInOctets.%inst%
dataSource ifOutOctets
ds-source = snmp://%snmp%/ifHCOutOctets.%inst%
Hope that helps somebody in the future.
Thanks,
Andy
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself;
WebMatrix provides all the features you need to develop and
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
cricket-users mailing list
https://lists.sourceforge.net/lists/listinfo/cricket-users
Stephen Carville
2011-03-31 17:59:11 UTC
Permalink
Post by Andy Dills
I've been using cricket for around a decade.
In the last few years, we've grown to the gigabit ethernet level. Since
then, we've been getting strange gaps in our graphs, with NaN listed as
the current value if you look at the graph when the gap is occuring. It
doesn't seem to be based on a traffic level threshhold...I'll see valid
data displayed at times that are higher usage than the traffic rates when
the data is NaN.
Whenever I run a manual collector, the data is being retrieved fine. I
can't figure out why the data isn't being displayed properly, and I'm not
able to think of a way to troubleshoot this further. I've tried specifying
the speeds of the interfaces, and I've even tried doing a fresh install on
a much more modern server (as the server cricket has been on since 2004 is
still chugging along, and producing graphs a bit more slowly than it
should).
Any suggestions?
Try using 64 bit counters. I added the following to appropriate parts of the
config tree to get statistics for 64 bit counters

OID ifHCInOctets 1.3.6.1.2.1.31.1.1.1.6
OID ifHCInUcastPkts 1.3.6.1.2.1.31.1.1.1.7
OID ifHCOutOctets 1.3.6.1.2.1.31.1.1.1.10
OID ifHCOutUcastPkts 1.3.6.1.2.1.31.1.1.1.11

dataSource ifHCInOctets
ds-source = snmp://%snmp%/ifHCInOctets.%inst%

dataSource ifHCOutOctets
ds-source = snmp://%snmp%/ifHCOutOctets.%inst%

graph ifHCInOctets
color = dark-green
draw-as = AREA
legend = "Average bits in"
y-axis = "bits per second"
units = "bits/sec"
scale = 8,*
bytes = true

graph ifHCOutOctets
color = blue
legend = "Average bits out"
y-axis = "bits per second"
units = "bits/sec"
scale = 8,*
bytes = true

targetType switch-port64
ds = "ifHCInOctets, ifHCOutOctets"
view = "HC_Octets"

view HC_Octets
elements = "ifHCInOctets,ifHCOutOctets"
Post by Andy Dills
Thanks,
Andy
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
---------------------------------------------------------------------------
--- Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself;
WebMatrix provides all the features you need to develop and
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
cricket-users mailing list
https://lists.sourceforge.net/lists/listinfo/cricket-users
Rodney McDuff
2011-06-15 03:37:32 UTC
Permalink
http://www.its.uq.edu.au/public/cricket/rrdoutlier.pl


RRDOUTLIER(1) User Contributed Perl Documentation
RRDOUTLIER(1)



NAME
rrdoutliers.pl: A program to remove outliers from a RRD file

USAGE
rrdoutliers.pl -in <infile> [-out <outfile>]
[-alpha <alpha>] [-passes <n>] [-printdata]
[-rra <rra:ds[[:attr=val]...]>]
[-rra <rra:ds[[:attr=val]...]>]

DESCRIPTION
This programs removes outliers from a RRD file using Grubbs test to
detect them and remove or modify them. The Grubb test calculates G,
the maximum absolute deviation between a data point and the mean
of the
data set normalized by the standard deviation of the data set.
If this
value is greater that a critical value of G, the point is
considered an
outlier and removed or modified. This process is repeated until no
outliers remain or some arbitrary limit is reached. The critical
value
of G is calculated using the student t distribution and a specified
significance level.

The options are:

-in <infile>
Input file to read. If the file extension is .rrd, rrdtool
dump is
used to convert it on the fly to a XML file. Otherwise the input
file is interpreted as an XML file using XML::DOM.

-out <infile>
Output file to write. If the file extension is .rrd, rrdtool
restore is used to convert the modified XML file to an RRD file.
Otherwise the output file written as an XML file.

-alpha <alpha>
Default significance level to use for all RRAs and DSs in the
input
file. This is the probability that you encounter a data point so
far from the other data points by pure chance alone. This
parameter
can be overridden for various combinations of RRA and DSs
using the
-rra switch. Default is 0.05 which equates to a 5% chance.

-passes <n>
Default numbers of passes to use for all RRAs and DSs in the
input
file. Ideally one would reiterate the Grubbs test, removing
outliers, until there were none left. This parameter sets a
ceiling
on the maximum number of iterations. This parameter can be
overridden for various combinations of RRA and DSs using the -rra
switch. Default is 5.

-printdata
Prints data for selected combination of RRA and DS. Useful to see
the data defore removing an outlier.

-rra <rra:ds[[:attr=val]...]>
Selects which RRAs and DSs to operate on. rra is a comma
separated
list containing either an integer, a range of integers (ie
1-5) or
the keyword all which specifies all RRA within the input file. ds
is comma separated list containing either an integer, a range of
integers (ie 1-5) or the keyword all which specifies all the DS
within a specified RRA within the input file. If the are no
selected RRAs and DSs the all RRAs and DSs are targeted. (This is
equivalent to -rra all:all). The numbering of rra and ds
start from
zero.

attr=val can be used to override default parameter like the
significance level alpha or the maximum number of iterations
n for
a particular RRA and DS combination. If value=val is
specified, a
detected outliers will be replaced with val rather that NaN. If
gcrit=val is specified then the critical G value is replace
by this
value rather than being calculated from the significance level
alpha and student t distribution.

EXAMPLES
rrdoutlier.pl -rra 1:1,3-5 -in infile.rrd

Inspect the outliers found by the Grubbs test for the 2nd,4th,5th and
6th DS in the second RRA within the file infile.rrd.

rrdoutlier.pl -rra 0:11 -passes 5 -in infile.rrd -out outfile.rrd
OUTLIER: Starting PASS 1 of rra0
OUTLIER: rra0 ds11: time= val=1.4021004120e+07 n=600
mean=23846.486280 sd=572027.450034 gmax=24.469381 gcrit=3.910910
OUTLIER: Starting PASS 2 of rra0
OUTLIER: rra0 ds11: time= val=2.8614430940e+05 n=599
mean=478.944320 sd=11681.725528 gmax=24.454038 gcrit=3.910442
OUTLIER: Starting PASS 3 of rra0
OUTLIER: rra0 ds11: Found 2 outliers. No more to be found

Remove a maximum of 5 outliers in the 12th DS of the first RRAs in
infile.rrd and write the modification back to outfile.rrd. In
this data
set however there are only 2 outliers.

rrdoutlier.pl -rra 0:11 -rra 0:16:passes=10:values=0.0 -in
infile.rrd -out outfile.rrd

Remove a maximum of 5 outliers in the 12th DS of the first RRAs
and 10
from the 17th. Replace these later outliers with the value zero.

AUTHOR
Rodney McDuff (***@its.uq.edu.au). License is GPL

SEE ALSO
perl(1), XML::DOM, Statistics::Distributions, Date::Parse,
Date::Format, File::stat, File::Copy

Grubbs, Frank (February 1969), Procedures for Detecting Outlying
Observations in Samples, Technometrics, Vol. 11, No. 1, pp. 1-21.



perl v5.10.0 2003-06-18
RRDOUTLIER(1)
Post by Stephen Carville
Post by Andy Dills
I've been using cricket for around a decade.
In the last few years, we've grown to the gigabit ethernet level. Since
then, we've been getting strange gaps in our graphs, with NaN listed as
the current value if you look at the graph when the gap is occuring. It
doesn't seem to be based on a traffic level threshhold...I'll see valid
data displayed at times that are higher usage than the traffic rates when
the data is NaN.
Whenever I run a manual collector, the data is being retrieved fine. I
can't figure out why the data isn't being displayed properly, and I'm not
able to think of a way to troubleshoot this further. I've tried specifying
the speeds of the interfaces, and I've even tried doing a fresh install on
a much more modern server (as the server cricket has been on since 2004 is
still chugging along, and producing graphs a bit more slowly than it
should).
Any suggestions?
Try using 64 bit counters. I added the following to appropriate parts of the
config tree to get statistics for 64 bit counters
OID ifHCInOctets 1.3.6.1.2.1.31.1.1.1.6
OID ifHCInUcastPkts 1.3.6.1.2.1.31.1.1.1.7
OID ifHCOutOctets 1.3.6.1.2.1.31.1.1.1.10
OID ifHCOutUcastPkts 1.3.6.1.2.1.31.1.1.1.11
dataSource ifHCInOctets
ds-source = snmp://%snmp%/ifHCInOctets.%inst%
dataSource ifHCOutOctets
ds-source = snmp://%snmp%/ifHCOutOctets.%inst%
graph ifHCInOctets
color = dark-green
draw-as = AREA
legend = "Average bits in"
y-axis = "bits per second"
units = "bits/sec"
scale = 8,*
bytes = true
graph ifHCOutOctets
color = blue
legend = "Average bits out"
y-axis = "bits per second"
units = "bits/sec"
scale = 8,*
bytes = true
targetType switch-port64
ds = "ifHCInOctets, ifHCOutOctets"
view = "HC_Octets"
view HC_Octets
elements = "ifHCInOctets,ifHCOutOctets"
Post by Andy Dills
Thanks,
Andy
---
Andy Dills
Xecunet, Inc.
www.xecu.net
301-682-9972
---
---------------------------------------------------------------------------
--- Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself;
WebMatrix provides all the features you need to develop and
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
cricket-users mailing list
https://lists.sourceforge.net/lists/listinfo/cricket-users
------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself;
WebMatrix provides all the features you need to develop and
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
cricket-users mailing list
https://lists.sourceforge.net/lists/listinfo/cricket-users
--
Dr. Rodney G. McDuff |Ex ignorantia ad sapientiam
Manager, Strategic Technologies Group| Ex luce ad tenebras
Information Technology Services |
The University of Queensland |
EMAIL: ***@its.uq.edu.au |
TELEPHONE: +61 7 334 66898 |
Loading...