[rrd-users] rrdtool memory usage and the Raspberry Pi

Discussion:

(too old to reply)

Jared Henley

2015-10-16 04:39:37 UTC

Hi,

I've been working on a logging application that uses rrdtool. It works
brilliantly on my PC, not so good on the Raspberry Pi model B+. But
before I go there, there's something interesting I noticed while working
on the PC.

I'm creating a database with the following command (in rrdpython)

rrdtool.create(filepath,
'--step', '1s',
'DS:progreset:GAUGE:1000s:0:1',
'RRA:MAX:0.1:5s:5000m',

<snip - there are 29 definitions all up, all the same>

This creates a 386MB file.

I didn't think too hard about it, until I did a graph to a CSV file. Of
course the CSV would be less space-efficient than the binary rrd file,
right? But the generated CSV turns out to be only 15MB?

rrdtool seems to use 64-bit integers for everything, so I figure the rrd
file above should use:
8 bytes per RRA entry * 29 RRAs + 1 timestamp * 60,000 locations in the
round-robin (5000 minutes / 5 seconds) = 14.4MB. I'm confused. Why is
the rrd file 30 times bigger than I'd expect? I did experiment with a
step size of 5 seconds, but the created file was the same size. But I
can live with largish files.

However, creating the databases on the Raspberry Pi was a dismal
failure. Memory usage climbed until the process crashed. I assume the
out-of-memory process killer did it. Since the Raspberry Pi only has
about 384MB of memory free after booting up, it is fairly memory
constrained. Is rrdtool creating the entire file in memory before
writing it out to disk?

So I copied some rrd files from the PC to the Raspberry Pi. I wasn't
surprised that they didn't work - presumably an endianness issue.
However, the rrd graph generation code was able to load up and complain
about badness in the file, so there's at least enough RAM to load up
rrdtool.

I wonder if I am going about this the wrong way? In the last hour I've
seen references to people having hundreds of rrd files. Is it
recommended to split your data up into lots of small chunks? Are there
other recommendations about how to use rrdtool in the most
memory-efficient manner?

Many thanks,
Jared

Tobias Oetiker

2015-10-16 06:19:40 UTC

Permalink

Hi Jared,

Post by Jared Henley
Hi,
I've been working on a logging application that uses rrdtool. It works
brilliantly on my PC, not so good on the Raspberry Pi model B+. But before I
go there, there's something interesting I noticed while working on the PC.
I'm creating a database with the following command (in rrdpython)
rrdtool.create(filepath,
'--step', '1s',
'DS:progreset:GAUGE:1000s:0:1',
'RRA:MAX:0.1:5s:5000m',
<snip - there are 29 definitions all up, all the same>
This creates a 386MB file.
I didn't think too hard about it, until I did a graph to a CSV file. Of
course the CSV would be less space-efficient than the binary rrd file, right?
But the generated CSV turns out to be only 15MB?
rrdtool seems to use 64-bit integers for everything, so I figure the rrd file
8 bytes per RRA entry * 29 RRAs + 1 timestamp * 60,000 locations in the
round-robin (5000 minutes / 5 seconds) = 14.4MB. I'm confused. Why is the rrd
file 30 times bigger than I'd expect? I did experiment with a step size of 5
seconds, but the created file was the same size. But I can live with largish
files.

when I create this database on my intel box it looks like this:

oetiker>./rrdtool create /tmp/demo.rrd --step 1s DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
oetiker>ls -l /tmp/demo.rrd
-rw-r--r-- 1 oetiker oep 480584 Oct 16 08:17 /tmp/demo.rrd

so something looks rather odd here ... are you using 1.5.4 on your
raspy ?

cheers
tobi

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch ***@oetiker.ch +41 62 775 9902

Jared Henley

2015-10-18 22:51:30 UTC

Permalink

Post by Tobias Oetiker
Hi Jared,

oetiker>./rrdtool create /tmp/demo.rrd --step 1s DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
oetiker>ls -l /tmp/demo.rrd
-rw-r--r-- 1 oetiker oep 480584 Oct 16 08:17 /tmp/demo.rrd
so something looks rather odd here ... are you using 1.5.4 on your
raspy ?
cheers
tobi

Hi Tobi,

I can't actually create the database on the rpi - the process gets
killed. Results mentioned are on my AMD 64-bit box, as are all tests below.

I just ran your test:

[08:32:39 ***@hope] $ rrdtool create /tmp/demo.rrd --step 1s
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
0 ~
[08:32:49 ***@hope] $ ls -l /tmp/demo.rrd
-rw-rw-r-- 1 pe pe 480584 Oct 19 08:32 /tmp/demo.rrd

Results are identical and obviously what I would expect.

I'm having another issue with rrdpython, (but my mail to Hye-Shik Chang
is bouncing), so I just tried my database creation directly in rrdtool

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \
DS:voltcontrol:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \
DS:freqcontrol:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \
DS:fault:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:uff:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:off:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:uvf:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:ovf:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:svopen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:svclose:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:bvopen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:bvclose:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:syncen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:busconnen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmfull:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmmed:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmlow:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:vred:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:exciter:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:ired:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:iwhite:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:iblue:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:f:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:isp:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlred:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlwhite:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlblue:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:svpos:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m

0 ~
[08:45:55 ***@hope] $ ls -l test.rrd
-rw-rw-r-- 1 pe pe 403757864 Oct 19 08:45 test.rrd

Obviously something funny is going on between the one datasource/RRA

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m

0 ~
[08:47:52 ***@hope] $ ls -l test.rrd
-rw-rw-r-- 1 pe pe 1921184 Oct 19 08:47 test.rrd

This file also seems too large. 1921184/480584 = 3.997... so it naively
looks like the filesize is following a parabolic function against DS/RRA
pairs rather than linear. I then created the following two files with
almost no data, so I could look at them in a hex editor easily.

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s

0 ~
[08:56:01 ***@hope] $ ls -l test.rrd
-rw-rw-r-- 1 pe pe 632 Oct 19 08:56 test.rrd
0 ~

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:30s

0 ~
[08:58:23 ***@hope] $ ls -l test2.rrd
-rw-rw-r-- 1 pe pe 1376 Oct 19 08:58 test2.rrd

I really don't understand the structure of an rrd file, but there's a
section at the end populated with "00 00 00 00 00 00 F8 FF" multiple
times over. I assume this is where the RRA data is stored, and that the
first section is the definitions and most-recently-updated values.
Working on that understanding, the RRA data section of test.rrd is:

0000:0240 | 00 00 00 00 00 00 F8 FF | ......øÿ
0000:0250 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0260 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0270 | 00 00 00 00 00 00 F8 FF | ......øÿ

(as expected, 6 64-bit values)

And the RRA data section of test2.rrd is:

0000:04A0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04B0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04D0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04E0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04F0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0500 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0510 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0520 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0530 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0540 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ

(I would expect 12 64-bit values, there are 24 here.)

I did some updates to these files. test.rrd made some sense to me.
Updates were 0.1, 0.2, 0.3, 0.4, 0.5 in 5 second intervals.

0000:0240 | 9A 99 99 99 99 99 D9 3F | ......Ù?
0000:0250 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......à?333333ã?
0000:0260 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......¹?......É?
0000:0270 | 33 33 33 33 33 33 D3 3F | 333333Ó?

and again, with updates 0.7, 0.7, 0.8, 0.9, 1:

0000:0240 | CD CC CC CC CC CC EC 3F | ÍÌÌÌÌÌì?
0000:0250 | 00 00 00 00 00 00 F0 3F 33 33 33 33 33 33 E3 3F | ......ð?333333ã?
0000:0260 | 66 66 66 66 66 66 E6 3F 66 66 66 66 66 66 E6 3F | ffffffæ?ffffffæ?
0000:0270 | 9A 99 99 99 99 99 E9 3F | ......é?

Updates to test2.rrd were 0.1:0.2, 0.3:0.4, 0.5:0.6, 0.7:0.9, 0.9:1 in 5
second intervals.

0000:04A0 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F | ffffffæ?ÍÌÌÌÌÌì?
0000:04B0 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F | ÍÌÌÌÌÌì?......ð?
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ
0000:04D0 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......¹?......É?
0000:04E0 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F | 333333Ó?......Ù?
0000:04F0 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......à?333333ã?
0000:0500 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......¹?......É?
0000:0510 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F | 333333Ó?......Ù?
0000:0520 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......à?333333ã?
0000:0530 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F | ffffffæ?ÍÌÌÌÌÌì?
0000:0540 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F | ÍÌÌÌÌÌì?......ð?
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......øÿ......øÿ

Using the info from test.rrd's hex dump, it looks like the values in
test2.rrd are:

0.7 0.9
0.9 1
UNKNOWN UNKNOWN
0.4 0.5
0.6 0.1
0.2 0.3
0.4 0.5
0.6 0.1
0.2 0.3
0.7 0.9
0.9 1
UNKNOWN UNKNOWN

It seems like the file test2.rrd contains two sets of RRA data like this:

first-half data
second-half data
second-half data
first-half data

Jared.

Rafal Gwizdala

2015-10-19 06:28:40 UTC

Permalink

Maybe the file size growth is expected to be parabolic - i think every RRA
is 'global' - it covers all the data sources. So when you're adding DS and
RRA pair you're actually adding N RRAs where N is number of data sources.
At least this is how i understood the concept of RRAs.
Best regards
Rafal

Post by Tobias Oetiker
Hi Jared,
Hi,
I've been working on a logging application that uses rrdtool. It works
brilliantly on my PC, not so good on the Raspberry Pi model B+. But before I
go there, there's something interesting I noticed while working on the PC.
I'm creating a database with the following command (in rrdpython)
rrdtool.create(filepath,
'--step', '1s',
'DS:progreset:GAUGE:1000s:0:1',
'RRA:MAX:0.1:5s:5000m',
<snip - there are 29 definitions all up, all the same>
This creates a 386MB file.
I didn't think too hard about it, until I did a graph to a CSV file. Of
course the CSV would be less space-efficient than the binary rrd file, right?
But the generated CSV turns out to be only 15MB?
rrdtool seems to use 64-bit integers for everything, so I figure the rrd file
8 bytes per RRA entry * 29 RRAs + 1 timestamp * 60,000 locations in the
round-robin (5000 minutes / 5 seconds) = 14.4MB. I'm confused. Why is the rrd
file 30 times bigger than I'd expect? I did experiment with a step size of 5
seconds, but the created file was the same size. But I can live with largish
files.
oetiker>./rrdtool create /tmp/demo.rrd --step 1s DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
oetiker>ls -l /tmp/demo.rrd
-rw-r--r-- 1 oetiker oep 480584 Oct 16 08:17 /tmp/demo.rrd
so something looks rather odd here ... are you using 1.5.4 on your
raspy ?
cheers
tobi
Hi Tobi,
I can't actually create the database on the rpi - the process gets
killed. Results mentioned are on my AMD 64-bit box, as are all tests below.
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
0 ~
-rw-rw-r-- 1 pe pe 480584 Oct 19 08:32 /tmp/demo.rrd
Results are identical and obviously what I would expect.
I'm having another issue with rrdpython, (but my mail to Hye-Shik Chang is
bouncing), so I just tried my database creation directly in rrdtool with

0 ~
-rw-rw-r-- 1 pe pe 403757864 Oct 19 08:45 test.rrd
Obviously something funny is going on between the one datasource/RRA test

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m

0 ~
-rw-rw-r-- 1 pe pe 1921184 Oct 19 08:47 test.rrd
This file also seems too large. 1921184/480584 = 3.997... so it naively
looks like the filesize is following a parabolic function against DS/RRA
pairs rather than linear. I then created the following two files with
almost no data, so I could look at them in a hex editor easily.

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s

0 ~
-rw-rw-r-- 1 pe pe 632 Oct 19 08:56 test.rrd
0 ~

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:30s

0 ~
-rw-rw-r-- 1 pe pe 1376 Oct 19 08:58 test2.rrd
I really don't understand the structure of an rrd file, but there's a
section at the end populated with "00 00 00 00 00 00 F8 FF" multiple times
over. I assume this is where the RRA data is stored, and that the first
section is the definitions and most-recently-updated values. Working on
0000:0240 | 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿
0000:0250 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0260 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0270 | 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿
(as expected, 6 64-bit values)
0000:04A0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04B0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04D0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04E0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04F0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0500 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0510 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0520 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0530 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0540 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
(I would expect 12 64-bit values, there are 24 here.)
I did some updates to these files. test.rrd made some sense to me.
Updates were 0.1, 0.2, 0.3, 0.4, 0.5 in 5 second intervals.
0000:0240 | 9A 99 99 99 99 99 D9 3F | ......Ã?
0000:0250 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......Ã ?333333Ã£?
0000:0260 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......Â¹?......Ã?
0000:0270 | 33 33 33 33 33 33 D3 3F | 333333Ã?
0000:0240 | CD CC CC CC CC CC EC 3F | ÃÃÃÃÃÃÃ¬?
0000:0250 | 00 00 00 00 00 00 F0 3F 33 33 33 33 33 33 E3 3F | ......Ã°?333333Ã£?
0000:0260 | 66 66 66 66 66 66 E6 3F 66 66 66 66 66 66 E6 3F | ffffffÃŠ?ffffffÃŠ?
0000:0270 | 9A 99 99 99 99 99 E9 3F | ......Ã©?
Updates to test2.rrd were 0.1:0.2, 0.3:0.4, 0.5:0.6, 0.7:0.9, 0.9:1 in 5
second intervals.
0000:04A0 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F | ffffffÃŠ?ÃÃÃÃÃÃÃ¬?
0000:04B0 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F | ÃÃÃÃÃÃÃ¬?......Ã°?
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
0000:04D0 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......Â¹?......Ã?
0000:04E0 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F | 333333Ã?......Ã?
0000:04F0 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......Ã ?333333Ã£?
0000:0500 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F | ......Â¹?......Ã?
0000:0510 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F | 333333Ã?......Ã?
0000:0520 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F | ......Ã ?333333Ã£?
0000:0530 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F | ffffffÃŠ?ÃÃÃÃÃÃÃ¬?
0000:0540 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F | ÃÃÃÃÃÃÃ¬?......Ã°?
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF | ......ÃžÃ¿......ÃžÃ¿
Using the info from test.rrd's hex dump, it looks like the values in
0.7 0.9
0.9 1
UNKNOWN UNKNOWN
0.4 0.5
0.6 0.1
0.2 0.3
0.4 0.5
0.6 0.1
0.2 0.3
0.7 0.9
0.9 1
UNKNOWN UNKNOWN
first-half data
second-half data
second-half data
first-half data
Jared.
_______________________________________________
rrd-users mailing list
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users

Tobias Oetiker

2015-10-19 06:33:22 UTC

Permalink

Hi Jared,

are you using 1.5.4 ?

cheers
tobi

Post by Jared Henley

Post by Tobias Oetiker
Hi Jared,

Post by Jared Henley
Hi,
I've been working on a logging application that uses rrdtool. It works
brilliantly on my PC, not so good on the Raspberry Pi model B+. But
before I
go there, there's something interesting I noticed while working on the PC.
I'm creating a database with the following command (in rrdpython)
rrdtool.create(filepath,
'--step', '1s',
'DS:progreset:GAUGE:1000s:0:1',
'RRA:MAX:0.1:5s:5000m',
<snip - there are 29 definitions all up, all the same>
This creates a 386MB file.
I didn't think too hard about it, until I did a graph to a CSV file. Of
course the CSV would be less space-efficient than the binary rrd file,
right?
But the generated CSV turns out to be only 15MB?
rrdtool seems to use 64-bit integers for everything, so I figure the rrd
file
8 bytes per RRA entry * 29 RRAs + 1 timestamp * 60,000 locations in the
round-robin (5000 minutes / 5 seconds) = 14.4MB. I'm confused. Why is the
rrd
file 30 times bigger than I'd expect? I did experiment with a step size
of 5
seconds, but the created file was the same size. But I can live with
largish
files.

oetiker>./rrdtool create /tmp/demo.rrd --step 1s
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
oetiker>ls -l /tmp/demo.rrd
-rw-r--r-- 1 oetiker oep 480584 Oct 16 08:17 /tmp/demo.rrd
so something looks rather odd here ... are you using 1.5.4 on your
raspy ?
cheers
tobi

Hi Tobi,
I can't actually create the database on the rpi - the process gets killed.
Results mentioned are on my AMD 64-bit box, as are all tests below.
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m
0 ~
-rw-rw-r-- 1 pe pe 480584 Oct 19 08:32 /tmp/demo.rrd
Results are identical and obviously what I would expect.
I'm having another issue with rrdpython, (but my mail to Hye-Shik Chang is
bouncing), so I just tried my database creation directly in rrdtool with the

0 ~
-rw-rw-r-- 1 pe pe 403757864 Oct 19 08:45 test.rrd
Obviously something funny is going on between the one datasource/RRA test and

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m

0 ~
-rw-rw-r-- 1 pe pe 1921184 Oct 19 08:47 test.rrd
This file also seems too large. 1921184/480584 = 3.997... so it naively looks
like the filesize is following a parabolic function against DS/RRA pairs
rather than linear. I then created the following two files with almost no
data, so I could look at them in a hex editor easily.

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s

0 ~
-rw-rw-r-- 1 pe pe 632 Oct 19 08:56 test.rrd
0 ~

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:30s \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:30s

0 ~
-rw-rw-r-- 1 pe pe 1376 Oct 19 08:58 test2.rrd
I really don't understand the structure of an rrd file, but there's a section
at the end populated with "00 00 00 00 00 00 F8 FF" multiple times over. I
assume this is where the RRA data is stored, and that the first section is the
definitions and most-recently-updated values. Working on that understanding,
0000:0240 | 00 00 00 00 00 00 F8 FF |
......øÿ
0000:0250 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0260 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0270 | 00 00 00 00 00 00 F8 FF | ......øÿ
(as expected, 6 64-bit values)
0000:04A0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04B0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04D0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04E0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04F0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0500 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0510 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0520 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0530 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0540 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
(I would expect 12 64-bit values, there are 24 here.)
I did some updates to these files. test.rrd made some sense to me. Updates
were 0.1, 0.2, 0.3, 0.4, 0.5 in 5 second intervals.
0000:0240 | 9A 99 99 99 99 99 D9 3F |
......Ù?
0000:0250 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F |
......à?333333ã?
0000:0260 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F |
......¹?......É?
0000:0270 | 33 33 33 33 33 33 D3 3F | 333333Ó?
0000:0240 | CD CC CC CC CC CC EC 3F |
ÍÌÌÌÌÌì?
0000:0250 | 00 00 00 00 00 00 F0 3F 33 33 33 33 33 33 E3 3F |
......ð?333333ã?
0000:0260 | 66 66 66 66 66 66 E6 3F 66 66 66 66 66 66 E6 3F |
ffffffæ?ffffffæ?
0000:0270 | 9A 99 99 99 99 99 E9 3F | ......é?
Updates to test2.rrd were 0.1:0.2, 0.3:0.4, 0.5:0.6, 0.7:0.9, 0.9:1 in 5
second intervals.
0000:04A0 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F |
ffffffæ?ÍÌÌÌÌÌì?
0000:04B0 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F |
ÍÌÌÌÌÌì?......ð?
0000:04C0 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
0000:04D0 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F |
......¹?......É?
0000:04E0 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F |
333333Ó?......Ù?
0000:04F0 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F |
......à?333333ã?
0000:0500 | 9A 99 99 99 99 99 B9 3F 9A 99 99 99 99 99 C9 3F |
......¹?......É?
0000:0510 | 33 33 33 33 33 33 D3 3F 9A 99 99 99 99 99 D9 3F |
333333Ó?......Ù?
0000:0520 | 00 00 00 00 00 00 E0 3F 33 33 33 33 33 33 E3 3F |
......à?333333ã?
0000:0530 | 66 66 66 66 66 66 E6 3F CD CC CC CC CC CC EC 3F |
ffffffæ?ÍÌÌÌÌÌì?
0000:0540 | CD CC CC CC CC CC EC 3F 00 00 00 00 00 00 F0 3F |
ÍÌÌÌÌÌì?......ð?
0000:0550 | 00 00 00 00 00 00 F8 FF 00 00 00 00 00 00 F8 FF |
......øÿ......øÿ
Using the info from test.rrd's hex dump, it looks like the values in test2.rrd
0.7 0.9
0.9 1
UNKNOWN UNKNOWN
0.4 0.5
0.6 0.1
0.2 0.3
0.4 0.5
0.6 0.1
0.2 0.3
0.7 0.9
0.9 1
UNKNOWN UNKNOWN
first-half data
second-half data
second-half data
first-half data
Jared.

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch ***@oetiker.ch +41 62 775 9902

Simon Hobson

2015-10-19 07:14:56 UTC

Permalink

Maybe the file size growth is expected to be parabolic - i think every RRA is 'global' - it covers all the data sources. So when you're adding DS and RRA pair you're actually adding N RRAs where N is number of data sources. At least this is how i understood the concept of RRAs.

Yes, that is correct - each RRA applied to every DS in the file. File size will scale (roughly) as D*R where D is number of data sources, and R is number of RRAs.

Post by Tobias Oetiker
--step 1s \
DS:progreset:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:tpm:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \
DS:voltcontrol:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \

Duplicate RRA

Post by Tobias Oetiker
DS:freqcontrol:GAUGE:1000s:0:1 RRA:LAST:0.1:5s:5000m \

Another duplicate

Post by Tobias Oetiker
DS:fault:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:uff:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:off:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:uvf:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \
DS:ovf:GAUGE:1000s:0:1 RRA:MAX:0.1:5s:5000m \

5 more duplicates

Post by Tobias Oetiker
DS:svopen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:svclose:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:bvopen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:bvclose:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:syncen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:busconnen:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmfull:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmmed:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:wmlow:GAUGE:1000s:0:1 RRA:AVERAGE:0.1:5s:5000m \
DS:vred:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:exciter:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:ired:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:iwhite:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:iblue:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:f:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:isp:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlred:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlwhite:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:dlblue:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m \
DS:svpos:GAUGE:1000s:0:32000 RRA:AVERAGE:0.1:5s:5000m

Another 19 duplicates

As a matter of style, I always put my RRAs at the end - so I have some "header" stuff, then a list of DSs, then a list of RRAs. Makes it much easier to read.
As an aside, does it matter what order they are declared in ? Ie, if an RRA is declared before all the DSs, does it still apply to all DSs ? I would hope so.