Discrepancy between actual RAM usage by Octave and "whos" output

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Discrepancy between actual RAM usage by Octave and "whos" output

PhilipNienhuis
Hi,

(Just curious)
During a recent debug session I noted that Octave's RAM usage is much bigger
than suggested by the "whos" command.
I used a 10,000,000x10 heterogeneous cell array [*] to experiment. Big but
not extreme, some GIS files I sometimes use altogether also lead to several
GBs of RAM usage by Octave.

On 64-bit Windows 7, with a 64bit (Fortran) indexing Octave-5.0.0, after
reading the array from file, the whos command says:

>> whos
Variables in the current scope:
   Attr Name          Size                     Bytes  Class
   ==== ====          ====                     =====  =====
        aa     10000000x10                 649999985  cell
        ans           1x1                          8  double
Total is 100000001 elements using 649999993 bytes

While reading the data, Octave's actual RAM usage went from a mere 185 MB to
over 5.8 GB, i.e., almost *nine* times the amount of bytes suggested by
whos. Such a multiplier is unexpectedly high for me; I wouldn't have been
surprised with a factor 2 or 3 or so.

With 64-bit Octave-4.4.1 (no 64b Fortran indexing), whos tells me the same
numbers but Octave's RAM usage is even a little larger, around 6 GB. Trying
on Linux (Mageia 6) memory consumption by Octave even goes to 6.2 GB.

Q:
What is the reason that actual RAM usage is so much larger than suggested by
"whos"? Where does the overhead come from?

A corrollary is that esp. for unwary users, "whos" actually gives deceiving
results.

Thanks,

Philip

[*] code to create it is in bug #53899, "make_big_aa.m",
https://savannah.gnu.org/bugs/download.php?file_id=45317




--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between actual RAM usage by Octave and "whos" output

Mike Miller-4
On Sun, Oct 28, 2018 at 11:05:44 -0500, PhilipNienhuis wrote:
> What is the reason that actual RAM usage is so much larger than suggested by
> "whos"? Where does the overhead come from?

Cell arrays have a lot of memory overhead. Compare the memory used by
Octave instantiating

    A = rand (1e7, 10);

with the memory used for

    A = num2cell (rand (1e7, 10));

On my system, I see about 850 MB used for the first, 3.9 GB for the
second.

> A corrollary is that esp. for unwary users, "whos" actually gives deceiving
> results.

Yes, especially if you are using cell arrays with a large number of
cells.

--
mike

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between actual RAM usage by Octave and "whos" output

apjanke-floss


On 10/30/18 3:38 PM, Mike Miller wrote:

> On Sun, Oct 28, 2018 at 11:05:44 -0500, PhilipNienhuis wrote:
>> What is the reason that actual RAM usage is so much larger than suggested by
>> "whos"? Where does the overhead come from?
>
> Cell arrays have a lot of memory overhead. Compare the memory used by
> Octave instantiating
>
>      A = rand (1e7, 10);
>
> with the memory used for
>
>      A = num2cell (rand (1e7, 10));
>
> On my system, I see about 850 MB used for the first, 3.9 GB for the
> second.
>
>> A corrollary is that esp. for unwary users, "whos" actually gives deceiving
>> results.
>
> Yes, especially if you are using cell arrays with a large number of
> cells.
>

Digging a little deeper: If Octave works like Matlab (and I think it
does), the "Bytes" reported by "whos" reflect only the memory used by
the raw primitive values (i.e. doubles, ints, chars) inside the arrays,
and not any of the overhead in Octave's internal array-management data
structures. Cells have high overhead because each individual cell
element contains an entire Octave array. (Same for objects that are not
coded as planar-organized.)

You can kind of see this by doing a "whos" on empty compound data types,
which all report as 0 bytes, when they clearly are using some memory.

octave:1> a_struct = struct;
octave:2> a_cell = {};
octave:3> an_object = containers.Map;
octave:4> whos
Variables in the current scope:

    Attr Name           Size                     Bytes  Class
    ==== ====           ====                     =====  =====
         a_cell         0x0                          0  cell
         a_struct       1x1                          0  struct
         an_object      1x1                          0  containers.Map

Conversely, because it doesn't check for memory shared between arrays
via CoW, whos() can also over-report memory usage for cells and other
compound types. (The memory arrangement is a directed acyclic graph, and
the memory-counting algorithm isn't checking for already-visited nodes.)
But this is less likely in practice.


octave:1> x = rand(1e7,1);
octave:2> cx = { x x x x x x x };
octave:3> ccx = repmat( { x }, [1e5 1]);
octave:4> cccx = repmat( { ccx }, [1e3 1]);
octave:6> ccccx = repmat( { cccx }, [1e3 1]);
octave:7> whos
Variables in the current scope:

    Attr Name          Size                            Bytes  Class
    ==== ====          ====                            =====  =====
         ccccx      1000x1               8000000000000000000  cell
         cccx       1000x1                  8000000000000000  cell
         ccx      100000x1                     8000000000000  cell
         cx            1x7                         560000000  cell
         x      10000000x1                          80000000  double

Actual memory usage here is about 100 MB.

You can exploit this behavior to do a form of low-rent compression on
low-cardinality cellstr or struct-organized object arrays:

function out = canonicalize(x)
     [ux,~,Jndx] = unique(x);
     out = reshape(ux(Jndx), size(x));
end

After a "foo = canonicalize(foo)", foo will contain the same values, but
have only one copy of each distinct value in memory.

Cheers,
Andrew