Several data buffers are needed in this problem. In particular,
transposition buffers are needed to go to/from the internal gridding
engine which works on velocity-first arrays. The buffers are:
- the table data buffer (size
),
- the table transposition/sorting buffer (size
) if (and only if) the transposition of the table (.bat)
and/or sorting (positions not sorted by Y offsets) are needed. The
same buffer is used for both tasks.
- the cube data buffer (size
),
- the cube transposition/sorting buffer (size
) if and only transposition of the cube is needed
(.lmv).
We call
the total amount of memory
allocated to the buffers. Note that they all scale as
. Some
other allocatable buffers are involved (e.g. X, Y, W columns read from
the table) but they are neglictible in front of the data size, and
they do not scale as
.
The command XY_MAP needs memory space to allocate these
buffers. However, it does not allocate freely the memory: it ensures
they fit in a limited memory space (VOLATILE_MEMORY, we do not
consider here the way it is defined).
If
VOLATILE_MEMORY, the 4 buffers above fit in the
dedicated memory space. We are in the best case where all the problem
fits in memory. In particular, the output cube fits in memory and will
be written at once, whatever its order.
If
VOLATILE_MEMORY, the 4 buffers do not fit in the
dedicated memory space. The problem will be iterated by blocks of
channels along its velocity dimension. There will be
iterations,
the buffers being divided by the same and correct amount. But, in this
case, there are 2 possibilities:
- if the output cube is velocity-last (.lmv), it will be
gdf-extended2 by block directly on the disk. In other words it does not
need to fit entirely in memory, only the current subset is loaded at
a time. The minimum number of subdivisions needed to remain below
VOLATILE_MEMORY is (floating point division):
(1)
- if the output cube is velocity-first (.vlm), it must fit
entirely in memory. The libgio does not offer to write
non-contiguous pieces of a cube (and we probably do not want
this). This means we have to find a balance for the cube data buffer
(scaled as
) and the 3 other ones (scaled as
). In
this case, XY_MAP accepts to allocate up to 50% of VOLATILE_MEMORY for the whole data cube:
(2)
If the output cube needs more than 50%, XY_MAP stops with an
error. Allocating more with leave too few space for the other buffers,
resulting in much increased number of iterations, traversing the table
more times, etc, leading to reduced efficiency. The recommendation is
then to use a lmv cube and to transpose it afterwards for a better
efficiency. The remaining part of the memory is then divided as usual
between the other buffers to same and correct amount:
(3)
At this stage,
is the minimum (floating point) number of divisions
needed. The number of channels processed per division is then:
(4)
Rounding to an integer number of channels is obviously needed as the
internal engine can not process fraction of channels. Down-rounding is
used because up-rounding could mean using more than VOLATILE_MEMORY per division. Because of the integer rounding, the
actual number of divisions which will be performed is finally:
(5)
Note that the last division may process less channels than the other
ones, because the total number of channels
may not be a multiple
of
.