Next: Gridding with XY_MAP
Up: IRAM Memo 2016-2 Gridding
Previous: Efficient transpositions
Contents
CLASS tables in details
CLASS can save a set of spectra with the command TABLE. It creates
a 2D table using the Gildas Data Format (a header and rows of data). By
default, the per-column format used is:
- the X offset position,
- the Y offset position,
- the spectrum weight W (e.g. for different integration times),
- first channel intensity
and so on for all channels.
Such a table is ordered Velocity-Position (hereafter: VP), i.e. X, Y,
W and then the channels are contiguous in the file for each spectrum,
and the spectra are concatenated one after the other. This has some
advantages and disadvantages which are described in the
table
and compared to the opposite Position-Velocity
ordering (hereafter: PV).
Table:
Advantages (+ or ++) and disadvantages (-) of VP- and
PV-ordered Class tables
Table:
Advantages (+ or ++) and disadvantages (-) of VP- and
PV-ordered Class tables
# |
Velocity-Position |
Position-Velocity |
1 |
++ |
natural ordering for Class Data Format to table conversion, i.e. the
Class file (input) and table (output) are traversed once in parallel,
the table may not fit in memory |
- |
would need non-contiguous access either to the Class file or to the
output table in memory, the table must fit in memory |
2 |
++ |
appending new spectra (TABLE APPEND mode) is straightforward and
low-cost |
- |
TABLE APPEND is not easy, costs reading and duplicating old
data |
3 |
- |
one needs to traverse the whole table to read the X Y W columns |
+ |
X Y W arrays are contiguous, reading them has a marginal cost |
4 |
+ |
the VP order is already the correct one for gridding (see
section ) |
- |
the PV order need to be transposed in an intermediate buffer before
use in gridding |
5 |
- |
reading by block of velocities is unefficient, i.e. one needs to
traverse the file as many times as the number of blocks |
++ |
reading by velocity blocks is efficient as each block is a contiguous
piece of file |
|
From the items 1 and 2 in table
, it is obvious that the
result of the TABLE command must remain a Velocity-Position table.
However, the item 5 tends to show that Position-Velocity tables may be
interesting for efficient1gridding, especially for tables which do not fit in memory and must
be read by blocks.
Transposing a Velocity-Position table can be done with the command
SIC
TRANSPOSE; the advantages and disadvantages are
detailed in the table
. The item 6 can be explained
like this: both VP items 3 and 5 (table
) and
stand-alone transposition (table
) need to traverse
the input table several times. However, because the former are part of
a larger problem (reading the table but also convolving, writing the
cube, memory transpositions if any), they can use only a smaller
amount of memory so they traverse the table more times (smaller blocks
and more of them are needed) than the stand-alone transposition. In
other words, and assuming the table does not fit in memory and that IO
dominate the problem, VP gridding is slower than transposition + PV
gridding. This is even more true if gridding is repeated several times
on the same table. A demonstration of these conclusions is shown in
table
.
Table:
Advantages (+) and disadvantages (-) of VP to PV stand-alone
transposition
Table:
Advantages (+) and disadvantages (-) of VP to PV stand-alone
transposition
# |
VP to PV |
6 |
+ |
file-to-file transposition is a stand-alone task which can involve
a maximum of the machine ressources, so that it is faster than the VP
overheads (items 3 and 5) when performed within the XY_MAP
command. |
7 |
- |
the transposed table is a duplicate of the same data, it doubles
the disk usage, |
8 |
- |
the transposition must be repeated each time the original table changes
or new data is appended. |
|
Table:
Gridding of a table of 61263 spectra x 20737 channels (4.8
GB). Each command was allowed to use only 2048 MB. The output cube
(1.7 GB with the command defaults) is ordered LMV so that it is
traversed only once for writing. The system cache was emptied
before each run.
Action |
Traversing table |
Elapsed time |
XY_MAP (VP) |
1 (XYW) + 7 blocks |
8.2 min |
TRANSPOSE (VP to PV) |
4 (in) + 1 (out) |
5.0 min |
XY_MAP (PV) |
1 |
1.9 min |
From these conclusions, there is no unique answer to what kind of
table should be feed in the gridding engine. The answer is a mixture of
efficiency when creating or appending the table, and reading it. However,
if the whole table fits in memory, i.e. it can be traversed once, then
it is probably better to use a VP table. Then, if it must be read by parts,
the PV order should be prefered.
Next: Gridding with XY_MAP
Up: IRAM Memo 2016-2 Gridding
Previous: Efficient transpositions
Contents
Gildas manager
2023-06-01