[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[alma-sw-ssr] Pipeline/Offline Requirements Draft V2.2
Here is the lastest draft of the document for the meeting. I will try to
produce a printed version using the tex template from the last document,
and bring this to the meeting so we will have readable copies.
This draft includes my best tries at addressing the comments recieved so
far, though not all the discussion on the pipeline session has been
followed through. We will have plenty to discuss at the meeting
(especially the priorities).
-Steve
--------------------------------------------
ALMA SSWG Pipeline and Offline Requirements
Draft 12-July-2001 v2.2
S. Myers, F. Gueth, B. Clark, P. Schilke,
M. Momose, K. Tatematsu
--------------------------------------------
The history of this document and correspondence can be found at
http://www.aoc.nrao.edu/~smyers/alma/offline-req/
This document is olr-2001-07-12-myers.txt
-------------------------------------------------------------------------------
Section 1: Overview
-------------------------------------------------------------------------------
Scope:
-------------
This document describes the requirements for data reduction software packages
in order to be able to handle the ALMA data output. There are two main sets
of requirements: requirements on the ALMA internal data pipelines, and
requirements on offline and external data reduction packages. In particular,
it is assumed that there will be (at least) one software package used
internally by ALMA in order to process the data in quasi-real time (the
quick-look and calibration pipelines) and also to fill the archive (the
science pipeline). This Pipeline must fulfill the Pipeline Requirements
(Section 2). Note that the tools that make up the Pipeline need not be in the
same program, and may be from different software packages. However, this
suite will be referred to as an integrated entity in this document.
In addition, there must be software available for users to reduce their own
data and/or data from the archive offline at their home institution, at an
ALMA regional center, or remotely using ALMA center computing. This suite of
tools must fulfill the Offline Data Reduction Requirements (Section 3). We
refer to the ensemble that fulfills these as the "Package". Again this might
be an assortment of different programs from different software packages
(e.g. AIPS, GILDAS, MIRIAD, aips++) but in this case it is highly desirable
that there be at least one single package that fulfills the Offline
requirements or that there be an installation that integrated the necessary
parts. Note, however, that is is highly unlikely that disparate applications
from different packages will fulfill the requirements on similar "look and
feel" and inter-connectivity, and thus it is likely that this will be a single
homogeneous suite by default.
I think it is important that the Pipeline be available for installation
and use by users, and not be merely an internal ALMA "black-ops" secret
weapon. It is desirable, though not necessary, that the Pipeline
package be part of the offline Package.
Basis:
-------------
This document assumes requirements already delineated in
ALMA-SW MEMO 11 "ALMA Software Science Requirements and Use Cases" at
http://www.alma.nrao.edu/development/computing/docs/joint/0011/ssranduc.pdf
Much of the content of this document is based on the AIPS++ User
specifications Memo 115 found at
http://aips2.nrao.edu/stable/docs/specs/specs.html
Although it was intended to be the basis for AIPS++, these are an excellent
starting point for our requirements to build upon. I do not think this makes
this too specific to AIPS++, as our document should be package-independent.
Nomenclature:
-------------
The subject of the Section 2 on Pipeline Requirements is referred to as the
"Pipeline". This may be implemented as disparate tools or programs, or as
separate packages provided by different groups, or as a single package, as
long as it fulfills the requirements. At its core, the Pipeline is a set
of operations, implemented by an underlying software package,
which takes a concise description of the way these operations are to be
performed and accesses ALMA data, either from the ALMA archive or from
local files, and produces a desired data product. There are several
pipelines within the Pipeline essential to the efficient operation of ALMA.
The subject of Section 3 on Offline Data Reduction Requirements is referred to
as the "Package" or "Offline Package". This is intended as a set of tools or
programs, believed adequate for ALMA reductions, and used end-users for
science and by ALMA staff for reductions upon which the behavior of the system
will be judged. It may consist of packages provided by different groups, with
transitions provided to integrate them into a single suite. It may be that
more than one Package fulfills the requirements of Section 3 and thus can be
considered as suitable for ALMA processing. The requirements will state that
the Package will be available for installation on the observer's own computer
systems as well as present at ALMA centers. Note that the pipeline may or
may not be based on the offline Package(s), depending on implementation.
However, the functionality of the Pipeline must be available in the offline
Package for users.
The user of the package may be referred to as "user" or "observer", and may
be the actual proposer or a staff member. In pipeline mode, the user may
actually be another tool or program.
The "archive" refers to the totality of the ALMA data storage and consists
of possible different physical archives.
General Considerations:
-----------------------
A. Two fundamentally new aspects of ALMA are the integrated archive and
the pipeline, therefore the impact of requirements on these two areas
should be considered. In particular the Pipeline will be the most
critical aspect of ALMA given that we envision both an effective dynamically
scheduled observatory with prompt user feedback mechanism and a
scientifically viable archive. The first substantial section deals with
the Pipeline in detail. We have also included topics "Relation to the
Pipeline" in the first section of General Requirements for the Offline
Reduction, and "Interaction with Archive" to the section on Data Handling.
B. There is a fundamental difference between running the ALMA array in
interferometric and single-dish modes. This difference may not be so
fundamental for many types of data processing, however. For example, if
several single dishes do single-dish together, much of the calibration will
be done interferometrically (pointing, focus, beam shape); for
interferometric observations, temperature scale is derived from single-dish
measurements. Therefore where appropriate we split these two paths, with
inclusion of integration of the two where appropriate.
C. There is no fundamental difference between spectral-line and continuum
observations, merely number of channels and bandwidths coming out of
the correlator. Due to the nature of the ALMA correlator, most of
the special calibration needs of traditional spectral
line observations (e.g.. bandpass calibration) are also applicable to
continuum observations (continuum is built up through summation of
spectral channels taken at low resolution). We will assume that
all data will be effectively taken in spectral line mode therefore.
D. There is no fundamental distinction needed for polarization data, merely
consideration of the number of polarization products needed for processing.
There may be special modes where polarization (e.g. RR or LL) is synthesized
in hardware, but these can be treated as special cases. The model we
consider here is the processing of one or more Stokes parameters and thus
polarization is integrated into all the topics.
E. Mosaicing will likely be widespread in ALMA data reduction due to the
small fields of view. In this trial outline, we choose to include
mosaicing as cases (like spectral line) under the interferometric and
single-dish modes when the requirements are straightforward, but
also under special headings "Mosaicing considerations" where it seems
more appropriate (under the Calibration and Imaging headings).
F. We propose a system of prioritizing with codes:
1 = essential
2 = highly desirable,
3 = desirable, but not critical
These codes are enclosed in brackets [] at the end of the items they
qualify.
NOTE: No priorities have been assigned in this draft of the document.
G. Specific requirements can/should be broken into sub-cases or instances
(e.g. R1 R1.1, R1.2) for clarity.
H. The requirements for the ACA are currently placed in an Appendix, pending
executive decision on whether ACA is within the scope of ALMA. This may
be deleted (if ACA is not part of ALMA) or promoted to a Chapter at a
later date.
-------------------------------------------------------------------------------
Section 2: Pipeline Data Processing Requirements
-------------------------------------------------------------------------------
Comments to: Frederic Gueth and Peter Schilke
PL-1.0 General
--------------
We distinguish three different pipelines, the Calibration, the Quick-Look,
and the Science Pipeline. The Calibration pipeline is intended for
processing of array calibration data, usually on short turnaround
time-scales, with feedback to the online system and into the archive.
The Quick-Look pipeline has the job of providing quasi-real-time
(~minutes) or short turnaround-time (<hours) images, spectra, and
data-quality assessment for feedback to the online system and to the
observers, and possibly output to the archive. The Science pipeline
is the primary data path from the array to the archive and to the
observer, usually operating on longer timescales to produce results
after breakpoints and after completion of projects (see ALMA-SW MEMO 11).
1.0-R1 The Pipelines shall be able to process all data coming from the
array in standard modes designated by the project. It must not
constitute a bottle-neck in the data flow, meaning that several
occurrences of the same pipeline shall be able to run in parallel
if necessary.
R1.1 Some projects will require unusually high data rates or
processing requirements. These may require processing outside
of the ALMA system and should be flagged appropriately so they
are not processed by the ALMA pipeline.
1.0-R2 The Pipelines shall operate through readable and comprehensible
data reduction scripts.
1.0-R2 Sufficient recording of corrections applied and/or models used
shall be carried out so that any step can be reversed and redone
if needed without recourse to repeating an entire series of
operations or resorting to a copy of the dataset at the
intermediate state.
1.0-R4 A manual, interactive mode of operations shall be available for
debugging, technical developments, inspection by experts engineers
and astronomers on duty, etc.
1.0-R5 The Pipelines shall also be run at the Regional Centers. Some of
the actions described below are not relevant in that case:
interaction with the Dynamic Scheduler, with the Sequencer.
1.0-R6 The pipelines should output a comprehensible summary of the
operations performed with diagnostic information to allow
checking of results and a record of the processing.
PL-2.0 Calibration pipeline
---------------------------
There are three general categories of calibrations that must be
handled:
o The instrumental calibration: pointing, focus, delay, baseline,
etc. What is required here is a fast feedback to the control
software. In particular, there are critical calibrations (focus,
pointing, delay) which must be executed successfully before
telescope operations can resume - these are the most time-critical
and have highest priority.
o The calibrations that do not require a time interpolation, as
the atmospheric or bandpass calibration: each time such a
scan is observed, something has to be derived and then stored,
to be applied to all the following observations, until a new
calibration of that kind is observed. Many of these can be
immediately processed (e.g. bandpass) and stored for subsequent use.
o The calibrations that require a time interpolation (e.g. the
phase and amplitude calibration, polarization leakage) where
a calibration curve has to be fitted using data taken over
a range in time. Some of these are local in time, and are
applied to target observations that were observed in between
calibrations. Others, once determined, are valid over a longer
time period.
The first two are clearly handled by the calibration pipeline as
described in this section. The third will likely be shared with
the other pipelines, the Science Pipeline in particular for
end-point calibration. It will probably be sufficient to include
only the simple modes in the Calibration and Quick-look Pipelines,
with more elaborate and mode-dependent calibrations occurring in
the Science Pipeline, especially for calibrations that affect only
the data in a given observation locally (e.g. fast-switching gain
calibration).
2.0-R1 The Calibration Pipeline shall be activated after each scan has
been observed.
2.0-R2 The Calibration Pipeline may also be re-invoked at any time with
updated parameters or improved data. The results should not
immediately overwrite old results so comparison is possible
before adopting the new calibration. There will need to
be a method for validation and acceptance of calibration
updates.
2.1 Interferometric data
------------------------
2.1-R1 The Calibration Pipeline shall reduce, and store the following
results for further use:
R1.1 the receiver sideband ratio calibration
R1.2 the atmospheric calibration
The results of the atmospheric calibration shall be passed
to or made available for access by the Dynamic Scheduler
(in real-time mode).
2.1-R2 For all observations of an astronomical source, the Calibration
Pipeline shall:
R2.1 apply the atmospheric calibration to the data
R2.2 store the phase corrected from the atmospheric effect, if
required
In subsequent operations, the corrected or uncorrected phase is
to be used, depending on the selected mode.
2.1-R3 For all observations of a calibrator source, the
Calibration Pipeline shall:
R3.1 compute the phase rms on the scan timescale
R3.2 compute the antenna efficiencies, using the averaged
amplitudes
R3.3 do the previous operations both with and without the
atmospheric phase correction, and deduce from the
comparison whether the atmospheric phase correction
improves the results or not
R3.4 derive amplitude and phase time-dependent variations by
fitting smoothed curves (e.g. polynomials, splines)
using all observations of calibrators since the beginning
of the session
These results shall be passed to the Dynamic Scheduler, and
made available to the Pipeline for gain transfer to target
sources.
2.1-R4 The Calibration Pipeline shall reduce the following
observations:
R4.1 pointing scans (results to be passed to the Sequencer)
R4.2 focus measurements (results to be passed to the Sequencer)
R4.3 delay calibration (results to be passed to the Sequencer)
R4.4 bandpass calibration
R4.5 baseline calibration
R4.6 holography measurement
2.2 Single-Dish data
--------------------
2.2-R1 The Calibration Pipeline shall reduce the atmospheric
calibration, using sideband ratios determine from
most recent interferometric calibration (2.1-R2.1),
and pass the results to the dynamic Scheduler.
2.2-R2 For all observations of an astronomical source, the Calibration
Pipeline shall apply the atmospheric calibration to the data.
2.2-R3 The Calibration Pipeline shall reduce and pass the results to
the Sequencer:
R3.1 pointing
R3.2 focus
Note that in most cases this will be done interferometrically.
2.2-R4 For the pointing and focus measurements, the fitting results
should be automatically stored in the telescope
parameter file if the fitting error is less than
the system specified value. If the error
is not less than the specified value,
the pipeline will send a message to the alarm system.
2.2-R5 The calibration pipeline shall derive the
half-power beam size, the main-beam
efficiency, and the Moon (fss) efficiency from the calibration
scans toward planets and the Moon, and store
the successful results in the telescope parameter file.
Another derived parameter is the total forward efficiency
obtained from skydip measurements.
PL-3.0 Quick Look pipeline
-----------------------
3.0-R1 The Quick Look pipeline shall be activated after the Calibration
Pipeline has been completed.
3.0-R2 A Monitoring Tool shall be available, plotting and archiving in a
log file various results of the Calibration Pipeline:
R2.1 the results of the last pointing or focus scan
R2.2 the phase rms computed over the last scan and computed over the
current session
R2.3 the corresponding seeing
R2.4 the atmospheric opacity
...
This tool shall include a variety of options, to control the plot
parameters, to plot the variation of these results with time, to
allow the operator to monitor one antenna or baseline in
particular, etc.
Automatic checks shall be available to detect bad or degrading
results, triggering alarms if necessary.
Since it is required that the observer/operator can efficiently
check out the status of ongoing observations, all the plots by the
monitoring tool should be reasonably simple, and the plot option
should be able to quickly be changed by the observer/operator.
3.0-R3 A Monitoring Tool shall be available to plot the current properties
of the array, such as:
R3.1 the current instantaneous uv coverage
R3.2 the corresponding weight distribution
R3.3 the corresponding dirty beam
R3.4 the previous quantities, integrated since the beginning of the
session
R3.5 the thermal noise rms reached since the beginning of the session
...
3.0-R4 Single-Dish data: the current spectra observed on the astronomical
target shall be corrected from the emission at a reference position
or frequency (depending on the observing mode), and displayed with
various options:
R4.1 time integration
R4.2 antenna summation
R4.3 baseline fit, excluding a pre-defined window, or a window
defined by the Operator or AoD
R4.4 spectra on a pseudo-grid corresponding to position on a raster
(a "stamp" or "profile" plot)
3.0-R5 Interferometric data: the visibilities observed on a target source
shall be calibrated, using the results of the Calibration Pipeline:
R5.1 apply the current bandpass calibration
R5.2 apply the current amplitude and phase correction
R5.3 apply the flux conversion factor based on standard antenna
efficiencies
3.0-R6 Interferometric data: the current spectra observed on the
astronomical target shall be displayed (amplitude and phase) with
various options:
R6.1 time integration
R6.2 choice of the baseline(s)
R6.3 baselines summation
R6.4 intensity (amp or phase) as function of baseline and time
(for a frequency), or time and frequency ( for a baseline )
3.0-R7 Interferometric data: the Quick Look Pipeline shall compute the
Fourier Transform of the visibilities, using the fastest algorithm,
and display the resulting image. Alternatively, the actual Fourier
Transform of each new visibility point can be computed and added to
the current image. This shall be done for:
R7.1 the continuum data
R7.2 the line-averaged spectra, over a pre-defined velocity range,
or possibly a velocity range defined by the Operator/AoD
PL-4.0 Science Pipeline
--------------------
4.0-R1 The Science Pipeline shall be activated after reaching a
break-point (either user-defined or end-of-session normally).
4.0-R2 The Science Pipeline shall find in the Archive all data observed
during the session. It shall use the atmospheric-calibrated data
(amplitude and phase).
4.1 Interferometric data
------------------------
4.1-R1 The Science Pipeline shall use the calibrator to derive:
R1.1 the bandpass calibration
R1.2 the best phase and amplitude solution
4.1-R2 The Science Pipeline shall calibrate the source observations by
applying:
R2.1 the bandpass calibration
R2.2 the phase calibration
R2.3 the amplitude calibration
4.1-R3 The Science Pipeline shall check and correct the flux scale by
using observations of source of known fluxes. Any effect due to
the source being resolved shall be taken into account.
4.1-R4 The Science Pipeline shall compute images for each
(non-blanked, possibly user-specified) frequency channel,
as well as for the continuum emission:
R4.1 extract the visibilities with the appropriate frequency
resolution + the continuum measurements
R4.2 find in the Archive the previous (calibrated) visibilities,
and check whether the data are compatible with the current
dataset
R4.3 grid the whole data set
R4.4 Fourier transform
4.1-R5 The images shall be deconvolved using the most appropriate
algorithm. In case of a complex image, it should be possible to
have several algorithms running in parallel, the best
(according to criteria TBD) image being eventually selected.
4.1-R6 Designated modes shall be supported, including:
R6.1 mosaic observations
R6.2 on-the-fly mosaics
R6.3 self calibration projects
R6.4 combination of single-dish + ALMA data (+ACA)
Comment: Careful cross calibration of the flux scales between
ALMA interferometric data and single dish data ( and ACA )
is required for high fidelity imaging. This will require
careful coordination with the calibration pipeline, especially
as ACA observations may be taken at very different times than
the main array data.
4.1-R7 Subtraction of continuum level from spectral data is
required. This can be done in both Fourier and image
domain. In the case of uv-plane subtraction, flexible
setting of the frequency channel ranges for the calculation
of the continuum level should be available.
Comment: There will likely need to be a way to make trial
subtractions and select the "best" in an automated manner
for the pipeline to function.
4.2 Single dish data
--------------------
4.2-R1 The data taken on the astronomical source shall be reduced,
depending on the the observing mode. All ALMA modes (as
designated by the project) shall be supported, including:
R1.1 on/off
R1.2 nutator switch
R1.3 frequency switch
R1.4 raster maps using one of the above modes
R1.5 OTF maps using one of the above modes
4.2-R2 The resulting spectra shall be corrected for a baseline, fitted
on all spectra channels but a pre-defined window.
4.2-R3 The Science Pipeline shall:
R3.1 find in the Archive previous (calibrated) observations, and
check whether the data are compatible with the current
dataset
R3.2 grid the whole data set
4.2-R4 Provision shall be taken to allow running appropriate
algorithms (deconvolution, destriping), if required by the data
or by the experience gained using ALMA.
PL-5.0 Interface with the Archive --- TO BE DETAILED
------------------------------
5.0-R1 The images produced by the Science Pipeline shall be archived,
together with the
R1.1 the script that was used to produce the image
R1.2 the log file of the software
5.0-R2 cf 7.0-R3 general SSR document
5.0-R3 Also to be archived:
R3.1 data quality control:
R3.1.1 estimate of the noise
R3.1.2 seeing
R3.1.3 image fidelity based on model?
R3.2 observation quality control:
R3.2.1 baseline quality
R3.2.2 calibration quality
R3.3 telescope state: (possibly in monitor file, but accessible)
R3.3.1 telescope pointing
R3.3.2 subreflector focus
R3.3.3 monitor point (e.g. temperatures) data
Appendix: Barry Clark's list of input parameters needed for each procedure
NOW MOVED TO APPENDIX AT END OF DOCUMENT
-------------------------------------------------------------------------------
Section 3: Offline Requirements
-------------------------------------------------------------------------------
OL-1.0 General Requirements and Interaction with other ALMA elements
1.1 Goals of the Offline Package
1.1-R1 An ALMA Offline Data Reduction Package (or "the package")
is primarily intended to enable end-users of ALMA (e.g.
observers or archive users) to produce scientifically
viable results that involve ALMA data products. The secondary
use is to enable ALMA staff to assess the state of the
array and derive calibration parameters for the system.
1.1-R2 The package should be able to function (be installed) at
the users home institution and at ALMA regional centers
(both locally and remotely). It should be portable to a
reasonable number of supported platforms,
including laptops without network connections.
1.1-R3 The performance of the package should be quantifiable and
commensurate with the data processing requirements of
ALMA output at a given time. This should be benchmarked
(e.g. "AIPSmarks") and reproduce accurately results for
a fiducial set of reduction tasks.
1.2 Relation to the Pipeline
1.2-R1 All modules available in the pipeline must be available also
as an offline analysis option.
1.2-R2 Note that not all offline analysis tools will necessarily
be in the pipeline package. For example, one of the important
differences between pipeline and offline reduction path is that
offline one should have extensive interactive capabilities to
merge and compare data with different resolution, coordinate
system, data grid, and so on.
1.3 Operational Issues
1.3-R1 Installation of package must be flexible, and able to be
installed on non-specialized hardware by end user, preferably
without root user permission (on Linux).
1.3-R2 Updates and new versions of the package should be backward
compatible where possible, such that user-built and observatory
provided scripts and tools should be executable with only
minor changes.
1.3-R3 User installation of the package should not be restricted
by other issues such as expensive or unduly restrictive
licenses (ie. the package license should convey all other
necessary licenses, such as GNU).
Comment: Although it may be attractive to build upon a
commercial package such as IDL, this is likely to be
prohibitive unless a blanket license would be bought
by the project and distributable free of charge to users.
OL-2.0 Interface
2.1 General user interface requirements
2.1-R1 User must be able to choose from a variety of interface styles
R1.1 A Command Line Interface (CLI) must be provided, with access
via both an interactive input and via script
R1.2 A Graphical User Interface (GUI) must be provided for
interactive processing. Actions taken under the GUI must
be loggable and convertible into scripts executable by
the CLI.
2.1-R2 The user should be able to interact with the host operating
system with command sequences invoked from the UI.
2.1-R3 Multitasking for all interfaces should be available where
appropriate.
R3.1 It must be possible to run one or more long-running
calculations in the "background." While background tasks
are running normal interactive activities must be possible.
2.1-R4 User-selectable output display devices will be supported.
2.1-R5 User-understandable and non-destructive error handling at
all levels is highly desirable.
2.1-R6 The application of successive stages of calibration,
correction, flagging and editing should not be destructive
to the data. The package should be able to recover and
revert to earlier stages without repeating an entire series
of operations.
2.1-R7 The interface and package should function without a network
connection. Users should be able to (conveniently) run the
data processing user interface from a laptop (e.g. on an
airplane).
2.2 Graphical User Interface (GUI)
2.2-R1 The GUI should provide real-time feedback via standard
compact displays.
2.2-R2 The default look and feel of the GUI should be
uniform and familiar through the entire package.
2.2-R3 The look and feel of the GUI must be customizable
to accommodate both the expert specialist and the novice user,
with ability to hide complexity when prudent and the ability
to access deeper levels when desired. The default look to
the novice should not be overly busy, with functionality
easily apparent through labeling and built-in help facility.
2.2-R4 The user should be able to configure the window arrangement
and menu appearance to fit display characteristics. For
example, on small displays runaway window generation can
quickly become annoying, and the choice of whether to
keep to a single switched window or to spawn new windows
should be up to the user.
2.2-R5 All functionality of the CLI must also be available in GUI
mode.
2.2-R6 A graphical data-flow oriented (IDL style) tool assembler
would be desirable, perhaps as an advanced GUI for later
development.
2.3 Command Line Interface (CLI)
2.3-R1 The CLI must be usable remotely over low-speed modem lines
or network connections, with ACSII terminal emulation.
2.3-R2 The interface must have the facility to read in command files
for batch processing of a sequence of CLI commands.
2.3-R3 The CLI should have command-line recall and editing,
name completion and minimum match where appropriate.
2.3-R4 All functionality of the GUI must also be available in CLI
mode.
2.4 Interface programming, parameter passing and feedback
2.4-R1 Must have basic programming facilities such as:
R1.1 variable assignment and evaluation
R1.2 conditional statements
R1.3 control loops
R1.4 string manipulation
R1.5 user-defined functions and procedures
R1.6 standard mathematical operations
R1.7 vector and matrix manipulation
2.4-R2 Commands executed should be logged, with provision to
re-execute the session.
2.4-R3 Input parameter checking upon parsing with reporting of
incorrect, suspicious or dangerous choices should be
done before execution where possible.
2.4-R4 Parameters should be passable between applications in as
transparent a manner as possible. However, "global" variables
should not be the default, unless designated specifically by
the user-programmer.
2.4-R5 Application variables should be named consistently and as
clearly as possible indicating their intended use using
astronomical terms where possible.
2.4-R6 There shall be no hidden parameters, all changeable
parameters in all actions must be accessible. In complex
cases, this should (must?) be through an hierarchical
interface with the most important parameters accessible
directly, and the others through sub-menus.
2.5 Documentation and help facility
2.5-R1 The package creators must provide comprehensive and user
comprehensible documentation for all parts of the package.
2.5-R2 There should be a variety of help levels and documentation
formats accessible from the UI and over the Internet,
applicable to novices, experts, and technical users.
These would include:
R2.1 user cookbooks with extensive examples
R2.2 application descriptions and reference manual
R2.3 programmer references
R2.4 data format descriptions
R2.5 algorithm descriptions
R2.6 online help, FAQ, email contacts
R2.7 release history, bug reports and tracking, patch descriptions
R2.8 newsletters, email exploders, notes series
These would be maintained by the package providers, with
help from the ALMA project.
R2.9 These should be in printer-friendly formats.
R2.10 Optional native HTML formats are desirable.
2.5-R3 Help should be context-sensitive. In GUI mode, fly-over
banners should indicate use of buttons and fields, and
clickable help buttons should be available on all pages.
2.5-R4 Can direct user to Web page, although in CLI mode,
must support in-line text based help also.
2.5-R5 Full search capability must be built into documentation
library.
OL-3.0 Data Handling
3.1 General data requirements
3.1-R1 The package must support data taken in any of the available
ALMA hardware modes in the most appropriate manner
3.1-R2 Must be able to handle the integrated data objects corresponding
to the observational programs carried out by ALMA. These
objects may be implemented in any manner appropriate, though
relations between the components of the object must be
maintained through some mechanism. These include:
R2.1 Program header information
R2.2 Observation status information (and schedules themselves)
R2.3 Field information
R2.4 Coherence function (visibility) data from interferometer
in all available polarization products, spectral channels,
frequency bands, IFs, including auto-correlations
R2.5 Auto-correlations in single-dish total power modes
R2.6 Weights and/or data uncertainties
R2.7 Flagging data or masks
R2.8 Diagnostic data and errors
R2.9 A-Priori calibration data (bandpasses, flux densities,
polarization leakages, etc.)
R2.10 Derived calibration data (gain tables, flux bootstraps,
etc.).
R2.11 Images and/or models produced from data
R2.12 Processing history
3.1-R3 Must support data taken in one or more polarization products,
spectral channels, frequency bands, IFs. Transformation must
be provided to the desired Stokes output parameter(s) with
facility for spectral or band averaging.
3.1-R4 Multiple pointing centers for mosaics must be supported.
3.1-R5 Multiple subarrays must be supported.
3.1-R6 Data taken in arbitrary scanning patterns must be dealt with.
3.1-R7 The flagging mask must be maintained and associated with the
data it refers to during any subsequent operations (such as
splitting of data sets).
3.1-R8 Calibration and ancillary monitoring data must be preserved
3.1-R9 Comprehensive and understandable processing history information
for the data must be maintained and be exportable.
3.1-R10 Distinctions between "single-source", "multi-source",
single-dish, and interferometer datasets should be avoided
with context built into the dataset or header.
3.1-R11 When sorting or indexing is desirable for performance
enhancement, this should be carried out in a manner
transparent to the user.
3.1-R12 Concatenation of datasets should be straightforward and
robust. Extraction and reinsertion of data subsets
should be supported.
3.1-R13 Users should have access (at the manipulation level) to all
aspects of the data including the header.
3.1-R14 The package must support locking data files so that there
is no possibility of one process corrupting a file that
is also being written to by another process. The model
should be: "one writer, multiple readers."
3.2 Data import and export
3.2-R1 Standard data formats (e.g. FITS) must be supported for both
input and output without loss of functionality or information,
though need not be the native format for both the package and
archive. The project will maintain a list of formats which the
package must support.
3.2-R2 Access to the archive must be supported, including for data
from the currently active observing session. Security and
integrity of the archive must be ensured during these
operations.
3.2-R3 Disk and offline data storage (e.g. DAT, DDS, DLT) must be
supported. The project will maintain a list of media which
the package must support.
3.2-R4 The ability to drop flagged data on export should be
included.
3.3 I/O speed and efficiency
3.3-R1 I/O of data must not be a bottleneck for processing, especially
for pipeline use. The definition of what constitutes a
"bottleneck" and what I/O throughput rate is acceptable must
be defined at each stage of ALMA operations (e.g. interim
science, full stand-alone ALMA, ALMA + ACA) and in each mode
(e.g. quick-look pipeline, offline use).
Comment: This is especially true if the native format
of the package is not used and filling/conversion is necessary.
For offline use, the intention is that users not be
faced with I/O operations that are way out of line with the
fastest equivalent times that could reasonably be achieved
with software development.
3.4 ALMA interferometer data
3.4-R1 Correlation products accumulated at multiple bit depths
(16-bit,32-bit) must be supported transparently
3.4-R2 On-line gain correction data must be carried along with
data
3.4-R3 Calibration tables and editing information must be associated
with the data and preserved on output
3.5 ALMA single dish and interferometer phased-array data
3.5-R1 Data taken with nodding secondary must be supported, as
a function of nodding phase
3.5-R2 Total power and phased array data sequences with scanning
pattern preserved
3.6 Images and other Data Products
3.6-R1 Standard multi-dimensional images must be supported, such as:
R1.1 Spectra and image slices (1D)
R1.2 Planar images (2D)
R1.3 Spectral and Time Cubes (3D)
3.6-R2 Standard derived data products must be supported, such as:
R2.1 Models (e.g. CLEAN models, Gaussian models, wavelets,
pixons)
3.6-R3 Blanking of pixels (magic-value) must be maintained through
the processing of images.
3.7 Foreign data
3.7-R1 Data produced by other interferometers and single dishes in
similar observing modes should be importable and processable
if provided in a standard data exchange format.
3.7-R2 Imaging data in standard formats (e.g. FITS) from astronomical
instruments at different wavelengths should be importable,
with the ability to combine (coadd) these with ALMA data where
appropriate. This should be through a set of widely used
formats, with a minimal list of supported standards
established by the project.
3.8 Interaction with the Archive
3.8-R1 The interface between the package and archive must be able to
provide data access (when such access is granted)
without interfering with other access to the archive.
3.8-R2 Products of imaging and data analysis should be able to be
archived in association with the data if desired. This means
output formats supported by the archive must be supported
by the package.
OL-4.0 Calibration and Editing
4.1 General calibration and editing requirements
4.1-R1 The package must be able reliably handle all designated ALMA
standard calibration modes, including but not exclusive
to temperature controlled loads, semi-transparent vanes,
apex calibration systems, WVR data, noise injection,
fast-switching calibration transfer, planetary observations.
4.1-R2 Calibration, editing, flagging, and correction of data in
the off-line and pipeline package should be easily reversible
within the process (ie. not requiring re-reading of the data
from the archive). Logging of individual editing steps should
be clearly marked in some sort of history table (possibly
distinct from a more readable history) with individual edits
undo-able.
4.1-R3 Data display and editing should be effected through generic
tools applicable to both single-dish and interferometer modes.
These should present should, as far as possible, present
similar interfaces to the user and have the same look-and-feel.
4.1-R4 Data editing and flagging should be possible based upon
array and environmental monitoring data.
4.1-R5 Data calibration, correction and flagging should be possible
based upon standard or user-defined models in either functional
or tabular form.
4.1-R6 Calibration should involve flexible averaging of data and
calibration quantities with user-controllable interpolation,
filtering, weighting, and application scope.
4.1-R7 Interactive data editing, calibration, and display of
calibration quantities should be largely graphical and
intuitive, with user-definable setups. Displays of
greyscale (e.g. TVFLG style) or line plots (Difmap style)
should be options. Visualization of the calibration results
will be key to obtaining robust and trustworthy calibration
with ALMA.
Specialized editing display tools should include:
R7.1 Specification of data by selection of time range, uv
range, pointing center
R7.2 Displays of spectra and spectral cubes, with time and
or channel averaging
R7.3 For interferometer data, amplitude (phase) vs. time on
each baseline (Difmap vplot), time-baseline (TVFLG)
with interactive zoom and clipping.
R7.4 Editing based on difference from a running mean, or rms
in boxcar, or difference versus model.
R7.5 Cuts through a data plane or cube.
4.1-R8 Editing should be incorporated into most visualization tools
where data or data-derived quantities are plotted, such as
from calibration solutions, amplitude vs. uv-distance plots,
or any number of other plots. A "see-it, flag-it" capability
should be the standard within the tools.
4.1-R9 Automatic editing tools should be available in the package.
4.1-R10 Access to time history of calibration information such as
source catalogs containing flux density histories, planetary
ephemerides, noise tube values, should be built into
calibration engines. Output of calibration procedures should
be exportable into similar structures.
4.2 Interferometer data
4.2-R1 Antenna-based determination of calibration quantities such
as gains, polarization leakages, bandpasses, is the primary
form of calibration where appropriate.
4.2-R2 In addition to antenna-based calibration, baseline dependent
corrections will also be supported. For example,
coherence loss due to atmospheric phase fluctuation depends on
baseline length (this aspect will be more important at higher
frequencies) and must be taken into account if some of the
WVR corrections are deemed incorrect while others are
applied. Also, in general, the bandpasses are baseline
dependent and contain non-closing terms.
4.2-R3 Gain corrections will be made based on differences between
observed and modeled data quantities, possibly with iteration
(e.g. self-calibration and determination of gains using
calibration sources). Where solutions are discrepant or
poor automatic edition should be possible.
4.2-R4 Redundancy (e.g. same or crossing baselines) should be used
wherever possible to increase accuracy of or check calibration
solutions. Editing base on this comparison should be possible.
4.2-R5 Calibration quantities (possibly stored in tables or data
structures) should be transferable between sources, possibly
after interpolation, extrapolation or smoothing. This will
be the primary method of phase calibration transfer using
fast-switching between source and calibrator.
4.2-R6 Determination of, correction for, and examination of closure
errors should be straightforward to carry out.
4.2-R7 Determination of the complex bandpass using calibration source
observations, and transfer to target sources, should be
simple and robust.
4.2-R8 Interferometric pointing, baseline, and beam response fitting
should be available.
4.2-R9 Determination of polarization calibration quantities such as
leakage (D-term or Jones matrix) and complex gain difference
must be an integral part of the package, using both linearized
and full matrix calculations.
4.2-R10 Incorporation of standard models (e.g. planetary disks, models
for HII region structure, known source spectra) should be
easy for calibration operations.
4.3 Single dish data
4.3-R1 Processing for pointing, tipping, focusing, beam-fitting data
must be available.
4.3-R2 Straightforward and flexible fitting of spectral bandpass from
calibration source observations is required
4.3-R3 Calibration of temperature controlled loads and noise sources
from observations of celestial sources should be supported
4.3-R4 Final data scaling must be possible in
case that the 1% absolute calibration fails for
unexpected reason. A user may want to make
their manual scaling, e. g. by referring to
the line intensity of the map reference center, to
compensate for the daily/time variation.
4.3-R5 For single-dish OTF observations
with the two orthogonal scan directions
(e.g., N-S and E-W), the intensity scale can be
adjusted to minimize the "scan effect" due to slight
variation of gain.
4.4 Mosaicing considerations
4.4-R1 Individual data points must be associated with pointing
center information, and one must have the ability to
deal with complex scanning strategies.
4.4-R2 Determination of and correction for pointing offsets and
the beam shape is critical to the ability to reliably mosaic
using ALMA, and thus must be available in several algorithmic
forms in the package.
4.4-R3 The complex polarization response of the telescope beams must
be calibratable (though this is mostly an imaging step).
4.4-R4 Careful cross calibration of the flux scales between ALMA
interferometric data and single dish data is required for high
fidelity imaging (important and more difficult for ACA data).
There must be tools to cross-check and correct the relative
calibration between mosaics and different component
observations.
4.5 Ancillary and diagnostic data
4.5-R1 Environmental data such as weather (e.g. wind speed,
temperature, dew point) should be available for editing
or calibration procedures, and easily incorporated into
user-specified calibration models.
4.5-R2 Engineering monitoring information such as temperature
sensor readings and tilt-meter outputs, perhaps included
as ancillary tables attached to data files with special
keywords, should be readable and incorporated into the
calibration and editing process.
4.5-R3 Output from the atmospheric monitoring (e.g. WVR, FTS)
instrumentation should be processed and used by calibration
software.
4.5-R4 Pointing, focus and subreflector information must be
dealt with appropriately.
OL-5.0 Imaging
5.1 General imaging and analysis requirements
5.1-R1 Imaging data selection from any combination of ALMA exported
data, the ALMA archive, or other instruments supporting common
export formats must be provided.
5.1-R2 Efficient selection of subsets of the imaging data must be
provided.
5.1-R3 Provision must be made for the utilization and development
of a variety of imaging, deconvolution, and analysis
algorithms (e.g. flavors of CLEAN, MEM, linear and non-linear
mosaics)
5.1-R4 Astrometric accuracy must be preserved over phase-calibration
distances of at least 5 degrees.
5.1-R5 Images made on the different equinox (e.g. B1950 and J2000)
or different coordinate (RA,DEC and l,b) system or
different projection (tangent, sinusoidal, ...)
can be merged and compared appropriately.
5.1-R6 Data cubes using different velocity definition (optical or radio
definition for Doppler velocity) must be merged appropriately.
5.1-R7 Image pixel blanking should be supported.
5.2 Interferometer imaging
5.2-R1 High-fidelity imaging of the entire primary beam in all
Stokes parameters is the primary goal - therefore,
incorporation of the polarized primary beam response of the
array is required.
5.2-R2 Imaging must deal seamlessly with mosaiced data, with proper
gridding in the uv-plane and compensation for primary beam
effects and pointing in such a manner as to mitigate the
effects of non-coplanar baselines and sky curvature. A
variety of options for gridding and beam correction should
be available at user request.
5.2-R3 There must be seamless integration of data from multiple
epochs and configurations
5.2-R4 There must be the ability to include short-spacing data
taken in single-dish mode (both ALMA and non-ALMA data)
5.2-R5 Subtraction of continuum level from spectral data is required.
This can be done in both the Fourier and image domain.
In the case of uv-plane subtraction, flexible setting of the
frequency channel ranges for the calculation of the continuum
level (graphically as well as CLI) should be available.
5.2-R6 The creation of 3D images for rotating object (e.g. planets)
should be supported.
5.3 Single dish imaging
5.3-R1 The package must be able to produce an image by using data
observed at different spacing (or even at random positions)
must be spatially interpolated or re-gridded correctly.
(e.g. 1 arcmin RA-DEC grid observations and 1 arcmin l-b grid
observations; two sets of 1 arcmin grid observations using
two different map reference centers (0,0) and (0'.2, 0'.6))
5.3-R2 - capability of the stamp map (profile map)
5.4 Mosaicing considerations
5.4-R1 Combination of interferometer and single-dish data into
mosaic imaging is essential.
5.4-R2 Careful (polarized) primary beam correction and pointing
correction is critical for high fidelity mosaic imaging
and must be incorporated into the mosaicing algorithms.
5.5 Inclusion of the ACA
[volunteers?]
OL-6.0 Analysis
6.1 General analysis requirements
6.1-R1 The astronomer must have the capability to develop
their own tools or tasks, with easy access to data
and images, and straightforward interface with the
package
6.1-R1 Translation between various astronomical quantities
and units (e.g. Jy and K, MHz and km/s) should be
straightforward and user selectable.
6.2 Visibility data analysis
6.2-R1 UV-plane analysis based on goodness-of-fit to a model
will be required.
6.3 Single dish data analysis
6.3-R1 Automatic measurement of line parameters (line intensity,
integrated intensity, Gaussian-fit line width,
rms noise level, ...) for user specified velocity (frequency)
window must be made, and to be stored in a text-format list
file that can be output by the user if desired.
6.3-R2 Effective, robust and precise spectral baseline removal
facility is required. Fourier analysis of standing waves and
their removal will be a critical task.
6.3-R3 Fourier transform to a pseudo-uv plane "image" will be needed.
6.4 Mosaicing and combined array analysis
6.4-R1 Seamless transformation between image-plane and uv-plane
analysis is necessary.
6.5 Image analysis and manipulation
6.5-R1 The ability to extract lower-dimensional "slices" from
n-dimensional data cubes efficiently is required.
6.5-R2 The ability to collapse or integrate over sub-dimensions
of data cubes in order to form "moments" is required.
6.5-R3 Blanking of pixels (magic-value) must be maintained through
the analysis process. It is desirable that blanks not be
destructive (the original pixel value is retained), and it be
possible to turn on and off different blanking ("mask") levels.
R3.1 Interactive and automatic facilities for setting of
blanking parameters (e.g. windowing, S/N based blanking)
to avoid degrading S/N in the analysis must be provided.
OL-7.0 Visualization
7.1 General visualization and plotting requirements
7.1-R1 Plotting and display capabilities should be integrated into
the GUI tools throughout the package. Where possible the
displays should have similar look and feel to reduce the
plotting learning curve.
7.1-R2 Plotting of data and calibration quantities as a function of
time (LST,UT,etc.) in both standard X-Y and image cube
(with additional axes) must be available.
7.1-R3 Where appropriate, edition and flagging capability should be
incorporated into all plots. Basically, if you "see-it"
you should be able to "flag-it".
7.1-R4 The output of the display should be possible in many
different designated formats, including fits, postscript,
pdf, gif and jpeg.
7.2 Display appearance and interactivity
7.2-R1 User should be able to produce overlays of different data
sets of standard formats. It should be possible to place these
data sets in layers which can be switched on and off
separately. The different images should be editable, and it
should be possible to declare certain colors transparent. It
must be possible to shift, rotate and scale the images at
will.
7.2-R2 It must be possible to display and overlay data with
different coordinate systems, i.e. the coordinate system of
the display can be chosen independent of the system the
data were observed in.
7.2-R3 Both contour plots with variously colored and styled lines
and false color maps should be possible, it should also be
possible to produce RGB overlays (i.e. one layer gets
assigned intensity scales of red, another one of green,
and one of blue), or Hue/Intensity/Saturation.
7.2-R4 User should be able to manipulate intensity and color
scales easily and graphically, the setup achieved should
be saveable and reloadable.
7.2-R5 User should be able to add annotation, both
interactively and through scripts, for publication quality
plots, i.e. text with various fonts (including Greek
letters), symbols (e.g. all the symbols provided by the
LaTeX package with AMSTeX extension), arrows, geometrical
figures like boxes and circles etc. Different line
styles, sizes, thicknesses and colors for all those should
be available. The various elements should be editable and
removable separately, and it should be possible to put
them in a separate layer.
7.2-R6 The display tool should have astrometry facilities, i.e.
based on catalogs and by assigning sources in the maps it
should be possible to calculate the coordinate system.
7.2-R7 The display for spectra must be linked to molecular data
bases which make identification of the lines possible.
7.3 Image-cube manipulation
7.3-R1 Interactive display of spectra corresponding to a displayed
image should be supported. For example, the display of
spectrum by clicking the the image map on the display,
for the position nearest the cursor position. Also, dragging
the line on the map to bring up a position-velocity diagram.
7.3-R2 Plotting of spectra on a pseudo-grid corresponding to
position on a raster (e.g. a "stamp map" or "profile map",
basically thumbnail spectra in panels corresponding to
position) should be possible.
7.3-R3 Data cubes should be viewable as movies with varying
speeds.
7.3-R4 It should be possible to view arbitrary subsets or
slices of data cubes.
OL-8.0 Simulation
8.1 General simulation requirements
8.1-R1 There must be simulation capability for interferometer and
single dish observation with ALMA in all modes, for planning
(with the ObsTool) and comparison of data with models
(for editing and correction). Various simulator components
with different levels of complexity and execution speed will
be necessary to carry out desired tasks, such as:
R1.1 Level 1 - Simple expected sensitivity levels given
integration time, configuration, mosaicing strategy,
atmospheric quality limits, used for proposal and schedule
preparation in the Observing Tool, though also useful
offline to check basic data properties. Timescale for
execution should be 0.1-5 minutes and should not require
significant computational resources.
R1.2 Level 2 - Basic dataset simulation tools to generate
fake data given observing parameters and simple models of the
instrument and atmosphere. These should include error
generation for thermal noise, pointing, primary beam,
atmosphere temperature and mean structure function,
antenna, optics, receiver, and correlator efficiencies, etc.
This is primarily for use in the pipelines and also for
testing of other software components. Timescale for
execution should be 1-30 minutes and may require significant
cpu and memory resources.
R1.3 Level 3 - More complex instrumental modeling, most useful
for project staff and engineers. Timescale of execution is
indefinite and will likely require special resources (e.g.
parallel computing).
8.1-R2 The output of the simulator must be compatible with the
rest of the offline package, and with the ALMA pipeline.
It should be available in all ALMA data format(s).
8.1-R3 The speed of the simulator must be commensurate with the
desired feedback time. For instance, if used with the
real-time-system to assess quality the simulator must
respond in minutes, if used for proposer feedback for
ObsTool application it should feedback also on minute
timescales for most simple experiments, while for complicated
engineering simulations it may be allowed to take
correspondingly longer.
8.1-R4 Relevant parts of the simulator (e.g. simple single field and
mosaic dataset generation with thermal noise and pointing
errors) should be available early in
the software production cycle in order to use it to test other
components of the package.
8.2 Interferometer simulation
8.2-R1 All correlator modes should be supported in the simulation.
8.2-R2 A primary use of simulation capability of the package is
to compute expected visibilities for trial models during the
calibration, imaging, or analysis stages, and thus must be
integrated to this extent with those tools.
8.2-R3 Realistic inclusion of telescope primary beam response
(with polarization), gain fluctuations, bandpasses,
atmospheric effects, correlator errors, and other effects
must be supported. It must be relatively easy for the
user to modify the simulator to include new error terms.
8.3 Single dish simulation
8.3-R1 All scanning and subreflector nutation modes must be supported.
8.3-R2 Baseline drift and standing wave effects must be included in
spectra.
8.4 Incorporation of previous or foreign data
8.4-R1 Simulation software must have the capability to incorporate
ALMA data taken previously, assuming the observer has been
granted access to this data.
8.4-R2 Simulation software must have the capability to incorporate
user-supplied input models and/or data in ALMA-supported form.
8.5 Interaction with Observing Tool
8.5-R1 The primary use of the ALMA simulator is to provide guidance
during the proposal and schedule preparation phases. As such,
it is critical that the relevant simulation software be
compatible with the Observing Tool, preferably integrated
seamlessly into its interface.
OL-9.0 Special Features
9.1 VLBI support
9.1-R1 It is assumed that the major processing of VLBI data from ALMA
will be outside the ALMA package, and only parts of the
processing necessary to produce usable single-dish or phased
array data and the associated calibration must be supported by
ALMA software.
9.1-R2 ALMA must provide the export of VLBI data in a suitable format.
9.2 Solar observing mode
9.3 Pulsar observing
9.4 Other modes
-------------------------------------------------------------------------------
Section 4: Common Algorithms
-------------------------------------------------------------------------------
Do we want a section here to point to in the previous sections? Who will
write this part?
-------------------------------------------------------------------------------
Appendices
-------------------------------------------------------------------------------
A. Appendix A: The ACA - to be written
B. Example pipeline parameters
C. Example calibration outline
-------------------------------------------------------------------------------
Appendix B: Barry Clark's list of input parameters needed for each procedure
-------------------------------------------------------------------------------
SCIENCE PIPELINE SCRIPTS
Script reduceSingleField
Parameters:
Field size
Pixel size
Stokes (or other polarization representations) to be imaged
Coordinate system of map (raDec, galactic, what have you)
Rotation of coordinates (in given system)
RA (of map center (probably rashift for moving objects like asteroids))
Dec (of map center (or galactic latitude, or whatever))
Gridding parameter (eg "robustness", or superuniform size)
Permissible degree of shadowing
Predeconvolution u,v taper
Deconvolution method
Deconvolution parameter(s) (eg parameter for automatic CLEAN boxing)
Restoring beamsize
Self-cal request [Y/N]
Self-cal control parameter(s) (eg, how many iterations, phase only or
amp every n cycles, etc.)
Self-cal validity parameter (One must have an algorithm for deciding if
it makes sense to apply a self-cal solution. At cm wavelengths, it
makes sense to look at the corrections computed for adjacent
integrations and reject things that look like noise. Not clear if
this works at mm wavelength, but something is desperately needed, and
it may well need a parameter to control it.)
Correct antenna beam [Y/N]
Pointer to an image in the archive to use for a starter image for selfcal
or a default image for MEM.
Number of dataSources
For each dataSource,
Data pointer sufficient to retrieve the requisite data from the archive
Format of the pointer may be a little tricky - it needs to be able
at least to handle locating data on the current source from all
sessions of the current project and also data from other projects
located by the observer at proposal phase II time. (Adding in data
from other instruments should be done outside the pipeline context,
I think.)
Also data pointers into the calibration archive to retrieve
Antenna Tsys spectra
Sideband ratios
Flux calibrations
Bandpass/delay calibrations
Phase calibration solutions
Amplitude calibration solutions
Polarization calibrations
Environment temperature and pressure
WVR data
Number of images
For each image, for each dataSource
A weight
Channel selection (I'd put a bitmap at this point, with conversion from
a more human-readable form at observeTool time, but whatever; there
are good arguments for something more sophisticated.)
Information assumed present with the correlation data
u,v
Time and date
Flagging and shadowing information
? maybe Total power from detectors
Information about the observation from archive headers
Phase stopping center
Frequency(ies) of observation
Sideband separating/sideband suppressing mode
Usage of WVR data
Image labeling information
Additional information from system catalogs
Antenna beam shape
Gain curves
// We will need two types of interferometer mosaic reductions.
// For a survey mode, it is most efficient to treat the observations
// as a set of single fields, with the results simply transfered to
// a single output plane. The true mutual deconvolution mosaic requires
// a lot more computation. In any case, the mosaic reduction requires
// the same information as the single pointing one, except for a
// different form of specification of the output map. Whether one
// has a switch in the script between the two types, or has different
// scripts doesn't matter much.
Script reduceMosaic
mutualDeconvolution? [Y/N]
Description of output area (possibly a set of polygons, but whatever)
// Other parameters same as for single field, except for output field size,
// which is replaced by above
Scrip reduceAutocorrelationMosaic
Output pixel size
Stokes (or other polarization representations) to be imaged
Coordinate system of map (raDec, galactic, what have you)
Rotation of coordinates (in given system)
Description of area to be mosaiced (polygons or whatever)
Gridding method (eg, interpolate and add, or FFT, add, FFT back)
Permissible degree of shadowing
Number of dataSources
For each dataSource,
Data pointer sufficient to retrieve the requisite data from the archive
Also data pointers into the calibration archive to retrieve
Antenna Tsys spectra
Sideband ratios
Flux calibrations
Bandpass calibrations
Amplitude calibration solutions
Polarization calibrations
Environment temperature and pressure
WVR data
Number of images
For each image, for each dataSource
A weight
Channel selection (I'd put a bitmap at this point, with conversion from
a more human-readable form at observeTool time, but whatever; there
are good arguments for something more sophisticated.)
Information assumed present with the correlation data
Time and date
Pointing center
Flagging and shadowing information
? maybe Total power from detectors
Information about the observation from archive headers
Frequency(ies) of observation
Image labeling information
Additional information from system catalogs
Antenna beam shape
Gain curves
QUICKLOOK PIPELINE SCRIPTS
Script quickLookMosaic (or single pointing)
Parameters
Field size
Pixel size
Coordinate system of map (raDec, galactic, what have you)
Rotation of coordinates (in given system)
Permissible degree of shadowing
Gridding parameter (eg "robustness", or superuniform size)
u,v taper
Correct antenna beam [Y/N]
Minimum time per image (if the next image to be made does not have this
much time, the pipeline sleeps until it does)
Maximum time per image (limits data to keep the pipeline computing demands
within reason
A set of images to be computed: for each,
RA (of map center (probably rashift for moving objects like asteroids))
Dec (of map center (or galactic latitude, or whatever))
(or could be just an id of a phase tracking center in a mosaic)
Stokes (or other polarization representations) to be imaged
Mask for channels to be added together before gridding
A pointer to data just taken
Also data pointers into the calibration archive to retrieve
Antenna Tsys spectra
Sideband ratios
Flux calibrations
Bandpass/delay calibrations
Phase calibration solutions
Amplitude calibration solutions
Polarization calibrations
Environment temperature and pressure
WVR data
Script quickLookAutocorrelation
Parameters
Output pixel size
Stokes (or other polarization representations) to be imaged
Coordinate system of map (raDec, galactic, what have you)
Rotation of coordinates (in given system)
Description of area to be mosaiced (polygons or whatever)
Data pointer sufficient to retrieve current data from the archive
Also data pointers into the calibration archive to retrieve
Antenna Tsys spectra
Sideband ratios
Flux calibrations
Bandpass calibrations
Amplitude calibration solutions
Polarization calibrations
List of channel selections to be imaged
CALIBRATION SCRIPTS
// For the most part, calibration scripts are invoked whenever a calibrator
// observation is finished, and have no parameters other than a pointer to
// the data just taken.
Script calibrateTsys
Parameters
Pointer to load switched data (with total power and autocorrelation)
Script calibrateSidebandRatio
Parameters
Pointer to data on strong calibrator in sideband separating mode
Script calibrateFlux
Parameters
Pointer to data on a flux calibrator
Optional pointers to phase calibrations (for a weak flux calibrator)
Script calibrateBandpass
Parameters
Pointer to bandpass calibrator observation
Optional channel mask (to flag channels with nasty lines, etc)
Represent as polynomial? [Y (how many terms)/N]
Script calibratePhase
Parameters
Pointer to phase calibrator observation
Also calibrate amplitude? [Y/N]
Optional channel mask
-------------------------------------------------------------------------------
Appendix C: ALMA Calibration Outline (SMyers)
-------------------------------------------------------------------------------
This will not be included in final draft, I include this here in case it is
useful for discussions at Berkeley on the calibration procedure.
ALMA Calibration Outline
Draft 1-May-2001 v0.3
S. Myers
-------------------------------------------------------------------------------
Before setting down the offline (and pipeline) requirements for calibration,
it would be a good idea to come up with an outline of the ALMA calibration
process. It is the goal of this outline to list the various ALMA calibration
operations, and to delineate the possible calibration procedures with either
a set of flow charts or sequences of operations and states.
-------------------------------------------------------------------------------
1. Online Calibration:
-------------------------------------------------------------------------------
These occur in the on-line system and are applied to control parameters
of the system such that the output of the correlator is a correct
representation of the correlation coefficient. We will not be concerned
with these here, but list a subset of them for completeness. To some extent
getting these wrong online will be corrected for in later operations.
-------------------------------------------------------------------------------
1.1 Delay
1.2 Pointing
1.3 Focus
1.4 Level Control
1.5 Quantization Corrections
-------------------------------------------------------------------------------
Real Time Calibrations:
-------------------------------------------------------------------------------
These are based on environmental information, monitor data, noise tube
measurements, calibration vane measurements, WVR measurements and are
meant to be applied to the data stream (though possibly well after the
fact).
-------------------------------------------------------------------------------
Applies To: (C=continuum S=spectrum Inf=interferometer TP=total power)
Inf TP
Code C S C S What Comes From Produces
------------------------------------------------------------------------------
2.1 SB x x x x Sideband Ratio CW signal? SSB temperature
2.2 TA x x x x Temperature Scale Noise tubes/vanes CC -> Ta*
2.3 WV x x Atmospheric Phase WVR Phase-stable
2.4 RP x x Polarization Cor. Cal signal? Orthogonal
Notes:
2.1 I am unsure just how these get applied here. They correct the DSB
temperatures measured in TA to SSB temperatures (would that be TA*SB?).
2.2 Typically from chopper wheel, includes opacity correction. Question:
will this be measured in some pseudo-continuum band at the antenna and
thus may be contaminated by lines, etc.?
Alternative: use tip curves or WVR+model. Chopper/Subref system should
be sufficient.
2.3 Option to apply correction in real-time, or record uncorrected data
also, plus corrections (useful if integration time shorter than
coherence time).
2.4 This is if a cal signal is used to correct polarization products in
real time.
-------------------------------------------------------------------------------
A Priori Calibrations:
-------------------------------------------------------------------------------
These require some previously known calibration information (eg. from
baseline determinations, gain curves) usually not part of
the observations themselves.
-------------------------------------------------------------------------------
Applies To:
Inf TP
Code C S C S What Comes From Produces
------------------------------------------------------------------------------
3.1 GC x x x x Antenna Gain Eff. vs. Elev. Normalized gain
3.2 IB x x Baseline Correction Calibration run Phase-stable
3.3 PB x x x x Primary Beam Holography PB Corrected
Notes:
3.3 Incorporated into imaging, includes polarization primary beam.
-------------------------------------------------------------------------------
A Posteriori Calibrations:
-------------------------------------------------------------------------------
These are usually determined from the data itself or calibrations taken
along with the data, and usually require a-priori calibration of the
data before determination. Some of these (bandpass, leakage) could be
done as a-priori if they were sufficiently stable.
-------------------------------------------------------------------------------
Applies To:
Inf TP
Code C S C S What Comes From Produces
------------------------------------------------------------------------------
4.1 FL x x x x Flux Scale Source Calibration Flux Density
4.2 GA x x Interferometer Gain Phase&Amp Referencing Phase Coherence
4.3 BP x x x x Bandpass Bright Source Flat bandpass
4.4 PD x x Pol. Phase Difference Pol. Cal. Source Orthogonal
4.5 PL x x x x Polarization Leakage Pol. Cal. Source Stokes
Notes:
4.1 Or conversion from Tant to Flux using aperture efficiency, as is
standard in VLBI.
4.3 On ALMA the minimum number of channels (eg. continuum mode) is
64, and thus some sort of bandpass correction will like be done
for all data.
4.4 eg. R/L phase diff for circular. In single dish case applied online.
4.5 After conversion from polarization products to Stokes parameters.
-------------------------------------------------------------------------------
Calibration States:
-------------------------------------------------------------------------------
The data can be considered to be in one of a number of "calibration states" at
a given time, depending on what has been applied to that point. The order in
which some of these are applies in not critical (eg. RW can be applied after
RT, you might be able to skip temperature and go directly to flux if the
system were stable enough). The following is a possible flow of states
during the calibration procedure:
State Units Input Apply Notes
-------------------------------------------------------------------------------
Raw Correlation coefficient raw only online short integ
RawPcor Correlation coefficient Raw WV longer integ
NorTemp Antenna Temperature (Ta*) RawPcor TA
RawFlux Flux density (PolProd) NorTemp FL
NorFlux Flux density RawFlux GA,BP
NorGain Flux density NorFlux GC
NorPol Flux density (Stokes) NorGain PD,PL for poln data
We can write these in operator notation, starting from Raw data or an
intermediate state:
NorTemp = TA*WV
NorFlux = GA*BP*FL*NorTemp = GA*BP*FL*TA*WV
Some alternative formulations:
Spectra in K.km/s: TemSpec = BP*GC*GA*NorTemp
One could make a set of flow charts, but I think the operator notation works
pretty well. These in some sense correspond to matrices in the measurement
equation, for example. If you think of the input streams of LTA output
(frequency channels x bands x polarization products), these are preserved
by the operations (scaling) until application of PD/PL, which mix the products
into Stokes (a rotation).
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Supplementary Info
-------------------------------------------------------------------------------
Document History:
2001.03.06 version 1.0
2001.03.08 version 1.1
2001.03.15 version 1.2
2001.05.01 slightly updated version for May SSR telecon
2001.06.12 incorporate Barry's comments from 2001.05.04
2001.06.12 three sections, incorporate Frederic & Peter's pipeline doc
2001.06.14 include Tatematsu's first single-dish contributions, other adds
2001.06.15 include Tatematsu and Momose additions, fill in some simulator stuff
2001.06.18 beef up interface section
2001.06.19 include Schilke's additions, beef up Data Handling & Calibration
2001.06.20 include Lucas's contributions
2001.06.21 version 2.0 to SSR for comments
2001.06.30 version 2.1 update offline section (smyers)
2001.07.12 version 2.2 for SSR Berkely meeting
-------------------------------------------------------------------------
|:| Steven T. Myers |:| Associate Scientist |:|
|:| National Radio Astronomy Observatory |:| |:|
|:| P.O. Box O |:| 1003 Lopezville Rd. |:|
|:| Socorro, NM 87801 |:| Ph: (505) 835-7294 |:|
|:| smyers@nrao.edu |:| FAX: (505) 835-7027 |:|
|:| http://www.aoc.nrao.edu/~smyers |:| |:|
-------------------------------------------------------------------------