Pull master into SNAPSHOT (#84)
* SDAP-208 remove pegged numpy version in nexuscli requirements.txt
* SDAP-206 remove pegged numpy version in nexuscli requirements.txt
* SDAP-206 remove pegged numpy version in nexuscli requirements.txt
* SDAP-192 Create DOMS netCDF reader tool (#83)
* Created DOMS netCDF reader tool.
* SDAP-192 DOMS netCDF reader updated to use num2date function to get datetime object from timestamp.
* SDAP-192 Added logging to DOMS netCDF reader tool.
* SDAP-192 Miscellaneous variable updates with DOMS netCDF reader.
* SDAP-192 'try/except' blocks around 'with' statements for netCDF and CSV file handling. Changed name of CSV file output from main. Moved print statement.
* SDAP-192 Changed print to LOGGER.info in DOMS netCDF reader.
* SDAP-192 Updates to DOMS netCDF reader documentation, including a README.md
diff --git a/tools/doms/README.md b/tools/doms/README.md
new file mode 100644
index 0000000..c49fa4a
--- /dev/null
+++ b/tools/doms/README.md
@@ -0,0 +1,66 @@
+# doms_reader.py
+The functions in doms_reader.py read a DOMS netCDF file into memory, assemble a list of matches of satellite and in situ data, and optionally output the matches to a CSV file. Each matched pair contains one satellite data record and one in situ data record.
+
+The DOMS netCDF files hold satellite data and in situ data in different groups (`SatelliteData` and `InsituData`). The `matchIDs` netCDF variable contains pairs of IDs (matches) which reference a satellite data record and an in situ data record in their respective groups. These records have a many-to-many relationship; one satellite record may match to many in situ records, and one in situ record may match to many satellite records. The `assemble_matches` function assembles the individual data records into pairs based on their `dim` group dimension IDs as paired in the `matchIDs` variable.
+
+## Requirements
+This tool was developed and tested with Python 2.7.5 and 3.7.0a0.
+Imported packages:
+* argparse
+* netcdf4
+* sys
+* datetime
+* csv
+* collections
+* logging
+
+
+## Functions
+### Function: `assemble_matches(filename)`
+Read a DOMS netCDF file into memory and return a list of matches from the file.
+
+#### Parameters
+- `filename` (str): the DOMS netCDF file name.
+
+#### Returns
+- `matches` (list): List of matches.
+
+Each list element in `matches` is a dictionary organized as follows:
+ For match `m`, netCDF group `GROUP` ('SatelliteData' or 'InsituData'), and netCDF group variable `VARIABLE`:
+
+`matches[m][GROUP]['matchID']`: netCDF `MatchedRecords` dimension ID for the match
+`matches[m][GROUP]['GROUPID']`: GROUP netCDF `dim` dimension ID for the record
+`matches[m][GROUP][VARIABLE]`: variable value
+
+For example, to access the timestamps of the satellite data and the in situ data of the first match in the list, along with the `MatchedRecords` dimension ID and the groups' `dim` dimension ID:
+```python
+matches[0]['SatelliteData']['time']
+matches[0]['InsituData']['time']
+matches[0]['SatelliteData']['matchID']
+matches[0]['SatelliteData']['SatelliteDataID']
+matches[0]['InsituData']['InsituDataID']
+```
+
+
+### Function: `matches_to_csv(matches, csvfile)`
+Write the DOMS matches to a CSV file. Include a header of column names which are based on the group and variable names from the netCDF file.
+
+#### Parameters:
+- `matches` (list): the list of dictionaries containing the DOMS matches as returned from the `assemble_matches` function.
+- `csvfile` (str): the name of the CSV output file.
+
+## Usage
+For example, to read some DOMS netCDF file called `doms_file.nc`:
+### Command line
+The main function for `doms_reader.py` takes one `filename` parameter (`doms_file.nc` argument in this example) for the DOMS netCDF file to read, calls the `assemble_matches` function, then calls the `matches_to_csv` function to write the matches to a CSV file `doms_matches.csv`.
+```
+python doms_reader.py doms_file.nc
+```
+```
+python3 doms_reader.py doms_file.nc
+```
+### Importing `assemble_matches`
+```python
+from doms_reader import assemble_matches
+matches = assemble_matches('doms_file.nc')
+```
diff --git a/tools/doms/doms_reader.py b/tools/doms/doms_reader.py
index ffd24b6..c8229c4 100644
--- a/tools/doms/doms_reader.py
+++ b/tools/doms/doms_reader.py
@@ -27,16 +27,20 @@
"""
Read a DOMS netCDF file and return a list of matches.
- Arguments:
- filename (string): the DOMS netCDF file name.
+ Parameters
+ ----------
+ filename : str
+ The DOMS netCDF file name.
- Returns:
- matches (list): List of matches. Each list element is a dictionary:
- For netCDF group GROUP (SatelliteData or InsituData) and group
- variable VARIABLE:
- matches[GROUP]['matchID']: MatchedRecords dimension ID for the match
- matches[GROUP]['GROUPID']: GROUP dim dimension ID for the record
- matches[GROUP][VARIABLE]: variable value
+ Returns
+ -------
+ matches : list
+ List of matches. Each list element is a dictionary.
+ For match m, netCDF group GROUP (SatelliteData or InsituData), and
+ group variable VARIABLE:
+ matches[m][GROUP]['matchID']: MatchedRecords dimension ID for the match
+ matches[m][GROUP]['GROUPID']: GROUP dim dimension ID for the record
+ matches[m][GROUP][VARIABLE]: variable value
"""
try:
@@ -74,16 +78,18 @@
LOGGER.exception("Error reading netCDF file " + filename)
raise err
-def matches_to_csv(matches, filename):
+def matches_to_csv(matches, csvfile):
"""
Write the DOMS matches to a CSV file. Include a header of column names
which are based on the group and variable names from the netCDF file.
- Arguments:
- matches (list): the list of dictionaries containing the DOMS matches as
- returned from assemble_matches.
-
- filename (string): the name of the CSV output file.
+ Parameters
+ ----------
+ matches : list
+ The list of dictionaries containing the DOMS matches as returned from
+ assemble_matches.
+ csvfile : str
+ The name of the CSV output file.
"""
# Create a header for the CSV. Column names are GROUP_VARIABLE or
# GROUP_GROUPID.
@@ -94,7 +100,7 @@
try:
# Write the CSV file
- with open(filename, 'w') as output_file:
+ with open(csvfile, 'w') as output_file:
csv_writer = csv.writer(output_file)
csv_writer.writerow(header)
for match in matches:
@@ -104,7 +110,7 @@
row.append(value)
csv_writer.writerow(row)
except (OSError, IOError) as err:
- LOGGER.exception("Error writing CSV file " + filename)
+ LOGGER.exception("Error writing CSV file " + csvfile)
raise err
if __name__ == '__main__':