SDAP-408 - Improvements to ingestion (#61)

* Writer fault tolerance

Noticed with Solr writes, but applied to all writers. Ingester process hits the underlying store very hard which, in Solr's case, can cause the write operation to fail. Existing implementation treats any failure as a lost connection and fails the ENTIRE pipeline. Now it will make several attempts with some backoff between attempts.

* Don't use np.ma.filled unless needed

Xarray already handles filling invalid points with NaN, so we just need to grab the underlying np.ndarray from the DataArray. The call to np.ma.filled with xr.DataArray type which I suspect data_subset is frequently if not always, is equivalent to calling np.array(data_subset).

* Worker init log msg

* Write consolidation

* Removed use of np.ma.filled with xr.DataArrays.

* Elasticsearch writer complies with abstract def but doesn't batch yet

* Updated data subset array creation for all reading processors

* Batching

* Batching of executor tasks & Cassandra writes

Cassandra writes are still individual but they are started & awaited in batches

* Raised logging level in kelvin to celsius processor to match others

* Logging formatting for time

* Logging formatting for write progress

* Improvements

* Removed commented code

Co-authored-by: rileykk <rileykk@jpl.nasa.gov>
Co-authored-by: skperez <stepheny.k.perez@jpl.nasa.gov>
12 files changed