tree b73db56dea81bf5e004a3975308d5872105f342f
parent 1dc729f2b27da46671cd33903b2ef52a0ae29ebe
author Thomas Tauber-Marshall <tmarshall@cloudera.com> 1574125046 -0800
committer Impala Public Jenkins <impala-public-jenkins@cloudera.com> 1575415585 +0000

IMPALA-9181: Serialize TQueryCtx once per query

When issuing Exec() rpcs to backends, we currently serialize the
TQueryCtx once per backend. This is inefficient as the TQueryCtx is
the same for all backends and really only needs to be serialized once.

Serializing the TQueryCtx can be expensive as it contains both the
full text of the original query and the descriptor table, which can be
quite large. In a synthetic dataset I tested with, scanning a table
with 100k partitions leads to a descriptor table size of ~20MB.

This patch serializes the TQueryCtx in the coordinator and then passes
it to each BackendState when calling Exec().

Followup work might consider if we really need all of the info in the
TQueryCtx to be distributed to all backends.

Testing:
- Passed full run of existing tests.
- Single node perf run showed no significant change.

Change-Id: I6a4dd302fd5602ec2775492a041ddd51e7d7a6c6
Reviewed-on: http://gerrit.cloudera.org:8080/14777
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
