[SPARK-53434][SQL] ColumnarRow's get should also check isNullAt
### What changes were proposed in this pull request?
Currently, ColumnarRow's `get` call didn't check `isNullAt`, but `UnsafeRow.get` does.
https://github.com/apache/spark/blob/b177b6515c8371fe0761b46d2fa45dd5e8465910/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/SpecializedGettersReader.java#L36
And in some cases it's assumed that the `InternalRow.get` is null safe, for example https://github.com/apache/spark/blob/5b2c4cf9ce886b69eeb5d2303d7582f6ecd763aa/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala#L377
We hit it when we extend spark to make it working on columnar data.
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manually
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #52175 from WangGuangxin/fix_columnarrow.
Authored-by: wangguangxin.cn <wangguangxin.cn@bytedance.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java
index ac05981..b14cd34 100644
--- a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java
+++ b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java
@@ -164,6 +164,7 @@
@Override
public Object get(int ordinal, DataType dataType) {
+ if (isNullAt(ordinal)) return null;
if (dataType instanceof BooleanType) {
return getBoolean(ordinal);
} else if (dataType instanceof ByteType) {
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala
index a0fe44b..966e892 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala
@@ -950,4 +950,27 @@
10, StructType(Seq(StructField("year", yearUDT)))) { testVector =>
assert(testVector.dataType() === StructType(Seq(StructField("year", IntegerType))))
}
+
+ testVectors("SPARK-53434: ColumnarRow.get() should handle null", 1, structType) { testVector =>
+ val c1 = testVector.getChild(0)
+ val c2 = testVector.getChild(1)
+ val c3 = testVector.getChild(2)
+
+ // For row 0, set the integer field to null, and other fields to non-null.
+ c1.putNull(0)
+ c2.putDouble(0, 3.45)
+ c3.putLong(0, 1000L)
+
+ val row = testVector.getStruct(0)
+
+ // Verify that get() on the null field returns null.
+ assert(row.isNullAt(0))
+ assert(row.get(0, IntegerType) == null)
+
+ // Verify that other fields can be retrieved correctly.
+ assert(!row.isNullAt(1))
+ assert(row.get(1, DoubleType) === 3.45)
+ assert(!row.isNullAt(2))
+ assert(row.get(2, TimestampNTZType) === 1000L)
+ }
}