1. 4499b0b feat(spark): ZooKeeper node should hold spark app id (for helping debug when lock is held for long time) (#18123) by Krishen · 5 hours ago master
  2. 9ba2760 feat(table-services): Support clustering file groups with earlier instants times first (#18174) by Krishen · 5 hours ago
  3. 31b8706 feat(vector): Add further research for supporting VECTOR type to RFC-99 (#18184) by Rahil C · 6 hours ago
  4. 3dcf4c6 feat(client): Add pre-write validator framework (#18239) by Nada · 29 hours ago
  5. 3139a19 feat(table-services): Allow users to not parallelize each partition with engine context during clustering planning (#18191) by Krishen · 31 hours ago
  6. 365398a refactor: Hudi Flink source v2 with better context management (#18269) by Peter Huang · 2 days ago
  7. abd8c22 fix(table-services): When applying rollback metadata to metadata table (v6) do not rollback a metadata table deltacommit if it has been already rolled back by post-commit rollback (#18160) by Krishen · 2 days ago
  8. 43d8ed8 test(common): report jvm memory stats for unit tests (#18207) by Surya Prasanna · 4 days ago
  9. a31f15d fix: infer record merge mode for pre-v9 tables in generateRequiredSchema (#18106) by vamsikarnika · 4 days ago
  10. 1b2cee80 feat(vector): add VECTOR type to HoodieSchema (#18146) by Rahil C · 4 days ago
  11. 1203b21 feat: Use PartitionValueExtractor interface in Spark reader path (#17850) by Surya Prasanna · 4 days ago
  12. 69e24ea perf: eliminate unnecessary timeline loading for Flink append only write path (#18264) by Danny Chan · 4 days ago
  13. 65c1b12 feat: Notebooks to support multiple hudi versions (#18255) by Ranga Reddy · 4 days ago
  14. e063493 feat(metrics): emit metric for rollback failures (#18148) by Nada · 6 days ago
  15. 7e9abc7 feat(table-services): Add config to filter partitions during full clean (#17550) by Prashant Wason · 6 days ago
  16. 73e710d feat(table-services): Emit archival metrics for monitoring and debugging (#18133) by Nada · 6 days ago
  17. e1ae9c6 feat(flink): collect event time in HoodieRowDataCreateHandle for min/max event time metrics (#18250) by Jianchun Xu · 6 days ago
  18. d4ff54b fix(flink): Use timestamp based partitioning in AutoRowDataKeyGen (#18090) by Prashant Wason · 7 days ago
  19. 363f41a fix: [HUDI-CLUSTERING] Optimize binary copy performance with lazy loading, bulk reads, and double buffering (#18241) by Harsha Gudladona · 7 days ago
  20. 59a5d88 fix: add all fields in HoodieSourceSplitSerializer (#18243) by Peter Huang · 7 days ago
  21. 181af01 perf: Adding support for LatestBaseFilesPathFilter to Spark File Index (#18136) by Surya Prasanna · 7 days ago
  22. 967408c refactor: move source assign package under split (#18253) by Peter Huang · 8 days ago
  23. fb7b1a5 feat(metadata-table): Add count validation for record index bootstrap (#18029) by Prashant Wason · 8 days ago
  24. 5fb1b34 fix: Fail metadata bootstrap early in presence of 0 byte file (#18209) by Surya Prasanna · 8 days ago
  25. 7f4dfda test: Add Scala test for record index rebootstrap on non-Hoodie partitions (#18208) by Surya Prasanna · 9 days ago
  26. df5d7c8 feat(spark-datasource): support spark.hoodie.* read config overrides (#18205) by Surya Prasanna · 9 days ago
  27. 6156b74 feat(spark): Add HoodieSparkSqlUtils APIs for tooling (#18202) by Surya Prasanna · 9 days ago
  28. fd0656c fix: Use correct lastCompletedTransactionMetadata while acquiring lock for clustering (#18198) by Surya Prasanna · 9 days ago
  29. 894b817 chore(ci): cleanup for print statements, showing tables/schemas (#17771) by Tim Brown · 10 days ago
  30. 18ee6cd fix: Fix string handling on bloom index (#18240) by Tim Brown · 10 days ago
  31. f5d08ec fix: Fix SHOW PARTITIONS commands functionality for slash-separated date partitioning (#18195) by Surya Prasanna · 10 days ago
  32. 8c5f832 fix: Fix typos across codebase (#18232) by Xinli Shang · 11 days ago
  33. fbeb93d fix: Remove trailing colon from incomplete error message in HoodieTableMetadataUtil (#18233) by Xinli Shang · 11 days ago
  34. f7b8059 feat(flink): Off-heap lookup join cache backed by RocksDB (#18231) by Vova Kolmakov · 11 days ago
  35. b7f72d3 feat: support predicate push down in Hudi flink source v2 (#18212) by Peter Huang · 11 days ago
  36. 057af9e chore(ci): Add Codecov coverage report in GitHub actions (#18230) by Y Ethan Guo · 13 days ago
  37. 9bb88cf feat(blob): Blob schema definition (#18108) by Tim Brown · 14 days ago
  38. 3bab09a fix: Handle case when 0 byte completed commit files present in the timeline (#18210) by Surya Prasanna · 2 weeks ago
  39. 431b4d8 docs(spark): Update description of modules related to integration with Spark (#18219) by Geser Dugarov · 2 weeks ago
  40. eb1d772 fix(trino): Fix Docker initialization issue in the Trino plugin (#18220) by vamsikarnika · 2 weeks ago
  41. 9fa5e85 chore: Add .claude and .codex directories to .gitignore (#18213) by vinoth chandar · 2 weeks ago
  42. dfe3220 refactor: Remove not used classes from `org.apache.hudi.spark.internal` (#18211) by Geser Dugarov · 2 weeks ago
  43. bce8d59 fix(flink): Use blocking instant generation when CDC is enabled (#18206) by Shuo Cheng · 3 weeks ago
  44. a7d301a fix: revert (feat: support mini batch split reader) (#18200) by Peter Huang · 3 weeks ago
  45. 5180b49 feat(flink): lookup join with retry and async capabilities (#18193) by Vova Kolmakov · 3 weeks ago
  46. 1577d6e fix: SimpleAvro-, NonpartitionedAvro- and ComplexAvroKeyGenerator are also valid for writing by Spark when meta-fields are disabled (#18187) by Vova Kolmakov · 3 weeks ago
  47. 2462ae5 refactor: Remove redundancy in index validation logic in HoodieIndexU… (#17911) by voonhous · 3 weeks ago
  48. 1846230 fix: throw correct exception when reading hoodie.properties file without access (#18176) by Surya Prasanna · 3 weeks ago
  49. 833ef62 fix: Empty write should not cause spark analysis errors with pre-commit validators (#18128) by Krishen · 3 weeks ago
  50. 762fbea feat(blob): update approach to remove reliance on column groups, break down plan (#18013) by Tim Brown · 3 weeks ago
  51. eeb0642 fix unsigned values in proto conversion to be positive (#18186) by Tim Brown · 3 weeks ago
  52. 53ee07b fix(metadata-table): Fix failed deletes when updating MDT with clean metadata (#18035) by Prashant Wason · 3 weeks ago
  53. 1909598 perf(common): Make ThreadLocal variables in HoodieAvroDataBlock static (#18023) by Prashant Wason · 3 weeks ago
  54. 9788b8d fix: Use local engine context for clean planning on metadata and non-partitioned tables (#17942) by Surya Prasanna · 3 weeks ago
  55. 3d5cf26 feat: Publish commits to process metrics for HoodieStreamer (#17929) by Surya Prasanna · 3 weeks ago
  56. bd52f59 fix(flink): include exception stacktrace in error logs (#18091) by Prashant Wason · 3 weeks ago
  57. 513f62c fix(spark): Fix TestSparkSchemaUtils failing with Spark 3.3 due to timestamp_ntz (#17917) by Prashant Wason · 3 weeks ago
  58. ef484d4 fix: Include metadata file cache size option in the configuration for HFile reader (#18175) by Shuo Cheng · 3 weeks ago
  59. b584c3c test(concurrency): add tests for write conflicts with different conflict resolution strategies (#17501) by Surya Prasanna · 3 weeks ago
  60. 7a88b6b refactor: Add Lombok annotations to hudi-common module (part 5) (#17878) by voonhous · 3 weeks ago
  61. 012b572 [MINOR] Publish HUDI version metrics as integers (#17466) by Prashant Wason · 3 weeks ago
  62. cf4a194 feat: add Presto to Hudi Notebooks (#18078) by Ranga Reddy · 3 weeks ago
  63. 5720910 test: add unit test for multiple partition filters on same column (#17934) by Surya Prasanna · 3 weeks ago
  64. 3244013 feat: Add metadata record_index lookup command to Hudi CLI (#17940) by Surya Prasanna · 3 weeks ago
  65. 589264c fix: flink source v2 serializability (#18165) by Peter Huang · 3 weeks ago
  66. 419f232 [HUDI-9730] RFC-99 Hudi Type System (#13743) by Balaji Varadarajan · 3 weeks ago
  67. 894df49 feat(schema): Consolidate null type handling (#18163) by Tim Brown · 3 weeks ago
  68. cabaf50 [MINOR] Fix HoodieLockMetrics.createTimerForMetrics to not share metric timer (#18097) by Lokesh Jain · 3 weeks ago
  69. 1ae0d5e feat: add flink stream read metrics for hudi source v2 (#18130) by Peter Huang · 3 weeks ago
  70. 51a0e81 fix: correct unsigned int conversion in TestProtoConversionUtil (#18120) by Surya Prasanna · 3 weeks ago
  71. b7f4468 fix: interrupt storage LP when heartbeat fails (#17870) by Alex R · 3 weeks ago
  72. dbbf86d Adds a guardrail to prevent the creation of the SparkRDDWriteClient when Spark's speculative execution is enabled (#18045) by Prashant Wason · 4 weeks ago
  73. 1965271 fix: (table-services) When using multiwriter do not delete pending rollback plan if exception is thrown while reading it (#18093) by Krishen · 4 weeks ago
  74. 4c9dcb3 [MINOR] Preload file listing for partitions in BloomIndex to avoid repeated listings (#17462) by Prashant Wason · 4 weeks ago
  75. 76f8ffb refactor: Add Lombok annotations to hudi-common module (part 6) (#17880) by voonhous · 4 weeks ago
  76. 14d8894 refactor: apply lombok for flink source v2 related classes (#18122) by Peter Huang · 4 weeks ago
  77. 241450e feat: support partition pruner in Flink hudi source v2 (#18074) by Peter Huang · 4 weeks ago
  78. 72c9f6f feat(schema): Minor cleanup of Avro schema usage (#18043) by Tim Brown · 4 weeks ago
  79. d92e99b feat: Lance schema evolution (add column, type promotion) (#17904) by Rahil C · 4 weeks ago
  80. ccce6e1 feat: support flink split distribution strategy (#18082) by Peter Huang · 4 weeks ago
  81. 013a165 feat(schema): Remove usage of migrated AvroSchemaUtils and HoodieAvroUtils methods (part 1) (#18007) by Tim Brown · 4 weeks ago
  82. 63275b3 perf: Avoid re-fetching file status from FS for HFile readers (#17709) by Tim Brown · 4 weeks ago
  83. 3aa3fd1 fix: exit transaction with error in storage LP when unlock failure due to lock acquired by others (#17871) by Alex R · 4 weeks ago
  84. 695b9cf feat(schema): Remove direct reliance on Avro for schema compatibility checks (#18006) by Tim Brown · 4 weeks ago
  85. 86c5c81 fix: disable retries in s3/gcs storage lock clients for storage based LP (#17869) by Alex R · 4 weeks ago
  86. 6eaf809 fix: correct deleted keys computation in computeRevivedAndDeletedKeys (#18094) by vamsikarnika · 4 weeks ago
  87. 4c819a5 perf: Support lazy clean of the RLI cache during bucket assigning (#18018) by Shuo Cheng · 4 weeks ago
  88. a59cd12 fix: Handle Non-Null Complex Types with Nullable Elements in ParquetSchemaConverter (#18087) by Prashant Wason · 4 weeks ago
  89. 5b741b6 refactor: migrate to ScanV2Internal API and remove ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN config (#17520) by Surya Prasanna · 4 weeks ago
  90. ff4da47 fix: reload table config after record index bootstrap to avoid bloom index fallback (#17508) by Surya Prasanna · 4 weeks ago
  91. 55510c4 refactor: Add Lombok annotations to hudi-utilities (Part 2) (#17876) by voonhous · 4 weeks ago
  92. 6720849 refactor: Add Lombok annotations to hudi-common module (part 4) (#17830) by voonhous · 4 weeks ago
  93. c293098 fix: Ensure Lance works when populateMetaFields is false with user defined keygen (#18042) by Rahil C · 4 weeks ago
  94. 00f5831 fix: Add config version information to DataSourceOptions (#17733) by huangxiaoping · 5 weeks ago
  95. c8dddc0 fix(utilities): Use passed-in configs when propsFilePath is null or empty in HoodieStreamer (#17467) by Prashant Wason · 5 weeks ago
  96. df58b8b address feedback (#18063) by Tim Brown · 5 weeks ago
  97. 0432611 fix: allows eager failure from abnormals for streaming write (#12150) by fhan · 5 weeks ago
  98. 4a7b623 feat(metadata): Handle metadata table service failures gracefully and emit metrics (#17930) by Surya Prasanna · 5 weeks ago
  99. 21eb05e fix: Use TableSchemaResolver in setWriteSchemaForDeletes for better schema resolution (#18030) by Prashant Wason · 5 weeks ago
  100. 983c03d feat: Support slash separated date partitioning for Hudi tables (#17787) by Surya Prasanna · 5 weeks ago