Sign in
apache
/
hudi
/
HEAD
4499b0b
feat(spark): ZooKeeper node should hold spark app id (for helping debug when lock is held for long time) (#18123)
by Krishen
· 5 hours ago
master
9ba2760
feat(table-services): Support clustering file groups with earlier instants times first (#18174)
by Krishen
· 5 hours ago
31b8706
feat(vector): Add further research for supporting VECTOR type to RFC-99 (#18184)
by Rahil C
· 6 hours ago
3dcf4c6
feat(client): Add pre-write validator framework (#18239)
by Nada
· 29 hours ago
3139a19
feat(table-services): Allow users to not parallelize each partition with engine context during clustering planning (#18191)
by Krishen
· 31 hours ago
365398a
refactor: Hudi Flink source v2 with better context management (#18269)
by Peter Huang
· 2 days ago
abd8c22
fix(table-services): When applying rollback metadata to metadata table (v6) do not rollback a metadata table deltacommit if it has been already rolled back by post-commit rollback (#18160)
by Krishen
· 2 days ago
43d8ed8
test(common): report jvm memory stats for unit tests (#18207)
by Surya Prasanna
· 4 days ago
a31f15d
fix: infer record merge mode for pre-v9 tables in generateRequiredSchema (#18106)
by vamsikarnika
· 4 days ago
1b2cee80
feat(vector): add VECTOR type to HoodieSchema (#18146)
by Rahil C
· 4 days ago
1203b21
feat: Use PartitionValueExtractor interface in Spark reader path (#17850)
by Surya Prasanna
· 4 days ago
69e24ea
perf: eliminate unnecessary timeline loading for Flink append only write path (#18264)
by Danny Chan
· 4 days ago
65c1b12
feat: Notebooks to support multiple hudi versions (#18255)
by Ranga Reddy
· 4 days ago
e063493
feat(metrics): emit metric for rollback failures (#18148)
by Nada
· 6 days ago
7e9abc7
feat(table-services): Add config to filter partitions during full clean (#17550)
by Prashant Wason
· 6 days ago
73e710d
feat(table-services): Emit archival metrics for monitoring and debugging (#18133)
by Nada
· 6 days ago
e1ae9c6
feat(flink): collect event time in HoodieRowDataCreateHandle for min/max event time metrics (#18250)
by Jianchun Xu
· 6 days ago
d4ff54b
fix(flink): Use timestamp based partitioning in AutoRowDataKeyGen (#18090)
by Prashant Wason
· 7 days ago
363f41a
fix: [HUDI-CLUSTERING] Optimize binary copy performance with lazy loading, bulk reads, and double buffering (#18241)
by Harsha Gudladona
· 7 days ago
59a5d88
fix: add all fields in HoodieSourceSplitSerializer (#18243)
by Peter Huang
· 7 days ago
181af01
perf: Adding support for LatestBaseFilesPathFilter to Spark File Index (#18136)
by Surya Prasanna
· 7 days ago
967408c
refactor: move source assign package under split (#18253)
by Peter Huang
· 8 days ago
fb7b1a5
feat(metadata-table): Add count validation for record index bootstrap (#18029)
by Prashant Wason
· 8 days ago
5fb1b34
fix: Fail metadata bootstrap early in presence of 0 byte file (#18209)
by Surya Prasanna
· 8 days ago
7f4dfda
test: Add Scala test for record index rebootstrap on non-Hoodie partitions (#18208)
by Surya Prasanna
· 9 days ago
df5d7c8
feat(spark-datasource): support spark.hoodie.* read config overrides (#18205)
by Surya Prasanna
· 9 days ago
6156b74
feat(spark): Add HoodieSparkSqlUtils APIs for tooling (#18202)
by Surya Prasanna
· 9 days ago
fd0656c
fix: Use correct lastCompletedTransactionMetadata while acquiring lock for clustering (#18198)
by Surya Prasanna
· 9 days ago
894b817
chore(ci): cleanup for print statements, showing tables/schemas (#17771)
by Tim Brown
· 10 days ago
18ee6cd
fix: Fix string handling on bloom index (#18240)
by Tim Brown
· 10 days ago
f5d08ec
fix: Fix SHOW PARTITIONS commands functionality for slash-separated date partitioning (#18195)
by Surya Prasanna
· 10 days ago
8c5f832
fix: Fix typos across codebase (#18232)
by Xinli Shang
· 11 days ago
fbeb93d
fix: Remove trailing colon from incomplete error message in HoodieTableMetadataUtil (#18233)
by Xinli Shang
· 11 days ago
f7b8059
feat(flink): Off-heap lookup join cache backed by RocksDB (#18231)
by Vova Kolmakov
· 11 days ago
b7f72d3
feat: support predicate push down in Hudi flink source v2 (#18212)
by Peter Huang
· 11 days ago
057af9e
chore(ci): Add Codecov coverage report in GitHub actions (#18230)
by Y Ethan Guo
· 13 days ago
9bb88cf
feat(blob): Blob schema definition (#18108)
by Tim Brown
· 14 days ago
3bab09a
fix: Handle case when 0 byte completed commit files present in the timeline (#18210)
by Surya Prasanna
· 2 weeks ago
431b4d8
docs(spark): Update description of modules related to integration with Spark (#18219)
by Geser Dugarov
· 2 weeks ago
eb1d772
fix(trino): Fix Docker initialization issue in the Trino plugin (#18220)
by vamsikarnika
· 2 weeks ago
9fa5e85
chore: Add .claude and .codex directories to .gitignore (#18213)
by vinoth chandar
· 2 weeks ago
dfe3220
refactor: Remove not used classes from `org.apache.hudi.spark.internal` (#18211)
by Geser Dugarov
· 2 weeks ago
bce8d59
fix(flink): Use blocking instant generation when CDC is enabled (#18206)
by Shuo Cheng
· 3 weeks ago
a7d301a
fix: revert (feat: support mini batch split reader) (#18200)
by Peter Huang
· 3 weeks ago
5180b49
feat(flink): lookup join with retry and async capabilities (#18193)
by Vova Kolmakov
· 3 weeks ago
1577d6e
fix: SimpleAvro-, NonpartitionedAvro- and ComplexAvroKeyGenerator are also valid for writing by Spark when meta-fields are disabled (#18187)
by Vova Kolmakov
· 3 weeks ago
2462ae5
refactor: Remove redundancy in index validation logic in HoodieIndexU… (#17911)
by voonhous
· 3 weeks ago
1846230
fix: throw correct exception when reading hoodie.properties file without access (#18176)
by Surya Prasanna
· 3 weeks ago
833ef62
fix: Empty write should not cause spark analysis errors with pre-commit validators (#18128)
by Krishen
· 3 weeks ago
762fbea
feat(blob): update approach to remove reliance on column groups, break down plan (#18013)
by Tim Brown
· 3 weeks ago
eeb0642
fix unsigned values in proto conversion to be positive (#18186)
by Tim Brown
· 3 weeks ago
53ee07b
fix(metadata-table): Fix failed deletes when updating MDT with clean metadata (#18035)
by Prashant Wason
· 3 weeks ago
1909598
perf(common): Make ThreadLocal variables in HoodieAvroDataBlock static (#18023)
by Prashant Wason
· 3 weeks ago
9788b8d
fix: Use local engine context for clean planning on metadata and non-partitioned tables (#17942)
by Surya Prasanna
· 3 weeks ago
3d5cf26
feat: Publish commits to process metrics for HoodieStreamer (#17929)
by Surya Prasanna
· 3 weeks ago
bd52f59
fix(flink): include exception stacktrace in error logs (#18091)
by Prashant Wason
· 3 weeks ago
513f62c
fix(spark): Fix TestSparkSchemaUtils failing with Spark 3.3 due to timestamp_ntz (#17917)
by Prashant Wason
· 3 weeks ago
ef484d4
fix: Include metadata file cache size option in the configuration for HFile reader (#18175)
by Shuo Cheng
· 3 weeks ago
b584c3c
test(concurrency): add tests for write conflicts with different conflict resolution strategies (#17501)
by Surya Prasanna
· 3 weeks ago
7a88b6b
refactor: Add Lombok annotations to hudi-common module (part 5) (#17878)
by voonhous
· 3 weeks ago
012b572
[MINOR] Publish HUDI version metrics as integers (#17466)
by Prashant Wason
· 3 weeks ago
cf4a194
feat: add Presto to Hudi Notebooks (#18078)
by Ranga Reddy
· 3 weeks ago
5720910
test: add unit test for multiple partition filters on same column (#17934)
by Surya Prasanna
· 3 weeks ago
3244013
feat: Add metadata record_index lookup command to Hudi CLI (#17940)
by Surya Prasanna
· 3 weeks ago
589264c
fix: flink source v2 serializability (#18165)
by Peter Huang
· 3 weeks ago
419f232
[HUDI-9730] RFC-99 Hudi Type System (#13743)
by Balaji Varadarajan
· 3 weeks ago
894df49
feat(schema): Consolidate null type handling (#18163)
by Tim Brown
· 3 weeks ago
cabaf50
[MINOR] Fix HoodieLockMetrics.createTimerForMetrics to not share metric timer (#18097)
by Lokesh Jain
· 3 weeks ago
1ae0d5e
feat: add flink stream read metrics for hudi source v2 (#18130)
by Peter Huang
· 3 weeks ago
51a0e81
fix: correct unsigned int conversion in TestProtoConversionUtil (#18120)
by Surya Prasanna
· 3 weeks ago
b7f4468
fix: interrupt storage LP when heartbeat fails (#17870)
by Alex R
· 3 weeks ago
dbbf86d
Adds a guardrail to prevent the creation of the SparkRDDWriteClient when Spark's speculative execution is enabled (#18045)
by Prashant Wason
· 4 weeks ago
1965271
fix: (table-services) When using multiwriter do not delete pending rollback plan if exception is thrown while reading it (#18093)
by Krishen
· 4 weeks ago
4c9dcb3
[MINOR] Preload file listing for partitions in BloomIndex to avoid repeated listings (#17462)
by Prashant Wason
· 4 weeks ago
76f8ffb
refactor: Add Lombok annotations to hudi-common module (part 6) (#17880)
by voonhous
· 4 weeks ago
14d8894
refactor: apply lombok for flink source v2 related classes (#18122)
by Peter Huang
· 4 weeks ago
241450e
feat: support partition pruner in Flink hudi source v2 (#18074)
by Peter Huang
· 4 weeks ago
72c9f6f
feat(schema): Minor cleanup of Avro schema usage (#18043)
by Tim Brown
· 4 weeks ago
d92e99b
feat: Lance schema evolution (add column, type promotion) (#17904)
by Rahil C
· 4 weeks ago
ccce6e1
feat: support flink split distribution strategy (#18082)
by Peter Huang
· 4 weeks ago
013a165
feat(schema): Remove usage of migrated AvroSchemaUtils and HoodieAvroUtils methods (part 1) (#18007)
by Tim Brown
· 4 weeks ago
63275b3
perf: Avoid re-fetching file status from FS for HFile readers (#17709)
by Tim Brown
· 4 weeks ago
3aa3fd1
fix: exit transaction with error in storage LP when unlock failure due to lock acquired by others (#17871)
by Alex R
· 4 weeks ago
695b9cf
feat(schema): Remove direct reliance on Avro for schema compatibility checks (#18006)
by Tim Brown
· 4 weeks ago
86c5c81
fix: disable retries in s3/gcs storage lock clients for storage based LP (#17869)
by Alex R
· 4 weeks ago
6eaf809
fix: correct deleted keys computation in computeRevivedAndDeletedKeys (#18094)
by vamsikarnika
· 4 weeks ago
4c819a5
perf: Support lazy clean of the RLI cache during bucket assigning (#18018)
by Shuo Cheng
· 4 weeks ago
a59cd12
fix: Handle Non-Null Complex Types with Nullable Elements in ParquetSchemaConverter (#18087)
by Prashant Wason
· 4 weeks ago
5b741b6
refactor: migrate to ScanV2Internal API and remove ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN config (#17520)
by Surya Prasanna
· 4 weeks ago
ff4da47
fix: reload table config after record index bootstrap to avoid bloom index fallback (#17508)
by Surya Prasanna
· 4 weeks ago
55510c4
refactor: Add Lombok annotations to hudi-utilities (Part 2) (#17876)
by voonhous
· 4 weeks ago
6720849
refactor: Add Lombok annotations to hudi-common module (part 4) (#17830)
by voonhous
· 4 weeks ago
c293098
fix: Ensure Lance works when populateMetaFields is false with user defined keygen (#18042)
by Rahil C
· 4 weeks ago
00f5831
fix: Add config version information to DataSourceOptions (#17733)
by huangxiaoping
· 5 weeks ago
c8dddc0
fix(utilities): Use passed-in configs when propsFilePath is null or empty in HoodieStreamer (#17467)
by Prashant Wason
· 5 weeks ago
df58b8b
address feedback (#18063)
by Tim Brown
· 5 weeks ago
0432611
fix: allows eager failure from abnormals for streaming write (#12150)
by fhan
· 5 weeks ago
4a7b623
feat(metadata): Handle metadata table service failures gracefully and emit metrics (#17930)
by Surya Prasanna
· 5 weeks ago
21eb05e
fix: Use TableSchemaResolver in setWriteSchemaForDeletes for better schema resolution (#18030)
by Prashant Wason
· 5 weeks ago
983c03d
feat: Support slash separated date partitioning for Hudi tables (#17787)
by Surya Prasanna
· 5 weeks ago
Next »