)]}'
{
  "log": [
    {
      "commit": "6acfd53d69b70625f595014ec1e6b0f1a52d9073",
      "tree": "ab6bee3796a849b8c62c2b4821c36da4dd48d48b",
      "parents": [
        "eb692006e592442a2364273bd162ae241a79d094"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Tue Apr 07 10:59:54 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 07 10:59:54 2026 +0800"
      },
      "message": "fix(spark): Increment segmentIndex when skipping segment due to crc check failure (#2746)\n\n### What changes were proposed in this pull request?\n\nWhen segment is skipped, we should increment the segmentIndex; otherwise, it may lead to data corruption.\n\n### Why are the changes needed?\n\nfix bug\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nadded test case"
    },
    {
      "commit": "eb692006e592442a2364273bd162ae241a79d094",
      "tree": "7eb2a74eb1b092e3e574df7e3efdc728ffc1dc74",
      "parents": [
        "ae417e63228dc46157c35c660fc5598650f6a618"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Thu Apr 02 11:45:08 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 02 11:45:08 2026 +0800"
      },
      "message": "[#2739] feat(server): Trigger flush when there are too many blocks in shuffle buffer (#2744)\n\n### What changes were proposed in this pull request?\nTrigger flush when there are too many blocks in shuffle buffer.\n\n### Why are the changes needed?\nFix: #2739\nPrevent memory overflow\n\n### Does this PR introduce any user-facing change?\nset rss.server.buffer.blockCount.capacity\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "ae417e63228dc46157c35c660fc5598650f6a618",
      "tree": "36a86091fbcc4be299ae5bb5d8af69b429ca5a37",
      "parents": [
        "b80940d6eb0bc2de362834c3c3b50842c124b203"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Mar 27 17:39:55 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 17:39:55 2026 +0800"
      },
      "message": "fix(spark): potential hang with skipped segments on overlapping decompression  (#2745)\n\n### What changes were proposed in this pull request?\n\nfix the PR #2735 \n\n### Why are the changes needed?\n\nfix potential hang\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests"
    },
    {
      "commit": "b80940d6eb0bc2de362834c3c3b50842c124b203",
      "tree": "37c17d2244b2bd524de8ad38cde41c8077e34376",
      "parents": [
        "b98b488198da3c130fa86be6e8e9f09407f4f84b"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Wed Mar 25 10:34:48 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 25 10:34:48 2026 +0800"
      },
      "message": "[#2740] improvement(server) Just check block count while checking commit result (#2742)\n\n### What changes were proposed in this pull request?\nReplace Roaring64NavigableMap with AtomicLong for checking commit result\n\n### Why are the changes needed?\nFix: #2740\nTo save memory.\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI"
    },
    {
      "commit": "b98b488198da3c130fa86be6e8e9f09407f4f84b",
      "tree": "6c9d3785a678560a91bf6d9df9865073745b33b8",
      "parents": [
        "2f0b954ae4d5683b74c107514791d88b2cf1f181"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Thu Mar 19 16:56:54 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 19 16:56:54 2026 +0800"
      },
      "message": "[#2738] feat(server): add metrics to track shuffle data block count and avg block size (#2741)\n\n### What changes were proposed in this pull request?\nAdd metrics to track shuffle data block count and avg block size.\n\n### Why are the changes needed?\nFix: #2738\nWe need to identify and stop the application that generate large amounts of blocks when heap memory is insufficient.\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI and manual testing"
    },
    {
      "commit": "2f0b954ae4d5683b74c107514791d88b2cf1f181",
      "tree": "d72294f59c31f7982a9b252c0f9a3176bb982089",
      "parents": [
        "2963220f20ff0dd1dd13e057b598da31276be73b"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Thu Mar 05 19:32:32 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 05 19:32:32 2026 +0800"
      },
      "message": "[#2601][FOLLOWUP] fix(spark): Release segmentPermits before buffer getting to avoid deadlock in decompression worker (#2737)\n\n### What changes were proposed in this pull request?\n\nRelease segmentPermits first to avoid deadlock\n\n### Why are the changes needed?\n\n#2601 caused ShuffleReadClientImplTest to hang, see https://github.com/apache/uniffle/actions/runs/22700481546/job/65816264462?pr\u003d2736\n\nThreads dump:\n\n```\n\"main\" #1 prio\u003d5 os_prio\u003d31 tid\u003d0x0000000152014000 nid\u003d0x1a03 waiting on condition [0x000000016b5a9000]\n   java.lang.Thread.State: WAITING (parking)\n\tat sun.misc.Unsafe.park(Native Method)\n\t- parking to wait for  \u003c0x000000076c844f78\u003e (a java.util.concurrent.CompletableFuture$Signaller)\n\tat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)\n\tat java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)\n\tat java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3334)\n\tat java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)\n\tat java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)\n\tat org.apache.uniffle.client.response.DecompressedShuffleBlock.getByteBuffer(DecompressedShuffleBlock.java:62)\n\tat org.apache.uniffle.client.response.DecompressedShuffleBlock.getUncompressLength(DecompressedShuffleBlock.java:53)\n\tat org.apache.uniffle.client.impl.DecompressionWorker.get(DecompressionWorker.java:165)\n\tat org.apache.uniffle.client.impl.ShuffleReadClientImpl.readShuffleBlockData(ShuffleReadClientImpl.java:329)\n\tat org.apache.uniffle.client.TestUtils.validateResult(TestUtils.java:55)\n\tat org.apache.uniffle.client.impl.ShuffleReadClientImplTest.readTest7(ShuffleReadClientImplTest.java:357)\n```\n\n```\n\"decompressionWorker-0\" #340 daemon prio\u003d5 os_prio\u003d31 tid\u003d0x000000011b921000 nid\u003d0x1481f waiting on condition [0x0000000329772000]\n   java.lang.Thread.State: WAITING (parking)\n\tat sun.misc.Unsafe.park(Native Method)\n\t- parking to wait for  \u003c0x000000076abc0630\u003e (a java.util.concurrent.Semaphore$NonfairSync)\n\tat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)\n\tat java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)\n\tat java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)\n\tat java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)\n\tat java.util.concurrent.Semaphore.acquire(Semaphore.java:312)\n\tat org.apache.uniffle.client.impl.DecompressionWorker.lambda$add$0(DecompressionWorker.java:102)\n\tat org.apache.uniffle.client.impl.DecompressionWorker$$Lambda$490/0x00000008006a0028.get(Unknown Source)\n\tat java.util.concurrent.CompletableFuture$AsyncSupply.run$$$capture(CompletableFuture.java:1604)\n\tat java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:750)\n```\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "2963220f20ff0dd1dd13e057b598da31276be73b",
      "tree": "810fb3cd5c14e52c391b80e919d0c0f056682619",
      "parents": [
        "b324cc33c1457cc55eda981c577d2d5888177ed2"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Thu Mar 05 19:25:20 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 05 19:25:20 2026 +0800"
      },
      "message": "fix(spark): Correct shuffle read time metrics in spark UI tab (#2736)\n\n### What changes were proposed in this pull request?\n\nCorrect shuffle read time metrics\n\n### Why are the changes needed?\n\nThe metrics for shuffle read time are mismatched.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t"
    },
    {
      "commit": "b324cc33c1457cc55eda981c577d2d5888177ed2",
      "tree": "31d63734e06c9f584c91f0f527b6904612a82753",
      "parents": [
        "4637321402dd09616e66065b600b97319dbeb144"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Mar 02 14:52:14 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 02 14:52:14 2026 +0800"
      },
      "message": "[#2716] feat(spark): Introduce option of max segments decompression to control memory usage (#2735)\n\n### What changes were proposed in this pull request?\n\nIntroduce option of max segments decompression to control memory usage\n\n### Why are the changes needed?\n\nTo address issue #2716, this PR introduces an option to set the maximum number of concurrent decompression segments, allowing better control over overall memory usage. Setting this value to 1 restricts decompression to a single segment at a time. \n\n### Does this PR introduce _any_ user-facing change?\n\nYes\n\n### How was this patch tested?\n\nUnit test"
    },
    {
      "commit": "4637321402dd09616e66065b600b97319dbeb144",
      "tree": "6f5d6d63677163ffa2aa7e9af5eecd93cb39d923",
      "parents": [
        "5986591c96d9e7f628e4cfbf6c7e342841af5706"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Mon Mar 02 10:54:24 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 02 10:54:24 2026 +0800"
      },
      "message": "[#2730] improvement(common): Use built-in `CompositeFileRegion` and remove unnecessary `deallocate` method (#2731)\n\n### What changes were proposed in this pull request?\n\n+ move `CompositeFileRegion` class to uniffle package\n+ do nothing in deallocate method\n\n### Why are the changes needed?\n\nSince we have already overridden the `release()/release(int decrement)` methods, we don\u0027t need to do anything in the `deallocate()` method. And after this, we no longer need to place it in the `io.netty.util` package.\n\ncloses #2730\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\n"
    },
    {
      "commit": "5986591c96d9e7f628e4cfbf6c7e342841af5706",
      "tree": "37b0a6c7a1461853b215b4b65680e8497f133b55",
      "parents": [
        "9c0c27dead125a50f4140b342c470e0ed13ec8f6"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Sat Feb 28 09:45:18 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sat Feb 28 09:45:18 2026 +0800"
      },
      "message": "[#2733] fix(spark): Calculate total value for `ShuffleReadTimesSummary` (#2734)\n\n### What changes were proposed in this pull request?\n\nCalculate total value for ShuffleReadTimesSummary.\n\n### Why are the changes needed?\n\nFix the total shuffle read time to zero in uniffle UI.\n\nCloses #2733\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nAfter:\n\u003cimg width\u003d\"1865\" height\u003d\"131\" alt\u003d\"image\" src\u003d\"https://github.com/user-attachments/assets/1ca5ee6c-24cc-45b8-b580-8d760273574a\" /\u003e\n\n"
    },
    {
      "commit": "9c0c27dead125a50f4140b342c470e0ed13ec8f6",
      "tree": "3d996303476e167b4dea4874dd79e015e84b9380",
      "parents": [
        "2731cf2df0de5c8322acbfbc2d81a35d1b7517cb"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Feb 12 15:33:26 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 12 15:33:26 2026 +0800"
      },
      "message": "[#2725] fix(spark)(partition-split): Add fallback under load-balance mode and fix stale assignment missing callback that caused timeout (#2729)\n\n### What changes were proposed in this pull request?\n\n1. Fallback to random server when no servers are available in load-balance mode\n2. Fix stale assignment missing callback in data pusher that caused the writer to hang until timeout, preventing reassign from being triggered\n\n### Why are the changes needed?\n\nfix the #2725 . Finally tracked down and fixed this tricky bug after a thorough investigation.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests\n"
    },
    {
      "commit": "2731cf2df0de5c8322acbfbc2d81a35d1b7517cb",
      "tree": "f2466db43d7ab7bfa05adebd8033e7ec4de46460",
      "parents": [
        "bdea9e675a5395d9d7c79935fa2610318b6f5b69"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Feb 12 10:10:33 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 12 10:10:33 2026 +0800"
      },
      "message": "[#2724] refactor(spark): Introduce `ReassignExecutor` to simplify shuffle writer logic (#2727)\n\n### What changes were proposed in this pull request?\n\nIntroduce `ReassignExecutor` to clarify shuffle writer logic\n\n### Why are the changes needed?\n\nfor #2724\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests\n"
    },
    {
      "commit": "bdea9e675a5395d9d7c79935fa2610318b6f5b69",
      "tree": "946671a7e8b6a811f23cbaec5dc3ab8a0b889213",
      "parents": [
        "eb53a1a7bb63dc0c7f1814e9772c1bb1e242d164"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Feb 12 10:09:38 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 12 10:09:38 2026 +0800"
      },
      "message": "chore(spark): Remove logs of successful heartbeat (#2728)\n\n### What changes were proposed in this pull request?\n\nRemove logs of successful heartbeat\n\n### Why are the changes needed?\n\nsimplify logs in spark driver side\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "eb53a1a7bb63dc0c7f1814e9772c1bb1e242d164",
      "tree": "426fbd4bd23a713e3033edd79712a9aaf57eab69",
      "parents": [
        "1f809ed263cb96dffded0988848977689c058d04"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Feb 10 19:37:13 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Feb 10 19:37:13 2026 +0800"
      },
      "message": "[#2725] feat(spark): Introduce optional fast-switch and ignore retry-count checking for stale assignment  (#2726)\n\n### What changes were proposed in this pull request?\n\n1.  Introduce option for stale assignment fast-switch. After having this, we could better to inspect some bugs if this mechanism is caused\n2. Ignore retry count checking for stale assignment to fix the multi server switch \n\n### Why are the changes needed?\n\nfor #2725 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests\n"
    },
    {
      "commit": "1f809ed263cb96dffded0988848977689c058d04",
      "tree": "fe29937189df6b7e2e1561acff7adcfb62827370",
      "parents": [
        "69b1b45bf4e6e03f4cfe3117416a80460f50537c"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Jan 30 16:46:43 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Jan 30 16:46:43 2026 +0800"
      },
      "message": "[#2718] feat(spark): Eliminate copy in WriterBuffer when compression off for Gluten (#2720)\n\n### What changes were proposed in this pull request?\n\n- This change introduces `CompositeByteBuf` in `WriterBuffer` to provide a zero-copy data view, which is especially beneficial in Gluten scenarios where compression is disabled and handled by the Gluten side. \n- In addition, the CRC32 generator is enhanced to operate directly on ByteBuf.\n\nFor better to achieve zero-copy, we\u0027d better to accept `ByteBuf` in codec, this could be finished in the next PRs.\n\n### Why are the changes needed?\n\nfor #2718 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests "
    },
    {
      "commit": "69b1b45bf4e6e03f4cfe3117416a80460f50537c",
      "tree": "7f9ee4202cdc887e211771f9e31eb1ae11ca8246",
      "parents": [
        "82ed9f8426ac12f7d3fdd3dacd55d9be00071c04"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Jan 29 10:31:06 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Jan 29 10:31:06 2026 +0800"
      },
      "message": "[#2714] feat(spark): Respect compression type when activating overlapping compression mechanism (#2715)\n\n### What changes were proposed in this pull request?\n\nRespect compression type when activating overlapping compression mechanism\n\n### Why are the changes needed?\n\n#2714 . Unify the overlapping compression checking logic into one place and also respect the compression type when activating this mechanism. \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests are enough. \n"
    },
    {
      "commit": "82ed9f8426ac12f7d3fdd3dacd55d9be00071c04",
      "tree": "c41954c8025e4a39f93129007b85a5d559d8fc1b",
      "parents": [
        "4290cfe79142878638dde41beb9c7e1602f758c9"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Jan 28 15:14:52 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Jan 28 15:14:52 2026 +0800"
      },
      "message": "[#2716] feat(client): More overlapping decompression stats to log (#2717)\n\n### What changes were proposed in this pull request?\n\nAdd more decompression stats to log\n1. peek memory used to dig when OOM happens\n2. decompression throughput \n\n### Why are the changes needed?\n\nto fix #2716 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests are enough\n\n---------\n\nCo-authored-by: Junfan Zhang \u003czhangjunfan@qiyi.com\u003e"
    },
    {
      "commit": "4290cfe79142878638dde41beb9c7e1602f758c9",
      "tree": "315d1ca5078f7774f1b0d1a2b40f201cab0b80d7",
      "parents": [
        "3525bab94eef75d1a9002fafb1579534b9d6307d"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Jan 21 15:31:48 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Jan 21 15:31:48 2026 +0800"
      },
      "message": "feat(doc): Update spark related performance guide in doc (#2713)\n\n### What changes were proposed in this pull request?\n\nUpdate spark related performance guide in doc\n\n### Why are the changes needed?\n\nTo show the latest spark performance guide in doc\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n\n---\nCo-authored-by: zhangjunfan \u003czhangjunfan@qiyi.com\u003e"
    },
    {
      "commit": "3525bab94eef75d1a9002fafb1579534b9d6307d",
      "tree": "4d3292b0a0a808fd1430e3eb38a8388c2554b1a1",
      "parents": [
        "f43f66fd4b88033689d06d5192e4a612046ef31c"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Jan 15 10:49:00 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Jan 15 10:49:00 2026 +0800"
      },
      "message": "[#2711] fix(spark): Race condition on deferred compressed block initialization (#2712)\n\n### What changes were proposed in this pull request?\n\nFix race condition on deferred compressed block initialization\n\n### Why are the changes needed?\n\nfix #2711 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting unit tests\n\n---------\n\nCo-authored-by: zhangjunfan \u003czhangjunfan@qiyi.com\u003e"
    },
    {
      "commit": "f43f66fd4b88033689d06d5192e4a612046ef31c",
      "tree": "61b36734ddc993b9dd1f04304c94fd19b17259d3",
      "parents": [
        "2d5fc0a2b8ffb0bece50cff76a864bc3a992660d"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Tue Jan 13 14:17:26 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Jan 13 14:17:26 2026 +0800"
      },
      "message": "[#2707] fix(server): Catch up on any failures in `calcTopNShuffleDataSize` (#2708)\n\n### What changes were proposed in this pull request?\n\n1. Wrap calcTopNShuffleDataSize in error handling\n2. Use `scheduleWithFixedDelay` instand of `scheduleAtFixedRate`\n\n### Why are the changes needed?\n\nFix: #2707 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nCI"
    },
    {
      "commit": "2d5fc0a2b8ffb0bece50cff76a864bc3a992660d",
      "tree": "0911832fe20a50b6ce4e5899df135f752ec93e40",
      "parents": [
        "13651156383ca0c29ab984d75c825c5a8c9145db"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Tue Jan 13 14:15:23 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Jan 13 14:15:23 2026 +0800"
      },
      "message": "[#2709] fix(spark): Fix serialization error in Spark History UI (#2710)\n\n### What changes were proposed in this pull request?\n\n+ Add cleaner ShuffleReadTimesSummary/ShuffleWriteTimesSummary entities\n+ Make ShuffleType as an enumeration\n\n### Why are the changes needed?\n\nfor #2709 \n\nFix serialization error when enabled `spark.history.store.hybridStore.enabled`\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nSuccessfully tested in internal spark history server environment.\n\n---------\n\nCo-authored-by: Junfan Zhang \u003czuston@apache.org\u003e"
    },
    {
      "commit": "13651156383ca0c29ab984d75c825c5a8c9145db",
      "tree": "416336b0bf0bc1785a18d3a151f3594c42c69709",
      "parents": [
        "cf29d36f9bdf6ab45cf2e9b2ee914122aa149ee2"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Thu Jan 08 19:20:03 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Jan 08 19:20:03 2026 +0800"
      },
      "message": "[#2705] fix(spark): Use read-write lock for `MutableShuffleHandleInfo` to avoid global locking (#2706)\n\n### What changes were proposed in this pull request?\n\nAdd read-write lock for MutableShuffleHandleInfo to avoid global locking\n\n### Why are the changes needed?\n\nTo fix a critical and potentially bug, particularly during the startup of large tasks in Spark jobs, although #2667 only partially addresses the issue.\n\ncloses #2705\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t"
    },
    {
      "commit": "cf29d36f9bdf6ab45cf2e9b2ee914122aa149ee2",
      "tree": "9b30b82c0b555adaad4d22b37ef8ce0f219c9bd4",
      "parents": [
        "7867d59b1b15d1021cb15528db89e64e42508d3b"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Dec 29 16:23:46 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Dec 29 16:23:46 2025 +0800"
      },
      "message": "[#2674] improvement(client): use ack val to check the block send result (#2703)\n\n### What changes were proposed in this pull request?\nUse ack val to check the block send result.\n\n### Why are the changes needed?\nFor better performance.\nFix: #2674\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI"
    },
    {
      "commit": "7867d59b1b15d1021cb15528db89e64e42508d3b",
      "tree": "8b2f626ec0306c22da2d077c7890d141e407182c",
      "parents": [
        "a2c2d056ce8fd49af025eaa48785b94d21515557"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Wed Dec 24 10:36:58 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Dec 24 10:36:58 2025 +0800"
      },
      "message": "[#2701] fix(server): release the memory of duplicate blocks (#2702)\n\n### What changes were proposed in this pull request?\nRelease the memory of duplicate blocks.\n\n### Why are the changes needed?\nThe used_buffer_size is incorrect when duplicate blocks occur.\nFix: #2701\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "a2c2d056ce8fd49af025eaa48785b94d21515557",
      "tree": "ace82e772e48844d466eb37ff563a23e8f4b1bd7",
      "parents": [
        "74203516a37acc34c72e7e0fb830cd882faa74d7"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Dec 18 11:40:46 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Dec 18 11:40:46 2025 +0800"
      },
      "message": "[#2697] refactor(spark): Involve related writer stats info into ShuffleWriteTaskStats (#2698)\n\n### What changes were proposed in this pull request?\n\nInvolve related writer stats info into ShuffleWriteTaskStats for further integrity validation block checksum implementation\n\n### Why are the changes needed?\n\nfor #2697\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting unit tests\n\n---------\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e"
    },
    {
      "commit": "74203516a37acc34c72e7e0fb830cd882faa74d7",
      "tree": "b4ab75901fd64c7f8f4ebeedd9b4e8bedc4da169",
      "parents": [
        "c7e23b6b72b573f1a5019a41c1ca16a36cc3932a"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Dec 16 15:28:15 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 16 15:28:15 2025 +0800"
      },
      "message": "[#2686] fix(client): Prefetch should be finished once shuffle result is empty or null (#2696)\n\n### What changes were proposed in this pull request?\n\nPrefetch should be finished once shuffle result is empty or null\n\n### Why are the changes needed?\n\nfix #2686 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests\n"
    },
    {
      "commit": "c7e23b6b72b573f1a5019a41c1ca16a36cc3932a",
      "tree": "fc437f44fbc8a15d9b8f91810f82ca02fae569c4",
      "parents": [
        "741ecba1a1157f6e2527469282f624748c9d8769"
      ],
      "author": {
        "name": "Mark Wadham",
        "email": "124088+m4rkw@users.noreply.github.com",
        "time": "Wed Dec 10 02:48:30 2025 +0000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Dec 10 10:48:30 2025 +0800"
      },
      "message": "chore: Fix grammar in RssException message (#2695)\n\n### What changes were proposed in this pull request?\n\nCorrection of grammar.\n\n### Why are the changes needed?\n\nSo that the grammar will be correct.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nN/A"
    },
    {
      "commit": "741ecba1a1157f6e2527469282f624748c9d8769",
      "tree": "2f147d2568df8df00cff59329e52c0aa9f1d5e8f",
      "parents": [
        "4c2fd4e103b765535a685316e9cc17fcaae942f4"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Dec 09 17:04:36 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 09 17:04:36 2025 +0800"
      },
      "message": "[#2691] feat(client): Introduce the `HARD_SPLIT_FROM_SERVER` response status code (#2694)\n\n### What changes were proposed in this pull request?\n\nThis PR introduces the `HARD_SPLIT_FROM_SERVER` response status code\n\n### Why are the changes needed?\n\nthe subtask for the #2691\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNo need\n"
    },
    {
      "commit": "4c2fd4e103b765535a685316e9cc17fcaae942f4",
      "tree": "6466887b2ac7b32794a728400a6746a9a1b4e8c8",
      "parents": [
        "61e47b30fe1e86598125684fdf515bd239305655"
      ],
      "author": {
        "name": "advancedxy",
        "email": "807537+advancedxy@users.noreply.github.com",
        "time": "Fri Dec 05 20:33:14 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Dec 05 20:33:14 2025 +0800"
      },
      "message": "chore: Update lz4 to address CVE-2025-12183 (#2693)\n\n### What changes were proposed in this pull request?\n1. upgrade lz to the latest version of org.lz4:lz4-java \n2. replace `fastestInstance` to `safeInstance`\n \n### Why are the changes needed?\nTo address [CVE-202512183](https://sites.google.com/sonatype.com/vulnerabilities/cve-2025-12183)\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\nExisting tests."
    },
    {
      "commit": "61e47b30fe1e86598125684fdf515bd239305655",
      "tree": "62a57268c7b0f5405482161bfbcd87f3af864a47",
      "parents": [
        "5bbe25e6cb643beaa9efe409f0e3a37ca5d86109"
      ],
      "author": {
        "name": "KCH",
        "email": "127062806+CPkch@users.noreply.github.com",
        "time": "Wed Dec 03 11:13:23 2025 +0900"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Dec 03 10:13:23 2025 +0800"
      },
      "message": "[#2672] fix(server):  NPE in PartitionedShuffleBlockIdManager (#2690)\n\n### What changes were proposed in this pull request?\n\nAdded null checks to prevent NullPointerException in PartitionedShuffleBlockIdManager.getFinishedBlockIds():\n\nAdd null check for partitionToBlockId when shuffleId doesn\u0027t exist\nAdd null check for bitmap before calling or() method\n\n### Why are the changes needed?\n\nFix: #2672 \n\nThe method was throwing NullPointerException when:\n\nshuffleId data doesn\u0027t exist in the map\nbitmap is null for a specific partition\nThese defensive checks prevent the exception and gracefully handle edge cases.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUT\n"
    },
    {
      "commit": "5bbe25e6cb643beaa9efe409f0e3a37ca5d86109",
      "tree": "a1eefd8ccaa1f675444685cacb9ae4b895f8e89b",
      "parents": [
        "8cb662524f7e215391dfb77edfefb0ade8faa745"
      ],
      "author": {
        "name": "zhan7236",
        "email": "76658920+zhan7236@users.noreply.github.com",
        "time": "Tue Dec 02 17:39:24 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 02 17:39:24 2025 +0800"
      },
      "message": "[#2675] test(spark)(followup): Add tests for Roaring64NavigableMap optimization in checkSentBlockCount (#2692)\n\n### What changes were proposed in this pull request?\n\nAdd unit tests to verify the `Roaring64NavigableMap` optimization introduced in PR #2687 for filtering duplicate blockIds from multiple replicas in `RssShuffleWriter#checkSentBlockCount`.\n\n### Why are the changes needed?\n\nAs suggested by the reviewer in PR #2687, tests should be added to cover the `Roaring64NavigableMap` optimization change.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNew tests added:\n- `testRoaring64NavigableMapDeduplication`: Verifies correct deduplication of blockIds across multiple servers (replicas)\n- `testRoaring64NavigableMapWithLargeBlockIds`: Verifies behavior with large number of consecutive blockIds\n\nTest results:\n- Spark3 `RssShuffleWriterTest`: 9 tests passed (including 2 new tests)\n- Spark2 `RssShuffleWriterTest`: 5 tests passed (including 2 new tests)\n\nRelated PR: #2687"
    },
    {
      "commit": "8cb662524f7e215391dfb77edfefb0ade8faa745",
      "tree": "c5e9e7b65f3b01885fbebf2293ba4ea92a21324e",
      "parents": [
        "b53c535c538b1f93b2516b1abf8b6ef92c784b5d"
      ],
      "author": {
        "name": "zhan7236",
        "email": "76658920+zhan7236@users.noreply.github.com",
        "time": "Mon Dec 01 10:34:28 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Dec 01 10:34:28 2025 +0800"
      },
      "message": "[#1603] feat(spark): Disable dataPusher initialization for Spark Driver (#2688)\n\n### What changes were proposed in this pull request?\n\nThis PR disables the `dataPusher` initialization for Spark driver in cluster mode, as it\u0027s only needed for executors to push shuffle data.\n\n### Why are the changes needed?\n\nThe `dataPusher` is used to push shuffle data to shuffle servers, which is an executor-side operation. In cluster mode, driver does not push shuffle data, so initializing `dataPusher` (along with its thread pool) for driver is unnecessary and wastes resources.\n\n**Note**: In local mode, driver also acts as executor, so `dataPusher` is still initialized in local mode.\n\n### Does this PR introduce any user-facing change?\n\nNo.\n\n### How was this patch tested?\n\n- Compiled successfully for both Spark 2 and Spark 3 profiles\n- All unit tests pass\n- Integration tests pass (tested `CombineByKeyTest` and `GroupByKeyTest` which were previously failing)\n\n### Summary of Changes:\n1. **RssShuffleManagerBase.java**: Wrapped `dataPusher` initialization with `(!isDriver || isLocalMode)` condition, where `isLocalMode` is determined by checking if `spark.master` starts with \"local\"\n2. **RssShuffleManager.java (Spark 3)**: Added null check for `dataPusher.setRssAppId()` calls\n3. **RssShuffleManager.java (Spark 2)**: Added null check for `dataPusher.setRssAppId()` calls\n\nCloses #1603"
    },
    {
      "commit": "b53c535c538b1f93b2516b1abf8b6ef92c784b5d",
      "tree": "8f217c75359d115fc561b565bccf0504b1b5a23e",
      "parents": [
        "fa80c346e9eafc69ef7d752cee029e06913e8ad5"
      ],
      "author": {
        "name": "zhan7236",
        "email": "76658920+zhan7236@users.noreply.github.com",
        "time": "Mon Dec 01 10:32:44 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Dec 01 10:32:44 2025 +0800"
      },
      "message": "[#2675] improvement(spark): Optimize `checkSentBlockCount` by using Roaring64NavigableMap (#2687)\n\n### What changes were proposed in this pull request?\n\nReplace `HashSet\u003cLong\u003e` with `Roaring64NavigableMap` in `RssShuffleWriter#checkSentBlockCount` method for both Spark2 and Spark3 clients. This optimization uses a compressed bitmap data structure to filter duplicate blockIds from multiple replicas.\n\nChanges:\n- Added import for `org.roaringbitmap.longlong.Roaring64NavigableMap`\n- Replaced `Set\u003cLong\u003e blockIds \u003d new HashSet\u003c\u003e()` with `Roaring64NavigableMap blockIdBitmap \u003d Roaring64NavigableMap.bitmapOf()`\n- Changed `blockIds.addAll(x)` to `x.forEach(blockIdBitmap::addLong)`\n- Changed `blockIds.size()` to `blockIdBitmap.getLongCardinality()`\n\n### Why are the changes needed?\n\n`Roaring64NavigableMap` is a compressed bitmap data structure that is more memory-efficient than `HashSet\u003cLong\u003e`, especially when storing large numbers of blockIds (which are typically consecutive or near-consecutive long integers). This optimization can significantly reduce memory usage in large-scale shuffle scenarios.\n\nFix: #2675\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\n- Compiled successfully with both `spark2` and `spark3` profiles\n- All existing unit tests pass:\n  - `RssShuffleWriterTest` for Spark3: 7 tests passed\n  - `RssShuffleWriterTest` for Spark2: 3 tests passed\n- Code style verified with `mvn spotless:check`"
    },
    {
      "commit": "fa80c346e9eafc69ef7d752cee029e06913e8ad5",
      "tree": "ed607a87d06996c89e70cf1b01c953255b4b9aa6",
      "parents": [
        "43bfd2012b624b3eb91f473582073784cdcf4072"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Dec 01 10:29:08 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Dec 01 10:29:08 2025 +0800"
      },
      "message": "[#2684] fix: Infinite memory data reading due to duplicate blockId (#2685)\n\n### What changes were proposed in this pull request?\n\nThis PR introduces an additional `isEnd` flag to indicate whether the end of the in-memory data has been reached.\nFor simplicity, this flag does not guarantee precise end-of-data detection; the client should still rely on its existing logic to determine completion. However, this flag should be sufficient to cover most cases.\n\nAttention that after this PR, the upgrade should first be performed on the client side, followed by upgrading the shuffle servers.\n\n### Why are the changes needed?\n\nfor #2684 .\n\nAs we all know, reading memory data is based on `last_block_id`, which determines the starting position. However, when duplicate `block_id`s appear in the sequence, the client may read duplicate blocks. This is acceptable because the client relies on the `processed_block_ids` mechanism to handle duplicates.\n\nThe issue arises when a duplicate block appears at the end of the memory data sequence. In this case, the client keeps receiving the same last block repeatedly, causing the read operation to run indefinitely, as illustrated below.\n\nFirst round of blockIds:\n```\n703702683883796\n844440172239124\n703702683884548\n985177660594452\n1125915148949780  (duplicate)\n844440172239876\n985177660595204\n1125915148950532\n1266652637305860\n985177660594452  \n1125915148949780  (duplicate)\n```\n\nSecond round of blockIds:\n```\n844440172239876\n985177660595204\n1125915148950532\n1266652637305860\n985177660594452 \n1125915148949780  (duplicate)\n```\n\nAt this point, the duplicated last blocks cause the client to repeatedly read the same tail of the sequence, resulting in an infinite loop.\n\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nexisting unit tests.\n"
    },
    {
      "commit": "43bfd2012b624b3eb91f473582073784cdcf4072",
      "tree": "ca00ac906e2058a742fb8274cff8d856e611a575",
      "parents": [
        "b6848f86c2107fe3a5d2428e4eed1debb88f00f0"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Nov 24 10:20:40 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Nov 24 10:20:40 2025 +0800"
      },
      "message": "[#2679] fix(spark): Potential data mismatch on overlapping decompression (#2680)\n\n### What changes were proposed in this pull request?\n\nThis PR is to fix the potential data mismatch when encountering the duplicate blockIds when the overlapping decompression is enabled.\n\nIn the one batch remote fetching, if partial blocks should be filtered out due to duplicating or unneed, we should skip it to inc the `segmentIndex` \n\n### Why are the changes needed?\n\nfix the data mismatch cases, that is found by the uniffle integrity validation mechanism.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests."
    },
    {
      "commit": "b6848f86c2107fe3a5d2428e4eed1debb88f00f0",
      "tree": "01cf1d473df2b2c8556ac33d6638a23948fc7dd4",
      "parents": [
        "d6df94ca8ed313ec63986273a744fa3574e78b98"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Nov 21 16:00:30 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Nov 21 16:00:30 2025 +0800"
      },
      "message": "[#2682] feat(spark): Make shuffleWriteTaskStats visible about integrity validation for Gluten (#2683)\n\n### What changes were proposed in this pull request?\n\nMake `shuffleWriteTaskStats` visible about integrity validation for Gluten\n\n### Why are the changes needed?\n\ntracked in #2682 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t"
    },
    {
      "commit": "d6df94ca8ed313ec63986273a744fa3574e78b98",
      "tree": "517f8182a88dbb0aa62064c3fd9821c78999b33b",
      "parents": [
        "afe1b9a52d69abc28c8e693620c5170e101dbf88"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Nov 21 11:37:16 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Nov 21 11:37:16 2025 +0800"
      },
      "message": "[#2673] feat(spark)(part-2): Merge partition stats for partition split on integrity validation (#2681)\n\n### What changes were proposed in this pull request?\n\nThis PR is to fix the incorrect aggregated expected record numbers when the partition split is activate. \n\n### Why are the changes needed?\n\nIf the partition split is activate and the server management is enabled for the integrity validation, the records check will fail due to the unmerged partition stats. \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests.\n"
    },
    {
      "commit": "afe1b9a52d69abc28c8e693620c5170e101dbf88",
      "tree": "de03d9176322ee0c95b0780e4e33826b11327f49",
      "parents": [
        "de55bd90eecab19301c688a105f85a9b7873774a"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Nov 19 12:04:27 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Nov 19 12:04:27 2025 +0800"
      },
      "message": "improvement(spark): Move the reassign info logs to DEBUG to cut down on noise (#2677)\n\n### What changes were proposed in this pull request?\n\nMove the reassign info logs to DEBUG to cut down on noise\n\n### Why are the changes needed?\n\nto reduce nosiy logs\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t"
    },
    {
      "commit": "de55bd90eecab19301c688a105f85a9b7873774a",
      "tree": "15b50fa4b8ceba8a33f1b84bf72e7aab496b91e9",
      "parents": [
        "b40c5091dc1d11d39df24047613468be1b17dde3"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Nov 14 14:06:33 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Nov 14 14:06:33 2025 +0800"
      },
      "message": "[#2673] feat(spark)(part-1): Add client-side support for storing partition stats on shuffle servers (#2669)\n\n### What changes were proposed in this pull request?\n\nThis is the part-1 PR only with uniffle client changes of making the partition stats stored in the shuffle-server side to make the integrity validation mechanism more stable. BTW, the shuffle-servers side changes will be implemented in the further PRs, and this PR is also compatible with  the legacy shuffle-server protocol.\n\n### Why are the changes needed?\n\nthe subtask for the issue #2673.\n\nBy leveraging the PR #2653 , we could end-to-end ensure the data consistency. But, the partition stats stored in the spark driver side, for the normal spark stages, this design runs well. But with the 100000 tasks with 10000 partitions, this will make the Spark driver overload. From the point of cluster spark jobs, some huge jobs will hang when getting the blockManagerIds, that will cost almost 20mins for one reader task, that is unacceptable. \n\nAnd so, this PR implements the server side store the partition stats like the blockID store did.\n\n### Does this PR introduce _any_ user-facing change?\n\n`spark.rss.client.integrityValidation.serverManagementEnabled\u003dfalse`\n\n### How was this patch tested?\n\nInternal job tests.\n"
    },
    {
      "commit": "b40c5091dc1d11d39df24047613468be1b17dde3",
      "tree": "0ff00863431dc73134bce40fa5ace79f502c718c",
      "parents": [
        "d6c5988208f071d7f2e226c3c45b8f623c49ff71"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Nov 13 20:23:19 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Nov 13 20:23:19 2025 +0800"
      },
      "message": "[#2665] feat(spark): Reconstruct the shuffle handle from initial spark handle it haven\u0027t bee updated (#2667)\n\n### What changes were proposed in this pull request?\n\nReconstruct the shuffle handle information from static spark handle if it hasn’t been updated. This means the latest shuffle handle data will no longer need to be transmitted through the RPC layer, reducing the load on the driver—especially when there are a large number of tasks.\n\n### Why are the changes needed?\n\nFor issue #2665: Once partition reassignment is enabled, the shuffle information is always retrieved when a task starts, which puts significant pressure on the Spark driver. Although in most cases the shuffle information remains unchanged, this behavior provides a natural point for optimization.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests could cover this PR\u0027s change.\n"
    },
    {
      "commit": "d6c5988208f071d7f2e226c3c45b8f623c49ff71",
      "tree": "d7aa6e9125d57e2c973472e6dff50fec95a41f24",
      "parents": [
        "f736c730262855bb97ee6fc2f3f5a6a461ae164c"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Nov 13 17:01:22 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Nov 13 17:01:22 2025 +0800"
      },
      "message": "feat(spark): Show shuffle failures into spark UI (#2668)\n\n### What changes were proposed in this pull request?\n\nShow the shuffle failure reason into the spark UI\n\n### Why are the changes needed?\n\nfollowup issue #2508 to be more easier to find out the root cause of shuffle failure\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job tests\n"
    },
    {
      "commit": "f736c730262855bb97ee6fc2f3f5a6a461ae164c",
      "tree": "f740dec639f1b053761525ac9ae74c55966b0edc",
      "parents": [
        "a37936f2bc34f6c38082ea85559ea70ebfdc0f26"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Nov 11 09:52:41 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Nov 11 09:52:41 2025 +0800"
      },
      "message": "[#2648] fix(spark): Incorrect fetched bytes metric when overlapping decompression is enabled (#2650)\n\n### What changes were proposed in this pull request?\n\nCorrect fetched bytes metric when overlapping decompression is enabled\n\n### Why are the changes needed?\n\nIn the current codebase, the shuffle read bytes will be as the uncompressed byte size, that is inconsistent with the writer side statistics.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job tests.\n"
    },
    {
      "commit": "a37936f2bc34f6c38082ea85559ea70ebfdc0f26",
      "tree": "efd4d7e69efabe7105df371b3270c01cc31017b9",
      "parents": [
        "1f371e854314a92aaa115dedcfbb0e2194617f7b"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Nov 07 13:56:36 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Nov 07 13:56:36 2025 +0800"
      },
      "message": "[#2652] feat(spark): Add compression for task write stats (#2666)\n\n### What changes were proposed in this pull request?\n\n1. Add compression for task write stats\n2. Optional blocks number check mechanism (disabled by default)\n\n### Why are the changes needed?\n\nTo reduce the task write stats size\n\n### Does this PR introduce _any_ user-facing change?\n\n`spark.rss.client.integrityValidation.blockNumberCheckEnabled\u003dfalse`\n\n### How was this patch tested?\n\nUnit tests.\n"
    },
    {
      "commit": "1f371e854314a92aaa115dedcfbb0e2194617f7b",
      "tree": "bfd30d428bc5a0fa911b4a62c877be4301431c47",
      "parents": [
        "17d2b257e3094d3d4d285d1ff0e584e51ecefae2"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Nov 06 11:39:40 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Nov 06 11:39:40 2025 +0800"
      },
      "message": "feat(spark): Make integrity validation disabled by default (#2664)\n\n### What changes were proposed in this pull request?\n\nMake integrity validation disabled by default\n\n### Why are the changes needed?\n\nthis feature is still in experimental\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "17d2b257e3094d3d4d285d1ff0e584e51ecefae2",
      "tree": "5cf8771fe0ff549f04d9e7d3c5885c94ed9617cd",
      "parents": [
        "6e24451231d5f56c797429a6fd83dd2e0b39ce72"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Nov 04 17:11:52 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Nov 04 17:11:52 2025 +0800"
      },
      "message": "[#2649] feat(spark): Introduce timeout mechanism when getting the decompressing data (#2651)\n\n### What changes were proposed in this pull request?\n\nThis PR is to introduce the timeout mechanism when getting the overlapping decompression data.\n\n### Why are the changes needed?\n\nIf not having this PR, the blocking wait have the potential risk to forever hang of the tasks when the rpc hang\n\n### Does this PR introduce _any_ user-facing change?\n\n`rss.client.read.overlappingDecompressionFetchSecondsThreshold\u003d-1`, this mechanism will be disabled by default.\n\n### How was this patch tested?\n\nInternal job tests\n"
    },
    {
      "commit": "6e24451231d5f56c797429a6fd83dd2e0b39ce72",
      "tree": "b7885a0da2b4738164406abc4c2d0a5d105be29a",
      "parents": [
        "d9815c06fea0f98a4afe2135869a7dc443c0fc93"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Nov 03 20:14:04 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Nov 03 20:14:04 2025 +0800"
      },
      "message": "refactor: Enhance spark client logs (#2662)\n\n### What changes were proposed in this pull request?\n\nenhance the spark client logs\n\n### Why are the changes needed?\n\nThis PR aims to simplify and enhance the Spark client logs.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "d9815c06fea0f98a4afe2135869a7dc443c0fc93",
      "tree": "e16cf2c2756be1ca9dbdbfe366835b94908ab8e7",
      "parents": [
        "bef547d44650738256adac05162b2a3efacb8c74"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Nov 03 10:06:25 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Nov 03 10:06:25 2025 +0800"
      },
      "message": "[#2652] feat(spark): Add detailed integrity validation failure analysis (#2657)\n\n### What changes were proposed in this pull request?\n\nThis PR is to add the detailed integrity validation failure analysis to hopefully get the concrate upstream task attempt id for the further dig.\n\n### Why are the changes needed?\n\nthe followup for the #2653 \n\n### Does this PR introduce _any_ user-facing change?\n\n`spark.rss.client.integrityValidation.failureAnalysisEnabled\u003dfalse`\n\n### How was this patch tested?\n\nUnit tests.\n"
    },
    {
      "commit": "bef547d44650738256adac05162b2a3efacb8c74",
      "tree": "c8f3016eec22313d8816d8663a13fb326857e377",
      "parents": [
        "8124152504ee823fbfdde7646cf812290f51c8de"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Nov 03 09:51:29 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Nov 03 09:51:29 2025 +0800"
      },
      "message": "[#2654] fix(spark): NPE on adding data into overlapping decompression worker (#2661)\n\n### What changes were proposed in this pull request?\n\nfix npe on adding data into overlapping decompression worker\n\n### Why are the changes needed?\n\nfix #2654 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\ninternal job tests\n"
    },
    {
      "commit": "8124152504ee823fbfdde7646cf812290f51c8de",
      "tree": "aab7735be20a6fe4274d1636d70964b2c28d1e8a",
      "parents": [
        "8bfe1d33c08bcfb8739339d379961584995bd15c"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Nov 03 09:51:11 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Nov 03 09:51:11 2025 +0800"
      },
      "message": "improvement(spark): Simplify client output logs for writer/reader (#2660)\n\n### What changes were proposed in this pull request?\n\nSimplify client output logs for writer/reader\n\n### Why are the changes needed?\n\nToo much unncessary logs are so mess\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "8bfe1d33c08bcfb8739339d379961584995bd15c",
      "tree": "fdb7f54965f139ef59386799ae62dc060ece22ad",
      "parents": [
        "6aef84634d7abfed1f65a4e8c37cfa95b3fd7591"
      ],
      "author": {
        "name": "Ruilei Ma",
        "email": "merrily01@gmail.com",
        "time": "Wed Oct 29 19:24:57 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Oct 29 19:24:57 2025 +0800"
      },
      "message": "chore: fix typo in `applicationpage.js` (#2656)\n\n"
    },
    {
      "commit": "6aef84634d7abfed1f65a4e8c37cfa95b3fd7591",
      "tree": "81c1d1ce83a838c20e84704406bb948a110e4ec1",
      "parents": [
        "5671a05b113a09065ebd5fb51bc4d492353b9283"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Oct 29 11:34:19 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Oct 29 11:34:19 2025 +0800"
      },
      "message": "[#2652] feat(spark): Introduce partition records number check to ensure data consistency (#2653)\n\n### What changes were proposed in this pull request?\n\nThis PR ensures end-to-end data consistency by verifying the record counts of each partition.\nIn this initial step, ShuffleWriteTaskStats is introduced to store record counts for validation.\nIn the next phase, this mechanism will be extended to support row-level checksums and block count verification.\n\nI have only validated this patch on Spark 3.5.0, and this feature is enabled only for Spark versions at least `3.5.0`\n\n### Why are the changes needed?\n\nfor the #2652\n\n### Does this PR introduce _any_ user-facing change?\n\n`spark.rss.client.integrityValidation.enabled\u003dfalse` . \n\n### How was this patch tested?\n\nUnit tests.\n"
    },
    {
      "commit": "5671a05b113a09065ebd5fb51bc4d492353b9283",
      "tree": "1c94b35769f123e8228ac59cdfa7dca977b7411f",
      "parents": [
        "5edf952826aebbbbfaade287e1c39171394d6c43"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Oct 22 15:36:04 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Oct 22 15:36:04 2025 +0800"
      },
      "message": "fix(spark): decompression time is always 0 when overlapping decompression is enabled (#2647)\n\n### What changes were proposed in this pull request?\n\nThis PR is to fix the decompression time is always 0 when overlapping decompression is enabled\n\n### Why are the changes needed?\n\nWhen the overlapping decompression is enabled, the decompression time is always zero.\n\n\u003cimg width\u003d\"1915\" height\u003d\"576\" alt\u003d\"image\" src\u003d\"https://github.com/user-attachments/assets/098a95ff-6a75-4a50-80bb-b7cbef1f5d25\" /\u003e\n\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job tests.\n"
    },
    {
      "commit": "5edf952826aebbbbfaade287e1c39171394d6c43",
      "tree": "9176be3411ea7316a4522c38ec50b3b8b998a962",
      "parents": [
        "42c5d9f79b3fc90b9bdd0a9058c0954dbc9d5510"
      ],
      "author": {
        "name": "Neo Chien",
        "email": "cchung100m@cs.ccu.edu.tw",
        "time": "Mon Oct 20 17:49:52 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Oct 20 17:49:52 2025 +0800"
      },
      "message": "[#2517] fix(client): IllegalReferenceCountException about ShuffleBlockInfo (#2638)\n\n### What changes were proposed in this pull request?\nFix `IllegalReferenceCountException` about ShuffleBlockInfo\n\n### Why are the changes needed?\nfor https://github.com/apache/uniffle/issues/2517\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\ncurrent UT"
    },
    {
      "commit": "42c5d9f79b3fc90b9bdd0a9058c0954dbc9d5510",
      "tree": "8963d2338f6c46f61d055506a926f8a88320f948",
      "parents": [
        "11881ab92706bb8cf3a2bad556ed5e545ac890cb"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Oct 20 17:49:28 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Oct 20 17:49:28 2025 +0800"
      },
      "message": "[#2644] feat(spark): Involve shuffle failure into the event logs (#2645)\n\n### What changes were proposed in this pull request?\n\nInvolve shuffle failure into the event logs\n\n### Why are the changes needed?\n\nfor #2644\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job tests\n"
    },
    {
      "commit": "11881ab92706bb8cf3a2bad556ed5e545ac890cb",
      "tree": "b49d7a88b3d5a56a2b5438dcf85c548bcdad327d",
      "parents": [
        "895291322752741603966d55295f01b26b8379ea"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Oct 15 09:37:57 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Oct 15 09:37:57 2025 +0800"
      },
      "message": "[#2640] feat(spark): Involve background prefetch time in spark UI (#2641)\n\n### What changes were proposed in this pull request?\n\nThis PR is to Involve background prefetch time in spark UI\n\n### Why are the changes needed?\n\nfor #2640 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job test"
    },
    {
      "commit": "895291322752741603966d55295f01b26b8379ea",
      "tree": "41459f08976effbd6d3338345d908a97e3b47861",
      "parents": [
        "1642c4dce78c8337a10b208a8389a090a906998b"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Oct 15 09:37:21 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Oct 15 09:37:21 2025 +0800"
      },
      "message": "chore: Add the space for ComposedClientReadHandler log (#2643)\n\n"
    },
    {
      "commit": "1642c4dce78c8337a10b208a8389a090a906998b",
      "tree": "b934683fe83ec95fe4b4b05d90ffb267abfdcc00",
      "parents": [
        "4805d1335aae06fa699ee3c2f17d94f599f3cb2c"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Oct 14 13:48:07 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Oct 14 13:48:07 2025 +0800"
      },
      "message": "[#2494] feat(spark): Involve background overlapping decompress time in spark UI (#2639)\n\n### What changes were proposed in this pull request?\n\nTo show the background overlapping decompress time in Spark UI\n\n### Why are the changes needed?\n\nMore easiler to observe the performance improvement ratio\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal job tests.\n"
    },
    {
      "commit": "4805d1335aae06fa699ee3c2f17d94f599f3cb2c",
      "tree": "75745e55bf5a254ccba01913546ae7871d65224d",
      "parents": [
        "872926122a67d6dc8da8fb5a9286d6dd0b86bf95"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Sep 30 14:22:27 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 30 14:22:27 2025 +0800"
      },
      "message": "[#2636] feat(spark): Cache shuffle handle info for reader to reduce RPC cost when partition reassign is enabled (#2637)\n\n### What changes were proposed in this pull request?\n\nThis PR is to introduce the cache mechanism to cache the read shuffle handle info to reduce the RPC cost and driver the GC pressure when the partition reassign is enabled\n\n### Why are the changes needed?\n\nfor #2636 .\n\nFrom the cluster spark jobs, I found some tasks failed on the failure of RPC of getting shuffle handle from the driver side when the partition reassign is enabled. This is the first step to optimize shuffle info getting for the reader side. \n\n### Does this PR introduce _any_ user-facing change?\n\nYes.\n\n`rss.client.read.shuffleHandleCacheEnabled\u003dfalse`\n\n### How was this patch tested?\n\nExisting tests\n"
    },
    {
      "commit": "872926122a67d6dc8da8fb5a9286d6dd0b86bf95",
      "tree": "812c7cb1e632dbd1906b3094563856132b3f9058",
      "parents": [
        "1d162dcf65b247d298fb35d1b64e1c392f40f285"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Sep 30 10:17:39 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 30 10:17:39 2025 +0800"
      },
      "message": "improvement(spark): Always reset decompression buffer with explicit position and limit (#2634)\n\n### What changes were proposed in this pull request?\n\nRefactor the uncompression buffer to reset by the explicit position\u003d0 and limit\u003dlen\n\n### Why are the changes needed?\n\nThis may be related with the bug #2630 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests\n"
    },
    {
      "commit": "1d162dcf65b247d298fb35d1b64e1c392f40f285",
      "tree": "eabe26d363ac4012fad33c5cf4e6c6c0dca0ea82",
      "parents": [
        "770eab1245d3df79d41b11f985aa387a17a6104b"
      ],
      "author": {
        "name": "yl09099",
        "email": "33595968+yl09099@users.noreply.github.com",
        "time": "Fri Sep 26 17:34:32 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Sep 26 17:34:32 2025 +0800"
      },
      "message": "[#2631] fix(server): Potential data loss due to the shuffle result report retry (#2632)\n\n### What changes were proposed in this pull request?\n\nSolve the problem of data loss caused by Spark task retries and Block metadata Report retries.\n\n### Why are the changes needed?\n\nFix: #2631 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExist UT.\n"
    },
    {
      "commit": "770eab1245d3df79d41b11f985aa387a17a6104b",
      "tree": "a5e05e7ede76a9ed9e8fa3e785479d584ee2d0bc",
      "parents": [
        "1bd7468863d1c75424c9b31f77ba9b8166933afd"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Sep 26 17:34:06 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Sep 26 17:34:06 2025 +0800"
      },
      "message": "[#2494] feat(spark): Add more statistics about overlapping decompression (#2633)\n\n### What changes were proposed in this pull request?\n\n1. Add more statistics about overlapping decompression to measure speedup ratio\n\n### Why are the changes needed?\n\n#2494 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t\n"
    },
    {
      "commit": "1bd7468863d1c75424c9b31f77ba9b8166933afd",
      "tree": "3b34d0d18352b04fdb79adff4589d07aadec68f7",
      "parents": [
        "96e96f8e5c2ae2c38d73313beee6b4c9cb26cbcc"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 25 10:30:40 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 25 10:30:40 2025 +0800"
      },
      "message": "[#2592] fix(spark): Skip failure when reporting shuffle write metrics to driver (#2629)\n\n### What changes were proposed in this pull request?\n\nSkip failure when reporting shuffle write metrics to driver\n\n### Why are the changes needed?\n\nfollowup the PR for the #2592 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n\nCo-authored-by: Junfan Zhang \u003czhangjunfan@qiyi.com\u003e"
    },
    {
      "commit": "96e96f8e5c2ae2c38d73313beee6b4c9cb26cbcc",
      "tree": "4983c8d3c11bee824920a20a441087ff089f08e6",
      "parents": [
        "3ccd91f1bcc85cc791ab1017ba199d38679118cd"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 25 10:30:13 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 25 10:30:13 2025 +0800"
      },
      "message": "[#2626] feat(spark): Respect rss.client.rpc.maxAttempts in ShuffleManagerClient (#2627)\n\n### What changes were proposed in this pull request?\n\nRespect `rss.client.rpc.maxAttempts` in ShuffleManagerClient\n\n### Why are the changes needed?\n\n#2626 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t\n\nCo-authored-by: Junfan Zhang \u003czhangjunfan@qiyi.com\u003e"
    },
    {
      "commit": "3ccd91f1bcc85cc791ab1017ba199d38679118cd",
      "tree": "a570687a3d7a0e95655907fc06bcec8d522b2a45",
      "parents": [
        "6ad3aa0f34c7d58ad3bf04697d9c738b0ec4df71"
      ],
      "author": {
        "name": "Neo Chien",
        "email": "cchung100m@cs.ccu.edu.tw",
        "time": "Mon Sep 22 17:02:42 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Sep 22 17:02:42 2025 +0800"
      },
      "message": "[#2614] improvement(client): Add test case for Incorrect header length for getLocalShuffleDataV3 (#2617)\n\n### What changes were proposed in this pull request?\nAdd test case for Incorrect header length for getLocalShuffleDataV3\n\n### Why are the changes needed?\nfor https://github.com/apache/uniffle/issues/2614\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "6ad3aa0f34c7d58ad3bf04697d9c738b0ec4df71",
      "tree": "623dce4414a9c6ace5167550c55c2f459e7d000f",
      "parents": [
        "abca5814f3c9ed6a3e5604ec102cfb37cf784133"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Sep 22 16:56:36 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Sep 22 16:56:36 2025 +0800"
      },
      "message": "[#2618] fix(spark): Invalid reassign status show in spark UI tab (#2620)\n\n### What changes were proposed in this pull request?\n\nFix the invalid reassign status show in spark UI tab\n\n### Why are the changes needed?\n\nIn the original design, the reassign info event is sent in the final stop method. However, because the post operation runs asynchronously, the listener may not receive the event before the JVM shuts down, so the event is effectively dropped.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal spark job test\n"
    },
    {
      "commit": "abca5814f3c9ed6a3e5604ec102cfb37cf784133",
      "tree": "2da303719b6b0b04531529b9035c72c2ffb3d5c2",
      "parents": [
        "9338529baec1086a46220c9b6090623d009c441d"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Sep 19 15:52:05 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Sep 19 15:52:05 2025 +0800"
      },
      "message": "[#2622] fix(spark): Make shuffleServerInfo comparable on updatePartitionSplitAssignment (#2623)\n\n### What changes were proposed in this pull request?\n\nMake shuffleServerInfo comparable for `updatePartitionSplitAssignment`\n\n### Why are the changes needed?\n\nfix 2622\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests\n"
    },
    {
      "commit": "9338529baec1086a46220c9b6090623d009c441d",
      "tree": "2ac584ae410a220892684db9580179097be73299",
      "parents": [
        "ad66fe9d55c2afd69df08d3f93f44d2ef6880fa1"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Sep 19 14:17:35 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Sep 19 14:17:35 2025 +0800"
      },
      "message": "[#2619] fix(spark): NPE in ShuffleReadTimes.merge (#2621)\n\n### What changes were proposed in this pull request?\n\nFix the NPE in `ShuffleReadTimes.merge`\n\n### Why are the changes needed?\n\nfix #2619\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t\n"
    },
    {
      "commit": "ad66fe9d55c2afd69df08d3f93f44d2ef6880fa1",
      "tree": "71f47b4f887d4ddc46c45b4719157b1b6c9c8ce8",
      "parents": [
        "67bd7af1d89e06f277d6e14a64f700f407ee5ce0"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 18 15:00:42 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 18 15:00:42 2025 +0800"
      },
      "message": "Revert \"Remove protected branch (#2615)\" (#2624)\n\nThis reverts commit 10aa39dccb6d3b3d9ee681ee335d5af4b00934aa."
    },
    {
      "commit": "67bd7af1d89e06f277d6e14a64f700f407ee5ce0",
      "tree": "2e054c7b29429a4ae02e153cd942d59447801186",
      "parents": [
        "e0a49b934f55e39b1d9289e021a5886a75f8438d"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 18 15:00:26 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 18 15:00:26 2025 +0800"
      },
      "message": "fix: Remove incubator to correct uniffle svn url (#2625)\n\n"
    },
    {
      "commit": "e0a49b934f55e39b1d9289e021a5886a75f8438d",
      "tree": "602949031349423b5f231c9f588402776a592478",
      "parents": [
        "10aa39dccb6d3b3d9ee681ee335d5af4b00934aa"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Sep 17 19:24:17 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Sep 17 19:24:17 2025 +0800"
      },
      "message": "Just a minor\n\n"
    },
    {
      "commit": "10aa39dccb6d3b3d9ee681ee335d5af4b00934aa",
      "tree": "c51b77daf5086e77933bb189ebfa23c083b3dbeb",
      "parents": [
        "1a46e2d0ce3775826654271e9a86e1287ea0a338"
      ],
      "author": {
        "name": "roryqi",
        "email": "roryqi@apache.org",
        "time": "Wed Sep 17 19:16:35 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Sep 17 19:16:35 2025 +0800"
      },
      "message": "Remove protected branch (#2615)\n\n"
    },
    {
      "commit": "1a46e2d0ce3775826654271e9a86e1287ea0a338",
      "tree": "7b7c137b6f596fe76d8de98bdc1d1a4f9b04031f",
      "parents": [
        "04964f30e367e5d698f2965c287474ded0f0b71b"
      ],
      "author": {
        "name": "Neo Chien",
        "email": "cchung100m@cs.ccu.edu.tw",
        "time": "Tue Sep 16 14:07:01 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 16 14:07:01 2025 +0800"
      },
      "message": "[#2599] fix(spark): Fix bug the incorrect shuffle read metric for spark (#2600)\n\n### What changes were proposed in this pull request?\n\nFix bug: Incorrect shuffle read metric for Spark\n\n### Why are the changes needed?\nfor https://github.com/apache/uniffle/issues/2599\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "04964f30e367e5d698f2965c287474ded0f0b71b",
      "tree": "79350b76106ae19bc3844cef5a6051e1791f4545",
      "parents": [
        "7015613adb6759861ff85478dd0590485f131b84"
      ],
      "author": {
        "name": "l.zonghai",
        "email": "842315999@qq.com",
        "time": "Tue Sep 16 10:08:39 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 16 10:08:39 2025 +0800"
      },
      "message": "[#2606] feat(mr): Add safety switch for map-stage combiner (#2607)\n\n### What changes were proposed in this pull request?\n\nIntroduce a configuration `mapreduce.rss.client.combiner.enable` to control whether the map-stage combiner runs in Uniffle MapReduce client.  \nDefault value is `false` to prevent job instability caused by large send-buffer GC storm.\n\n### Why are the changes needed?\n\nUsing map-stage combiner on large send buffer (`mapreduce.task.io.sort.mb * mapreduce.rss.client.sort.memory.use.threshold`) can trigger severe GC overhead,  \nwhich may stall MapTask and sender threads, leading to job hang. Most users do not require this by default.\n\n\n### Does this PR introduce _any_ user-facing change?\n\nYes, this adds a new optional configuration for expert users. Default behavior remains stable.\n\n### How was this patch tested?\n\nManually tested with MapReduce jobs with combiners. Verified that jobs run successfully with combiner disabled.\n1. **Combiner disabled**: MapTasks completed normally with fast GC cycles. Sample logs:\n```\n[2025-09-11 19:48:47] S0: 0MB, S1: 0MB, Eden: 299.02MB, Old: 11.53MB, ... Total: 0.101s\n...\n[2025-09-11 19:49:30] S0: 82.88MB, S1: 0MB, Eden: 160.04MB, Old: 485.99MB, ...Total: 6.683s\n...\n[2025-09-11 19:49:57] S0: 0MB, S1: 0MB, Eden: 207.66MB, Old: 532.27MB, ...Total: 12.474s\n```\n\u003e The MapTask completed successfully within 1 minute.\n\n2. **Combiner enabled**: MapTask GC cycles grew very long; job stalled and was eventually killed. Sample logs:\n\n```\n[2025-09-11 19:52:00] S0: 0MB, S1: 0MB, Eden: 80.49MB, Old: 12.86MB, ... Total: 0.054s\n[2025-09-11 19:52:08] S0: 0MB, S1: 0MB, Eden: 515.53MB, Old: 27.24MB, ... Total: 0.149s\n...\n[2025-09-11 20:01:54] S0: 0MB, S1: 0MB, Eden: 60.36MB, Old: 687.51MB, ... Total: 242.505s\n```\n\n\u003e The MapTask did not complete after 9 minutes.\n\nThese logs demonstrate that disabling the map-stage combiner avoids severe GC overhead and job stalls, validating the safety switch.\n\n---------\n\nCo-authored-by: Lobo2008 \u003c842315999@qq.coom\u003e"
    },
    {
      "commit": "7015613adb6759861ff85478dd0590485f131b84",
      "tree": "a0ea47f476cfec4b3d5df7dd8020903d341cae31",
      "parents": [
        "9fdde02b51a04176e29508a39583e0021547f3a0"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Sep 16 10:06:54 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 16 10:06:54 2025 +0800"
      },
      "message": "[#2609] feat(spark): Expose `checkDataIfAnyFailure` method so that Gluten can invoke it to trigger reassign ASSP (#2610)\n\n### What changes were proposed in this pull request?\n\nThis PR is exposing the `checkDataIfAnyFailure` method with protected modifier, so that the Gluten can invoke this to trigger reassign mechanism as soon as possible.\n\n### Why are the changes needed?\n\nfor #2609 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t"
    },
    {
      "commit": "9fdde02b51a04176e29508a39583e0021547f3a0",
      "tree": "4fcf4f953dbf5cd08369423ec404fba8e8c2b8f7",
      "parents": [
        "14a50985a01244192b85ed5ff9c61625c9912319"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 11 17:43:37 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 11 17:43:37 2025 +0800"
      },
      "message": "[#2591] fix(client): Missing task_id propagation in getLocalShuffleDataV3 (#2605)\n\n### What changes were proposed in this pull request?\n\nFix missing task_id propagation in getLocalShuffleDataV3\n\n### Why are the changes needed?\n\nTask_id is invalid \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nInternal tests."
    },
    {
      "commit": "14a50985a01244192b85ed5ff9c61625c9912319",
      "tree": "1b616a96f827adc51d42e20776556092225d0441",
      "parents": [
        "96bf76cbc9d9bbe51c3e8a502ba37b32ae2ce8d6"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 11 17:43:04 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 11 17:43:04 2025 +0800"
      },
      "message": "[#2591] fix(client): Incorrect header length for getLocalShuffleDataV3 (#2604)\n\n### What changes were proposed in this pull request?\n\nFix incorrect header length for getLocalShuffleDataV3\n\n### Why are the changes needed?\n\nThis will make getLocalShuffleDataV3 invalid\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nTested in our internal jobs.\n"
    },
    {
      "commit": "96bf76cbc9d9bbe51c3e8a502ba37b32ae2ce8d6",
      "tree": "f44a9437f0b6c5a68160b5a3a7321b0bcf42ed62",
      "parents": [
        "1e48bc673d1c0ee41f889a0de6192b0fab131467"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Sep 09 19:36:18 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Sep 09 19:36:18 2025 +0800"
      },
      "message": "[#2591] feat(client): Introduce the mechanism to report localfile read plan (#2603)\n\n### What changes were proposed in this pull request?\n\nThis PR introduces a mechanism to report localfile read plan, and the changes only are scoped in the client side. More changes should be added in the shuffle server in the future.\n\n### Why are the changes needed?\n\nFor normal partitions, the reading mode is sequential, which makes read-ahead optimization feasible. This has already been verified in the Riffle project (see [issue #483](https://github.com/zuston/riffle/issues/483)).\n\nFor huge partitions, however, the reading mode becomes skippable due to the AQE skew join optimization rule. In such cases, it is difficult to predict the next read position and length.\n\nBased on this analysis, we propose introducing a fixed read plan that is propagated from the client to the server, allowing the server to recognize the next read offset and thereby benefit from read-ahead optimization.\n\n### Does this PR introduce _any_ user-facing change?\n\nYes. And this feature will be disabled by default\n\n### How was this patch tested?\n\nExisting tests.\n"
    },
    {
      "commit": "1e48bc673d1c0ee41f889a0de6192b0fab131467",
      "tree": "ec008a0f7c0ef366b6f3b3fd3b89187dbdaaabcc",
      "parents": [
        "2a32171b9724272276797ede17eb057027df5767"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Sep 05 17:02:41 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Sep 05 17:02:41 2025 +0800"
      },
      "message": "[#2601] feat(spark): Introduce overlapping decompression for shuffle read (#2602)\n\n### What changes were proposed in this pull request?\n\nThis PR is to introduce the overlapping decompression for the shuffle reading for the better shuffle speed.\n\n### Why are the changes needed?\n\nfor #2601\n\nWhen applying the https://github.com/apache/uniffle/pull/2598 into the benchmark of terasort 100g, I found some bottleneck for the decompression time in the read phase. \n\nBased on a 100 GB Terasort benchmark, the results are impressive, reducing shuffle read time by 50% after applying this PR.\n\n\u003cimg width\u003d\"1911\" height\u003d\"716\" alt\u003d\"image\" src\u003d\"https://github.com/user-attachments/assets/707598fb-0850-4d38-8453-f34f852fc6af\" /\u003e\n\n\n### Does this PR introduce _any_ user-facing change?\n\nYes\n\n### How was this patch tested?\n\n1. Unit tests.\n"
    },
    {
      "commit": "2a32171b9724272276797ede17eb057027df5767",
      "tree": "6853397a44235ae063691855409b43d3b52cdb6f",
      "parents": [
        "d5e689c32aaaa69465fe92d0c79a773321ffd1b9"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Thu Sep 04 17:26:41 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Sep 04 17:26:41 2025 +0800"
      },
      "message": "[#2569] feat(spark): Add statistic of shuffle read times (#2598)\n\n### What changes were proposed in this pull request?\n\nAdd statistic of shuffle read times to find the bottleneck for shuffle reading\n\n### Why are the changes needed?\n\nfor #2569\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nTests in cluster\n"
    },
    {
      "commit": "d5e689c32aaaa69465fe92d0c79a773321ffd1b9",
      "tree": "c3041ae6ee08b16eca808d98bed3cc1660ae5cfe",
      "parents": [
        "32f4ac6c530058f7342dc17f6e707dad428f74b1"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Aug 27 10:15:00 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Aug 27 10:15:00 2025 +0800"
      },
      "message": "[#2592] fix(spark): Ignore failure when reporting shuffle read metrics to driver (#2593)\n\n### What changes were proposed in this pull request?\n\nIgnore failure when reporting shuffle read metrics to driver\n\n### Why are the changes needed?\n\nfix #2592 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "32f4ac6c530058f7342dc17f6e707dad428f74b1",
      "tree": "4043675834b6871d278d501b79fed7397942f575",
      "parents": [
        "f3bc84fbe82f85bcc6d0be2b6a8cf17bdb1291ff"
      ],
      "author": {
        "name": "Neo Chien",
        "email": "cchung100m@cs.ccu.edu.tw",
        "time": "Tue Aug 26 14:15:46 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 26 14:15:46 2025 +0800"
      },
      "message": "[#2575] fix(spark): Fix java.lang.IndexOutOfBoundsException: len is negative (#2589)\n\n### What changes were proposed in this pull request?\nFix java.lang.IndexOutOfBoundsException: len is negative\n\n### Why are the changes needed?\nfor https://github.com/apache/uniffle/issues/2575\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "f3bc84fbe82f85bcc6d0be2b6a8cf17bdb1291ff",
      "tree": "b31a4efcaaf8337959f025b57e0c5575f806654f",
      "parents": [
        "fe0ff7e60efab90a556554f4afb46f0caea6c2ce"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Aug 25 10:32:55 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Aug 25 10:32:55 2025 +0800"
      },
      "message": "[#2494] feat(spark): Enable overlapping compression by default (#2588)\n\n### What changes were proposed in this pull request?\n\n1. Enable overlapping compression by default\n2. Add doc for this feature\n\n### Why are the changes needed?\n\nfor #2494 \n\n### Does this PR introduce _any_ user-facing change?\n\nYes.\n\n### How was this patch tested?\n\nNeedn\u0027t"
    },
    {
      "commit": "fe0ff7e60efab90a556554f4afb46f0caea6c2ce",
      "tree": "8d399c7b6a4b48a91bd5cbf572bc93d7104ec1ac",
      "parents": [
        "e787d87c96f3d040e23b366afe378b657a02453d"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Fri Aug 22 10:08:12 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Aug 22 10:08:12 2025 +0800"
      },
      "message": "[#2586] fix(spark): Support writer switching servers on partition split with LOAD_BALANCE mode without reassign (#2587)\n\n### What changes were proposed in this pull request?\n\nThis PR is to support writer switching servers on partition split with LOAD_BALANCE mode without reassign. \n\n### Why are the changes needed?\n\nfix #2586 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nUnit tests\n"
    },
    {
      "commit": "e787d87c96f3d040e23b366afe378b657a02453d",
      "tree": "761bcce7ac29eb44a794939a969948e72d9258c5",
      "parents": [
        "0facb7be1913c088eab5c5d9970ab0766a884a99"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Aug 19 14:27:58 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 19 14:27:58 2025 +0800"
      },
      "message": "[#2583] fix(spark): Enable taskIds filter only on AQE and multi replicas for reader (#2584)\n\n### What changes were proposed in this pull request?\n\nOnly enable taskIds filter mechanism on AQE skew hit and multi replicas. Previously, when multi shuffle servers are assigned for reader to read, it will enable this mechanism, that will hurt the shuffle-servers performance due to the bitmap check, actually there is no need to do this on partition reassign (partition split) is enabled.\n\n### Why are the changes needed?\n\nTo fix #2583 \n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests \n"
    },
    {
      "commit": "0facb7be1913c088eab5c5d9970ab0766a884a99",
      "tree": "2a35a0ed3f234152d27f799f2662409293ff6e57",
      "parents": [
        "9b611cf147f07ede5db66f682db60a1ae7e48992"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Aug 19 14:24:06 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 19 14:24:06 2025 +0800"
      },
      "message": "[#2568] feat(spark): Use space-efficient protobuf for `MutableShuffleHandleInfo` to reduce RPC memory overhead (#2578)\n\n### What changes were proposed in this pull request?\n\nThis PR uses a space-efficient protobuf data structure to store the partitions-to-servers mapping, thereby reducing the RPC cost.\n\n### Why are the changes needed?\n\nThis is the part of PR for the #2568. \n\nIn large-scale Spark jobs, the number of partitions can reach up to 20K, whereas the number of assigned shuffle servers remains smaller than the total number of nodes in the Uniffle cluster.\nPrior to this PR, both the driver and the client (when reassignment was enabled) required substantial memory for RPC transfers, which could significantly increase the frequency of driver garbage collection.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo\n\n### How was this patch tested?\n\nUnit tests.\n"
    },
    {
      "commit": "9b611cf147f07ede5db66f682db60a1ae7e48992",
      "tree": "dc08c1ae69bb6e9c799a5369df43521c3a0cfc44",
      "parents": [
        "7f1586e9ca1935cac54cc3f9efd2a2e98582a2ec"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Tue Aug 19 10:47:22 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 19 10:47:22 2025 +0800"
      },
      "message": "[#2527] docs: Add some docs for LAB (#2585)\n\n### What changes were proposed in this pull request?\nAdd some docs for LAB\n\n### Why are the changes needed?\nFix: #2527\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI"
    },
    {
      "commit": "7f1586e9ca1935cac54cc3f9efd2a2e98582a2ec",
      "tree": "ec9ae15342dae772a2f1a72062fcf3c75c61de98",
      "parents": [
        "e6f0941ad9768beb83ec330c964d20a3ce2e55e3"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Mon Aug 18 14:41:19 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Aug 18 14:41:19 2025 +0800"
      },
      "message": "[#2579] fix(spark): Correct partition length for overlapping compression (#2580)\n\n### What changes were proposed in this pull request?\n\nThis PR fixes the partition length calculation for overlapping compression. Previously, when overlapping compression was enabled, the partition length was recorded as the uncompressed length, which broke the default Spark semantics.\nThis change aligns the behavior with ESS semantics and updates the partition length only after the event has been successfully processed.\n\n### Why are the changes needed?\n\nTo fix the incorrect semantic of partition length for overlapping compression, this will effect the AQE rules.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests.\n"
    },
    {
      "commit": "e6f0941ad9768beb83ec330c964d20a3ce2e55e3",
      "tree": "71e0ca38bbc29d7c0341d4ec25633c415cb30852",
      "parents": [
        "a1974f6e3dc9272efab7b32e5865912033ff6f0e"
      ],
      "author": {
        "name": "Zhen Wang",
        "email": "643348094@qq.com",
        "time": "Thu Aug 14 11:15:12 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Aug 14 11:15:12 2025 +0800"
      },
      "message": "[#2581] fix(spark): Use `SparkContext.getActive` instead of `getOrCreate` to align with method semantics (#2582)\n\n### What changes were proposed in this pull request?\n\nUse `SparkContext.getActive` instead of `getOrCreate` to better align with the intended semantics for external invocation.\n\n### Why are the changes needed?\n\nFix #2581\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nAdded unit test"
    },
    {
      "commit": "a1974f6e3dc9272efab7b32e5865912033ff6f0e",
      "tree": "99fb086e9c694e12141c3d41aa4781c4ee90bfa5",
      "parents": [
        "4eb83eea26a02a83372f88378399ceb0075858c8"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Aug 12 10:19:58 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 12 10:19:58 2025 +0800"
      },
      "message": "[#2576] fix: Warm up java version var to eliminate lock on creating concurrent hashmap (#2577)\n\n### What changes were proposed in this pull request?\n\nCache the java9+ var to eliminate lock when creating concurrent hashmap\n\n### Why are the changes needed?\n\nfix #2576 .\n\n`Enums.getIfPresent` will always using the global lock, this is unnecessary and hurt the performance\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting tests.\n"
    },
    {
      "commit": "4eb83eea26a02a83372f88378399ceb0075858c8",
      "tree": "7a85c3eec3b2e7d505cc5079d97166132f403545",
      "parents": [
        "7414ed5f541a171c7c7714648d816d838c7c365e"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Aug 05 17:40:07 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Aug 05 17:40:07 2025 +0800"
      },
      "message": "[#2571] fix(client): Race condition when adding shuffle servers (#2574)\n\n### What changes were proposed in this pull request?\n\nUsing the thread safe way to add shuffle-servers to emlinate race condition\n\n### Why are the changes needed?\n\nfix #2571\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeedn\u0027t\n"
    },
    {
      "commit": "7414ed5f541a171c7c7714648d816d838c7c365e",
      "tree": "59db0ffcffd00ad90eacda2ebe2c142a95a3bb7d",
      "parents": [
        "41d0fc53fd8ea7db164b3b510d8a351816cd1dd9"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Aug 04 13:58:10 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Aug 04 13:58:10 2025 +0800"
      },
      "message": "[MINOR] chore(CI): bump dorny/paths-filter from v3.0.2 to de90cc6fb38fc0963ad72b210f1f284cd68cea36 (#2570)\n\n"
    },
    {
      "commit": "41d0fc53fd8ea7db164b3b510d8a351816cd1dd9",
      "tree": "37333780e5571aaa283960c6f43acb7832b737e3",
      "parents": [
        "99d5c3a1165ebe9f6fb024eb625a0afbf8756460"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Aug 04 11:20:56 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Aug 04 11:20:56 2025 +0800"
      },
      "message": "[#2558] improvement(server): Limit the max flush event count for a single buffer (#2562)\n\n### What changes were proposed in this pull request?\nLimit the max flush event count for a single buffer.\n\n### Why are the changes needed?\nFix: #2558\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nVerify in production environment."
    },
    {
      "commit": "99d5c3a1165ebe9f6fb024eb625a0afbf8756460",
      "tree": "6659158a4304dc750b760cc43c6699c211b6f62b",
      "parents": [
        "066e71e6c50859e38457381a81393361fd400d3e"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Aug 04 11:19:53 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Aug 04 11:19:53 2025 +0800"
      },
      "message": "[#2525][FOLLOWUP] fix(server): remove metric `buffer_block_size` (#2567)\n\n### What changes were proposed in this pull request?\nRemove metric `buffer_block_size`\n\n### Why are the changes needed?\nThe observations of Summary are expensive due to the streaming quantile calculation and synchronized is used in `Summary.observe`.\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI"
    },
    {
      "commit": "066e71e6c50859e38457381a81393361fd400d3e",
      "tree": "77ca5764bcfc4a1e8be17395d5e8f59a3aa2edfe",
      "parents": [
        "1713c1f707ca3b2d95bdbd1fb7940f6377e31a93"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Wed Jul 30 17:22:00 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Jul 30 17:22:00 2025 +0800"
      },
      "message": "[#2563] improvement(spark): Add more logs of shuffle write on reassignment failure (#2564)\n\n### What changes were proposed in this pull request?\n\nThis PR is to add more logs of shuffle write on reassignment failure for the next inspection when it occurs.\n\n### Why are the changes needed?\n\nfor #2563 .\n\n```\n25/07/29 08:13:25 ERROR TaskResources: Task 14257 failed by error: \norg.apache.uniffle.common.exception.RssException: No available replacement server for: 10.xxxxx-23100-23104\n\tat org.apache.spark.shuffle.writer.RssShuffleWriter.reassignAndResendBlocks(RssShuffleWriter.java:832)\n\tat org.apache.spark.shuffle.writer.RssShuffleWriter.collectFailedBlocksToResend(RssShuffleWriter.java:672)\n\tat org.apache.spark.shuffle.writer.RssShuffleWriter.checkDataIfAnyFailure(RssShuffleWriter.java:570)\n\tat org.apache.spark.shuffle.writer.RssShuffleWriter.checkBlockSendResult(RssShuffleWriter.java:532)\n\tat org.apache.spark.shuffle.writer.RssShuffleWriter.internalCheckBlockSendResult(RssShuffleWriter.java:518)\n```\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t\n"
    },
    {
      "commit": "1713c1f707ca3b2d95bdbd1fb7940f6377e31a93",
      "tree": "4c2c21250d211a6bcb6b6855b877e53f25e0d985",
      "parents": [
        "be3a1ffc027fb6e587367905d21e5224d5839b16"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Jul 29 10:28:38 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Jul 29 10:28:38 2025 +0800"
      },
      "message": "[#2549] fix(spark): Invalid remote storage configuration was propagated during application registration (#2550)\n\n### What changes were proposed in this pull request?\n\nfix invalid remote storage configuration was propagated during application registration\n\n### Why are the changes needed?\n\nfix #2549\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\nYes. existing tests and internal cluster tests.\n"
    },
    {
      "commit": "be3a1ffc027fb6e587367905d21e5224d5839b16",
      "tree": "b7ac1fd7d5276dc94f93a47305e84c0ebcd4b1ea",
      "parents": [
        "8fa51bb303e40635fcbd454687da714cb61f798a"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Jul 29 10:28:01 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Jul 29 10:28:01 2025 +0800"
      },
      "message": "[#2560] improvement(client): Fast fail on hadoop reader initialization failure (#2551)\n\n### What changes were proposed in this pull request?\n\nFast fail on hadoop reader initialization failure\n\n### Why are the changes needed?\n\nWhen the hadoop reader initialization failed, this exception will be ignored. When the final unexcepted blockIds throws, it’s hard to find out the root cause.\n\nAnd so, we should fast fail in this case.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nExisting unit tests."
    },
    {
      "commit": "8fa51bb303e40635fcbd454687da714cb61f798a",
      "tree": "a409a36ad8898edb0402fb421a9e0609ecacc327",
      "parents": [
        "a5086b305743d2a9dbd2a0e4e9698a7974abdbb9"
      ],
      "author": {
        "name": "Junfan Zhang",
        "email": "zuston@apache.org",
        "time": "Tue Jul 29 10:26:07 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Jul 29 10:26:07 2025 +0800"
      },
      "message": "[MINOR] improvement(client): Shorten log for multi replica client reader (#2561)\n\n### What changes were proposed in this pull request?\n\nWhen multi replica or partition split is enabled,  the log with blockId unexception message will always show when reading from one shuffle-server, that will make someone confuse. \n\n### Why are the changes needed?\n\nShorten log for multi replica client reader\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nNeen\u0027t\n"
    },
    {
      "commit": "a5086b305743d2a9dbd2a0e4e9698a7974abdbb9",
      "tree": "c82e21ae20f4606ceb53451a719c6ee2f64f18df",
      "parents": [
        "850db71f18ab2945fe2c2c392f1a9f9b56a65c6b"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Jul 28 10:38:44 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jul 28 10:38:44 2025 +0800"
      },
      "message": "[#2555] feat(server): support dynamically modifying the tags of shuffle server (#2557)\n\n### What changes were proposed in this pull request?\nSupport dynamically modifying the tags of shuffle server.\n\n### Why are the changes needed?\nFix: #2555\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nUT"
    },
    {
      "commit": "850db71f18ab2945fe2c2c392f1a9f9b56a65c6b",
      "tree": "ae94f499dc61408084a77a74b67ddf645597d250",
      "parents": [
        "4b256e9c5d211e5577c0902c3de551d2a9844665"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Mon Jul 28 10:37:34 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jul 28 10:37:34 2025 +0800"
      },
      "message": "[MINOR] improvement(client): Simplify logging of heartbeat failures (#2559)\n\n### What changes were proposed in this pull request?\nSimplify logging of heartbeat failures\n\n### Why are the changes needed?\nThe stack trace of abnormal heartbeats is not only useless, but interfere with troubleshooting.\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nCI"
    },
    {
      "commit": "4b256e9c5d211e5577c0902c3de551d2a9844665",
      "tree": "84fca72e91c3df91bb56ef30c9c0c5e8a5d52a04",
      "parents": [
        "ccc534be53827889fab2ce2d98af7f974757f54b"
      ],
      "author": {
        "name": "xianjingfeng",
        "email": "xianjingfeng666@gmail.com",
        "time": "Fri Jul 25 17:35:08 2025 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Jul 25 17:35:08 2025 +0800"
      },
      "message": "[#2492][FOLLOWUP] improvement: change the default value of chunkPoolCapacityRatio (#2554)\n\n### What changes were proposed in this pull request?\nChange the default value of chunkPoolCapacityRatio\n\n### Why are the changes needed?\nThe proportion and the frequency of small blocks is not high. If this value is set too high, it may cause off-heap memory overflow.\nFix: #2492\n\n### Does this PR introduce any user-facing change?\nNo.\n\n### How was this patch tested?\nVerify in production environment.\n\n"
    }
  ],
  "next": "ccc534be53827889fab2ce2d98af7f974757f54b"
}