These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.
Replace the ForkJoinPool in CleanerChore by ThreadPoolExecutor which can limit the spawn thread size and avoid the master GC frequently. The replacement is an internal implementation in CleanerChore, so no config key change, the upstream users can just upgrade the hbase master without any other change.
Introduced a new config key for the snapshot taking/restoring operations at master side: hbase.master.executor.snapshot.threads, its default value is 3. means we can have 3 snapshot operations running at the same time.
Add serveral API in TimeRange class for avoiding using the deprecated TimeRange constructor: * TimeRange#from: Represents the time interval [minStamp, Long.MAX_VALUE) * TimeRange#until: Represents the time interval [0, maxStamp) * TimeRange#between: Represents the time interval [minStamp, maxStamp)
Provide a public method in MultiRowRangeFilter class to speed the requirement of filtering with multiple row prefixes, it will expand the row prefixes as multiple rowkey ranges by MultiRowRangeFilter, it's more efficient. {code} public MultiRowRangeFilter(byte[][] rowKeyPrefixes); {code}
Update the base docker image to ubuntu 18.04 for the find flaky tests jenkins job.
Adds a fixMeta method to hbck Service. Fixes holes in hbase:meta. Follow-up to fix overlaps. See HBASE-22567 also.
Follow-on is adding a client-side to hbase-operator-tools that can exploit this new addition (HBASE-22825)
Changes merge so you can merge more than two regions at a time. Currently only available inside HBase. HBASE-22827, a follow-on, is about exposing the facility in the Admin API (and then via the shell).
New shaded artifact for testing: hbase-shaded-testing-util.
We found a critical bug which can lead to WAL corruption when Durability.ASYNC_WAL is used. The reason is that we release a ByteBuffer before actually persist the content into WAL file.
The problem maybe lead to several errors, for example, ArrayIndexOfOutBounds when replaying WAL. This is because that the ByteBuffer is reused by others.
ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY java.lang.ArrayIndexOutOfBoundsException: 18056 at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) at org.apache.hadoop.hbase.PrivateCellUtil.matchingFamily(PrivateCellUtil.java:735) at org.apache.hadoop.hbase.CellUtil.matchingFamily(CellUtil.java:816) at org.apache.hadoop.hbase.wal.WALEdit.isMetaEditFamily(WALEdit.java:143) at org.apache.hadoop.hbase.wal.WALEdit.isMetaEdit(WALEdit.java:148) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:297) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:195) at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:100)
And may even cause segmentation fault and crash the JVM directly. You will see a hs_err_pidXXX.log file and usually the problem is SIGSEGV. This is usually because that the ByteBuffer has already been returned to the OS and used for other purpose.
The problem has been reported several times in the past and this time Wellington Ramos Chevreuil provided the full logs and deeply analyzed the logs so we can find the root cause. And Lijin Bin figured out that the problem may only happen when Durability.ASYNC_WAL is used. Thanks to them.
The problem only effects the 2.x releases, all users are highly recommand to upgrade to a release which has this fix in, especially that if you use Durability.ASYNC_WAL.
Add a new method runHbckChore in Hbck interface and a new shell cmd hbck_chore_run to request HBCK chore to run at master side.
Adds a “CatalogJanitor hbase:meta Consistency Issues” section to the new ‘HBCK Report’ page added by HBASE-22709. This section is empty unless the most recent CatalogJanitor scan turned up problems. If so, will show table of issues found.
When CatalogJanitor runs, it now checks for holes, overlaps, empty info:regioninfo columns and bad servers. Dumps findings into log. Follow-up adds report to new ‘HBCK Report’ linked off the Master UI.
NOTE: All features but the badserver check made it into branch-2.1 and branch-2.0 backports.
This feature is enabled by default. And the hbck chore run per 60 minutes by default. You can config “hbase.master.hbck.checker.interval” to a value lesser than or equal to 0 for disabling the chore.
Notice: the config “hbase.master.hbck.checker.interval” was renamed to “hbase.master.hbck.chore.interval” in HBASE-22737.
Upgrade jackson databind dependency to 2.9.9.1 due to CVEs
https://nvd.nist.gov/vuln/detail/CVE-2019-12814
https://nvd.nist.gov/vuln/detail/CVE-2019-12384
Add a new master web UI to show the potentially problematic opened regions. There are three case:
The config point “hbase.offheapcache.minblocksize” was wrong and is now deprecated. The new config point is “hbase.blockcache.minblocksize”.
OfflineMetaRepair is no longer supported in HBase-2+. Please refer to https://hbase.apache.org/book.html#HBCK2
This tool is deprecated in 2.x and will be removed in 3.0.
Mark the Hbck#scheduleServerCrashProcedure(List<HBaseProtos.ServerName> serverNames) as deprecated. Use Hbck#scheduleServerCrashProcedures(List<ServerName> serverNames) instead.
In HBASE-20734 we moved the recovered.edits onto the wal file system but when constructing the directory we missed the BASE_NAMESPACE_DIR(‘data’). So when using the default config, you will find that there are lots of new directories at the same level with the ‘data’ directory.
In this issue, we add the BASE_NAMESPACE_DIR back, and also try our best to clean up the wrong directories. But we can only clean up the region level directories, so if you want a clean fs layout on HDFS you still need to manually delete the empty directories at the same level with ‘data’.
The effect versions are 2.2.0, 2.1.[1-5], 1.4.[8-10], 1.3.[3-5].
hbase.regionserver.compaction.check.period is used for controlling how often the compaction checker runs. If unset, will use hbase.server.thread.wakefrequency as default value.
hbase.regionserver.flush.check.period is used for controlling how ofter the flush checker runs. If unset, will use hbase.server.thread.wakefrequency as default value.
These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.
See the document http://hbase.apache.org/book.html#upgrade2.2 about how to upgrade from 2.0 or 2.1 to 2.2+.
HBase 2.2+ uses a new Procedure form assiging/unassigning/moving Regions. It does not process HBase 2.1 and 2.0's Unassign/Assign Procedure types. Upgrade requires that we first drain the Master Procedure Store of old style Procedures before starting the new 2.2 Master. So you need to make sure that before you kill the old version (2.0 or 2.1) Master, there is no region in transition. And once the new version (2.2+) Master is up, you can rolling upgrade RegionServers one by one.
And there is a more safer way if you are running 2.1.1+ or 2.0.3+ cluster. It need four steps to upgrade Master.
Then you can rolling upgrade RegionServers one by one. See HBASE-21075 for more details.
Added completebulkload short name for BulkLoadHFilesTool to bin/hbase.
Change the default hadoop-3 version to 3.1.2. Drop the support for the releases which are effected by CVE-2018-8029, see this email https://lists.apache.org/thread.html/3d6831c3893cd27b6850aea2feff7d536888286d588e703c6ffd2e82@%3Cuser.hadoop.apache.org%3E
The CellUtil.setTimestamp
method changes to be an API with audience LimitedPrivate(COPROC)
in HBase 3.0. With that designation the API should remain stable within a given minor release line, but may change between minor releases.
Previously, this method was deprecated in HBase 2.0 for removal in HBase 3.0. Deprecation messages in HBase 2.y releases have been updated to indicate the expected API audience change.
The class LossyCounting was unintentionally marked Public but was never intended to be part of our public API. This oversight has been corrected and LossyCounting is now marked as Private and going forward may be subject to additional breaking changes or removal without notice. If you have taken a dependency on this class we recommend cloning it locally into your project before upgrading to this release.
Warnings for level headings are corrected in the book for the HBase Incompatibilities section.
Add hadoop 3.0.3, 3.1.1 3.1.2 in our hadoop check jobs.
The DumpReplicationQueues tool will now list replication queues sorted in chronological order.
Fixes a formatting issue in the administration section of the book, where listing indentation were a little bit off.
Now the default hadoop-two.version has been changed to 2.8.5, and all hadoop versions before 2.8.2(exclude) will not be supported any more.
Removed extra + in HRegion, HStore and LoadIncrementalHFiles for branch-2 and HRegion and HStore for branch-1.
Updated metrics core from 3.2.1 to 3.2.6.
The rubocop definition for the maximum method length was set to 75.
Fixes the formatting of the “Voting on Release Candidates” to actually show the quote and code formatting of the RAT check.
The rubocop configuration in the hbase-shell module now allows a line length with 100 characters, instead of 80 as before. For everything before 2.1.5 this change introduces rubocop itself.
This change allows the system and superusers to initiate compactions, even when a space quota violation policy disallows compactions from happening. The original intent behind disallowing of compactions was to prevent end-user compactions from creating undue I/O load, not disallowing *any* compaction in the system.
Adds new configuration hbase.client.failure.map.cleanup.interval which defaults to ten minutes.
Updates libs used internally by hbase via hbase-thirdparty as follows:
gson 2.8.1 -\> 2.8.5 guava 22.0 -\> 27.1-jre pb 3.5.1 -\> 3.7.0 netty 4.1.17 -\> 4.1.34 commons-collections4 4.1 -\> 4.3
Introduced
Future<Void> createTableAsync(TableDescriptor);
Introduced these methods: void move(byte[]); void move(byte[], ServerName); Future<Void> splitRegionAsync(byte[]);
These methods are deprecated: void move(byte[], byte[])
Add a new jenkins file for running pre commit check for GitHub PR.
Add cloneSnapshot/restoreSnapshot with acl methods in AsyncAdmin.
When insufficient permissions, you now get:
HTTP/1.1 403 Forbidden
on the HTTP side, and in the message
Forbidden org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user ‘myuser',action: get, tableName:mytable, family:cf. at org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor.authorizeAccess(RangerAuthorizationCoprocessor.java:547) and the rest of the ADE stack
Now we will sort the javac WARNING/ERROR before generating diff in pre-commit so we can get a stable output for the error prone. The downside is that we just sort the output lexicographically so the line number will also be sorted lexicographically, which is a bit strange to human.
Exposes a new configuration property “zookeeper.multi.max.size” which dictates the maximum size of deletes that HBase will make to ZooKeeper in a single RPC. This property defaults to 1MB, which should fall beneath the default ZooKeeper limit of 2MB, controlled by “jute.maxbuffer”.
Fixed awkward dependency issue that prevented site building.
HBase 2.1.4 shipped with an early version of this fix that incorrectly altered the libraries included in our binary assembly for using Apache Hadoop 2.7 (the current build default Hadoop version for 2.1.z). For folks running out of the box against a Hadoop 2.7 cluster (or folks who skip the installation step of replacing the bundled Hadoop libraries) this will result in a failure at Region Server startup due to a missing class definition. e.g.:
2019-03-27 09:02:05,779 ERROR [main] regionserver.HRegionServer: Failed construction RegionServer java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:644) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:628) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:356) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.hbase.util.CommonFSUtils.getRootDir(CommonFSUtils.java:362) at org.apache.hadoop.hbase.util.CommonFSUtils.isValidWALRootDir(CommonFSUtils.java:411) at org.apache.hadoop.hbase.util.CommonFSUtils.getWALRootDir(CommonFSUtils.java:387) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:704) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:613) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:3029) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:63) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3047) Caused by: java.lang.ClassNotFoundException: org.apache.htrace.SamplerBuilder at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 26 more
Workaround via any one of the following:
hadoop
executable is in the PATH
seen at Region Server startup and that you are not using the HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP
bypass.htrace-core-3.1.0-incubating.jar
available to all Region Servers via the HBASE_CLASSPATH environment variable.htrace-core-3.1.0-incubating.jar
available to all Region Servers by copying it into the directory ${HBASE_HOME}/lib/client-facing-thirdparty/
.Add a listTableDescriptors(List<TableName>) method in the AsyncAdmin interface, to align with the Admin interface.
Add a mergeRegionsAsync(byte[][], boolean) method in the AsyncAdmin interface.
Instead of using assert, now we will throw IllegalArgumentException when you want to merge less than 2 regions at client side. And also, at master side, instead of using assert, now we will throw DoNotRetryIOException if you want merge more than 2 regions, since we only support merging two regions at once for now.
Add drainXXX parameter for balancerSwitch/splitSwitch/mergeSwitch methods in the AsyncAdmin interface, which has the same meaning with the synchronous parameter for these methods in the Admin interface.
bulkload (HFileOutputFormat2) support config the compression on client ,you can set the job configuration “hbase.mapreduce.hfileoutputformat.compression” override the auto-detection of the target table's compression
Deprecated AsyncTable.isTableAvailable(TableName, byte[][]).
After HBASE-21871, we can specify a peer table name with --peerTableName in VerifyReplication tool like the following: hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --peerTableName=peerTable 5 TestTable
In addition, we can compare any 2 tables in any remote clusters with specifying both peerId and --peerTableName.
For example: hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --peerTableName=peerTable zk1,zk2,zk3:2181/hbase TestTable
Adds below flush, split, and compaction metrics
From 2.2.0, hbase supports client login via keytab. To use this feature, client should specify `hbase.client.keytab.file` and `hbase.client.keytab.principal` in hbase-site.xml, then the connection will contain the needed credentials which be renewed periodically to communicate with kerberized hbase cluster.
After HBASE-21410, we add a helper page to Master UI. This helper page is mainly to help HBase operator quickly found all regions and pids that are get stuck. There are 2 entries to get in this page. One is showing in the Regions in Transition section, it made “num region(s) in transition” a link that you can click and check all regions in transition and their related procedure IDs. The other one is showing in the table details section, it made the number of CLOSING or OPENING regions a link, which you can click and check regions and related procedure IDs of CLOSING or OPENING regions of a certain table. In this helper page, not only you can see all regions and related procedures, there are 2 buttons at the top which will show these regions or procedure IDs in text format. This is mainly aim to help operator to easily copy and paste all problematic procedure IDs and encoded region names to HBCK2's command line, by which we HBase operator can bypass these procedures or assign these regions.
After HBASE-21588, we introduce a new way to do WAL splitting coordination by procedure framework. This can simplify the process of WAL splitting and no need to connect zookeeper any more. During ServerCrashProcedure, it will create a SplitWALProcedure for each WAL that need to split. Then each SplitWALProcedure will spawn a SplitWALRemoteProcedure to send the request to regionserver. At the RegionServer side, whole process is handled by SplitWALCallable. It split the WAL and return the result to master. According to my test, this patch has a better performance as the number of WALs that need to split increase. And it can relieve the pressure on zookeeper.
Previously the recovered.edits directory was under the root directory. This JIRA moves the recovered.edits directory to be under the hbase.wal.dir if set. It also adds a check for any recovered.edits found under the root directory for backwards compatibility. This gives improvements when a faster media(like SSD) or more local FileSystem is used for the hbase.wal.dir than the root dir.
When oldwals (and hfile) cleaner cleans stale wals (and hfiles), it will periodically check and wait the clean results from filesystem, the total wait time will be no more than a max time.
The periodically wait and check configurations are hbase.oldwals.cleaner.thread.check.interval.msec (default is 500 ms) and hbase.regionserver.hfilecleaner.thread.check.interval.msec (default is 1000 ms).
Meanwhile, The max time configurations are hbase.oldwals.cleaner.thread.timeout.msec and hbase.regionserver.hfilecleaner.thread.timeout.msec, they are set to 60 seconds by default.
All support dynamic configuration.
e.g. in the oldwals cleaning scenario, one may consider tuning hbase.oldwals.cleaner.thread.timeout.msec and hbase.oldwals.cleaner.thread.check.interval.msec
HBASE-21481 improves the quality of access control, by strengthening the protection of super users's privileges.
Now we have four types of RIT procedure metrics, assign, unassign, move, reopen. The meaning of assign/unassign is changed, as we will not increase the unassign metric and then the assign metric when moving a region. Also introduced two new procedure metrics, open and close, which are used to track the open/close region calls to region server. We may send open/close multiple times to finish a RIT since we may retry multiple times.
Problem: This is an old problem since HBASE-2231. The compaction event marker was only writed to WAL. But after flush, the WAL may be archived, which means an useful compaction event marker be deleted, too. So the compacted store files cannot be archived when region open and replay WAL.
Solution: After this jira, the compaction event tracker will be writed to HFile. When region open and load store files, read the compaction evnet tracker from HFile and archive the compacted store files which still exist.
HBase contains two quota scopes: MACHINE and CLUSTER. Before this patch, set quota operations did not expose scope option to client api and use MACHINE as default, CLUSTER scope can not be set and used. Shell commands are as follows: set_quota, TYPE => THROTTLE, TABLE => ‘t1’, LIMIT => ‘10req/sec’
This issue implements CLUSTER scope in a simple way: For user, namespace, user over namespace quota, use [ClusterLimit / RSNum] as machine limit. For table and user over table quota, use [ClusterLimit / TotalTableRegionNum * MachineTableRegionNum] as machine limit. After this patch, user can set CLUSTER scope quota, but MACHINE is still default if user ignore scope. Shell commands are as follows: set_quota, TYPE => THROTTLE, TABLE => ‘t1’, LIMIT => ‘10req/sec’ set_quota, TYPE => THROTTLE, TABLE => ‘t1’, LIMIT => ‘10req/sec’, SCOPE => MACHINE set_quota, TYPE => THROTTLE, TABLE => ‘t1’, LIMIT => ‘10req/sec’, SCOPE => CLUSTER
Change spotbugs version to 3.1.11.
Remove bloom filter type ROWPREFIX_DELIMITED. May add it back when find a better solution.
Support enable or disable exceed throttle quota. Exceed throttle quota means, user can over consume user/namespace/table quota if region server has additional available quota because other users don't consume at the same time. Use the following shell commands to enable/disable exceed throttle quota: enable_exceed_throttle_quota disable_exceed_throttle_quota There are two limits when enable exceed throttle quota:
Remove jackson dependencies from most hbase modules except hbase-rest, use shaded gson instead. The output json will be a bit different since jackson can use getter/setter, but gson will always use the fields.
Mark HConstants.META_QOS as deprecated. It is for internal use only, which is the highest priority. You should not try to set a priority greater than or equal to this value, although it is no harm but also useless.
This patch adds the ability to disable split and/or merge for a table (By default, split and merge are enabled for a table).
Allows shell to set Scan options previously not exposed. See additions as part of the scan help by typing following hbase shell:
hbase> help ‘scan’
We can specify peerQuorumAddress instead of peerId in VerifyReplication tool. So it no longer requires peerId to be setup when using this tool.
For example: hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication zk1,zk2,zk3:2181/hbase testTable
Introduce a VerifyWALEntriesReplicationEndpoint which replicates nothing but only verifies if all the cells are valid. It can be used to capture bugs for writing WAL, as most times we will not read the WALs again after writing it if there are no region server crashes.
Deprecated HBaseConfiguration#getInt(Configuration, String, String, int) method and removed it from 3.0.0 version.
Introduced an new config key in this issue: hbase.regionserver.inmemory.compaction.pool.size. the default value would be 10. you can configure this to set the pool size of in-memory compaction pool. Note that all memstores in one region server will share the same pool, so if you have many regions in one region server, you need to set this larger to compact faster for better read performance.
Make StoppedRpcClientException extend DoNotRetryIOException.
To implement user permission control in Precedure V2, move grant and revoke method from AccessController to master firstly. Mark AccessController#grant and AccessController#revoke as deprecated and please use Admin#grant and Admin#revoke instead.
IMPORTANT: Due to security issues, all users who use hbase thrift should avoid using releases which do not have this fix.
The effect releases are: 2.1.x: 2.1.2 and below 2.0.x: 2.0.4 and below 1.x: 1.4.x and below
If you are using the effect releases above, please consider upgrading to a newer release ASAP.
HTableMultiplexer exposes the implementation class, and it is incomplete, so we mark it as deprecated and remove it in 3.0.0 release.
There is no direct replacement for HTableMultiplexer, please use BufferedMutator if you want to batch mutations to a table.
Introduce a BulkLoadHFiles interface which is marked as IA.Public, for doing bulk load programmatically. Introduce a BulkLoadHFilesTool which extends BulkLoadHFiles, and is marked as IA.LimitedPrivate(TOOLS), for using from command line. The old LoadIncrementalHFiles is deprecated and will be removed in 3.0.0.
Move the two getHbck method from ClusterConnection to Connection, and mark the methods as IA.LimitedPrivate(HBCK), as ClusterConnection is IA.Private and should not be depended by HBCK2.
Add a clearRegionLocationCache method in Connection to clear the region location cache for all the tables. As in RegionLocator, most of the methods have a ‘reload’ parameter, which implicitly tells user that we have a region location cache, so adding a method to clear the cache is fine.
Support set region server rpc throttle quota which represents the read/write ability of region servers and throttles when region server's total requests exceeding the limit.
Use the following shell command to set RS quota: set_quota TYPE => THROTTLE, REGIONSERVER => ‘all’, THROTTLE_TYPE => WRITE, LIMIT => ‘20000req/sec’ set_quota TYPE => THROTTLE, REGIONSERVER => ‘all’, LIMIT => NONE “all” represents the throttle quota of all region servers and setting specified region server quota isn't supported currently.
In shell commands “describe_namespace” and “describe”, which are used to see the descriptors of the namespaces and tables respectively, quotas set on that particular namespace/table will also be printed along.
Adds shell support for the following:
After HBASE-21620, the filterListWithOR has been a bit slow because we need to merge each sub-filter‘s RC , while before HBASE-21620, we will skip many RC merging, but the logic was wrong. So here we choose another way to optimaze the performance: removing the KeyValueUtil#toNewKeyCell. Anoop Sam John suggested that the KeyValueUtil#toNewKeyCell can save some GC before because if we copy key part of cell into a single byte[], then the block the cell refering won’t be refered by the filter list any more, the upper layer can GC the data block quickly. while after HBASE-21620, we will update the prevCellList for every encountered cell now, so the lifecycle of cell in prevCellList for FilterList will be quite shorter. so just use the cell ref for saving cpu. BTW, we removed all the arrays streams usage in filter list, because it's also quite time-consuming in our test.
We found the memstore snapshotting would cost much time because of calling the time-consuming ConcurrentSkipListMap#Size, it would make the p999 latency spike happen. So in this issue, we remove all ConcurrentSkipListMap#size in memstore by counting the cellsCount in MemstoreSizeing. As the issue described, the p999 latency spike was mitigated.
Provides a new throttle type: capacity unit. One read/write/request capacity unit represents that read/write/read+write up to 1K data. If data size is more than 1K, then consume additional capacity units.
Use shell command to set capacity unit(CU): set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => ‘u1’, LIMIT => ‘10CU/sec’
Use the “hbase.quota.read.capacity.unit” property to set the data size of one read capacity unit in bytes, the default value is 1K. Use the “hbase.quota.write.capacity.unit” property to set the data size of one write capacity unit in bytes, the default value is 1K.
Does thread dump on stdout on abort.
Now all the Enum configs in ColumnFamilyDescriptor can accept lower case config value.
Python3 support was added to dev-support/submit-patch.py. To install newly required dependencies run `pip install -r dev-support/python-requirements.txt` command.
In HBASE-21657, I simplified the path of estimatedSerialiedSize() & estimatedSerialiedSizeOfCell() by moving the general getSerializedSize() and heapSize() from ExtendedCell to Cell interface. The patch also included some other improvments:
We gain almost 40% throughput improvement in 100% scan case for branch-2 (cacheHitRatio100%)[1], it‘s a good thing. While it’s a incompatible change in some case, such as if the upstream user implemented their own Cells, although it's rare but can happen, then their compile will be error.
Adds task monitor that shows ServerCrashProcedure progress in UI.
Before this issue, thrift1 server and thrift2 server are totally different servers. If a new feature is added to thrift1 server, thrfit2 server have to make the same change to support it(e.g. authorization). After this issue, thrift2 server is inherited from thrift1, thrift2 server now have all the features thrift1 server has(e.g http support, which thrift2 server doesn't have before). The way to start thrift1 or thrift2 server remain the same after this issue.
ThriftAdmin/ThriftTable are implemented based on Thrift2. With ThriftAdmin/ThriftTable, People can use thrift2 protocol just like HTable/HBaseAdmin. Example of using ThriftConnection Configuration conf = HBaseConfiguration.create(); conf.set(ClusterConnection.HBASE_CLIENT_CONNECTION_IMPL,ThriftConnection.class.getName()); Connection conn = ConnectionFactory.createConnection(conf); Table table = conn.getTable(tablename) It is just like a normal Connection, similar use experience with the default ConnectionImplementation
There was a bug when scan with the same startRow(inclusive=true) and stopRow(inclusive=false). The old incorrect behavior is return one result. After this fix, the new correct behavior is return nothing.
Support enable or disable rpc throttle when hbase quota is enabled. If hbase quota is enabled, rpc throttle is enabled by default. When disable rpc throttle, HBase will not throttle any request. Use the following commands to switch rpc throttle : enable_rpc_throttle / disable_rpc_throttle.
Add a new configuration “hbase.skip.load.duplicate.table.coprocessor”. The default value is false to keep compatible with the old behavior. Config it true to skip load duplicate table coprocessor.
Added DDL operations and some other structure definition to thrift2. Methods added: create/modify/addColumnFamily/deleteColumnFamily/modifyColumnFamily/enable/disable/truncate/delete table create/modify/delete namespace get(list)TableDescriptor(s)/get(list)NamespaceDescirptor(s) tableExists/isTableEnabled/isTableDisabled/isTableAvailabe And some class definitions along with those methods
Deprecated region coprocessor postMutationBeforeWAL and introduce two new region coprocessor postIncrementBeforeWAL and postAppendBeforeWAL instead.
Use de.skuzzle.enforcer.restrict-imports-enforcer-rule extension for maven enforcer plugin to ban illegal imports at compile time. Now if you use illegal imports, for example, import com.google.common.*, there will be a compile error, instead of a checkstyle warning.
Add a sanity check when constructing KeyValue from a byte[]. we use the constructor when we‘re reading kv from socket or HFIle or WAL(replication). the santiy check isn’t designed for discovering the bits corruption in network transferring or disk IO. It is designed to detect bugs inside HBase in advance. and HBASE-21459 indicated that there's extremely small performance loss for diff kinds of keyvalue.
The replication UI on master will show the replication endpoint classname.
Add a SERIAL flag for add_peer command to identifiy whether or not the replication peer is a serial replication peer. The default serial flag is false.
Log level of ReadOnlyZKClient moved to debug.
The HBase shell
now includes a command to list regions currently in transition.
HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct 8 21:05:50 UTC 2018 hbase(main):001:0> help 'rit' List all regions in transition. Examples: hbase> rit hbase(main):002:0> create ... 0 row(s) in 2.5150 seconds => Hbase::Table - IntegrationTestBigLinkedList hbase(main):003:0> rit 0 row(s) in 0.0340 seconds hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1' 0 row(s) in 0.0540 seconds hbase(main):005:0> rit IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1. state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null 1 row(s) in 0.0170 seconds
Allow passing of -Dkey=value option to shell to override hbase-* configuration: e.g.:
$ ./bin/hbase shell -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -Draining=false ... hbase(main):001:0> @shell.hbase.configuration.get(“hbase.zookeeper.quorum”) => “ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org” hbase(main):002:0> @shell.hbase.configuration.get(“raining”) => “false”
Incompatible change. Allow MasterObserver#preModifyTable to return a new TableDescriptor. And master will use this returned TableDescriptor to modify table.
HBase clusters will experience Region Server failures due to out of memory errors due to a leak given any of the following:
When there are long running scans the Region Server process attempts to optimize access by using a different API geared towards sequential access. Due to an error in HBASE-20704 for HBase 2.0+ the Region Server fails to release related resources when those scans finish. That same optimization path is always used for the HBase internal file compaction process.
Impact for this error can be minimized by setting the config value “hbase.storescanner.pread.max.bytes” to MAX_INT to avoid the optimization for default user scans. Clients should also be checked to ensure they do not pass the STREAM read type to the Scan API. This will have a severe impact on performance for long scans.
Compactions always use this sequential optimized reading mechanism so downstream users will need to periodically restart Region Server roles after compactions have happened.
Add a new method preCreateTableRegionInfos for MasterObserver, which will be called before creating region infos for the given table, before the preCreateTable method. It allows you to return a new TableDescritor to override the original one. Returns null or throws exception will stop the creation.
After HBASE-21492 the return type of WALCellCodec#getWALCellCodecClass has been changed from String to Class
To prevent race condition between in progress snapshot (performed by TakeSnapshotHandler) and HFileCleaner which results in data loss, this JIRA introduced mutual exclusion between taking snapshot and running HFileCleaner. That is, at any given moment, either some snapshot can be taken or, HFileCleaner checks hfiles which are not referenced, but not both can be running.
Changes group name of hbase metrics from “HBase Counters” to “HBaseCounters”.
Parent issue moved hbase-spark* modules to hbase-connectors. This issue removes hbase-spark* modules from hbase core repo.
hbase-spark* modules have been cloned to https://github.com/apache/hbase-connectors All spark connector dev is to happen in that repo from here on out.
Let me file a subtask to remove hbase-spark* modules from hbase core.
Add -Djdk.net.URLClassPath.disableClassPathURLCheck=true when executing surefire plugin.
Puts master startup into holding pattern if meta is not assigned (previous it would exit). To make progress again, operator needs to inject an assign (Caveats and instruction can be found in HBASE-21035).
Adds scheduleServerCrashProcedure to the HbckService.
Add two new config hbase.regionserver.abort.timeout and hbase.regionserver.abort.timeout.task. If regionserver abort timeout, it will schedule an abort timeout task to run. The default abort task is SystemExitWhenAbortTimeout, which will force to terminate region server when abort timeout. And you can config a special abort timeout task by hbase.regionserver.abort.timeout.task.
Adds to bin/hbase means of invoking hbck2. Pass the new ‘-j’ option on the ‘hbck’ command with a value of the full path to the HBCK2.jar.
E.g:
$ ./bin/hbase hbck -j ~/checkouts/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar setTableState x ENABLED
Retry assigns ‘forever’ (or until an intervention such as a ServerCrashProcedure).
Previous retry was a maximum of ten times but on failure, handling was an indeterminate.
The description claims the balancer not dynamically configurable but this is an error; it is http://hbase.apache.org/book.html#dyn_config
Also, if balancer is seen to be cutting out too soon, try setting “hbase.master.balancer.stochastic.runMaxSteps” to true.
Adds cleaner logging around balancer start.
HBASE-21073 | Major | “Maintenance mode” master
Instead of being an ephemeral state set by hbck, maintenance mode is now an explicit toggle set by either configuration property or environment variable. In maintenance mode, master will host system tables and not assign any user-space tables to RSs. This gives operators the ability to affect repairs to meta table with fewer moving parts.
Changed waitTime parameter to lockWait on bypass. Changed default waitTime from 0 -- i.e. wait for ever -- to 1ms so if lock is held, we'll go past it and if override enforce bypass.
bypass will now throw an Exception if passed a lockWait <= 0; i.e bypass will prevent an operator getting stuck on an entity lock waiting forever (lockWait == 0)
Cleans up usage and docs around Canary. Does not change command-line args (though we should -- smile).
For the sub procedures which are successfully finished, do not do rollback. This is a change in rollback behavior.
State changes which are done by sub procedures should be handled by parent procedures when rolling back. For example, when rolling back a MergeTableProcedure, we will schedule new procedures to bring the offline regions online instead of rolling back the original procedures which off-lined the regions (in fact these procedures can not be rolled back...).
Scans that make use of QualifierFilter
previously would erroneously return both columns with an empty qualifier along with those that matched. After this change that behavior has changed to only return those columns that match.
It is recommended to place the working directory on-cluster on HDFS as doing so has shown a strong performance increase due to data locality. It is important to note that the working directory should not overlap with any existing directories as the working directory will be cleaned out during the snapshot process. Beyond that, any well-named directory on HDFS should be sufficient.
This adds two extra features to WALPrettyPrinter tool:
Output for each cell combined size of cell descriptors, plus the cell value itself, in a given WAL edit. This is printed on the results as “cell total size sum:” info by default;
An optional -g/--goto argument, that allows to seek straight to that specific WAL file position, then sequentially reading the WAL from that point towards its end;
Local HBase cluster (as used by unit tests) wait times on startup and initialization can be configured via `hbase.master.start.timeout.localHBaseCluster` and `hbase.master.init.timeout.localHBaseCluster`
Adds anchors #tables, #tasks, etc.
Add table state column to the tables panel
Removed the abort_procedure command from shell -- dangerous -- and deprecated abortProcedure in Admin API.
Add two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED
Adds ‘raw’ assigns/unassigns to the Hbck Service. Takes a list of encoded region names and bulk assigns/unassigns. Skirts Master ‘state’ check and does not invoke Coprocessors. For repair only.
Here is what HBCK2 usage looks like now:
{code} $ java -cp hbase-hbck2-1.0.0-SNAPSHOT.jar org.apache.hbase.HBCK2 usage: HBCK2 <OPTIONS> COMMAND [<ARGS>]
Options: -d,--debug run with debug output -h,--help output this help message --hbase.zookeeper.peerport peerport of target hbase ensemble --hbase.zookeeper.quorum ensemble of target hbase --zookeeper.znode.parent parent znode of target hbase
Commands: setTableState <TABLENAME> <STATE> Possible table states: ENABLED, DISABLED, DISABLING, ENABLING To read current table state, in the hbase shell run: hbase> get ‘hbase:meta’, ‘<TABLENAME>’, ‘table:state’ A value of \x08\x00 == ENABLED, \x08\x01 == DISABLED, etc. An example making table name ‘user’ ENABLED: $ HBCK2 setTableState users ENABLED Returns whatever the previous table state was.
assign <ENCODED_REGIONNAME> ... A ‘raw’ assign that can be used even during Master initialization. Skirts Coprocessors. Pass one or more encoded RegionNames: e.g. 1588230740 is hard-coded encoding for hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a random user-space encoded Region name looks like. For example: $ HBCK2 assign 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid of the created AssignProcedure or -1 if none.
unassign <ENCODED_REGIONNAME> ... A ‘raw’ unassign that can be used even during Master initialization. Skirts Coprocessors. Pass one or more encoded RegionNames: Skirts Coprocessors. Pass one or more encoded RegionNames: de00010733901a05f5a2a3a382e27dd4 is an example of what a random user-space encoded Region name looks like. For example: $ HBCK2 unassign 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid of the created UnassignProcedure or -1 if none. {code}
This change ensures Append operations are assembled into the expected order.
Make it so can run the WAL parse and load system in isolation. Here is an example:
{code}$ HBASE_OPTS=" -XX:+UnlockDiagnosticVMOptions -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:+DebugNonSafepoints" ./bin/hbase org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore ~/big_set_of_masterprocwals/ {code}
Add a new nettyDirectMemoryUsage under server's ipc metrics to show direct memory usage for netty rpc server.
Client facing artifacts are now built whenever Maven is run through the “package” goal. Previously, the client facing artifacts would create placeholder jars that skipped repackaging HBase and third-party dependencies unless the “release” profile was active.
Build times may be noticeably longer depending on your build hardware. For example, the Jenkins worker nodes maintained by ASF Infra take ~14% longer to do a full packaging build. An example portability-focused personal laptop took ~25% longer.
Allows configuration of the length of RPC messages printed to the log at TRACE level via “hbase.ipc.trace.param.size” in RpcServer.
Users who have previously made use of prefix tree encoding can now check that their existing HFiles no longer contain data that uses it with an additional preupgrade check command.
hbase pre-upgrade validate-hfile
Please see the “HFile Content validation” section of the ref guide's coverage of the pre-upgrade validator tool for usage details.
Adds an HBCK Service and a first method to force-change-in-table-state for use by an HBCK client effecting ‘repair’ to a malfunctioning HBase.
Cleanup all the cluster start override combos in HBaseTestingUtility by adding a StartMiniClusterOption and Builder.
Fence out hbase-1.x hbck1 instances. Stop them making state changes on an hbase-2.x cluster; they could do damage. We do this by writing the hbck1 lock file into place on hbase-2.x Master start-up.
To disable this new behavior, set hbase.write.hbck1.lock.file to false
Introduced a new TransitRegionStateProcedure to replace the old AssignProcedure/UnassignProcedure/MoveRegionProcedure. In the old code, MRP will not be attached to RegionStateNode, so it can not be interrupted by ServerCrashProcedure, which introduces lots of tricky code to deal with races, and also causes lots of other difficulties on how to prevent scheduling redundant or even conflict procedures for a region.
And now TRSP is the only one procedure which can bring region online or offline. When you want to schedule one, you need to check whether there is already one attached to the RegionStateNode, under the lock of the RegionStateNode. If not just go ahead, and if there is one, then you should do something, for example, give up and fail directly, or tell the TRSP to give up(This is what SCP does). Since the check and attach are both under the lock of RSN, it will greatly reduce the possible races, and make the code much simpler.
HFiles generated by 2.0.0, 2.0.1, 2.1.0 are not forward compatible to 1.4.6-, 1.3.2.1-, 1.2.6.1-, and other inactive releases. Why HFile lose compatability is hbase in new versions (2.0.0, 2.0.1, 2.1.0) use protobuf to serialize/deserialize TimeRangeTracker (TRT) while old versions use DataInput/DataOutput. To solve this, We have to put HBASE-21012 to 2.x and put HBASE-21013 in 1.x. For more information, please check HBASE-21008.
After HBASE-20965, we can use MasterFifoRpcScheduler in master to separate RegionServerReport requests to indenpedent handler. To use this feature, please set “hbase.master.rpc.scheduler.factory.class” to “org.apache.hadoop.hbase.ipc.MasterFifoRpcScheduler”. Use “hbase.master.server.report.handler.count” to set RegionServerReport handlers count, the default value is half of “hbase.regionserver.handler.count” value, but at least 1, and the other handlers count in master is “hbase.regionserver.handler.count” value minus RegionServerReport handlers count, but at least 1 too.
In previous releases, when a Space Quota was configured on a table or namespace and that table or namespace was deleted, the Space Quota was also deleted. This change improves the implementation so that the same is also done for RPC Quotas.
After HBASE-20986, we can set different value to block size of WAL and recovered edits. Both of their default value is 2 * default HDFS blocksize. And hbase.regionserver.recoverededits.blocksize is for block size of recovered edits while hbase.regionserver.hlog.blocksize is for block size of WAL.
With this change if a WAL's meta provider (hbase.wal.meta_provider) is not explicitly set, it now defaults to whatever hbase.wal.provider is set to. Previous, the two settings operated independently, each with its own default.
This change is operationally incompatible with previous HBase versions because the default WAL meta provider no longer defaults to AsyncFSWALProvider but to hbase.wal.provider.
The thought is that this is more in line with an operator's expectation, that a change in hbase.wal.provider is sufficient to change how WALs are written, especially given hbase.wal.meta_provider is an obscure configuration and that the very idea that meta regions would have their own wal provider would likely come as a surprise.
Update hadoop-two.version to 2.7.7 and hadoop-three.version to 3.0.3 due to a JDK issue which is solved by HADOOP-15473.
Make hasLock method final, and add a locked field in Procedure to record whether we have the lock. We will set it to true in doAcquireLock and to false in doReleaseLock. The sub procedures do not need to manage it any more.
Also added a locked field in the proto message. When storing, the field will be set according to the return value of hasLock. And when loading, there is a new field in Procedure called lockedWhenLoading. We will set it to true if the locked field in proto message is true.
The reason why we can not set the locked field directly to true by calling doAcquireLock is that, during initialization, most procedures need to wait until master is initialized. So the solution here is that, we introduced a new method called waitInitialized in Procedure, and move the wait master initialized related code from acquireLock to this method. And we added a restoreLock method to Procedure, if lockedWhenLoading is true, we will call the acquireLock to get the lock, but do not set locked to true. And later when we call doAcquireLock and pass the waitInitialized check, we will test lockedWhenLoading, if it is true, when we just set the locked field to true and return, without actually calling the acquireLock method since we have already called it once.
Exposing 2 new metrics in HBase to provide ReadRequestRate and WriteRequestRate at region server level. These metrics give the rate of request handled by the region server and are reset after every monitoring interval.
Added a new command to the shell to switch on/off compactions called “compaction_switch”. Disabling compactions will interrupt any currently ongoing compactions. This setting will be lost on restart of the server. Added the configuration hbase.regionserver.compaction.enabled so user can enable/disable compactions via hbase-site.xml.
Class org.apache.hadoop.hbase.util.Base64 has been removed in it's entirety from HBase 2+. In HBase 1, unused methods have been removed from the class and the audience was changed from Public to Private. This class was originally intended as an internal utility class that could be used externally but thinking since changed; these classes should not have been advertised as public to end-users.
This represents an incompatible change for users who relied on this implementation. An alternative implementation for affected clients is available at java.util.Base64 when using Java 8 or newer; be aware, it may encode/decode differently. For clients seeking to restore this specific implementation, it is available in the public domain for download at http://iharder.sourceforge.net/current/java/base64/
This enhances the AccessControlClient APIs to retrieve the permissions based on namespace, table name, family and qualifier for specific user. AccessControlClient can also validate a user whether allowed to perform specified operations on a particular table. Following APIs have been added,