blob: d29c0a0918e7b5ae8768d2bbfdec30f09a65d9fa [file] [log] [blame]
Release Notes - Hive - Version 0.8.0
** New Feature
* [HIVE-192] - Add TIMESTAMP column type for thrift dynamic_type
* [HIVE-306] - Support "INSERT [INTO] destination"
* [HIVE-788] - Triggers when a new partition is created for a table
* [HIVE-818] - Create a Hive CLI that connects to hive ThriftServer
* [HIVE-872] - Allow type widening on COALESCE/UNION ALL
* [HIVE-956] - Add support of columnar binary serde
* [HIVE-1003] - optimize metadata only queries
* [HIVE-1310] - Partitioning columns should be of primitive types only
* [HIVE-1343] - add an interface in RCFile to support concatenation of two files without (de)compression
* [HIVE-1537] - Allow users to specify LOCATION in CREATE DATABASE statement
* [HIVE-1694] - Accelerate GROUP BY execution using indexes
* [HIVE-1734] - Implement map_keys() and map_values() UDFs
* [HIVE-1735] - Extend Explode UDTF to handle Maps
* [HIVE-1803] - Implement bitmap indexing in Hive
* [HIVE-1918] - Add export/import facilities to the hive system
* [HIVE-1941] - support explicit view partitioning
* [HIVE-1950] - Block merge for RCFile
* [HIVE-2090] - Add "DROP DATABASE ... CASCADE/RESTRICT"
* [HIVE-2121] - Input Sampling By Splits
* [HIVE-2185] - extend table statistics to store the size of uncompressed data (+extend interfaces for collecting other types of statistics)
* [HIVE-2188] - Add get_table_objects_by_name() to Hive MetaStore
* [HIVE-2215] - Add api for marking / querying set of partitions for events
* [HIVE-2223] - support grouping on complex types in Hive
* [HIVE-2225] - Purge expired events
* [HIVE-2236] - Cli: Print Hadoop's CPU milliseconds
* [HIVE-2244] - Add a Plugin Developer Kit to Hive
* [HIVE-2272] - add TIMESTAMP data type
* [HIVE-2278] - Support archiving for multiple partitions if the table is partitioned by multiple columns
* [HIVE-2380] - Add Binary Datatype in Hive
* [HIVE-2500] - Allow Hive to be debugged remotely
* [HIVE-2509] - Literal bigint
* [HIVE-2561] - Allow UDFs to specify additional FILE/JAR resources necessary for execution
** Bug
* [HIVE-11] - better error code from Hive describe command
* [HIVE-106] - Join operation fails for some queries
* [HIVE-619] - Improve the error messages for missing/incorrect UDF/UDAF class
* [HIVE-1218] - CREATE TABLE t LIKE some_view should create a new empty base table, but instead creates a copy of view
* [HIVE-1302] - describe parse_url throws an error
* [HIVE-1342] - Predicate push down get error result when sub-queries have the same alias name
* [HIVE-1461] - Clean up references to 'hive.metastore.local'
* [HIVE-1538] - FilterOperator is applied twice with ppd on.
* [HIVE-1592] - ProxyFileSystem.close calls super.close twice.
* [HIVE-1595] - job name for alter table <T> archive partition <P> is not correct
* [HIVE-1631] - JDBC driver returns wrong precision, scale, or column size for some data types
* [HIVE-1675] - SAXParseException on plan.xml during local mode.
* [HIVE-1825] - Different defaults for hive.metastore.local
* [HIVE-1850] - alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?)
* [HIVE-1884] - Potential risk of resource leaks in Hive
* [HIVE-1937] - DDLSemanticAnalyzer won't take newly set Hive parameters
* [HIVE-1943] - Metastore operations (like drop_partition) could be improved in terms of maintaining consistency of metadata and data
* [HIVE-1959] - Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.
* [HIVE-1963] - Don't set ivy.home in build-common.xml
* [HIVE-1965] - Auto convert mapjoin should not throw exception if the top operator is union operator.
* [HIVE-1973] - Getting error when join on tables where name of table has uppercase letters
* [HIVE-1974] - In error scenario some opened streams may not closed in ScriptOperator.java, Utilities.java
* [HIVE-1975] - "insert overwrite directory" Not able to insert data with multi level directory path
* [HIVE-1976] - Exception should be thrown when invalid jar,file,archive is given to add command
* [HIVE-1980] - Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts
* [HIVE-1987] - HWI admin_list_jobs JSP page throws exception
* [HIVE-1988] - Make the delegation token issued by the MetaStore owned by the right user
* [HIVE-2001] - Add inputs and outputs to authorization DDL commands
* [HIVE-2003] - LOAD compilation does not set the outputs during semantic analysis resulting in no authorization checks being done for it.
* [HIVE-2008] - keyword_1.q is failing
* [HIVE-2022] - Making JDO thread-safe by default
* [HIVE-2024] - In Driver.execute(), mapred.job.tracker is not restored if one of the task fails.
* [HIVE-2025] - Fix TestEmbeddedHiveMetaStore and TestRemoteHiveMetaStore broken by HIVE-2022
* [HIVE-2031] - Correct the exception message for the better traceability for the scenario load into the partitioned table having 2 partitions by specifying only one partition in the load statement.
* [HIVE-2032] - create database does not honour warehouse.dir in dbproperties
* [HIVE-2033] - A database's warehouse.dir is not used for tables created in it.
* [HIVE-2034] - Backport HIVE-1991 after overridden by HIVE-1950
* [HIVE-2037] - Merge result file size should honor hive.merge.size.per.task
* [HIVE-2040] - the retry logic in Hive's concurrency is not working correctly.
* [HIVE-2042] - In error scenario some opened streams may not closed
* [HIVE-2045] - TCTLSeparatedProtocol.SimpleTransportTokenizer.nextToken() throws Null Pointer Exception in some cases
* [HIVE-2054] - Exception on windows when using the jdbc driver. "IOException: The system cannot find the path specified"
* [HIVE-2060] - CLI local mode hit NPE when exiting by ^D
* [HIVE-2061] - Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
* [HIVE-2062] - HivePreparedStatement.executeImmediate always throw exception
* [HIVE-2069] - NullPointerException on getSchemas
* [HIVE-2080] - Few code improvements in the ql and serde packages.
* [HIVE-2083] - Bug: RowContainer was set to 1 in JoinUtils.
* [HIVE-2086] - Add test coverage for external table data loss issue
* [HIVE-2095] - auto convert map join bug
* [HIVE-2096] - throw a error if the input is larger than a threshold for index input format
* [HIVE-2098] - Make couple of convenience methods in EximUtil public
* [HIVE-2100] - virtual column references inside subqueries cause execution exceptions
* [HIVE-2107] - Log4J initialization info should not be printed out if -S is specified
* [HIVE-2113] - In shell mode, local mode continues if a local-mode task throws exception in pre-hooks
* [HIVE-2117] - insert overwrite ignoring partition location
* [HIVE-2120] - auto convert map join may miss good candidates
* [HIVE-2122] - Remove usage of deprecated methods from org.apache.hadoop.io package
* [HIVE-2125] - alter table concatenate fails and deletes data
* [HIVE-2131] - Bitmap Operation UDF doesn't clear return list
* [HIVE-2138] - Exception when no splits returned from index
* [HIVE-2142] - Jobs do not get killed even when they created too many files.
* [HIVE-2145] - NPE during parsing order-by expression
* [HIVE-2146] - Block Sampling should adjust number of reducers accordingly to make it useful
* [HIVE-2151] - Too many open files in running negative cli tests
* [HIVE-2153] - Stats JDBC LIKE queries should escape '_' and '%'
* [HIVE-2157] - NPE in MapJoinObjectKey
* [HIVE-2159] - TableSample(percent ) uses one intermediate size to be int, which overflows for large sampled size, making the sampling never triggered.
* [HIVE-2160] - Few code improvements in the metastore,hwi and ql packages.
* [HIVE-2176] - Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus
* [HIVE-2178] - Log related Check style Comments fixes
* [HIVE-2181] - Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.
* [HIVE-2182] - Avoid null pointer exception when executing UDF
* [HIVE-2183] - In Task class and its subclasses logger is initialized in constructor
* [HIVE-2184] - Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
* [HIVE-2186] - Dynamic Partitioning Failing because of characters not supported globStatus
* [HIVE-2192] - Stats table schema incompatible after HIVE-2185
* [HIVE-2196] - Ensure HiveConf includes all properties defined in hive-default.xml
* [HIVE-2197] - SessionState used before ThreadLocal set
* [HIVE-2198] - While using Hive in server mode, HiveConnection.close() is not cleaning up server side resources
* [HIVE-2199] - incorrect success flag passed to jobClose
* [HIVE-2204] - unable to get column names for a specific table that has '_' as part of its table name
* [HIVE-2211] - Fix a bug caused by HIVE-243
* [HIVE-2214] - CommandNeedRetryException.java is missing ASF header
* [HIVE-2222] - runnable queue in Driver and DriverContext is not thread safe
* [HIVE-2237] - hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
* [HIVE-2243] - Can't publish maven release artifacts to apache repository
* [HIVE-2248] - Comparison Operators convert number types to common type instead of double if possible
* [HIVE-2253] - Merge failing of join tree in exceptional case
* [HIVE-2257] - Enable TestHadoop20SAuthBridge
* [HIVE-2259] - Skip comments in hive script
* [HIVE-2260] - ExecDriver::addInputPaths should pass the table properties to the record writer
* [HIVE-2275] - Revert HIVE-2219 and apply correct patch to improve the efficiency of dropping multiple partitions
* [HIVE-2276] - Fix Inconsistency between RB and JIRA patches for HIVE-2194
* [HIVE-2281] - Regression introduced from HIVE-2155
* [HIVE-2286] - ClassCastException when building index with security.authorization turned on
* [HIVE-2287] - Error during UNARCHIVE of a partition
* [HIVE-2292] - Comment clause should immediately follow identifier field in CREATE DATABASE statement
* [HIVE-2294] - Allow ShimLoader to work with Hadoop 0.20-append
* [HIVE-2296] - bad compressed file names from insert into
* [HIVE-2298] - Fix UDAFPercentile to tolerate null percentiles
* [HIVE-2303] - files with control-A,B are not delimited correctly.
* [HIVE-2307] - Schema creation scripts for PostgreSQL use bit(1) instead of boolean
* [HIVE-2309] - Incorrect regular expression for extracting task id from filename
* [HIVE-2315] - DatabaseMetadata.getColumns() does not return partition column names for a table
* [HIVE-2319] - Calling alter_table after changing partition comment throws an exception
* [HIVE-2322] - Add ColumnarSerDe to the list of native SerDes
* [HIVE-2326] - Turn off bitmap indexing when map-side aggregation is turned off
* [HIVE-2328] - hive.zookeeper.session.timeout is set to null in hive-default.xml
* [HIVE-2331] - Turn off compression when generating index intermediate results
* [HIVE-2334] - DESCRIBE TABLE causes NPE when hive.cli.print.header=true
* [HIVE-2335] - Indexes are still automatically queried when out of sync with their source tables
* [HIVE-2337] - Predicate pushdown erroneously conservative with outer joins
* [HIVE-2338] - Alter table always throws an unhelpful error on failure
* [HIVE-2342] - mirror.facebook.net is 404ing
* [HIVE-2343] - stats not updated for non "load table desc" operations
* [HIVE-2344] - filter is removed due to regression of HIVE-1538
* [HIVE-2356] - Fix udtf_explode.q and udf_explode.q test failures
* [HIVE-2358] - JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types
* [HIVE-2362] - HiveConf properties not appearing in the output of 'set' or 'set -v'
* [HIVE-2366] - Metastore upgrade scripts for HIVE-2246 do not migrate indexes nor rename the old COLUMNS table
* [HIVE-2368] - Slow dropping of partitions caused by full listing of storage descriptors
* [HIVE-2369] - Minor typo in error message in HiveConnection.java (JDBC)
* [HIVE-2382] - Invalid predicate pushdown from incorrect column expression map for select operator generated by GROUP BY operation
* [HIVE-2383] - Incorrect alias filtering for predicate pushdown
* [HIVE-2384] - import of multiple partitions from a partitioned table with external location overwrites files
* [HIVE-2386] - Add Mockito to LICENSE file
* [HIVE-2391] - published POMs in Maven repo are incorrect
* [HIVE-2393] - Fix whitespace test diff accidentally introduced in HIVE-1360
* [HIVE-2398] - Hive server doesn't return schema for 'set' command
* [HIVE-2402] - Function like with empty string is throwing null pointer exception
* [HIVE-2405] - get_privilege does not get user level privilege
* [HIVE-2407] - File extensions not preserved in Hive.checkPaths when renaming new destination file
* [HIVE-2411] - Metastore server tries to connect to NN without authenticating itself
* [HIVE-2412] - Update Eclipse configuration to include Mockito dependency
* [HIVE-2413] - BlockMergeTask ignores client-specified jars
* [HIVE-2417] - Merging of compressed rcfiles fails to write the valuebuffer part correctly
* [HIVE-2429] - skip corruption bug that cause data not decompressed
* [HIVE-2431] - upgrading thrift version didn't upgrade libthrift.jar symlink correctly
* [HIVE-2451] - TABLESAMBLE(BUCKET xxx) sometimes doesn't trigger input pruning as regression of HIVE-1538
* [HIVE-2455] - Pass correct remoteAddress in proxy user authentication
* [HIVE-2459] - remove all @author tags from source
* [HIVE-2463] - fix Eclipse for javaewah upgrade
* [HIVE-2465] - Primitive Data Types returning null if the data is out of range of the data type.
* [HIVE-2466] - mapjoin_subquery dump small table (mapjoin table) to the same file
* [HIVE-2472] - Metastore statistics are not being updated for CTAS queries.
* [HIVE-2474] - Hive PDK needs an Ivy configuration file
* [HIVE-2481] - HadoopJobExecHelper does not handle null counters well
* [HIVE-2486] - Phabricator for code review
* [HIVE-2487] - Bug from HIVE-2446, the code that calls client stats publishers run() methods is in wrong place, should be in the same method but inside of while (!rj.isComplete()) {} loop
* [HIVE-2488] - PDK tests failing on Hudson because HADOOP_HOME is not defined
* [HIVE-2492] - PDK PluginTest failing on Hudson
* [HIVE-2497] - partition pruning prune some right partition under specific conditions
* [HIVE-2499] - small table filesize for automapjoin is not consistent in HiveConf.java and hive-default.xml
* [HIVE-2501] - When new instance of Hive (class) is created, the current database is reset to default (current database shouldn't be changed).
* [HIVE-2510] - Hive throws Null Pointer Exception upon CREATE TABLE <db_name>.<table_name> .... if the given <db_name> doesn't exist
* [HIVE-2516] - cleaunup QTestUtil: use test.data.files as current directory if one not specified
* [HIVE-2519] - Dynamic partition insert should enforce the order of the partition spec is the same as the one in schema
* [HIVE-2522] - HIVE-2446 bug (next one) - If constructor of ClientStatsPublisher throws runtime exception it will be propagated to HadoopJobExecHelper's progress method and beyond, whereas it shouldn't
* [HIVE-2531] - Allow people to use only issue numbers without 'HIVE-' prefix with `arc diff --jira`.
* [HIVE-2532] - Evaluation of non-deterministic/stateful UDFs should not be skipped even if constant oi is returned.
* [HIVE-2534] - HiveIndexResult creation fails due to file system issue
* [HIVE-2536] - Support scientific notation for Double literals
* [HIVE-2548] - How to submit documentation fixes
* [HIVE-2550] - Provide jira_base_url for improved arc commit workflow
* [HIVE-2556] - upgrade script 008-HIVE-2246.mysql.sql contains syntax errors
* [HIVE-2562] - HIVE-2247 Changed the Thrift API causing compatibility issues.
* [HIVE-2565] - Add Java linter to Hive
* [HIVE-2568] - HIVE-2246 upgrade script needs to drop foreign key in COLUMNS_OLD
* [HIVE-2571] - eclipse template .classpath is broken
* [HIVE-2572] - HIVE-2246 upgrade script changed the COLUMNS_V2.COMMENT length
* [HIVE-2574] - ivy offline mode broken by changingPattern and checkmodified attributes
* [HIVE-2578] - Debug mode in some situations doesn't work properly when child JVM is started from MapRedLocalTask
* [HIVE-2580] - Hive build fails with error "java.io.IOException: Not in GZIP format"
* [HIVE-2581] - explain task: getJSONPlan throws a NPE if the ast is null
* [HIVE-2583] - bug in ivy 2.2.0 breaks build
* [HIVE-2588] - Update arcconfig to include commit listener
* [HIVE-2590] - HBase bulk load wiki page improvements
* [HIVE-2598] - Update README.txt file to use description from wiki
* [HIVE-2613] - HiveCli eclipse launch configuration hangs
* [HIVE-2622] - Hive POMs reference the wrong Hadoop artifacts
* [HIVE-2624] - Fix eclipse classpath template broken in HIVE-2523
* [HIVE-2625] - Fix maven-build Ant target
* [HIVE-2630] - TestHiveServer doesn't produce a JUnit report file
* [HIVE-2634] - revert HIVE-2566
* [HIVE-2643] - Recent patch prevents Hadoop confs from loading in 0.20.204
** Improvement
* [HIVE-1078] - CREATE VIEW followup: CREATE OR REPLACE
* [HIVE-1360] - Allow UDFs to access constant parameter values at compile time
* [HIVE-1567] - increase hive.mapjoin.maxsize to 10 million
* [HIVE-1644] - use filter pushdown for automatically accessing indexes
* [HIVE-1690] - HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack
* [HIVE-1731] - Improve miscellaneous error messages
* [HIVE-1740] - support NOT IN and NOT LIKE syntax
* [HIVE-1741] - HiveInputFormat.readFields should print the cause when there's an exception
* [HIVE-1784] - Ctrl+c should kill currently running query, but not exit the CLI
* [HIVE-1815] - The class HiveResultSet should implement batch fetching.
* [HIVE-1833] - Task-cleanup task should be disabled
* [HIVE-1887] - HIVE-78 Followup: group partitions by tables when do authorizations and there is no partition level privilege
* [HIVE-1916] - Change Default Alias For Aggregated Columns (_c1)
* [HIVE-1966] - mapjoin operator should not load hashtable for each new inputfile if the hashtable to be loaded is already there.
* [HIVE-1989] - recognize transitivity of predicates on join keys
* [HIVE-1991] - Hive Shell to output number of mappers and number of reducers
* [HIVE-1994] - Support new annotation @UDFType(stateful = true)
* [HIVE-2000] - adding comments to Hive Stats JDBC queries
* [HIVE-2002] - Expand exceptions caught for metastore operations
* [HIVE-2018] - avoid loading Hive aux jars in CLI remote mode
* [HIVE-2020] - Create a separate namespace for Hive variables
* [HIVE-2028] - Performance instruments for client side execution
* [HIVE-2030] - isEmptyPath() to use ContentSummary cache
* [HIVE-2035] - Use block-level merge for RCFile if merging intermediate results are needed
* [HIVE-2036] - Update bitmap indexes for automatic usage
* [HIVE-2038] - Metastore listener
* [HIVE-2039] - remove hadoop version check from hive cli shell script
* [HIVE-2051] - getInputSummary() to call FileSystem.getContentSummary() in parallel
* [HIVE-2052] - PostHook and PreHook API to add flag to indicate it is pre or post hook plus cache for content summary
* [HIVE-2056] - Generate single MR job for multi groupby query if hive.multigroupby.singlemr is enabled.
* [HIVE-2068] - Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
* [HIVE-2070] - SHOW GRANT grantTime field should be a human-readable timestamp
* [HIVE-2082] - Reduce memory consumption in preparing MapReduce job
* [HIVE-2106] - Increase the number of operator counter
* [HIVE-2109] - No lock for some non-mapred tasks config variable hive.lock.mapred.only.operation added
* [HIVE-2119] - Optimizer on partition field
* [HIVE-2126] - Hive's symlink text input format should be able to work with ComineHiveInputFormat
* [HIVE-2127] - Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
* [HIVE-2128] - Automatic Indexing with multiple tables
* [HIVE-2133] - DROP TABLE IF EXISTS should not fail if a view of that name exists
* [HIVE-2134] - Remove System.exit
* [HIVE-2139] - Enables HiveServer to accept -hiveconf option
* [HIVE-2144] - reduce workload generated by JDBCStatsPublisher
* [HIVE-2147] - Add api to send / receive message to metastore
* [HIVE-2148] - Add interface classification in Hive.
* [HIVE-2154] - add exception handling to hive's record reader
* [HIVE-2155] - Improve error messages emitted during semantic analysis
* [HIVE-2156] - Improve error messages emitted during task execution
* [HIVE-2171] - Allow custom serdes to set field comments
* [HIVE-2191] - Allow optional [inner] on equi-join.
* [HIVE-2194] - Add actions for alter table and alter partition events for metastore event listeners
* [HIVE-2201] - reduce name node calls in hive by creating temporary directories
* [HIVE-2208] - create a new API in Warehouse where the root directory is specified
* [HIVE-2209] - Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object
* [HIVE-2210] - ALTER VIEW RENAME
* [HIVE-2213] - Optimize partial specification metastore functions
* [HIVE-2217] - add Query text for debugging in lock data
* [HIVE-2218] - speedup addInputPaths
* [HIVE-2219] - Make "alter table drop partition" more efficient
* [HIVE-2221] - Provide metastore upgarde script for HIVE-2215
* [HIVE-2224] - Ability to add partitions atomically
* [HIVE-2226] - Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
* [HIVE-2233] - Show current database in hive prompt
* [HIVE-2245] - Make CombineHiveInputFormat the default hive.input.format
* [HIVE-2246] - Dedupe tables' column schemas from partitions in the metastore db
* [HIVE-2252] - Display a sample of partitions created when Fatal Error occurred due to too many partitioned created
* [HIVE-2256] - Better error message in CLI on invalid column name
* [HIVE-2282] - Local mode needs to work well with block sampling
* [HIVE-2284] - bucketized map join should allow join key as a superset of bucketized columns
* [HIVE-2290] - Improve error messages for DESCRIBE command
* [HIVE-2299] - Optimize Hive query startup time for multiple partitions
* [HIVE-2346] - Add hooks to run when execution fails.
* [HIVE-2347] - Make Hadoop Job ID available after task finishes executing
* [HIVE-2350] - Improve RCFile Read Speed
* [HIVE-2354] - Support automatic rebuilding of indexes when they go stale
* [HIVE-2364] - Make performance logging configurable.
* [HIVE-2370] - Improve RCFileCat performance significantly
* [HIVE-2378] - Warn user that precision is lost when bigint is implicitly cast to double.
* [HIVE-2385] - Local Mode can be more aggressive if LIMIT optimization is on
* [HIVE-2396] - RCFileReader Buffer Reuse
* [HIVE-2404] - Allow RCFile Reader to tolerate corruptions
* [HIVE-2440] - make hive mapper initialize faster when having tons of input files
* [HIVE-2445] - The PerfLogger should log the full name of hooks, not just the simple name.
* [HIVE-2446] - Introduction of client statistics publishers possibility
* [HIVE-2447] - Add job ID to MapRedStats
* [HIVE-2448] - Upgrade JavaEWAH to 0.3
* [HIVE-2450] - move lock retry logic into ZooKeeperHiveLockManager
* [HIVE-2453] - Need a way to categorize queries in hooks for improved logging
* [HIVE-2456] - JDBCStatsAggregator DELETE STATEMENT should escape _ and %
* [HIVE-2457] - Files in Avro-backed Hive tables do not have a ".avro" extension
* [HIVE-2458] - Group-by query optimization Followup: add flag in conf/hive-default.xml
* [HIVE-2461] - Add method to PerfLogger to perform cleanup/final steps.
* [HIVE-2462] - make INNER a non-reserved keyword
* [HIVE-2467] - HA Support for Metastore Server
* [HIVE-2470] - Improve support for Constant Object Inspectors
* [HIVE-2479] - Log more Hadoop task counter values in the MapRedStats class.
* [HIVE-2484] - Enable ALTER TABLE SET SERDE to work on partition level
* [HIVE-2505] - Update junit jar in testlibs
* [HIVE-2506] - Get ConstantObjectInspectors working in UDAFs
* [HIVE-2515] - Make Constant OIs work with UDTFs.
* [HIVE-2523] - add a new builtins subproject
* [HIVE-2527] - Consecutive string literals should be combined into a single string literal.
* [HIVE-2535] - Use sorted nature of compact indexes
* [HIVE-2545] - Make metastore log4j configuration file configurable again.
* [HIVE-2546] - add explain formatted
* [HIVE-2553] - Use hashing instead of list traversal for IN operator for primitive types
* [HIVE-2566] - reduce the number map-reduce jobs for union all
* [HIVE-2569] - Too much debugging info on console if a job failed
* [HIVE-2593] - avoid referencing /tmp in tests
* [HIVE-2605] - Setting no_drop on a table should cascade to child partitions
* [HIVE-2607] - Add caching to json_tuple
* [HIVE-2619] - Add hook to run in metastore's endFunction which can collect more fb303 counters
** Task
* [HIVE-1095] - Hive in Maven
* [HIVE-2076] - Provide Metastore upgrade scripts and default schemas for PostgreSQL
* [HIVE-2161] - Remaining patch for HIVE-2148
* [HIVE-2239] - Use the version commons-codec from Hadoop
* [HIVE-2376] - Upgrade Hive's Thrift dependency to version 0.7.0
* [HIVE-2441] - Metastore upgrade scripts for schema change introduced in HIVE-2215
* [HIVE-2442] - Metastore upgrade script and schema DDL for Hive 0.8.0
* [HIVE-2468] - Make Hive compile against Hadoop 0.23
* [HIVE-2491] - Add pdk, hbase-handler etc as source dir in eclipse
* [HIVE-2521] - Update wiki links in README file
* [HIVE-2552] - Omit incomplete Postgres upgrade scripts from release tarball
** Sub-task
* [HIVE-559] - Support JDBC ResultSetMetadata
* [HIVE-1983] - Bundle Log4j configuration files in Hive JARs
* [HIVE-2049] - Push down partition pruning to JDO filtering for a subset of partition predicates
* [HIVE-2050] - batch processing partition pruning process
* [HIVE-2114] - Backward incompatibility introduced from HIVE-2082 in MetaStoreUtils.getPartSchemaFromTableSchema()
* [HIVE-2118] - Partition Pruning bug in the case of hive.mapred.mode=nonstrict
* [HIVE-2140] - Return correct Major / Minor version numbers for Hive Driver
* [HIVE-2158] - add the HivePreparedStatement implementation based on current HIVE supported data-type
* [HIVE-2434] - add a TM to Hive logo image
* [HIVE-2435] - Update project naming and description in Hive wiki
* [HIVE-2436] - Update project naming and description in Hive website
* [HIVE-2437] - update project website navigation links
* [HIVE-2438] - add trademark attributions to Hive homepage
* [HIVE-2476] - Update project description and wiki link in ivy.xml files
** Test
* [HIVE-2426] - Test that views with joins work properly
* [HIVE-2493] - TestLazySimpleSerde fails randomly
* [HIVE-2513] - create a test to verify that partition pruning works for partitioned views with a union
** Wish
* [HIVE-243] - ^C breaks out of running query, but not whole CLI
Release Notes - Hive - Version 0.7.0
** New Feature
* [HIVE-78] - Authorization infrastructure for Hive
* [HIVE-417] - Implement Indexing in Hive
* [HIVE-471] - Add reflect() UDF for reflective invocation of Java methods
* [HIVE-537] - Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
* [HIVE-842] - Authentication Infrastructure for Hive
* [HIVE-1096] - Hive Variables
* [HIVE-1293] - Concurrency Model for Hive
* [HIVE-1304] - add row_sequence UDF
* [HIVE-1405] - hive command line option -i to run an init file before other SQL commands
* [HIVE-1408] - add option to let hive automatically run in local mode based on tunable heuristics
* [HIVE-1413] - bring a table/partition offline
* [HIVE-1438] - sentences() UDF for natural language tokenization
* [HIVE-1481] - ngrams() UDAF for estimating top-k n-gram frequencies
* [HIVE-1514] - Be able to modify a partition's fileformat and file location information.
* [HIVE-1518] - context_ngrams() UDAF for estimating top-k contextual n-grams
* [HIVE-1528] - Add json_tuple() UDTF function
* [HIVE-1529] - Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
* [HIVE-1549] - Add ANSI SQL correlation aggregate function CORR(X,Y).
* [HIVE-1609] - Support partition filtering in metastore
* [HIVE-1624] - Patch to allows scripts in S3 location
* [HIVE-1636] - Implement "SHOW TABLES {FROM | IN} db_name"
* [HIVE-1659] - parse_url_tuple: a UDTF version of parse_url
* [HIVE-1661] - Default values for parameters
* [HIVE-1779] - Implement GenericUDF str_to_map
* [HIVE-1790] - Patch to support HAVING clause in Hive
* [HIVE-1792] - track the joins which are being converted to map-join automatically
* [HIVE-1818] - Call frequency and duration metrics for HiveMetaStore via jmx
* [HIVE-1819] - maintain lastAccessTime in the metastore
* [HIVE-1820] - Make Hive database data center aware
* [HIVE-1827] - Add a new local mode flag in Task.
* [HIVE-1835] - Better auto-complete for Hive
* [HIVE-1840] - Support ALTER DATABASE to change database properties
* [HIVE-1856] - Implement DROP TABLE/VIEW ... IF EXISTS
* [HIVE-1858] - Implement DROP {PARTITION, INDEX, TEMPORARY FUNCTION} IF EXISTS
* [HIVE-1881] - Make the MetaStore filesystem interface pluggable via the hive.metastore.fs.handler.class configuration property
* [HIVE-1889] - add an option (hive.index.compact.file.ignore.hdfs) to ignore HDFS location stored in index files.
* [HIVE-1971] - Verbose/echo mode for the Hive CLI
** Improvement
* [HIVE-138] - Provide option to export a HEADER
* [HIVE-474] - Support for distinct selection on two or more columns
* [HIVE-558] - describe extended table/partition output is cryptic
* [HIVE-1126] - Missing some Jdbc functionality like getTables getColumns and HiveResultSet.get* methods based on column name.
* [HIVE-1211] - Tapping logs from child processes
* [HIVE-1226] - support filter pushdown against non-native tables
* [HIVE-1229] - replace dependencies on HBase deprecated API
* [HIVE-1235] - use Ivy for fetching HBase dependencies
* [HIVE-1264] - Make Hive work with Hadoop security
* [HIVE-1378] - Return value for map, array, and struct needs to return a string
* [HIVE-1394] - do not update transient_lastDdlTime if the partition is modified by a housekeeping operation
* [HIVE-1414] - automatically invoke .hiverc init script
* [HIVE-1415] - add CLI command for executing a SQL script
* [HIVE-1430] - serializing/deserializing the query plan is useless and expensive
* [HIVE-1441] - Extend ivy offline mode to cover metastore downloads
* [HIVE-1443] - Add support to turn off bucketing with ALTER TABLE
* [HIVE-1447] - Speed up reflection method calls in GenericUDFBridge and GenericUDAFBridge
* [HIVE-1456] - potentail NullPointerException
* [HIVE-1463] - hive output file names are unnecessarily large
* [HIVE-1469] - replace isArray() calls and remove LOG.isInfoEnabled() in Operator.forward()
* [HIVE-1495] - supply correct information to hooks and lineage for index rebuild
* [HIVE-1497] - support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
* [HIVE-1498] - support IDXPROPERTIES on CREATE INDEX
* [HIVE-1512] - Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
* [HIVE-1513] - hive starter scripts should load admin/user supplied script for configurability
* [HIVE-1517] - ability to select across a database
* [HIVE-1533] - Use ZooKeeper from maven
* [HIVE-1536] - Add support for JDBC PreparedStatements
* [HIVE-1546] - Ability to plug custom Semantic Analyzers for Hive Grammar
* [HIVE-1581] - CompactIndexInputFormat should create split only for files in the index output file.
* [HIVE-1605] - regression and improvements in handling NULLs in joins
* [HIVE-1611] - Add alternative search-provider to Hive site
* [HIVE-1616] - Add ProtocolBuffersStructObjectInspector
* [HIVE-1617] - ScriptOperator's AutoProgressor can lead to an infinite loop
* [HIVE-1622] - Use CombineHiveInputFormat for the merge job if hive.merge.mapredfiles=true
* [HIVE-1638] - convert commonly used udfs to generic udfs
* [HIVE-1641] - add map joined table to distributed cache
* [HIVE-1642] - Convert join queries to map-join based on size of table/row
* [HIVE-1645] - ability to specify parent directory for zookeeper lock manager
* [HIVE-1655] - Adding consistency check at jobClose() when committing dynamic partitions
* [HIVE-1660] - Change get_partitions_ps to pass partition filter to database
* [HIVE-1692] - FetchOperator.getInputFormatFromCache hides causal exception
* [HIVE-1701] - drop support for pre-0.20 Hadoop versions
* [HIVE-1704] - remove Hadoop 0.17 specific test reference logs
* [HIVE-1738] - Optimize Key Comparison in GroupByOperator
* [HIVE-1743] - Group-by to determine equals of Keys in reverse order
* [HIVE-1746] - Support for using ALTER to set IDXPROPERTIES
* [HIVE-1749] - ExecMapper and ExecReducer: reduce function calls to l4j.isInfoEnabled()
* [HIVE-1750] - Remove Partition Filtering Conditions when Possible
* [HIVE-1751] - Optimize ColumnarStructObjectInspector.getStructFieldData()
* [HIVE-1754] - Remove JDBM component from Map Join
* [HIVE-1757] - test cleanup for Hive-1641
* [HIVE-1758] - optimize group by hash map memory
* [HIVE-1761] - Support show locks for a particular table
* [HIVE-1765] - Add queryid while locking
* [HIVE-1768] - Update transident_lastDdlTime only if not specified
* [HIVE-1782] - add more debug information for hive locking
* [HIVE-1783] - CommonJoinOperator optimize the case of 1:1 join
* [HIVE-1785] - change Pre/Post Query Hooks to take in 1 parameter: HookContext
* [HIVE-1786] - Improve documentation for str_to_map() UDF
* [HIVE-1787] - optimize the code path when there are no outer joins
* [HIVE-1796] - dumps time at which lock was taken along with the queryid in show locks <T> extended
* [HIVE-1797] - Compressed the hashtable dump file before put into distributed cache
* [HIVE-1798] - Clear empty files in Hive
* [HIVE-1801] - HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
* [HIVE-1811] - Show the time the local task takes
* [HIVE-1824] - create a new ZooKeeper instance when retrying lock, and more info for debug
* [HIVE-1831] - Add a option to run task to check map-join possibility in non-local mode
* [HIVE-1834] - more debugging for locking
* [HIVE-1843] - add an option in dynamic partition inserts to throw an error if 0 partitions are created
* [HIVE-1852] - Reduce unnecessary DFSClient.rename() calls
* [HIVE-1855] - Include Process ID in the log4j log file name
* [HIVE-1865] - redo zookeeper hive lock manager
* [HIVE-1899] - add a factory method for creating a synchronized wrapper for IMetaStoreClient
* [HIVE-1900] - a mapper should be able to span multiple partitions
* [HIVE-1907] - Store jobid in ExecDriver
* [HIVE-1910] - Provide config parameters to control cache object pinning
* [HIVE-1923] - Allow any type of stats publisher and aggregator in addition to HBase and JDBC
* [HIVE-1929] - Find a way to disable owner grants
* [HIVE-1931] - Improve the implementation of the METASTORE_CACHE_PINOBJTYPES config
* [HIVE-1948] - Have audit logging in the Metastore
* [HIVE-1956] - "Provide DFS initialization script for Hive
* [HIVE-1961] - Make Stats gathering more flexible with timeout and atomicity
* [HIVE-1962] - make a libthrift.jar and libfb303.jar in dist package for backward compatibility
* [HIVE-1970] - Modify build to run all tests regardless of subproject failures
* [HIVE-1978] - Hive SymlinkTextInputFormat does not estimate input size correctly
** Bug
* [HIVE-307] - "LOAD DATA LOCAL INPATH" fails when the table already contains a file of the same name
* [HIVE-741] - NULL is not handled correctly in join
* [HIVE-1203] - HiveInputFormat.getInputFormatFromCache "swallows" cause exception when throwing IOExcpetion
* [HIVE-1305] - add progress in join and groupby
* [HIVE-1376] - Simple UDAFs with more than 1 parameter crash on empty row query
* [HIVE-1385] - UDF field() doesn't work
* [HIVE-1416] - Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
* [HIVE-1422] - skip counter update when RunningJob.getCounters() returns null
* [HIVE-1440] - FetchOperator(mapjoin) does not work with RCFile
* [HIVE-1448] - bug in 'set fileformat'
* [HIVE-1453] - Make Eclipse launch templates auto-adjust to Hive version number changes
* [HIVE-1462] - Reporting progress in FileSinkOperator works in multiple directory case
* [HIVE-1465] - hive-site.xml ${user.name} not replaced for local-file derby metastore connection URL
* [HIVE-1470] - percentile_approx() fails with more than 1 reducer
* [HIVE-1471] - CTAS should unescape the column name in the select-clause.
* [HIVE-1473] - plan file should have a high replication factor
* [HIVE-1475] - .gitignore files being placed in test warehouse directories causing build failure
* [HIVE-1489] - TestCliDriver -Doverwrite=true does not put the file in the correct directory
* [HIVE-1491] - fix or disable loadpart_err.q
* [HIVE-1494] - Index followup: remove sort by clause and fix a bug in collect_set udaf
* [HIVE-1501] - when generating reentrant INSERT for index rebuild, quote identifiers using backticks
* [HIVE-1508] - Add cleanup method to HiveHistory class
* [HIVE-1509] - Monitor the working set of the number of files
* [HIVE-1510] - HiveCombineInputFormat should not use prefix matching to find the partitionDesc for a given path
* [HIVE-1520] - hive.mapred.local.mem should only be used in case of local mode job submissions
* [HIVE-1523] - ql tests no longer work in miniMR mode
* [HIVE-1532] - Replace globStatus with listStatus inside Hive.java's replaceFiles.
* [HIVE-1534] - Join filters do not work correctly with outer joins
* [HIVE-1535] - alter partition should throw exception if the specified partition does not exist.
* [HIVE-1547] - Unarchiving operation throws NPE
* [HIVE-1548] - populate inputs and outputs for all statements
* [HIVE-1556] - Fix TestContribCliDriver test
* [HIVE-1561] - smb_mapjoin_8.q returns different results in miniMr mode
* [HIVE-1563] - HBase tests broken
* [HIVE-1564] - bucketizedhiveinputformat.q fails in minimr mode
* [HIVE-1570] - referencing an added file by it's name in a transform script does not work in hive local mode
* [HIVE-1578] - Add conf. property hive.exec.show.job.failure.debug.info to enable/disable displaying link to the task with most failures
* [HIVE-1580] - cleanup ExecDriver.progress
* [HIVE-1583] - Hive should not override Hadoop specific system properties
* [HIVE-1584] - wrong log files in contrib client positive
* [HIVE-1589] - Add HBase/ZK JARs to Eclipse classpath
* [HIVE-1593] - udtf_explode.q is an empty file
* [HIVE-1598] - use SequenceFile rather than TextFile format for hive query results
* [HIVE-1600] - need to sort hook input/output lists for test result determinism
* [HIVE-1601] - Hadoop 0.17 ant test broken by HIVE-1523
* [HIVE-1606] - For a null value in a string column, JDBC driver returns the string "NULL"
* [HIVE-1607] - Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
* [HIVE-1614] - UDTF json_tuple should return null row when input is not a valid JSON string
* [HIVE-1628] - Fix Base64TextInputFormat to be compatible with commons codec 1.4
* [HIVE-1629] - Patch to fix hashCode method in DoubleWritable class
* [HIVE-1630] - bug in NO_DROP
* [HIVE-1633] - CombineHiveInputFormat fails with "cannot find dir for emptyFile"
* [HIVE-1639] - ExecDriver.addInputPaths() error if partition name contains a comma
* [HIVE-1647] - Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )
* [HIVE-1650] - TestContribNegativeCliDriver fails
* [HIVE-1656] - All TestJdbcDriver test cases fail in Eclipse unless a property is added in run config
* [HIVE-1657] - join results are displayed wrongly for some complex joins using select *
* [HIVE-1658] - Fix describe * [extended] column formatting
* [HIVE-1663] - ql/src/java/org/apache/hadoop/hive/ql/parse/SamplePruner.java is empty
* [HIVE-1664] - Eclipse build broken
* [HIVE-1670] - MapJoin throws EOFExeption when the mapjoined table has 0 column selected
* [HIVE-1671] - multithreading on Context.pathToCS
* [HIVE-1673] - Create table bug causes the row format property lost when serde is specified.
* [HIVE-1674] - count(*) returns wrong result when a mapper returns empty results
* [HIVE-1678] - NPE in MapJoin
* [HIVE-1688] - In the MapJoinOperator, the code uses tag as alias, which is not always true
* [HIVE-1691] - ANALYZE TABLE command should check columns in partition spec
* [HIVE-1699] - incorrect partition pruning ANALYZE TABLE
* [HIVE-1707] - bug when different partitions are present in different dfs
* [HIVE-1711] - CREATE TABLE LIKE should not set stats in the new table
* [HIVE-1712] - Migrating metadata from derby to mysql thrown NullPointerException
* [HIVE-1713] - duplicated MapRedTask in Multi-table inserts mixed with FileSinkOperator and ReduceSinkOperator
* [HIVE-1716] - make TestHBaseCliDriver use dynamic ports to avoid conflicts with already-running services
* [HIVE-1717] - ant clean should delete stats database
* [HIVE-1720] - hbase_stats.q is failing
* [HIVE-1737] - Two Bugs for Estimating Row Sizes in GroupByOperator
* [HIVE-1742] - Fix Eclipse templates (and use Ivy metadata to generate Eclipse library dependencies)
* [HIVE-1748] - Statistics broken for tables with size in excess of Integer.MAX_VALUE
* [HIVE-1753] - HIVE 1633 hit for Stage2 jobs with CombineHiveInputFormat
* [HIVE-1756] - failures in fatal.q in TestNegativeCliDriver
* [HIVE-1759] - Many important broken links on Hive web page
* [HIVE-1760] - Mismatched open/commit transaction calls in case of connection retry
* [HIVE-1767] - Merge files does not work with dynamic partition
* [HIVE-1769] - pcr.q output is non-deterministic
* [HIVE-1771] - ROUND(infinity) chokes
* [HIVE-1775] - Assertation on inputObjInspectors.length in Groupy operator
* [HIVE-1776] - parallel execution and auto-local mode combine to place plan file in wrong file system
* [HIVE-1777] - Outdated comments for GenericUDTF.close()
* [HIVE-1780] - Typo in hive-default.xml
* [HIVE-1781] - outputs not populated for dynamic partitions at compile time
* [HIVE-1794] - GenericUDFOr and GenericUDFAnd cannot receive boolean typed object
* [HIVE-1795] - outputs not correctly populated for alter table
* [HIVE-1804] - Mapjoin will fail if there are no files associating with the join tables
* [HIVE-1806] - The merge criteria on dynamic partitons should be per partiton
* [HIVE-1807] - No Element found exception in BucketMapJoinOptimizer
* [HIVE-1808] - bug in auto_join25.q
* [HIVE-1809] - Hive comparison operators are broken for NaN values
* [HIVE-1812] - spurious rmr failure messages when inserting with dynamic partitioning
* [HIVE-1828] - show locks should not use getTable()/getPartition
* [HIVE-1829] - Fix intermittent failures in TestRemoteMetaStore
* [HIVE-1830] - mappers in group followed by joins may die OOM
* [HIVE-1844] - Hanging hive client caused by TaskRunner's OutOfMemoryError
* [HIVE-1845] - Some attributes in the Eclipse template file is deprecated
* [HIVE-1846] - change hive assumption that local mode mappers/reducers always run in same jvm
* [HIVE-1848] - bug in MAPJOIN
* [HIVE-1849] - add more logging to partition pruning
* [HIVE-1853] - downgrade JDO version
* [HIVE-1854] - Temporarily disable metastore tests for listPartitionsByFilter()
* [HIVE-1857] - mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message
* [HIVE-1860] - Hive's smallint datatype is not supported by the Hive JDBC driver
* [HIVE-1861] - Hive's float datatype is not supported by the Hive JDBC driver
* [HIVE-1862] - Revive partition filtering in the Hive MetaStore
* [HIVE-1863] - Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
* [HIVE-1864] - test load_overwrite.q fails
* [HIVE-1867] - Add mechanism for disabling tests with intermittent failures
* [HIVE-1870] - TestRemoteHiveMetaStore.java accidentally deleted during commit of HIVE-1845
* [HIVE-1871] - bug introduced by HIVE-1806
* [HIVE-1873] - Fix 'tar' build target broken in HIVE-1526
* [HIVE-1874] - fix HBase filter pushdown broken by HIVE-1638
* [HIVE-1878] - Set the version of Hive trunk to '0.7.0-SNAPSHOT' to avoid confusing it with a release
* [HIVE-1896] - HBase and Contrib JAR names are missing version numbers
* [HIVE-1897] - Alter command execution "when HDFS is down" results in holding stale data in MetaStore
* [HIVE-1902] - create script for the metastore upgrade due to HIVE-78
* [HIVE-1903] - Can't join HBase tables if one's name is the beginning of the other
* [HIVE-1908] - FileHandler leak on partial iteration of the resultset.
* [HIVE-1912] - Double escaping special chars when removing old partitions in rmr
* [HIVE-1913] - use partition level serde properties
* [HIVE-1914] - failures in testhbaseclidriver
* [HIVE-1915] - authorization on database level is broken.
* [HIVE-1917] - CTAS (create-table-as-select) throws exception when showing results
* [HIVE-1927] - Fix TestHadoop20SAuthBridge failure on Hudson
* [HIVE-1928] - GRANT/REVOKE should handle privileges as tokens, not identifiers
* [HIVE-1934] - alter table rename messes the location
* [HIVE-1936] - hive.semantic.analyzer.hook cannot have multiple values
* [HIVE-1939] - Fix test failure in TestContribCliDriver/url_hook.q
* [HIVE-1944] - dynamic partition insert creating different directories for the same partition during merge
* [HIVE-1951] - input16_cc.q is failing in testminimrclidriver
* [HIVE-1952] - fix some outputs and make some tests deterministic
* [HIVE-1964] - add fully deterministic ORDER BY in test union22.q and input40.q
* [HIVE-1969] - TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
* [HIVE-1979] - fix hbase_bulk.m by setting HiveInputFormat
* [HIVE-1981] - TestHadoop20SAuthBridge failed on current trunk
* [HIVE-1995] - Mismatched open/commit transaction calls when using get_partition()
* [HIVE-1998] - Update README.txt and add missing ASF headers
* [HIVE-2007] - Executing queries using Hive Server is not logging to the log file specified in hive-log4j.properties
* [HIVE-2010] - Improve naming and README files for MetaStore upgrade scripts
* [HIVE-2011] - upgrade-0.6.0.mysql.sql script attempts to increase size of PK COLUMNS.TYPE_NAME to 4000
* [HIVE-2059] - Add datanucleus.identifierFactory property to HiveConf to avoid unintentional MetaStore Schema corruption
* [HIVE-2064] - Make call to SecurityUtil.getServerPrincipal unambiguous
** Sub-task
* [HIVE-1361] - table/partition level statistics
* [HIVE-1696] - Add delegation token support to metastore
* [HIVE-1810] - a followup patch for changing the description of hive.exec.pre/post.hooks in conf/hive-default.xml
* [HIVE-1823] - upgrade the database thrift interface to allow parameters key-value pairs
* [HIVE-1836] - Extend the CREATE DATABASE command with DBPROPERTIES
* [HIVE-1842] - Add the local flag to all the map red tasks, if the query is running locally.
** Task
* [HIVE-1526] - Hive should depend on a release version of Thrift
* [HIVE-1817] - Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
* [HIVE-1876] - Update Metastore upgrade scripts to handle schema changes introduced in HIVE-1413
* [HIVE-1882] - Remove CHANGES.txt
* [HIVE-1904] - Create MetaStore schema upgrade scripts for changes made in HIVE-417
* [HIVE-1905] - Provide MetaStore schema upgrade scripts for changes made in HIVE-1823
** Test
* [HIVE-1464] - improve test query performance
* [HIVE-1755] - JDBM diff in test caused by Hive-1641
* [HIVE-1774] - merge_dynamic_part's result is not deterministic
* [HIVE-1942] - change the value of hive.input.format to CombineHiveInputFormat for tests
Release Notes - Hive - Version 0.6.0
** New Feature
* [HIVE-259] - Add PERCENTILE aggregate function
* [HIVE-675] - add database/schema support Hive QL
* [HIVE-705] - Hive HBase Integration (umbrella)
* [HIVE-801] - row-wise IN would be useful
* [HIVE-862] - CommandProcessor should return DriverResponse
* [HIVE-894] - add udaf max_n, min_n to contrib
* [HIVE-917] - Bucketed Map Join
* [HIVE-972] - support views
* [HIVE-1002] - multi-partition inserts
* [HIVE-1027] - Create UDFs for XPath expression evaluation
* [HIVE-1032] - Better Error Messages for Execution Errors
* [HIVE-1087] - Let user script write out binary data into a table
* [HIVE-1121] - CombinedHiveInputFormat for hadoop 19
* [HIVE-1127] - Add UDF to create struct
* [HIVE-1131] - Add column lineage information to the pre execution hooks
* [HIVE-1132] - Add metastore API method to get partition by name
* [HIVE-1134] - bucketing mapjoin where the big table contains more than 1 big partition
* [HIVE-1178] - enforce bucketing for a table
* [HIVE-1179] - Add UDF array_contains
* [HIVE-1193] - ensure sorting properties for a table
* [HIVE-1194] - sorted merge join
* [HIVE-1197] - create a new input format where a mapper spans a file
* [HIVE-1219] - More robust handling of metastore connection failures
* [HIVE-1238] - Get partitions with a partial specification
* [HIVE-1255] - Add mathematical UDFs PI, E, degrees, radians, tan, sign, and atan
* [HIVE-1270] - Thread pool size in Thrift metastore server should be configurable
* [HIVE-1272] - Add SymlinkTextInputFormat to Hive
* [HIVE-1278] - Partition name to values conversion conversion method
* [HIVE-1307] - More generic and efficient merge method
* [HIVE-1332] - Archiving partitions
* [HIVE-1351] - Tool to cat rcfiles
* [HIVE-1397] - histogram() UDAF for a numerical column
* [HIVE-1401] - Web Interface can ony browse default
* [HIVE-1410] - Add TCP keepalive option for the metastore server
* [HIVE-1439] - Alter the number of buckets for a table
** Bug
* [HIVE-287] - support count(*) and count distinct on multiple columns
* [HIVE-763] - getSchema returns invalid column names, getThriftSchema does not return old style string schemas
* [HIVE-1011] - GenericUDTFExplode() throws NPE when given nulls
* [HIVE-1022] - desc Table should work
* [HIVE-1029] - typedbytes does not support nulls
* [HIVE-1042] - function in a transform with more than 1 argument fails
* [HIVE-1056] - Predicate push down does not work with UDTF's
* [HIVE-1064] - NPE when operating HiveCLI in distributed mode
* [HIVE-1066] - TestContribCliDriver failure in serde_typedbytes.q, serde_typedbytes2.q, and serde_typedbytes3.q
* [HIVE-1075] - Make it possible for users to recover data when moveTask fails
* [HIVE-1085] - ColumnarSerde should not be the default Serde when user specified a fileformat using 'stored as'.
* [HIVE-1086] - Add "-Doffline=true" option to ant
* [HIVE-1090] - Skew Join does not work in distributed env.
* [HIVE-1092] - Conditional task does not increase finished job counter when filter job out.
* [HIVE-1094] - Disable streaming last table if there is a skew key in previous tables.
* [HIVE-1116] - bug with alter table rename when table has property EXTERNAL=FALSE
* [HIVE-1124] - create view should expand the query text consistently
* [HIVE-1125] - Hive CLI shows 'Ended Job=' at the beginning of the job
* [HIVE-1129] - Assertion in ExecDriver.execute when assertions are enabled in HADOOP_OPTS
* [HIVE-1142] - "datanucleus" typos in conf/hive-default.xml
* [HIVE-1167] - Use TreeMap instead of Property to make explain extended deterministic
* [HIVE-1174] - Job counter error if "hive.merge.mapfiles" equals true
* [HIVE-1176] - 'create if not exists' fails for a table name with 'select' in it
* [HIVE-1184] - Expression Not In Group By Key error is sometimes masked
* [HIVE-1185] - Fix RCFile resource leak when opening a non-RCFile
* [HIVE-1195] - Increase ObjectInspector[] length on demand
* [HIVE-1200] - Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
* [HIVE-1204] - typedbytes: writing to stderr kills the mapper
* [HIVE-1205] - RowContainer should flush out dummy rows when the table desc is null
* [HIVE-1207] - ScriptOperator AutoProgressor does not set the interval
* [HIVE-1242] - CombineHiveInputFormat does not work for compressed text files
* [HIVE-1247] - hints cannot be passed to transform statements
* [HIVE-1252] - Task breaking bug when breaking after a filter operator
* [HIVE-1253] - date_sub() function returns wrong date because of daylight saving time difference
* [HIVE-1257] - joins between HBase tables and other tables (whether HBase or not) are broken
* [HIVE-1258] - set merge files to files when bucketing/sorting is being enforced
* [HIVE-1261] - ql.metadata.Hive#close() should check for null metaStoreClient
* [HIVE-1268] - Cannot start metastore thrift server on a specific port
* [HIVE-1271] - Case sensitiveness of type information specified when using custom reducer causes type mismatch
* [HIVE-1273] - UDF_Percentile NullPointerException
* [HIVE-1274] - bug in sort merge join if the big table does not have any row
* [HIVE-1275] - TestHBaseCliDriver hangs
* [HIVE-1277] - Select query with specific projection(s) fails if the local file system directory for ${hive.user.scratchdir} does not exist.
* [HIVE-1280] - problem in combinehiveinputformat with nested directories
* [HIVE-1281] - Bucketing column names in create table should be case-insensitive
* [HIVE-1286] - error/info message being emitted on standard output
* [HIVE-1290] - sort merge join does not work with bucketizedhiveinputformat
* [HIVE-1291] - Fix UDAFPercentile ndexOutOfBoundsException
* [HIVE-1294] - HIVE_AUX_JARS_PATH interferes with startup of Hive Web Interface
* [HIVE-1298] - unit test symlink_text_input_format.q needs ORDER BY for determinism
* [HIVE-1308] - <boolean> = <boolean> throws NPE
* [HIVE-1311] - bug is use of hadoop supports splittable
* [HIVE-1312] - hive trunk does not compile with hadoop 0.17 any more
* [HIVE-1315] - bucketed sort merge join breaks after dynamic partition insert
* [HIVE-1317] - CombineHiveInputFormat throws exception when partition name contains special characters to URI
* [HIVE-1320] - NPE with lineage in a query of union alls on joins.
* [HIVE-1321] - bugs with temp directories, trailing blank fields in HBase bulk load
* [HIVE-1322] - Cached FileSystem can lead to persistant IOExceptions
* [HIVE-1323] - leading dash in partition name is not handled properly
* [HIVE-1325] - dynamic partition insert should throw an exception if the number of target table columns + dynamic partition columns does not equal to the number of select columns
* [HIVE-1326] - RowContainer uses hard-coded '/tmp/' path for temporary files
* [HIVE-1327] - Group by partition column returns wrong results
* [HIVE-1330] - fatal error check omitted for reducer-side operators
* [HIVE-1331] - select * does not work if different partitions contain different formats
* [HIVE-1338] - Fix bin/ext/jar.sh to work with hadoop 0.20 and above
* [HIVE-1341] - Filter Operator Column Pruning should preserve the column order
* [HIVE-1345] - TypedBytesSerDe fails to create table with multiple columns.
* [HIVE-1350] - hive.query.id is not unique
* [HIVE-1352] - rcfilecat should use '\t' to separate columns and print '\r\n' at the end of each row.
* [HIVE-1353] - load_dyn_part*.q tests need ORDER BY for determinism
* [HIVE-1354] - partition level properties honored if it exists
* [HIVE-1364] - Increase the maximum length of various metastore fields, and remove TYPE_NAME from COLUMNS primary key
* [HIVE-1365] - Bug in SMBJoinOperator which may causes a final part of the results in some cases.
* [HIVE-1366] - inputFileFormat error if the merge job takes a different input file format than the default output file format
* [HIVE-1371] - remove blank in rcfilecat
* [HIVE-1373] - Missing connection pool plugin in Eclipse classpath
* [HIVE-1377] - getPartitionDescFromPath() in CombineHiveInputFormat should handle matching by path
* [HIVE-1388] - combinehiveinputformat does not work if files are of different types
* [HIVE-1403] - Reporting progress to JT during closing files in FileSinkOperator
* [HIVE-1407] - Add hadoop-*-tools.jar to Eclipse classpath
* [HIVE-1409] - File format information is retrieved from first partition
* [HIVE-1411] - DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH
* [HIVE-1412] - CombineHiveInputFormat bug on tablesample
* [HIVE-1417] - Archived partitions throw error with queries calling getContentSummary
* [HIVE-1418] - column pruning not working with lateral view
* [HIVE-1420] - problem with sequence and rcfiles are mixed for null partitions
* [HIVE-1421] - problem with sequence and rcfiles are mixed for null partitions
* [HIVE-1425] - hive.task.progress should be added to conf/hive-default.xml
* [HIVE-1428] - ALTER TABLE ADD PARTITION fails with a remote Thrift metastore
* [HIVE-1435] - Upgraded naming scheme causes JDO exceptions
* [HIVE-1448] - bug in 'set fileformat'
* [HIVE-1454] - insert overwrite and CTAS fail in hive local mode
* [HIVE-1455] - lateral view does not work with column pruning
* [HIVE-1492] - FileSinkOperator should remove duplicated files from the same task based on file sizes
* [HIVE-1524] - parallel execution failed if mapred.job.name is set
* [HIVE-1594] - Typo of hive.merge.size.smallfiles.avgsize prevents change of value
* [HIVE-1613] - hive --service jar looks for hadoop version but was not defined
* [HIVE-1615] - Web Interface JSP needs Refactoring for removed meta store methods
* [HIVE-1681] - ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
* [HIVE-1697] - Migration scripts should increase size of PARAM_VALUE in PARTITION_PARAMS
** Improvement
* [HIVE-543] - provide option to run hive in local mode
* [HIVE-964] - handle skewed keys for a join in a separate job
* [HIVE-990] - Incorporate CheckStyle into Hive's build.xml
* [HIVE-1047] - Merge tasks in GenMRUnion1
* [HIVE-1068] - CREATE VIEW followup: add a "table type" enum attribute in metastore's MTable, and also null out irrelevant attributes for MTable instances which describe views
* [HIVE-1069] - CREATE VIEW followup: find and document current expected version of thrift, and regenerate code to match
* [HIVE-1093] - Add a "skew join map join size" variable to control the input size of skew join's following map join job.
* [HIVE-1102] - make number of concurrent tasks configurable
* [HIVE-1108] - QueryPlan to be independent from BaseSemanticAnalyzer
* [HIVE-1109] - Structured temporary directories
* [HIVE-1110] - add counters to show that skew join triggered
* [HIVE-1117] - Make QueryPlan serializable
* [HIVE-1118] - Add hive.merge.size.per.task to HiveConf
* [HIVE-1119] - Make all Tasks and Works serializable
* [HIVE-1120] - In ivy offline mode, don't delete downloaded jars
* [HIVE-1122] - Make ql/metadata/Table and Partition serializable
* [HIVE-1128] - Let max/min handle complex types like struct
* [HIVE-1136] - add type-checking setters for HiveConf class to match existing getters
* [HIVE-1144] - CREATE VIEW followup: support ALTER TABLE SET TBLPROPERTIES on views
* [HIVE-1150] - Add comment to explain why we check for dir first in add_partitions().
* [HIVE-1152] - Add metastore API method to drop partition / append partition by name
* [HIVE-1164] - drop_partition_by_name() should use drop_partition_common()
* [HIVE-1190] - Configure build to download Hadoop tarballs from Facebook mirror instead of Apache
* [HIVE-1198] - When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
* [HIVE-1212] - Explicitly say "Hive Internal Error" to ease debugging
* [HIVE-1216] - Show the row with error in mapper/reducer
* [HIVE-1220] - accept TBLPROPERTIES on CREATE TABLE/VIEW
* [HIVE-1228] - allow HBase key column to be anywhere in Hive table
* [HIVE-1241] - add pre-drops in bucketmapjoin*.q
* [HIVE-1244] - add backward-compatibility constructor to HiveMetaStoreClient
* [HIVE-1246] - mapjoin followed by another mapjoin should be performed in a single query
* [HIVE-1260] - from_unixtime should implment a overloading function to accept only bigint type
* [HIVE-1276] - optimize bucketing
* [HIVE-1295] - facilitate HBase bulk loads from Hive
* [HIVE-1296] - CLI set and set -v commands should dump properties in alphabetical order
* [HIVE-1297] - error message in Hive.checkPaths dumps Java array address instead of path string
* [HIVE-1300] - support: alter table touch partition
* [HIVE-1306] - cleanup the jobscratchdir
* [HIVE-1316] - Increase the memory limit for CLI client
* [HIVE-1328] - make mapred.input.dir.recursive work for select *
* [HIVE-1329] - for ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'), change TBL_TYPE attribute from MANAGED_TABLE to EXTERNAL_TABLE
* [HIVE-1335] - DataNucleus should use connection pooling
* [HIVE-1348] - Moving inputFileChanged() from ExecMapper to where it is needed
* [HIVE-1349] - Do not pull counters of non initialized jobs
* [HIVE-1355] - Hive should use NullOutputFormat for hadoop jobs
* [HIVE-1357] - CombineHiveInputSplit should initialize the inputFileFormat once for a single split
* [HIVE-1372] - New algorithm for variance() UDAF
* [HIVE-1383] - allow HBase WAL to be disabled
* [HIVE-1387] - Add PERCENTILE_APPROX which works with double data type
* [HIVE-1531] - Make Hive build work with Ivy versions < 2.1.0
* [HIVE-1543] - set abort in ExecMapper when Hive's record reader got an IOException
* [HIVE-1693] - Make the compile target depend on thrift.home
** Task
* [HIVE-1081] - Automated source code cleanup
* [HIVE-1084] - Cleanup Class names
* [HIVE-1103] - Add .gitignore file
* [HIVE-1104] - Suppress Checkstyle warnings for generated files
* [HIVE-1112] - Replace instances of StringBuffer/Vector with StringBuilder/ArrayList
* [HIVE-1123] - Checkstyle fixes
* [HIVE-1135] - Use Anakia for version controlled documentation
* [HIVE-1137] - build references IVY_HOME incorrectly
* [HIVE-1147] - Update Eclipse project configuration to match Checkstyle
* [HIVE-1163] - Eclipse launchtemplate changes to enable debugging
* [HIVE-1256] - fix Hive logo img tag to avoid stretching
* [HIVE-1427] - Provide metastore schema migration scripts (0.5 -> 0.6)
* [HIVE-1709] - Provide Postgres metastore schema migration scripts (0.5 -> 0.6)
* [HIVE-1725] - Include metastore upgrade scripts in release tarball
* [HIVE-1726] - Update README file for 0.6.0 release
* [HIVE-1729] - Satisfy ASF release management requirements
** Sub-task
* [HIVE-1340] - checking VOID type for NULL in LazyBinarySerde
** Test
* [HIVE-1188] - NPE when running TestJdbcDriver/TestHiveServer
* [HIVE-1236] - test HBase input format plus CombinedHiveInputFormat
* [HIVE-1279] - temporarily disable HBase test execution
* [HIVE-1359] - Unit test should be shim-aware