processing/src/main/java/org/apache/druid/segment/DimensionIndexer.java - druid - Git at Google

 /*
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
  * distributed with this work for additional information
  * regarding copyright ownership.  The ASF licenses this file
  * to you under the Apache License, Version 2.0 (the
  * "License"); you may not use this file except in compliance
  * with the License.  You may obtain a copy of the License at
  *
  *   http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing,
  * software distributed under the License is distributed on an
  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  * KIND, either express or implied.  See the License for the
  * specific language governing permissions and limitations
  * under the License.
  */

 package org.apache.druid.segment;

 import org.apache.druid.collections.bitmap.BitmapFactory;
 import org.apache.druid.collections.bitmap.MutableBitmap;
 import org.apache.druid.query.dimension.DimensionSpec;
 import org.apache.druid.segment.column.ColumnCapabilities;
 import org.apache.druid.segment.data.CloseableIndexed;
 import org.apache.druid.segment.incremental.IncrementalIndex;
 import org.apache.druid.segment.incremental.IncrementalIndexRowHolder;

 import javax.annotation.Nullable;

 /**
  * Processing related interface
  *
  * A DimensionIndexer is a per-dimension stateful object that encapsulates type-specific operations and data structures
  * used during the in-memory ingestion process (i.e., work done by {@link IncrementalIndex}).
  *
  * Ingested row values are passed to a DimensionIndexer, which will update its internal data structures such as
  * a value->ID dictionary as row values are seen.
  *
  * The DimensionIndexer is also responsible for implementing various value lookup operations,
  * such as conversion between an encoded value and its full representation. It maintains knowledge of the
  * mappings between encoded values and actual values.
  *
  *
  * Sorting and Ordering
  * --------------------
  * When encoding is present, there are two relevant orderings for the encoded values.
  *
  * 1.) Ordering based on encoded value's order of ingestion
  * 2.) Ordering based on converted actual value
  *
  * Suppose we have a new String dimension DimA, which sees the values "Hello", "World", and "Apple", in that order.
  * This would correspond to dictionary encodings of "Hello"=0, "World"=1, and "Apple"=2, by the order
  * in which these values were first seen during ingestion.
  *
  * However, some use cases require the encodings to be sorted by their associated actual values.
  * In this example, that ordering would be "Apple"=0, "Hello"=1, "World"=2.
  *
  * The first ordering will be referred to as "Unsorted" in the documentation for this interface, and
  * the second ordering will be referred to as "Sorted".
  *
  * The unsorted ordering is used during ingestion, within the {@link org.apache.druid.segment.incremental.IncrementalIndexRow}
  * keys; the encodings are built as rows are ingested, taking the order in which new dimension values are seen.
  *
  * The generation of a sorted encoding takes place during segment creation when indexes are merged/persisted.
  * The sorted ordering will be used for dimension value arrays in that context and when reading from
  * persisted segments.
  *
  * Note that after calling the methods below that deal with sorted encodings,
  * - getUnsortedEncodedValueFromSorted()
  * - getSortedIndexedValues()
  * - convertUnsortedEncodedKeyComponentToSortedEncodedKeyComponent()
  *
  * calling processRowValsToUnsortedEncodedKeyComponent() afterwards can invalidate previously read sorted encoding values
  * (i.e., new values could be added that are inserted between existing values in the ordering).
  *
  *
  * Thread Safety
  * --------------------
  * Each DimensionIndexer exists within the context of a single IncrementalIndex. Before IndexMerger.persist() is
  * called on an IncrementalIndex, any associated DimensionIndexers should allow multiple threads to add data to the
  * indexer via processRowValsToUnsortedEncodedKeyComponent() and allow multiple threads to read data via methods that only
  * deal with unsorted encodings.
  *
  * As mentioned in the "Sorting and Ordering" section, writes and calls to the sorted encoding
  * methods should not be interleaved: the sorted encoding methods should only be called when it is known that
  * writes to the indexer will no longer occur.
  *
  * The implementations of methods dealing with sorted encodings are free to assume that they will be called
  * by only one thread.
  *
  * The sorted encoding methods are not currently used outside of index merging/persisting (single-threaded context, and
  * no new events will be added to the indexer).
  *
  * If an indexer is passed to a thread that will use the sorted encoding methods, the caller is responsible
  * for ensuring that previous writes to the indexer are visible to the thread that uses the sorted encoding space.
  *
  * For example, in the RealtimePlumber and IndexGeneratorJob, the thread that performs index persist is started
  * by the same thread that handles the row adds on an index, ensuring the adds are visible to the persist thread.
  *
  * @param <EncodedType> class of a single encoded value
  * @param <EncodedKeyComponentType> A row key contains a component for each dimension, this param specifies the
  *                                 class of this dimension's key component. A column type that supports multivalue rows
  *                                 should use an array type (e.g., Strings would use int[]). Column types without
  *                                 multivalue row support should use single objects (e.g., Long, Float).
  * @param <ActualType> class of a single actual value
  *
  */
 public interface DimensionIndexer<
     EncodedType extends Comparable<EncodedType>,
     EncodedKeyComponentType,
     ActualType extends Comparable<ActualType>>
 {

   /**
    * Encodes the given row value(s) of the dimension to be used within a row key.
    * It also updates the internal state of the DimensionIndexer, e.g. the dimLookup.
    * <p>
    * For example, the dictionary-encoded String-type column will return an int[]
    * containing dictionary IDs.
    * <p>
    *
    * @param dimValues             Value(s) of the dimension in a row. This can
    *                              either be a single value or a list of values
    *                              (for multi-valued dimensions)
    * @param reportParseExceptions true if parse exceptions should be reported,
    *                              false otherwise
    * @return Encoded dimension value(s) to be used as a component for the row key.
    * Contains an object of the {@link EncodedKeyComponentType} and the effective
    * size of the key component in bytes.
    */
   EncodedKeyComponent<EncodedKeyComponentType> processRowValsToUnsortedEncodedKeyComponent(
       @Nullable Object dimValues,
       boolean reportParseExceptions
   );

   /**
    * This method will be called while building an {@link IncrementalIndex} whenever a known dimension column (either
    * through an explicit schema on the ingestion spec, or auto-discovered while processing rows) is absent in any row
    * that is processed, to allow an indexer to account for any missing rows if necessary. Useful so that a string
    * {@link DimensionSelector} built on top of an {@link IncrementalIndex} may accurately report
    * {@link DimensionSelector#nameLookupPossibleInAdvance()} by allowing it to track if it has any implicit null valued
    * rows.
    *
    * At index persist/merge time all missing columns for a row will be explicitly replaced with the value appropriate
    * null or default value.
    */
   void setSparseIndexed();

   /**
    * Given an encoded value that was ordered by associated actual value, return the equivalent
    * encoded value ordered by time of ingestion.
    *
    * Using the example in the class description:
    *   getUnsortedEncodedValueFromSorted(2) would return 0
    *
    * @param sortedIntermediateValue value to convert
    * @return converted value
    */
   EncodedType getUnsortedEncodedValueFromSorted(EncodedType sortedIntermediateValue);


   /**
    * Returns an indexed structure of this dimension's sorted actual values.
    * The integer IDs represent the ordering of the sorted values.
    *
    * Using the example in the class description:
    *  "Apple"=0,
    *  "Hello"=1,
    *  "World"=2
    *
    * @return Sorted index of actual values
    */
   CloseableIndexed<ActualType> getSortedIndexedValues();


   /**
    * Get the minimum dimension value seen by this indexer.
    *
    * NOTE:
    * On an in-memory segment (IncrementalIndex), we can determine min/max values by looking at the stream of
    * row values seen in calls to processSingleRowValToIndexKey().
    *
    * However, on a disk-backed segment (QueryableIndex), the numeric dimensions do not currently have any
    * supporting index structures that can be used to efficiently determine min/max values.
    *
    * When numeric dimension support is added, the segment format should be changed to store min/max values, to
    * avoid performing a full-column scan to determine these values for numeric dims.
    *
    * @return min value
    */
   ActualType getMinValue();


   /**
    * Get the maximum dimension value seen by this indexer.
    *
    * @return max value
    */
   ActualType getMaxValue();


   /**
    * Get the cardinality of this dimension's values.
    *
    * @return value cardinality
    */
   int getCardinality();


   /**
    * Return an object used to read values from this indexer's column as Strings.
    *
    * @param spec Specifies the output name of a dimension and any extraction functions to be applied.
    * @param currEntry Provides access to the current Row object in the Cursor
    * @param desc Descriptor object for this dimension within an IncrementalIndex
    * @return A new object that reads rows from currEntry
    */
   DimensionSelector makeDimensionSelector(
       DimensionSpec spec,
       IncrementalIndexRowHolder currEntry,
       IncrementalIndex.DimensionDesc desc
   );

   /**
    * Return an object used to read values from this indexer's column.
    *
    * @param currEntry Provides access to the current Row object in the Cursor
    * @param desc Descriptor object for this dimension within an IncrementalIndex
    * @return A new object that reads rows from currEntry
    */
   ColumnValueSelector<?> makeColumnValueSelector(
       IncrementalIndexRowHolder currEntry,
       IncrementalIndex.DimensionDesc desc
   );

   ColumnCapabilities getColumnCapabilities();
   /**
    * Compares the row values for this DimensionIndexer's dimension from a Row key.
    *
    * The dimension value arrays within a Row key always use the "unsorted" ordering for encoded values.
    *
    * The row values are passed to this function as an Object, the implementer should cast them to the type
    * appropriate for this dimension.
    *
    * For example, a dictionary encoded String implementation would cast the Objects as int[] arrays.
    *
    * When comparing, if the two arrays have different lengths, the shorter array should be ordered first.
    *
    * Otherwise, the implementer of this function should iterate through the unsorted encoded values, converting
    * them to their actual type (e.g., performing a dictionary lookup for a dict-encoded String dimension),
    * and comparing the actual values until a difference is found.
    *
    * Refer to StringDimensionIndexer.compareUnsortedEncodedKeyComponents() for a reference implementation.
    *
    * The comparison rules used by this method should match the rules used by
    * {@link DimensionHandler#getEncodedValueSelectorComparator()}, otherwise incorrect ordering/merging of rows
    * can occur during ingestion, causing issues such as imperfect rollup.
    *
    * @param lhs dimension value array from a Row key
    * @param rhs dimension value array from a Row key
    * @return comparison of the two arrays
    */
   int compareUnsortedEncodedKeyComponents(@Nullable EncodedKeyComponentType lhs, @Nullable EncodedKeyComponentType rhs);


   /**
    * Check if two row value arrays from Row keys are equal.
    *
    * @param lhs dimension value array from a Row key
    * @param rhs dimension value array from a Row key
    * @return true if the two arrays are equal
    */
   boolean checkUnsortedEncodedKeyComponentsEqual(
       @Nullable EncodedKeyComponentType lhs,
       @Nullable EncodedKeyComponentType rhs
   );


   /**
    * Given a row value array from a Row key, generate a hashcode.
    * @param key dimension value array from a Row key
    * @return hashcode of the array
    */
   int getUnsortedEncodedKeyComponentHashCode(@Nullable EncodedKeyComponentType key);

   /**
    * Given a row value array from a Row key, as described in the documentation for
    * {@link #compareUnsortedEncodedKeyComponents}, convert the unsorted encoded values to a list of actual values.
    *
    * If the key has one element, this method should return a single Object instead of a list.
    *
    * @param key dimension value array from a Row key
    * @return single value or list containing the actual values corresponding to the encoded values in the input array
    */
   Object convertUnsortedEncodedKeyComponentToActualList(EncodedKeyComponentType key);

   /**
    * Converts dictionary-encoded row values from unspecified (random) encoding order, to sorted encoding. This step
    * is needed to be able to correctly map per-segment encoded values to global values on the next conversion step,
    * {@link DimensionMerger#convertSortedSegmentRowValuesToMergedRowValues}. The latter method requires sorted encoding
    * values on the input, because {@link DimensionMerger#writeMergedValueDictionary} takes sorted lookups as it's input.
    */
   ColumnValueSelector convertUnsortedValuesToSorted(ColumnValueSelector selectorWithUnsortedValues);

   /**
    * Helper function for building bitmap indexes for integer-encoded dimensions.
    *
    * Called by IncrementalIndexAdapter as it iterates through its sequence of rows.
    *
    * Given a row value array from a Row key, with the current row number indicated by "rowNum",
    * set the index for "rowNum" in the bitmap index for each value that appears in the row value array.
    *
    * For example, if key is an int[] array with values [1,3,4] for a dictionary-encoded String dimension,
    * and rowNum is 27, this function would set bit 27 in bitmapIndexes[1], bitmapIndexes[3], and bitmapIndexes[4]
    *
    * See StringDimensionIndexer.fillBitmapsFromUnsortedEncodedKeyComponent() for a reference implementation.
    *
    * If a dimension type does not support bitmap indexes, this function will not be called
    * and can be left unimplemented.
    *
    * @param key dimension value array from a Row key
    * @param rowNum current row number
    * @param bitmapIndexes array of bitmaps, indexed by integer dimension value
    * @param factory bitmap factory
    */
   void fillBitmapsFromUnsortedEncodedKeyComponent(
       EncodedKeyComponentType key,
       int rowNum,
       MutableBitmap[] bitmapIndexes,
       BitmapFactory factory
   );
 }
	/*
	* Licensed to the Apache Software Foundation (ASF) under one
	* or more contributor license agreements. See the NOTICE file
	* distributed with this work for additional information
	* regarding copyright ownership. The ASF licenses this file
	* to you under the Apache License, Version 2.0 (the
	* "License"); you may not use this file except in compliance
	* with the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*
	* Unless required by applicable law or agreed to in writing,
	* software distributed under the License is distributed on an
	* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	* KIND, either express or implied. See the License for the
	* specific language governing permissions and limitations
	* under the License.
	*/

	package org.apache.druid.segment;

	import org.apache.druid.collections.bitmap.BitmapFactory;
	import org.apache.druid.collections.bitmap.MutableBitmap;
	import org.apache.druid.query.dimension.DimensionSpec;
	import org.apache.druid.segment.column.ColumnCapabilities;
	import org.apache.druid.segment.data.CloseableIndexed;
	import org.apache.druid.segment.incremental.IncrementalIndex;
	import org.apache.druid.segment.incremental.IncrementalIndexRowHolder;

	import javax.annotation.Nullable;

	/**
	* Processing related interface
	*
	* A DimensionIndexer is a per-dimension stateful object that encapsulates type-specific operations and data structures
	* used during the in-memory ingestion process (i.e., work done by {@link IncrementalIndex}).
	*
	* Ingested row values are passed to a DimensionIndexer, which will update its internal data structures such as
	* a value->ID dictionary as row values are seen.
	*
	* The DimensionIndexer is also responsible for implementing various value lookup operations,
	* such as conversion between an encoded value and its full representation. It maintains knowledge of the
	* mappings between encoded values and actual values.
	*
	*
	* Sorting and Ordering
	* --------------------
	* When encoding is present, there are two relevant orderings for the encoded values.
	*
	* 1.) Ordering based on encoded value's order of ingestion
	* 2.) Ordering based on converted actual value
	*
	* Suppose we have a new String dimension DimA, which sees the values "Hello", "World", and "Apple", in that order.
	* This would correspond to dictionary encodings of "Hello"=0, "World"=1, and "Apple"=2, by the order
	* in which these values were first seen during ingestion.
	*
	* However, some use cases require the encodings to be sorted by their associated actual values.
	* In this example, that ordering would be "Apple"=0, "Hello"=1, "World"=2.
	*
	* The first ordering will be referred to as "Unsorted" in the documentation for this interface, and
	* the second ordering will be referred to as "Sorted".
	*
	* The unsorted ordering is used during ingestion, within the {@link org.apache.druid.segment.incremental.IncrementalIndexRow}
	* keys; the encodings are built as rows are ingested, taking the order in which new dimension values are seen.
	*
	* The generation of a sorted encoding takes place during segment creation when indexes are merged/persisted.
	* The sorted ordering will be used for dimension value arrays in that context and when reading from
	* persisted segments.
	*
	* Note that after calling the methods below that deal with sorted encodings,
	* - getUnsortedEncodedValueFromSorted()
	* - getSortedIndexedValues()
	* - convertUnsortedEncodedKeyComponentToSortedEncodedKeyComponent()
	*
	* calling processRowValsToUnsortedEncodedKeyComponent() afterwards can invalidate previously read sorted encoding values
	* (i.e., new values could be added that are inserted between existing values in the ordering).
	*
	*
	* Thread Safety
	* --------------------
	* Each DimensionIndexer exists within the context of a single IncrementalIndex. Before IndexMerger.persist() is
	* called on an IncrementalIndex, any associated DimensionIndexers should allow multiple threads to add data to the
	* indexer via processRowValsToUnsortedEncodedKeyComponent() and allow multiple threads to read data via methods that only
	* deal with unsorted encodings.
	*
	* As mentioned in the "Sorting and Ordering" section, writes and calls to the sorted encoding
	* methods should not be interleaved: the sorted encoding methods should only be called when it is known that
	* writes to the indexer will no longer occur.
	*
	* The implementations of methods dealing with sorted encodings are free to assume that they will be called
	* by only one thread.
	*
	* The sorted encoding methods are not currently used outside of index merging/persisting (single-threaded context, and
	* no new events will be added to the indexer).
	*
	* If an indexer is passed to a thread that will use the sorted encoding methods, the caller is responsible
	* for ensuring that previous writes to the indexer are visible to the thread that uses the sorted encoding space.
	*
	* For example, in the RealtimePlumber and IndexGeneratorJob, the thread that performs index persist is started
	* by the same thread that handles the row adds on an index, ensuring the adds are visible to the persist thread.
	*
	* @param <EncodedType> class of a single encoded value
	* @param <EncodedKeyComponentType> A row key contains a component for each dimension, this param specifies the
	* class of this dimension's key component. A column type that supports multivalue rows
	* should use an array type (e.g., Strings would use int[]). Column types without
	* multivalue row support should use single objects (e.g., Long, Float).
	* @param <ActualType> class of a single actual value
	*
	*/
	public interface DimensionIndexer<
	EncodedType extends Comparable<EncodedType>,
	EncodedKeyComponentType,
	ActualType extends Comparable<ActualType>>
	{

	/**
	* Encodes the given row value(s) of the dimension to be used within a row key.
	* It also updates the internal state of the DimensionIndexer, e.g. the dimLookup.
	* <p>
	* For example, the dictionary-encoded String-type column will return an int[]
	* containing dictionary IDs.
	* <p>
	*
	* @param dimValues Value(s) of the dimension in a row. This can
	* either be a single value or a list of values
	* (for multi-valued dimensions)
	* @param reportParseExceptions true if parse exceptions should be reported,
	* false otherwise
	* @return Encoded dimension value(s) to be used as a component for the row key.
	* Contains an object of the {@link EncodedKeyComponentType} and the effective
	* size of the key component in bytes.
	*/
	EncodedKeyComponent<EncodedKeyComponentType> processRowValsToUnsortedEncodedKeyComponent(
	@Nullable Object dimValues,
	boolean reportParseExceptions
	);

	/**
	* This method will be called while building an {@link IncrementalIndex} whenever a known dimension column (either
	* through an explicit schema on the ingestion spec, or auto-discovered while processing rows) is absent in any row
	* that is processed, to allow an indexer to account for any missing rows if necessary. Useful so that a string
	* {@link DimensionSelector} built on top of an {@link IncrementalIndex} may accurately report
	* {@link DimensionSelector#nameLookupPossibleInAdvance()} by allowing it to track if it has any implicit null valued
	* rows.
	*
	* At index persist/merge time all missing columns for a row will be explicitly replaced with the value appropriate
	* null or default value.
	*/
	void setSparseIndexed();

	/**
	* Given an encoded value that was ordered by associated actual value, return the equivalent
	* encoded value ordered by time of ingestion.
	*
	* Using the example in the class description:
	* getUnsortedEncodedValueFromSorted(2) would return 0
	*
	* @param sortedIntermediateValue value to convert
	* @return converted value
	*/
	EncodedType getUnsortedEncodedValueFromSorted(EncodedType sortedIntermediateValue);


	/**
	* Returns an indexed structure of this dimension's sorted actual values.
	* The integer IDs represent the ordering of the sorted values.
	*
	* Using the example in the class description:
	* "Apple"=0,
	* "Hello"=1,
	* "World"=2
	*
	* @return Sorted index of actual values
	*/
	CloseableIndexed<ActualType> getSortedIndexedValues();


	/**
	* Get the minimum dimension value seen by this indexer.
	*
	* NOTE:
	* On an in-memory segment (IncrementalIndex), we can determine min/max values by looking at the stream of
	* row values seen in calls to processSingleRowValToIndexKey().
	*
	* However, on a disk-backed segment (QueryableIndex), the numeric dimensions do not currently have any
	* supporting index structures that can be used to efficiently determine min/max values.
	*
	* When numeric dimension support is added, the segment format should be changed to store min/max values, to
	* avoid performing a full-column scan to determine these values for numeric dims.
	*
	* @return min value
	*/
	ActualType getMinValue();


	/**
	* Get the maximum dimension value seen by this indexer.
	*
	* @return max value
	*/
	ActualType getMaxValue();


	/**
	* Get the cardinality of this dimension's values.
	*
	* @return value cardinality
	*/
	int getCardinality();


	/**
	* Return an object used to read values from this indexer's column as Strings.
	*
	* @param spec Specifies the output name of a dimension and any extraction functions to be applied.
	* @param currEntry Provides access to the current Row object in the Cursor
	* @param desc Descriptor object for this dimension within an IncrementalIndex
	* @return A new object that reads rows from currEntry
	*/
	DimensionSelector makeDimensionSelector(
	DimensionSpec spec,
	IncrementalIndexRowHolder currEntry,
	IncrementalIndex.DimensionDesc desc
	);

	/**
	* Return an object used to read values from this indexer's column.
	*
	* @param currEntry Provides access to the current Row object in the Cursor
	* @param desc Descriptor object for this dimension within an IncrementalIndex
	* @return A new object that reads rows from currEntry
	*/
	ColumnValueSelector<?> makeColumnValueSelector(
	IncrementalIndexRowHolder currEntry,
	IncrementalIndex.DimensionDesc desc
	);

	ColumnCapabilities getColumnCapabilities();
	/**
	* Compares the row values for this DimensionIndexer's dimension from a Row key.
	*
	* The dimension value arrays within a Row key always use the "unsorted" ordering for encoded values.
	*
	* The row values are passed to this function as an Object, the implementer should cast them to the type
	* appropriate for this dimension.
	*
	* For example, a dictionary encoded String implementation would cast the Objects as int[] arrays.
	*
	* When comparing, if the two arrays have different lengths, the shorter array should be ordered first.
	*
	* Otherwise, the implementer of this function should iterate through the unsorted encoded values, converting
	* them to their actual type (e.g., performing a dictionary lookup for a dict-encoded String dimension),
	* and comparing the actual values until a difference is found.
	*
	* Refer to StringDimensionIndexer.compareUnsortedEncodedKeyComponents() for a reference implementation.
	*
	* The comparison rules used by this method should match the rules used by
	* {@link DimensionHandler#getEncodedValueSelectorComparator()}, otherwise incorrect ordering/merging of rows
	* can occur during ingestion, causing issues such as imperfect rollup.
	*
	* @param lhs dimension value array from a Row key
	* @param rhs dimension value array from a Row key
	* @return comparison of the two arrays
	*/
	int compareUnsortedEncodedKeyComponents(@Nullable EncodedKeyComponentType lhs, @Nullable EncodedKeyComponentType rhs);


	/**
	* Check if two row value arrays from Row keys are equal.
	*
	* @param lhs dimension value array from a Row key
	* @param rhs dimension value array from a Row key
	* @return true if the two arrays are equal
	*/
	boolean checkUnsortedEncodedKeyComponentsEqual(
	@Nullable EncodedKeyComponentType lhs,
	@Nullable EncodedKeyComponentType rhs
	);


	/**
	* Given a row value array from a Row key, generate a hashcode.
	* @param key dimension value array from a Row key
	* @return hashcode of the array
	*/
	int getUnsortedEncodedKeyComponentHashCode(@Nullable EncodedKeyComponentType key);

	/**
	* Given a row value array from a Row key, as described in the documentation for
	* {@link #compareUnsortedEncodedKeyComponents}, convert the unsorted encoded values to a list of actual values.
	*
	* If the key has one element, this method should return a single Object instead of a list.
	*
	* @param key dimension value array from a Row key
	* @return single value or list containing the actual values corresponding to the encoded values in the input array
	*/
	Object convertUnsortedEncodedKeyComponentToActualList(EncodedKeyComponentType key);

	/**
	* Converts dictionary-encoded row values from unspecified (random) encoding order, to sorted encoding. This step
	* is needed to be able to correctly map per-segment encoded values to global values on the next conversion step,
	* {@link DimensionMerger#convertSortedSegmentRowValuesToMergedRowValues}. The latter method requires sorted encoding
	* values on the input, because {@link DimensionMerger#writeMergedValueDictionary} takes sorted lookups as it's input.
	*/
	ColumnValueSelector convertUnsortedValuesToSorted(ColumnValueSelector selectorWithUnsortedValues);

	/**
	* Helper function for building bitmap indexes for integer-encoded dimensions.
	*
	* Called by IncrementalIndexAdapter as it iterates through its sequence of rows.
	*
	* Given a row value array from a Row key, with the current row number indicated by "rowNum",
	* set the index for "rowNum" in the bitmap index for each value that appears in the row value array.
	*
	* For example, if key is an int[] array with values [1,3,4] for a dictionary-encoded String dimension,
	* and rowNum is 27, this function would set bit 27 in bitmapIndexes[1], bitmapIndexes[3], and bitmapIndexes[4]
	*
	* See StringDimensionIndexer.fillBitmapsFromUnsortedEncodedKeyComponent() for a reference implementation.
	*
	* If a dimension type does not support bitmap indexes, this function will not be called
	* and can be left unimplemented.
	*
	* @param key dimension value array from a Row key
	* @param rowNum current row number
	* @param bitmapIndexes array of bitmaps, indexed by integer dimension value
	* @param factory bitmap factory
	*/
	void fillBitmapsFromUnsortedEncodedKeyComponent(
	EncodedKeyComponentType key,
	int rowNum,
	MutableBitmap[] bitmapIndexes,
	BitmapFactory factory
	);
	}