src/main/java/org/apache/sysds/runtime/controlprogram/caching/UnifiedMemoryManager.java - systemds - Git at Google

 /*
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
  * distributed with this work for additional information
  * regarding copyright ownership.  The ASF licenses this file
  * to you under the Apache License, Version 2.0 (the
  * "License"); you may not use this file except in compliance
  * with the License.  You may obtain a copy of the License at
  *
  *   http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing,
  * software distributed under the License is distributed on an
  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  * KIND, either express or implied.  See the License for the
  * specific language governing permissions and limitations
  * under the License.
  */

 package org.apache.sysds.runtime.controlprogram.caching;

 import org.apache.commons.lang.NotImplementedException;

 /**
  * Unified Memory Manager - Initial Design
  *
  * Motivation:
  * The Unified Memory Manager, henceforth UMM, will act as a central manager of in-memory
  * matrix (uncompressed and compressed), frame, and tensor blocks within SystemDS control
  * program. So far, operation memory (70%) and buffer pool memory (15%, LazyWriteBuffer)
  * are managed independently, which causes unnecessary evictions. New components like the
  * LineageCache also use and manage statically provisioned memory areas. Ultimately, the
  * UMM aims to eliminate these shortcomings by providing a central, potentially thread-local,
  * memory management.
  *
  * Memory Areas:
  * Initially, the UMM only handles CacheBlock objects (e.g., MatrixBlock, FrameBlock, and
  * TensorBlock), and manages two memory areas:
  *   (1) operation memory (pinned cache blocks and reserved memory) and
  *   (2) dirty objects (dirty cache blocks that need to be written to local FS before eviction)
  *
  * The UMM is configured with a capacity (absolute size in byte). Relative to this capacity,
  * the operations and buffer pool memory areas each will have a min and max amount of memory
  * they can occupy, meaning that the boundary for the areas can shift dynamically depending
  * on the current load. Most importantly, though, dirty objects must not be counted twice
  * when pinning such an object for an operation. The min/max constraints are not exposed but
  * configured internally. An good starting point are the following constraints (relative to
  * JVM max heap size):
  * ___________________________
  * | operations  | 0%  | 70% | (pin requests always accepted)
  * | buffer pool | 15% | 85% | (eviction on demand)
  *
  * Object Lifecycle:
  * The UMM will also need to keep track of the current state of individual cache blocks, for
  * which it will have a few member variables. A queue similar to the current EvictionQueue is
  * used to add/remove entries with LRU as its eviction policy. In general, there are three
  * properties of object status to consider:
  *  (1) Non-dirty/dirty: non-dirty objects have a representation on HDFS or can be recomputed
  *      from lineage trace (e.g., rand/seq outputs), while dirty objects need to be preserved.
  *  (2) FS Persisted: on eviction, dirty objects need to be written to local file system.
  *      As long the local file representation exist, dirty objects can simply be dropped.
  *  (3) Pinned/unpinned: For operations, objects are pinned into memory to guard against
  *      eviction. All pin requests have to be accepted, and once a non-dirty object is released
  *      (unpinned) it can be dropped without persisting it to local FS.
  *
  * Thread-safeness:
  * Initially, the UMM will be used in an instance-based manner. For global visibility and
  * use in parallel for loops, the UMM would need to provide a static, synchronized API, but
  * this constitutes a source of severe contention. In the future, we will consider a design
  * with thread-local UMMs for the individual parfor workers.
  *
  * Testing:
  * The UMM will be developed bottom up, and thus initially tested via component tests for
  * evaluating the eviction behavior for sequences of API requests.
  */
 public class UnifiedMemoryManager
 {
 	public UnifiedMemoryManager(long capacity) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Pins a cache block into operation memory.
 	 *
 	 * @param key    unique identifier and local FS filename for eviction
 	 * @param block  cache block if not under UMM control, null otherwise
 	 * @param dirty  indicator if block is dirty (subject to buffer pool management)
 	 * @return       pinned cache block, potentially restored from local FS
 	 */
 	public CacheBlock pin(String key, CacheBlock block, boolean dirty) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Pins a virtual cache block into operation memory, by making a size reservation.
 	 * The provided size is an upper bound of the actual object size, and can be
 	 * updated on unpin (once the actual cache block is provided).
 	 *
 	 * @param key    unique identifier and local FS filename for eviction
 	 * @param size   memory reservation in operation area
 	 * @param dirty  indicator if block is dirty (subject to buffer pool management)
 	 */
 	public void pin(String key, long size, boolean dirty) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Unpins (releases) a cache block from operation memory. Dirty objects
 	 * are logically moved back to the buffer pool area.
 	 *
 	 * @param key    unique identifier and local FS filename for eviction
 	 */
 	public void unpin(String key) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Unpins (releases) a cache block from operation memory. If the size of
 	 * the provided cache block differs from the UMM meta data, the UMM meta
 	 * data is updated. Use cases include update-in-place operations and
 	 * size reservations via worst-case upper bound estimates.
 	 *
 	 * @param key    unique identifier and local FS filename for eviction
 	 * @param block  cache block which may be under UMM control, if null ignored
 	 */
 	public void unpin(String key, CacheBlock block) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Removes a cache block associated with the given key from all memory
 	 * areas, and deletes evicted representations (files in local FS). The
 	 * local file system deletes can happen asynchronously.
 	 *
 	 * @param key    unique identifier and local FS filename for eviction
 	 */
 	public void delete(String key) {
 		//TODO implement
 		throw new NotImplementedException();
 	}

 	/**
 	 * Removes all cache blocks from all memory areas and deletes all evicted
 	 * representations (files in local FS). All internally thread pools must be
 	 * shut down in a gracefully manner (e.g., wait for pending deletes).
 	 */
 	public void deleteAll() {
 		//TODO implement
 		throw new NotImplementedException();
 	}
 }
	/*
	* Licensed to the Apache Software Foundation (ASF) under one
	* or more contributor license agreements. See the NOTICE file
	* distributed with this work for additional information
	* regarding copyright ownership. The ASF licenses this file
	* to you under the Apache License, Version 2.0 (the
	* "License"); you may not use this file except in compliance
	* with the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*
	* Unless required by applicable law or agreed to in writing,
	* software distributed under the License is distributed on an
	* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	* KIND, either express or implied. See the License for the
	* specific language governing permissions and limitations
	* under the License.
	*/

	package org.apache.sysds.runtime.controlprogram.caching;

	import org.apache.commons.lang.NotImplementedException;

	/**
	* Unified Memory Manager - Initial Design
	*
	* Motivation:
	* The Unified Memory Manager, henceforth UMM, will act as a central manager of in-memory
	* matrix (uncompressed and compressed), frame, and tensor blocks within SystemDS control
	* program. So far, operation memory (70%) and buffer pool memory (15%, LazyWriteBuffer)
	* are managed independently, which causes unnecessary evictions. New components like the
	* LineageCache also use and manage statically provisioned memory areas. Ultimately, the
	* UMM aims to eliminate these shortcomings by providing a central, potentially thread-local,
	* memory management.
	*
	* Memory Areas:
	* Initially, the UMM only handles CacheBlock objects (e.g., MatrixBlock, FrameBlock, and
	* TensorBlock), and manages two memory areas:
	* (1) operation memory (pinned cache blocks and reserved memory) and
	* (2) dirty objects (dirty cache blocks that need to be written to local FS before eviction)
	*
	* The UMM is configured with a capacity (absolute size in byte). Relative to this capacity,
	* the operations and buffer pool memory areas each will have a min and max amount of memory
	* they can occupy, meaning that the boundary for the areas can shift dynamically depending
	* on the current load. Most importantly, though, dirty objects must not be counted twice
	* when pinning such an object for an operation. The min/max constraints are not exposed but
	* configured internally. An good starting point are the following constraints (relative to
	* JVM max heap size):
	* ___________________________
	* \| operations \| 0% \| 70% \| (pin requests always accepted)
	* \| buffer pool \| 15% \| 85% \| (eviction on demand)
	*
	* Object Lifecycle:
	* The UMM will also need to keep track of the current state of individual cache blocks, for
	* which it will have a few member variables. A queue similar to the current EvictionQueue is
	* used to add/remove entries with LRU as its eviction policy. In general, there are three
	* properties of object status to consider:
	* (1) Non-dirty/dirty: non-dirty objects have a representation on HDFS or can be recomputed
	* from lineage trace (e.g., rand/seq outputs), while dirty objects need to be preserved.
	* (2) FS Persisted: on eviction, dirty objects need to be written to local file system.
	* As long the local file representation exist, dirty objects can simply be dropped.
	* (3) Pinned/unpinned: For operations, objects are pinned into memory to guard against
	* eviction. All pin requests have to be accepted, and once a non-dirty object is released
	* (unpinned) it can be dropped without persisting it to local FS.
	*
	* Thread-safeness:
	* Initially, the UMM will be used in an instance-based manner. For global visibility and
	* use in parallel for loops, the UMM would need to provide a static, synchronized API, but
	* this constitutes a source of severe contention. In the future, we will consider a design
	* with thread-local UMMs for the individual parfor workers.
	*
	* Testing:
	* The UMM will be developed bottom up, and thus initially tested via component tests for
	* evaluating the eviction behavior for sequences of API requests.
	*/
	public class UnifiedMemoryManager
	{
	public UnifiedMemoryManager(long capacity) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Pins a cache block into operation memory.
	*
	* @param key unique identifier and local FS filename for eviction
	* @param block cache block if not under UMM control, null otherwise
	* @param dirty indicator if block is dirty (subject to buffer pool management)
	* @return pinned cache block, potentially restored from local FS
	*/
	public CacheBlock pin(String key, CacheBlock block, boolean dirty) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Pins a virtual cache block into operation memory, by making a size reservation.
	* The provided size is an upper bound of the actual object size, and can be
	* updated on unpin (once the actual cache block is provided).
	*
	* @param key unique identifier and local FS filename for eviction
	* @param size memory reservation in operation area
	* @param dirty indicator if block is dirty (subject to buffer pool management)
	*/
	public void pin(String key, long size, boolean dirty) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Unpins (releases) a cache block from operation memory. Dirty objects
	* are logically moved back to the buffer pool area.
	*
	* @param key unique identifier and local FS filename for eviction
	*/
	public void unpin(String key) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Unpins (releases) a cache block from operation memory. If the size of
	* the provided cache block differs from the UMM meta data, the UMM meta
	* data is updated. Use cases include update-in-place operations and
	* size reservations via worst-case upper bound estimates.
	*
	* @param key unique identifier and local FS filename for eviction
	* @param block cache block which may be under UMM control, if null ignored
	*/
	public void unpin(String key, CacheBlock block) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Removes a cache block associated with the given key from all memory
	* areas, and deletes evicted representations (files in local FS). The
	* local file system deletes can happen asynchronously.
	*
	* @param key unique identifier and local FS filename for eviction
	*/
	public void delete(String key) {
	//TODO implement
	throw new NotImplementedException();
	}

	/**
	* Removes all cache blocks from all memory areas and deletes all evicted
	* representations (files in local FS). All internally thread pools must be
	* shut down in a gracefully manner (e.g., wait for pending deletes).
	*/
	public void deleteAll() {
	//TODO implement
	throw new NotImplementedException();
	}
	}