Roadmap.md

The Geode team is actively working on several ground breaking features that seek to extend the product's usefulness in transactional environments while adding new features that make it very relevant to big data applications. Here are some of the key features being built into the product at this time.

HDFS Integration: Geode as a transactional layer that microbatches data out to Hadoop. This capability makes Geode a NoSQL store that can sit on top of Hadoop and parallelize the process of moving data from the in memory tier into Hadoop, making it very useful for capturing and processing fast data while making it available for Hadoop jobs relatively quickly. The key requirements being met here are

Ingest data into HDFS parallely
Cache bloom filters and allow fast lookups of individual elements
Have programmable policies for deciding what stays in memory
Roll files in HDFS
Index data that is in memory
Have expiration policies that allows the transactional set to decay out older data
Solution needs to support replicated and partitioned regions

Spark Integration: Geode as a data store for Spark applications is what is being enabled here. By providing a bridge style connector for Spark applications, Geode can become the data store for storing intermediate and final state for Spark applications and allow reference data stored in the in memory tier to be accessed very efficiently for applications

Expose Geode regions as Spark RDDs
Write Spark RDDs to Geode Regions
Execute arbitrary OQL queries in your spark applications

Off-Heap data management: Increasing the memory density for the in memory tier has been an important goal for customers. Moving data out of the ambit of the JVM garbage collector allows for higher throughput because GC threads are no longer actively copying data from one memory space to the next. It also reduces the need to restrict JVM sizes in order to ensure that memory allocation and garbage generation never outruns the garbage collector. This in turn reduces complexity by reducing cluster sizes and moving parts in a running cluster

Storing values, indexes and keys off heap
Optimizing Geode so that interacting with data minimizes the amount of data deserialized and the number of times data is deserialized

Lucene Integration: Allow Lucene indexes to be stored in Geode regions allowing users to do text searches on data stored in Geode. One way of leveraging this work would be to use the Gem/Z connector to push data into a Geode cluster and allow all kinds of text analysis to be done on this data.

General product improvements:

Making authentication and authorization for all channels (gfsh, admin, client-server and REST) to follow the highly effective client server model that we support today.
Extending the transactions mechanism in Geode to support distributed transactions with eager locking and extend the colocated transaction model already supported in the product.