Storage
Implementations for persistent and not-so-persistent (aka in-memory) data stores. The great thing about the storage implementations is that they are easy to use and interchangeable. For example, during prototyping we might want something easy to setup. For testing something without external dependencies. For production something reliable, clustered or super fast?
The storage API’s and query DSL are designed primarily for use with NoSQL databases and only support storing homogeneous objects. The storage implementations are standalone and are optional, it’s entirely possible to use MongoDB for example directly, or with the Vert.x APIs.
The API is modeled around a “queryable” map and inspired by the Hazelcast IMap interface. This results in a very convenient object store with a fast lookup on the primary key, while also supporting queries with indexes.
All examples uses the Account
class which implements Storable
.
Storage implementations
Class | Persistence | Scope | Description |
---|---|---|---|
HazelMap | memory/disk* | Cluster | Distributed cluster, backed by an IMap. |
ElasticMap | memory/disk | Server | Elasticsearch high-level transport client. |
MongoDBMap | memory*/disk | Server | MongoDB async client driver. |
IndexedMapPersisted | disk | JVM | CQEngine backed with SQLite persistence. |
IndexedMapVolatile | memory | JVM | CQEngine with in memory store and indexes. |
SharedMap | memory | JVM | Thread/Verticle concurrency safe plain map. |
JsonMap | memory | JVM | No serialization required for JsonObjects. |
PrivateMap | memory | Instance | A map that isn’t shared across loaders. |
*) Not included in the free/community version.
Persistence types indicates how the data is stored and if it survives a restart.
Type | Description |
---|---|
memory | data is stored in RAM - super fast but does not survive restarts. |
disk | data is persisted to disk and survives restarts. |
Types of scope
Type | Description |
---|---|
Cluster | requires clustering, Hazelcasts cluster members has access. |
Server | runs in a separate process or machine, shared with authorized. |
JVM | shared with services running in the same JVM. |
Instance | Not shared, even if the same Map name is used. |
For JVM scoped persistence stores the stored data is shared between all clients. This requires that the storage is loaded with the same database identifier. The database identifier is constructed from the database name and the collection name. As an example, the IndexedMapPersisted
stores data to <database name>/<collection name>.db
which is an SQLite file.
Loading a storage
Loading a storage implementation is done using a StorageLoader
.
new StorageLoader(context)
.withPlugin(HazelMap.class)
.withDB("appName", "accounts")
.withValue(Account.class)
.build(done -> {
if (done.succeeded()) {
AsyncStorage<Account> db = done.result();
} else {
// handle done.cause();
}
});
The value class has a single requirement, it must implement Storable
.
class Account implements Storable {
private String name;
@Override
public String getId() {
return name;
}
}
getId
is an optional override, if not implemented the hashCode
of the Storable
will be used. In this case make sure to use a stable hashCode
and preferably something associated with the stored objects, so that the key can be used for faster lookups.
It is important to implement hashCode
and equals
for the following storages
- IndexedMapPersisted
- IndexedMapVolatile
As these implementations are based on the Java collections API.
When using HazelMap
the following method override is required for attributes that are being sorted on.
@Override
public int compareToAttribute(Storable other, String attribute) {
Account account = (Account) other;
switch (attribute) {
case "name":
return name.compareTo(account.getName());
default:
return 0;
}
}
This can also be solved with some trickery using the Serializer.getValueByPath
but there are too many variables in this case to be solved with a default implementation.
Storage API
The storage API is modeled after a simple Map, all methods are asynchronous to avoid blocking the event loop. Some storage implementations that use in-memory and does not block can complete directly. This is an implementation detail and the API is always used asynchronously. This means Future<T>
and Handler<AsyncResult<T>>
, see the Vert.x documentation for more information on how these work.
// retrieve the object with the id of "key".
get("key", (done) -> {
if (done.succeeded()) {
Account account = done.result();
} else {
// handle done.cause();
}
});
// put the account object but ignore the result.
put(account, (done) -> {});
// check if the storage contains an object with the given key.
contains("key", (done) -> {
boolean exists = done.result();
});
// adds the account if it does not already exist.
putIfAbsent(account, (done) -> {
if (done.succeeded()) {
// inserted successfully.
} else {
// failed, if caused by ValueAlreadyPresent the
// value already exists and was not inserted.
}
});
// updates the given value but only if it already exists.
update(account, (done) -> {
if (done.succeeded()) {
// the existing value was updated.
} else {
// failed, if caused by ValueMissingException
// the value didn't not previously exist.
}
});
// retrieves all values in the store with a lazy stream.
values(done -> {
// done.result() is a Stream<Value> lazily evaluated
// depending on the storage.
});
// remove all entries from the store.
clear(done -> {
// if done.succeeded() all entires are cleared.
});
// retrieve the current number of entries in the store.
size((done) -> {
int count = done.result();
});
// add an index for a regular attribute {"petstore": {owner: "jess"}}
addIndex("petstore.owner");
// add an index for a multi-valued attribute {"petstore": {petNames: ["kitty1", "kitty2"]}}
addIndex("petstore.petNames[]");
It is recommended to add all indexes to CQEngine based disk-persistence stores, before adding any objects. As indexes are not loaded from the SQLite database on startup, as these require special accessor implementations, “Attributes”. If the application is shut down, started and an object is added without calling .addIndex that object will not be added to any indexes and cannot be found using attributes for which indexes exists for.
To solve this, call IndexedMapPersisted.reindex()
before instantiating the IndexedMapPersisted storage plugin. Objects
added before the index was added with .addIndex
can then be re-indexed, this will be done the next time .addIndex
is called and this incurs a performance penalty as the whole collection will be re-indexed. To avoid this, add all
indexes any time the application is started.
Query API
The query API is the same for all storage implementations.
// some storage we already initialized.
AsyncStorage<Account> account = db;
Query<Account> query = db.query("username")
.equalTo("admin")
.and("email")
.like("@root.com")
.or("age")
.between(32, 64)
.matches("[0-9]*")
.and("lastname")
.startsWith("duda")
.pageSize(32)
.page(4)
.orderBy("firstName")
.order(SortOrder.ASCENDING)
.name("super_advanced_account_query");
query.execute(done -> {
// done.result() all 32 matching results on page 4.
// ordered in asending order by their firstname.
});
query.poll(done -> {
// done.result() contains matches to the query
// the query will be executed every second.
}, () -> 1000);
Serializing a query to DSL, this does NOT escape inputs in any way - do NOT use for user input !!!
new Query().on("cat.type")
.in("siamese", "perser", "ragdoll")
.and("cat.color").equalTo("white")
.or("cat.lifestyle").in("amphibians", "wateranimal").matches("[water].*")
.or("cat.age").between(0L, 100L).and("cat.name").startsWith("fl")
.orderBy("cat.name").order(SortOrder.ASCENDING)
.page(3).pageSize(24)
.setName("findCatsQ")
.toString();
/*
Output:
NAMED QUERY 'findCatsQ' QUERY
ON cat.type IN (siamese,perser,ragdoll) AND cat.color EQ white
OR cat.lifestyle IN (amphibians,wateranimal) REGEX([water].*)
OR cat.age BETWEEN 0 100 AND cat.name STARTSWITH fl
ORDERBY cat.name ASCENDING PAGE 3 PAGESIZE 24
*/
Sometime in the future it might be possible to escape inputs properly. For now this must be done MANUALLY.
Query DSL
There is also a text-based query parser that can be used to send queries over the network or by reading from configuration. There is no support for prepared statements, if that is needed use the Query API instead.
Example query
NAMED QUERY 'findCats Query' ON cat.type
IN (siamese,perser,ragdoll)
AND cat.color EQ white
OR cat.lifestyle IN (amphibians,wateranimal)
AND cat.address REGEX([water ].*)
OR cat.age BETWEEN 0 100
AND cat.name STARTSWITH fl
ORDERBY cat.name ASCENDING PAGE 3 PAGESIZE 24
The query can be parsed with the QueryParser
// some database we already initialized.
AsyncStorage<Account> accounts = db;
// The query parser needs a reference to the backing storage.
QueryParser parser = new QueryParser(db::query);
// parsing the expression returns a QueryBuilder<Account>
parser.parse(expression)
.execute(done -> {
// done.result() contains matching elements.
});
Custom implementations
Providing a custom implementation is easy, just implement the AsyncStorage<T>
interface and the QueryBuilder<T>
. To help with the implementation of the query interface AbstractQueryBuilder<T>
can be used.
To conform to existing implementations make sure to extend test cases from MapTestCases
.
Versions
Aiming to keep up to the latest versions, current support is
Storage | Version |
---|---|
ElasticSeach | 7.3.0 |
MongoDB | 4.0.8 |
Hazelcast | 3.10.5 |
CQEngine | 3.4.0 |
Please submit a feature request with any ideas on how to improve the APIs or to request support for another storage.