By now you may have heard that Apache Spark is the fastest growing project in open source ‘Big Data’ community. Spark does not include its own distributed data persistence technology but can work with any Hadoop-compatible data formats. You can use the MarkLogic Connector for Hadoop as an input source to Spark and take advantage of the Spark framework to develop your ‘Big Data’ applications on top of MarkLogic.
Rebalancing in MarkLogic redistributes content in a database so that the forests that make up the database each have a similar number of documents. Spreading documents evenly across forests lets you take better advantage of concurrency among hosts.
A key feature of MarkLogic is ACID compliance. ACID stands for atomicity, consistency, isolation, and durability, and meeting these four requirements means that transactions in MarkLogic are reliable.