NOSQL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Event Logging

Applications have different event logging needs; within the enterprise, there are many different
applications that want to log events. Document databases can store all these different types of events
and can act as a central data store for event storage. This is especially true when the type of data
being captured by the events keeps changing. Events can be sharded by the name of the application
where the event originated or by the type of event such as
Content Management Systems, Blogging Platforms
Since document databases have no predefined schemas and usually understand JSON documents,
they work well in content management systems or applications for publishing websites, managing
user comments, user registrations, profiles, web-facing documents.
Web Analytics or Real-Time Analytics
Document databases can store data for real-time analytics; since parts of the document can be
updated, it’s very easy to store page views or unique visitors, and new metrics can be easily
added without schema changes.
E-Commerce Applications
E-commerce applications often need to have flexible schema for products and orders, as well as
the ability to evolve their data models without expensive database refactoring or data migration.

Documents are the main concept in document databases. The database stores and retrieves documents,
which can be XML, JSON, BSON, and so on. These documents are self-describing, hierarchical tree
data structures which can consist of maps, collections, and scalar values. The documents stored are
similar to each other but do not have to be exactly the same. Document databases store documents in
the value part of the key-value store; think about document databases as key-value stores where the
value is examinable.

Features:
Consistency
Transaction
Availability
Query Feature
Scaling

final Mongo mongo = new Mongo(mongoURI);


mongo.setWriteConcern(REPLICAS_SAFE);
DBCollection shopping = mongo.getDB(orderDatabase).getCollection(shoppingCollection);

try {
WriteResult result = shopping.insert(order, REPLICAS_SAFE);
} catch (MongoException writeException) {
}
Riak,Redis,Memcached DB and its flavors,Berkeley DB, HamsterDB, Amazon DynamoDB
,Project Voldemort
We could also create buckets which store specific data. In Riak, they are known as domain
buckets allowing the serialization and deserialization to be handled by the client driver.
Using domain buckets or different buckets for different segments the data across different
buckets allowing you to read only the object you need without having to change keydesign.
Bucket bucket = client.fetchBucket(bucketName).execute();
DomainBucket<UserProfile> profileBucket =
DomainBucket.builder(bucket,UserProfile.class).build();

Storing Session Information: Generally, every web session is unique and is assigned a unique
sessionid value. Applications that store the sessionid on disk or in an RDBMS will greatly benefit from
moving to a key-value store, since everything about the session can be stored by a single PUT request
or retrieved using GET. This single-request operation makes it very fast, as everything about the
session is stored in a single object. Solutions such as Memcached are used by many web applications,
and Riak can be used when availability isimportant.
User Profiles,Preferences: Almost every user has a unique userId, username, or some other
attribute, as well as preferences such as language, color, timezone, which products the user has
access to, and so on. This can all be put into an object, so getting preferences of a user takes a single
GET operation. Similarly, product profiles can be stored.
Shopping Cart Data: E-commerce websites have shopping carts tied to the user. As we want the
shopping carts to be available all the time, across browsers, machines, and sessions, all the shopping
information can be put into the value where the key is the userid. A Riak cluster would be best suited
for these kinds of applications.

Partitioning is the process of dividing intermediate key-value pairs generated by the


Mapper phase into subsets, each of which will be processed by a specific Reducer.
Purpose of Partitioning:To distribute the data evenly across reducers.
To ensure that all key-value pairs with the same key go to the same reducer.
Partitioner: A built-in or user-defined function determines how intermediate key-value
pairs are assigned to reducers.
Combining is an optimization step in MapReduce that reduces the amount of
intermediate data transferred between the Mapper and Reducer phases.Purpose of
Combining:To minimize network overhead by aggregating intermediate results locally
on the mapper node.To decrease the total amount of data sent across the network
(shuffle phase).How Combining Works:A Combiner function is applied to intermediate
key-value pairs output by the Mapper.The Combiner performs partial aggregation,
similar to the Reducer, but its scope is limited to the local mapper's output.

You might also like