Mongodb On Red Hat
Mongodb On Red Hat
Mongodb On Red Hat
May 2012
Table of Contents
Introduc tion
A bou t MongoDB
MongoDB Components
Deployment Architectures
Installing MongoDB
Storage Configuration
Securing MongoDB
Database Backups
System Monitoring
Summary
09
Introduction
steps necessary for deploying MongoDB on RHEL 6.2,
including tuning system parameters for performance,
reliability and security. The supplied architectural
patterns and system configuration steps can serve as
a basis for deploying multi-node database clusters in
a production environment.
MongoDB Components
mongod
Mongod is the primary component of MongoDB,
responsible for the structured storage system. In single
node or replica set deployments the only component
running is mongod. When deploying sharded configurations, mongos and config will also be necessary.
mongos
Mongos is a routing and coordination process that
makes the mongod nodes in the cluster look like a
single system. Mongos processes route data requests,
keeping a cached copy of config server information in
memory. Any changes that occur on the config servers
are propagated to each mongos process. Mongos
processes may be run on the shard servers themselves,
but they are lightweight enough to exist on each
application server. Many mongos processes can be
run simultaneously since these processes do not
coordinate between one another.
config
Config server is a mongod process used to synchronously replicate the state information of a sharded
environment. In a sharded environment, config servers
store the metadata of the cluster. Although the config
server can run as a standalone, production deployments should run three individual config server
instances with copies of the same metadata (for
data safety).
About MongoDB
MongoDB is a scalable, high-performance, open
source NoSQL database. MongoDB is built around
storing JSON-style documents, allowing data to be
schema free and more dynamically represented. It
offers full index support on any attribute which makes
queries fast and flexible and supports atomic in-place
document updates for high performance consistent
updates. MongoDB is built from the ground up to scale
for high performance database access. High availability and read scaling are accomplished via Replica
Sets, an asynchronous data replication mechanism
where data is written to master nodes and replicated
to any number of secondary nodes. Applications can
be configured to read from the primary or any of the
secondaries, providing support for high performance
reads. MongoDB also scales horizontally
Deployment Architectures
Using upon these components, the following diagrams
illustrate the architecture of various deployment
types. Beginning with a single node, the following
steps and diagrams can be used to guide you through
a production deployment as the scale of the database
load grows. The following should be used as reference
designs, or starting points, for your own deployment.
Mongod
Secondary
Replica Set
Application
Data Center A
Data Center B
Data Center C
Mongod
Primary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Tiered
Replica Set
Application
Application
Data Center A
Data Center B
Data Center C
Mongod
Mongod
Mongod
Primary
Secondary
Sharding
using
replica
sets
as the Secondary
baseApplication
for each
Application
Application
Mongod
data shard
Mongod
Secondary
Cong
Mongod
Secondary
Mongos
Mongos
ConfigMongos
servers manage metadata for
sharded
Replica Set
configurations (shard keys, data balancing)
Mongod
Primary
from secondary
Mongod
Primary
Mongod
Primary
Mongod
Secondary
Application
Mongod
Secondary
Application
Mongod
Secondary
Application
Cong
Mongod
Secondary
Mongos
Mongod
Secondary
Mongos
Mongod
MongosSecondary
Cong
Secondary
Cong
Replica Set
Application
Mongod
Secondary
Mongod
Primary
Mongod
Secondary
Replica Set
Tiered
Application
1
2
Data Center A
Data Center B
Data Center C
Mongod
Primary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Tiered
Application
Replica Set
Data Center A
Data Center B
Data Center C
Mongod
Primary
Mongod
Secondary
Mongod
Secondary
http://www.mongodb.org/display/DOCS/Replica+Sets
Mongod
Mongod
Application
Application
Application
Secondary
Secondary
http://www.mongodb.org/display/DOCS/Sharding
Mongos
Replica
Set
Mongos
Mongos
Cong
Cong
Cong
Shard 1
Shard 2
Shard 3
Cong
Application
Mongod
Mongod
Secondary
alsoPrimary
be updated to
read
Cong
Shard 1
Shard 2
Shard 3
Mongod
Primary
Mongod
Primary
Mongod
Primary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Mongod
Secondary
Installing MongoDB
journal=true
dbpath=/data
logpath=/data/log/mongod.log
Storage Configuration
At this point MongoDB has been installed, but before
we start the service we must configure the data
storage (dbpath parameter above). There are several
options to consider when configuring database
storage including volume arrangement, filesystem
and encryption.
Volume Management
With volumes, you have a choice between using logical
volumes (which map to multiple physical volumes)
or using a RAID configuration (hardware or softwarebased). Logical volumes (LVM) allow you to map
multiple physical volumes to a single logical device,
with options for striping or mirroring data across the
physical devices. For more information on LVM refer
to the RHEL Logical Volume Manager5 documentation. The recommended approach is to use a RAID10
disk configuration, which is a combination of RAID0
(striping) and RAID1 (mirroring). This configuration
typically provides the best combination of performance and reliability. To setup software-based RAID10
storage, install and use the mdadm tools to configure
the volumes.
$ echo [10gen]
name=10gen Repository
baseurl=http://downloads-distro.mongodb.org/
repo/redhat/os/x86_64
gpgcheck=0 | sudo tee -
a /etc/yum.repos.d/
10gen.repo
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/entitlements.html
http://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplog
5
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/ch-lvm.html
3
4
Using ext4:
$ sudo mdadm -
-
create -
l10 -
n4/dev/md0/
dev/sda*
Or using XFS:
$ echo /dev/md0 /data xfs
defaults,auto,noatime,noexec 0 0 | sudo tee
-a /etc/fstab
Filesystems
When using RHEL there are two choices of recommended filesystems, ext4 or XFS. Ext4 uses extents,
which improves performance when using large files
and reduces metadata overhead for large files. In
addition, ext4 also labels unallocated block groups
and inode table sections accordingly, which allows
them to be skipped during a file system check. This
makes for quicker file system checks, which becomes
more beneficial as the file system grows in size. XFS
is a scalable, high performance filesystem created to
support extremely large filesystems. XFS supports
metadata journaling, which facilitates quicker crash
recovery. The XFS file system can also be defragmented and enlarged while mounted and active. In
addition, RHEL supports backup and restore utilities
specific to XFS. Note: RHEL also supports the ext3
filesystem, however it is not recommended for use
with MongoDB due to issues with file allocation and
large file access.
{ _id : ObjectId(4f64befde229
ee93b9172111), a : 1 }
Storage Encryption
$ sudo mount -
t ecryptfs /data /data
Securing MongoDB
$ sudo iptables -
A INPUT -
m state -
-
state
NEW,ESTABLISHED -
s 192.168.x.y -
p tcp -
-
dport
27017 -
j ACCEPT
6
7
$ sudo iptables -
A INPUT -
m state -
-
state
NEW,ESTABLISHED -
p tcp -
-
dport 28017 -
j
ACCEPT
Database Backups
Depending upon the configuration if your server(s),
there are multiple ways to backup the data stored in
MongoDB. One options is a volume-specific backup,
where the database is issued an fsync+lock command,
which locks the system from incoming writes, then
backups are conducted and the database is unlocked.
Note, this approach should only be used with a
secondary node within a replica set (or within a replica
set inside of a shard) in order to prevent errors in your
app when attempting to write data.
#!/bin/bash
suffix=$(date +%w)
mkdir /home/username/backup/mongo-$suffix
/usr/bin/mongodump -
o /home/username/backup/
mongo-$suffix
switched to db admin
SECONDARY> db.fsyncLock()
seeAlso : http://www.mongodb.org/
display/DOCS/fsync+Command,
$ crontab -
e
ok : 1
8
9
System Monitoring
MongoDB and RHEL include several tools to help
monitor the performance of your database instances.
MongoDB includes mongostat (exposes internal
system metrics), a query profiler and diagnostic tools
(via the mongo shell). 10gen also offers MongoDB
Monitoring Service, which is a free SaaS solution for
proactive monitoring of your MongoDB cluster. MMS
requires minimal setup and can be deployed onto your
cluster quickly. To learn more about MMS, refer to the
10gen MongoDB Monitoring Service13 site. For more
information about MongoDB monitoring tools, refer
to the MongoDB Monitoring and Diagnostics14
documentation.
$ sudo cp -
a /etc/tune-
profiles/throughputperformance /etc/tune-profiles/myprofile
sysctl.ktune
The sysctl settings used by ktune. The format is
identical to the /etc/sysconfig/sysctl file (refer to the
sysctl and sysctl.conf man pages).
ktune.sysconfig
The configuration file of ktune itself, typically /etc/
sysconfig/ktune.
ktune.sh
An init-style shell script used by the ktune service
which can run specific commands during system
startup to tune the system.
http://crontab.org/
http://www.mongodb.org/display/DOCS/Import+Export+Tools - ImportExportTools-mongodump
http://www.mongodb.org/display/DOCS/Backups
13
http://www.10gen.com/mongodb-monitoring-service
14
http://www.mongodb.org/display/DOCS/Monitoring+and+Diagnostics
10
11
12
Summary
Another tool available in RHEL is SystemTap, an application profiler. It is a tracing and probing tool that
allows users to study and monitor the activities of
the operating system (particularly, the kernel) in fine
detail. It provides information similar to the output
of tools like netstat, ps, top, and iostat; however,
SystemTap is designed to provide more filtering and
analysis options for collected information. It is most
useful when other similar tools cannot precisely
pinpoint a bottleneck in the system, requiring a
deep analysis of system activity.
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/SystemTap_Beginners_Guide/index.html
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html -nettopsect
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/SystemTap_Beginners_Guide/mainsect-disk.html - disktop
18
http://sourceware.org/systemtap/wiki/HomePage
15
16
17
New York Palo Alto Washington, D.C. London Dublin Barcelona Sydney
US (866) 237-8815 INTL +1 (650) 440-4474 info@10gen.com
Copyright 2013 10gen, Inc. All Rights Reserved.