Jump to content

Examine individual changes

This page allows you to examine the variables generated by the Edit Filter for an individual change.

Variables generated for this change

VariableValue
Edit count of the user (user_editcount)
192
Name of the user account (user_name)
'Vitaly Zdanevich'
Age of the user account (user_age)
333235319
Groups (including implicit) the user is in (user_groups)
[ 0 => '*', 1 => 'user', 2 => 'autoconfirmed' ]
Rights that the user has (user_rights)
[ 0 => 'createaccount', 1 => 'read', 2 => 'edit', 3 => 'createtalk', 4 => 'writeapi', 5 => 'viewmywatchlist', 6 => 'editmywatchlist', 7 => 'viewmyprivateinfo', 8 => 'editmyprivateinfo', 9 => 'editmyoptions', 10 => 'abusefilter-log-detail', 11 => 'urlshortener-create-url', 12 => 'centralauth-merge', 13 => 'abusefilter-view', 14 => 'abusefilter-log', 15 => 'vipsscaler-test', 16 => 'collectionsaveasuserpage', 17 => 'reupload-own', 18 => 'move-rootuserpages', 19 => 'createpage', 20 => 'minoredit', 21 => 'editmyusercss', 22 => 'editmyuserjson', 23 => 'editmyuserjs', 24 => 'purge', 25 => 'sendemail', 26 => 'applychangetags', 27 => 'spamblacklistlog', 28 => 'mwoauthmanagemygrants', 29 => 'reupload', 30 => 'upload', 31 => 'move', 32 => 'collectionsaveascommunitypage', 33 => 'autoconfirmed', 34 => 'editsemiprotected', 35 => 'skipcaptcha', 36 => 'transcode-reset', 37 => 'createpagemainns', 38 => 'movestable', 39 => 'autoreview' ]
Whether the user is editing from mobile app (user_app)
false
Whether or not a user is editing through the mobile interface (user_mobile)
false
Page ID (page_id)
34445650
Page namespace (page_namespace)
0
Page title without namespace (page_title)
'Amazon DynamoDB'
Full page title (page_prefixedtitle)
'Amazon DynamoDB'
Edit protection level of the page (page_restrictions_edit)
[]
Page age in seconds (page_age)
258316508
Action (action)
'edit'
Edit summary/reason (summary)
'Added link to YouTube with official lecture'
Old content model (old_content_model)
'wikitext'
New content model (new_content_model)
'wikitext'
Old page wikitext, before the edit (old_wikitext)
'{{Infobox software | logo = DynamoDB.png | name = Amazon DynamoDB | developer = [[Amazon.com]] | released = {{Release date and age|2012|01}} <ref>{{cite web|url=https://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html|title=Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications - All Things Distributed|website=www.allthingsdistributed.com}}</ref> | operating system = [[Cross-platform]] | language = English | genre = {{flatlist| * [[Document-oriented database]] * [[Key-value database]] }} | license = [[Proprietary software|Proprietary]] | website = {{URL|http://aws.amazon.com/dynamodb/}} }} '''Amazon DynamoDB''' is a fully managed proprietary [[NoSQL]] [[database]] service that supports [[key-value]] and document data structures<ref>{{cite web|url=https://aws.amazon.com/dynamodb/faqs/|title=Amazon DynamoDB - FAQs|website=Amazon Web Services, Inc.}}</ref> and is offered by [[Amazon.com]] as part of the [[Amazon Web Services]] portfolio.<ref name='ZDNet 2012-01-19'>{{cite web|url=http://www.zdnet.co.uk/news/cloud/2012/01/19/amazon-switches-on-dynamodb-cloud-database-service-40094849/|title=Amazon switches on DynamoDB cloud database service|accessdate=2012-01-21|first=Jack|last=Clark|date=2012-01-19|website=ZDNet}}</ref> DynamoDB exposes a similar data model to and derives its name from [[Dynamo (storage system)|Dynamo]], but has a different underlying implementation. Dynamo had a multi-master design requiring the client to resolve version conflicts and DynamoDB uses synchronous replication across multiple [[data center]]s<ref>{{cite web|url=https://aws.amazon.com/dynamodb/faqs/#scale_anchor|title=FAQs: Scalability, Availability & Durability|website=Amazon Web Services}}</ref> for high durability and availability. DynamoDB was announced by Amazon CTO [[Werner Vogels]] on January 18, 2012,<ref name='Vogels 2012-01-18'>{{cite web|url=http://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html|title=Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications|accessdate=2012-01-21|first = Werner|last=Vogels|date=2012-01-18|work=All Things Distributed blog}}</ref> and is presented as an evolution of [[Amazon SimpleDB]] solution.<ref>{{Cite web|url=https://aws.amazon.com/dynamodb/faqs/|title=Amazon DynamoDB - FAQs|website=Amazon Web Services, Inc.|access-date=2019-06-03}}</ref> ==Background== Vogels motivates the project in his 2012 announcement.<ref name="Vogels 2012-01-18" /> Amazon began as a decentralized network of services. Originally, services had direct access to each other's databases. When this became a bottleneck on engineering operations, services moved away from this direct access pattern in favor of public-facing [[Application programming interface|API]]<nowiki/>s. Still, third-party [[Relational database|relational database management systems]] struggled to handle Amazon's client base. This culminated during the 2004 holiday season, when several technologies failed under high traffic. Engineers were normalizing these relational systems to reduce [[data redundancy]], a design that optimizes for storage. The sacrifice: they stored a given "item" of data (e.g., the information pertaining to a product in a product database) over several relations, and it takes time to assemble disjoint parts for a query. Many of Amazon's services demanded mostly primary-key reads on their data, and with speed a top priority, putting these pieces together was extremely taxing.<ref name=":1">{{Cite journal|last=DeCandia|first=Giuseppe|last2=Hastorun|first2=Deniz|last3=Jampani|first3=Madan|last4=Kakulapati|first4=Gunavardhan|last5=Lakshman|first5=Avinash|last6=Pilchin|first6=Alex|last7=Sivasubramanian|first7=Swaminathan|last8=Vosshall|first8=Peter|last9=Vogels|first9=Werner|date=October 2007|title=Dynamo: Amazon's Highly Available Key-value Store|journal=SIGOPS Oper. Syst. Rev.|volume=41|issue=6|pages=205–220|doi=10.1145/1323293.1294281|issn=0163-5980}}</ref> Content with compromising storage efficiency, Amazon's response was [[Dynamo (storage system)|Dynamo]]: a highly available key-value store built for internal use.<ref name="Vogels 2012-01-18" /> Dynamo, it seemed, was everything their engineers needed, but adoption lagged. Amazon's developers opted for "just works" design patterns with [[Amazon S3|S3]] and SimpleDB. While these systems had noticeable design flaws, they did not demand the overhead of provisioning hardware and scaling and re-partitioning data. Amazon's next iteration of [[NoSQL]] technology, DynamoDB, automated these database management operations. ==Overview== [[File:Dynamodb-aws-console.png|alt=Web console|thumb|Web console]] DynamoDB differs from other Amazon services by allowing developers to purchase a service based on [[throughput]], rather than [[computer data storage|storage]]. If Auto Scaling is enabled, then the database will [[Scalability|scale]] automatically.<ref>{{Cite web|url=http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html|title=Managing Throughput Capacity Automatically with DynamoDB Auto Scaling|website=Amazon DynamoDB|access-date=2017-07-05}}</ref> Additionally, administrators can request throughput changes and DynamoDB will spread the data and traffic over a number of servers using [[solid-state drive]]s, allowing predictable performance.<ref name='ZDNet 2012-01-19'/> It offers integration with [[Hadoop]] via [[Amazon Elastic MapReduce|Elastic MapReduce]].<ref>{{cite news |last1=Gray |first1=Adam |title=AWS HowTo: Using Amazon Elastic MapReduce with DynamoDB (Guest Post) |url=https://aws.amazon.com/blogs/aws/aws-howto-using-amazon-elastic-mapreduce-with-dynamodb/ |accessdate=29 October 2019 |work=AWS News Blog |date=25 January 2012}}</ref> In September 2013, Amazon made a local development version of DynamoDB available so developers could test DynamoDB-backed applications locally.<ref>{{Cite web|url=https://aws.amazon.com/blogs/aws/dynamodb-local-for-desktop-development/|title=DynamoDB Local for Desktop Development|website=Amazon Web Services|date=12 September 2013|accessdate=13 September 2013}}</ref> ==Development considerations== === Data modeling === A DynamoDB [[Table (database)|table]] features items that have attributes, some of which form a [[primary key]].<ref name=":0">{{Cite web|url=https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html|title=Amazon DynamoDB Developer Guide|date=August 10, 2012|website=AWS|access-date=July 18, 2019}}</ref> In relational systems, however, an item features each table attribute (or juggles "null" and "unknown" values in their absence), DynamoDB items are schema-less. The only exception: when creating a table, a developer specifies a primary key, and the table requires a key for every item. Primary keys must be scalar ([[String (computer science)|strings]], numbers, or [[Binary number|binary]]) and can take one of two forms. A single-attribute primary key is known as the table's "partition key", which determines the partition that an item [[Hash (computer science)|hashes]] to––more on partitioning below––so an ideal partition key has a uniform distribution over its range. A primary key can also feature a second attribute, which DynamoDB calls the table's "sort key". In this case, partition keys do not have to be unique; they are paired with sort keys to make a unique identifier for each item. The partition key is still used to determine which partition the item is stored in, but within each partition, items are sorted by the sort key. === Indices === In the relational model, indices typically serve as "helper" [[data structure]]s to supplement a table. They allow the DBMS to optimize queries under the hood and they do not improve query functionality. In DynamoDB, there is no [[Query optimization|query optimizer]], and an index is simply another table with a different key (or two) that sits beside the original.<ref name=":0" /> When a developer creates an index, she creates a new copy of her data, but only the fields that she specifies get copied over (at a minimum, the fields that she indexes on and the original table's primary key). DynamoDB users issue queries directly to their indices. There are two types of indices available. A global secondary index features a partition key (and optional sort key) that's different from the original table's partition key. A local secondary index features the same partition key as the original table, but a different sort key. Both indices introduce entirely new query functionality to a DynamoDB database by allowing queries on new keys. Similar to relational database management systems, DynamoDB updates indices automatically on addition/update/deletion, so you must be judicious when creating them or risk slowing down a write-heavy database with a slew of index updates. === Syntax === DynamoDB uses [[JSON]] for its syntax because of its ubiquity.{{citation needed|date=October 2019}} The create table action demands just three arguments: TableName, KeySchema––a list containing a partition key and an optional sort key––and AttributeDefinitions––a list of attributes to be defined which must at least contain definitions for the attributes used as partition and sort keys. Whereas [[relational database]]s offer robust query languages, DynamoDB offers just Put, Get, Update, and Delete operations. Put requests contain the TableName attribute and an Item attribute, which consists of all the attributes and values the item has. An Update request follows the same syntax. Similarly, to get or delete an item, simply specify a TableName and Key. ==System architecture== === Data structures === DynamoDB uses [[hash function|hashing]] and [[B-tree]]s to manage data. Upon entry, data is first distributed into different partitions by hashing on the partition key. Each partition can store up to 10GB of data and handle by default 1,000 write capacity units (WCU) and 3,000 read capacity units (RCU).<ref name=":2">{{Cite web|url=https://shinesolutions.com/2016/06/27/a-deep-dive-into-dynamodb-partitions/|title=A Deep Dive into DynamoDB Partitions|last=Gunasekara|first=Archie|date=2016-06-27|website=Shine Solutions Group|language=en|access-date=2019-08-03}}</ref> One RCU represents one [[Strong consistency|strongly consistent]] read per second or two [[Eventual consistency|eventually consistent]] reads per second for items up to 4KB in size.<ref name=":0" /> One WCU represents one write per second for an item up to 1KB in size. To prevent data loss, DynamoDB features a two-tier backup system of replication and long-term storage.<ref name=":3">{{Citation|title=AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)|url=https://www.youtube.com/watch?v=yvBR71D0nAQ|language=en|access-date=2019-08-03}}</ref> Each partition features three nodes, each of which contains a copy of that partition's data. Each node also contains two data structures: a B tree used to locate items, and a replication log that notes all changes made to the node. DynamoDB periodically takes snapshots of these two data structures and stores them for a month in [[Amazon S3|S3]] so that engineers can perform point-in-time restores of their databases. Within each partition, one of the three nodes is designated the "leader node". All write operations travel first through the leader node before propagating, which makes writes consistent in DynamoDB. To maintain its status, the leader sends a "heartbeat" to each other node every 1.5 seconds. Should another node stop receiving heartbeats, it can initiate a new leader election. DynamoDB uses the [[Paxos (computer science)|Paxos algorithm]] to elect leaders. Amazon engineers originally avoided Dynamo due to engineering overheads like provisioning and managing partitions and nodes.<ref name=":1" /> In response, the DynamoDB team built a service it calls AutoAdmin to manage a database.<ref name=":3" /> AutoAdmin replaces a node when it stops responding by copying data from another node. When a partition exceeds any of its three thresholds (RCU, WCU, or 10GB), AutoAdmin will automatically add additional partitions to further segment the data.<ref name=":2" /> Just like indexing systems in the relational model, DynamoDB demands that any updates to a table be reflected in each of the table's indices. DynamoDB handles this using a service it calls the "log propagator", which subscribes to the replication logs in each node and sends additional Put, Update, and Delete requests to indices as necessary.<ref name=":3" /> Because indices result in substantial performance hits for write requests, DynamoDB allows a user at most five of them on any given table.{{citation needed|date=October 2019}} === Query execution === Suppose that a DynamoDB user issues a write operation (a Put, Update, or Delete). While a typical relational system would convert the SQL query to [[relational algebra]] and run optimization algorithms, DynamoDB skips both processes and gets right to work.<ref name=":3" /> The request arrives at the DynamoDB request router, which authenticates––"Is the request coming from where/whom it claims to be?"––and checks for authorization––"Does the user submitting the request have the requisite permissions?" Assuming these checks pass, the system hashes the request's partition key to arrive in the appropriate partition. There are three nodes within, each with a copy of the partition's data. The system first writes to the leader node, then writes to a second node, then sends a "success" message, and finally continues propagating to the third node. Writes are consistent because they always travel first through the leader node. Finally, the log propagator propagates the change to all indices. For each index, it grabs that index's primary key value from the item, then performs the same write on that index without log propagation. If the operation is an Update to a preexisting item, the updated attribute may serve as a primary key for an index, and thus the B tree for that index must update as well. B trees only handle insert, delete, and read operations, so in practice, when the log propagator receives an Update operation, it issues both a Delete operation and a Put operation to all indices. Now suppose that a DynamoDB user issues a Get operation. The request router proceeds as before with authentication and authorization. Next, as above, we hash our partition key to arrive in the appropriate hash. Now, we encounter a problem: with three nodes in eventual consistency with one another, how can we decide which to investigate? DynamoDB affords the user two options when issuing a read: consistent and eventually consistent. A consistent read visits the leader node. But the consistency-availability trade-off rears its head again here: in read-heavy systems, always reading from the leader can overwhelm a single node and reduce availability. The second option, an [[Eventual consistency|eventually]] consistent read, selects a random node. In practice, this is where DynamoDB trades consistency for availability. If we take this route, what are the odds of an inconsistency? We'd need a write operation to return "success" and begin propagating to the third node, but not finish. We'd also need our Get to target this third node. This means a 1-in-3 chance of inconsistency within the write operation's propagation window. How long is this window? Any number of catastrophes could cause a node to fall behind, but in the vast majority of cases, the third node is up-to-date within milliseconds of the leader. ==Performance== DynamoDB exposes performance metrics that help users provision it correctly and keep applications using DynamoDB running smoothly: * Requests and throttling * Errors: ConditionalCheckFailedRequests, UserErrors, SystemErrors * Metrics related to [http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html Global Secondary Index] creation<ref>{{cite web|url=https://www.datadoghq.com/blog/top-dynamodb-performance-metrics/|title=Top DynamoDB performance metrics}}</ref> These metrics can be tracked using the [[Amazon Web Services|AWS]] Management Console, using the AWS [[Command Line Interface]], or a monitoring tool integrating with [[Amazon CloudWatch]].<ref>{{cite web|url=https://www.datadoghq.com/blog/how-to-collect-dynamodb-metrics/|title=How to collect DynamoDB metrics}}</ref> ==Language bindings== Languages and frameworks with a DynamoDB [[language binding|binding]] include [[Java (programming language)|Java]], [[JavaScript]], [[Node.js]], [[Go (programming language)|Go]], [[C Sharp (programming language)|C#]] [[.NET Framework|.NET]], [[Perl]], [[PHP]], [[Python (programming language)|Python]], [[Ruby (programming language)|Ruby]], [[Haskell (programming language)|Haskell]], [[Erlang (programming language)|Erlang]], [[Django (web framework)|Django]], and [[Grails (framework)|Grails]].<ref>{{cite web|url=https://aws.amazon.com/blogs/aws/amazon-dynamodb-libraries-mappers-and-mock-implementations-galore/|title=Amazon DynamoDB Libraries, Mappers, and Mock Implementations Galore!|website=Amazon Web Services}}</ref> == Code examples == Against [https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html HTTP API], query items:<syntaxhighlight lang="json"> POST / HTTP/1.1 Host: dynamodb.<region>.<domain>; Accept-Encoding: identity Content-Length: <PayloadSizeBytes> User-Agent: <UserAgentString> Content-Type: application/x-amz-json-1.0 Authorization: AWS4-HMAC-SHA256 Credential=<Credential>, SignedHeaders=<Headers>, Signature=<Signature> X-Amz-Date: <Date> X-Amz-Target: DynamoDB_20120810.Query { "TableName": "Reply", "IndexName": "PostedBy-Index", "Limit": 3, "ConsistentRead": true, "ProjectionExpression": "Id, PostedBy, ReplyDateTime", "KeyConditionExpression": "Id = :v1 AND PostedBy BETWEEN :v2a AND :v2b", "ExpressionAttributeValues": { ":v1": {"S": "Amazon DynamoDB#DynamoDB Thread 1"}, ":v2a": {"S": "User A"}, ":v2b": {"S": "User C"} }, "ReturnConsumedCapacity": "TOTAL" } </syntaxhighlight>Sample response:<syntaxhighlight lang="json"> HTTP/1.1 200 OK x-amzn-RequestId: <RequestId> x-amz-crc32: <Checksum> Content-Type: application/x-amz-json-1.0 Content-Length: <PayloadSizeBytes> Date: <Date> { "ConsumedCapacity": { "CapacityUnits": 1, "TableName": "Reply" }, "Count": 2, "Items": [ { "ReplyDateTime": {"S": "2015-02-18T20:27:36.165Z"}, "PostedBy": {"S": "User A"}, "Id": {"S": "Amazon DynamoDB#DynamoDB Thread 1"} }, { "ReplyDateTime": {"S": "2015-02-25T20:27:36.165Z"}, "PostedBy": {"S": "User B"}, "Id": {"S": "Amazon DynamoDB#DynamoDB Thread 1"} } ], "ScannedCount": 2 } </syntaxhighlight>GetItem in [[Go (programming language)|Go]]:<syntaxhighlight lang="go"> getItemInput := &dynamodb.GetItemInput{ TableName: aws.String("happy-marketer"), Key: map[string]*dynamodb.AttributeValue{ "pk": { S: aws.String("project"), }, "sk": { S: aws.String(email + " " + name), }, }, } getItemOutput, err := dynamodbClient.GetItem(getItemInput) </syntaxhighlight>DeleteItem in [[Go (programming language)|Go]]:<syntaxhighlight lang="go"> deleteItemInput := &dynamodb.DeleteItemInput{ TableName: aws.String("happy-marketer"), Key: map[string]*dynamodb.AttributeValue{ "pk": { S: aws.String("project"), }, "sk": { S: aws.String(email + " " + name), }, }, } _, err := dynamodbClient.DeleteItem(deleteItemInput) if err != nil { panic(err) } </syntaxhighlight> ==See also== *[[Amazon Aurora]] *[[Amazon_DocumentDB|Amazon DocumentDB (with MongoDB compatibility)]] *[[Amazon Redshift]] *[[Amazon Relational Database Service]] ==References== {{reflist}} == External links == <!-- Per [[WP:ELMINOFFICIAL]], choose one official website only --> * {{Official website|http://aws.amazon.com/dynamodb/}} {{Amazon}} {{Cloud computing}} <!--Categories--> [[Category:Amazon Web Services|DynamoDB]] [[Category:Cloud storage]] [[Category:Distributed data stores]] [[Category:Structured storage]] [[Category:NoSQL products]] [[Category:Cloud databases]] [[Category:Computer-related introductions in 2012]] [[Category:Key-value databases]]'
New page wikitext, after the edit (new_wikitext)
'{{Infobox software | logo = DynamoDB.png | name = Amazon DynamoDB | developer = [[Amazon.com]] | released = {{Release date and age|2012|01}} <ref>{{cite web|url=https://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html|title=Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications - All Things Distributed|website=www.allthingsdistributed.com}}</ref> | operating system = [[Cross-platform]] | language = English | genre = {{flatlist| * [[Document-oriented database]] * [[Key-value database]] }} | license = [[Proprietary software|Proprietary]] | website = {{URL|http://aws.amazon.com/dynamodb/}} }} '''Amazon DynamoDB''' is a fully managed proprietary [[NoSQL]] [[database]] service that supports [[key-value]] and document data structures<ref>{{cite web|url=https://aws.amazon.com/dynamodb/faqs/|title=Amazon DynamoDB - FAQs|website=Amazon Web Services, Inc.}}</ref> and is offered by [[Amazon.com]] as part of the [[Amazon Web Services]] portfolio.<ref name='ZDNet 2012-01-19'>{{cite web|url=http://www.zdnet.co.uk/news/cloud/2012/01/19/amazon-switches-on-dynamodb-cloud-database-service-40094849/|title=Amazon switches on DynamoDB cloud database service|accessdate=2012-01-21|first=Jack|last=Clark|date=2012-01-19|website=ZDNet}}</ref> DynamoDB exposes a similar data model to and derives its name from [[Dynamo (storage system)|Dynamo]], but has a different underlying implementation. Dynamo had a multi-master design requiring the client to resolve version conflicts and DynamoDB uses synchronous replication across multiple [[data center]]s<ref>{{cite web|url=https://aws.amazon.com/dynamodb/faqs/#scale_anchor|title=FAQs: Scalability, Availability & Durability|website=Amazon Web Services}}</ref> for high durability and availability. DynamoDB was announced by Amazon CTO [[Werner Vogels]] on January 18, 2012,<ref name='Vogels 2012-01-18'>{{cite web|url=http://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html|title=Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications|accessdate=2012-01-21|first = Werner|last=Vogels|date=2012-01-18|work=All Things Distributed blog}}</ref> and is presented as an evolution of [[Amazon SimpleDB]] solution.<ref>{{Cite web|url=https://aws.amazon.com/dynamodb/faqs/|title=Amazon DynamoDB - FAQs|website=Amazon Web Services, Inc.|access-date=2019-06-03}}</ref> ==Background== Vogels motivates the project in his 2012 announcement.<ref name="Vogels 2012-01-18" /> Amazon began as a decentralized network of services. Originally, services had direct access to each other's databases. When this became a bottleneck on engineering operations, services moved away from this direct access pattern in favor of public-facing [[Application programming interface|API]]<nowiki/>s. Still, third-party [[Relational database|relational database management systems]] struggled to handle Amazon's client base. This culminated during the 2004 holiday season, when several technologies failed under high traffic. Engineers were normalizing these relational systems to reduce [[data redundancy]], a design that optimizes for storage. The sacrifice: they stored a given "item" of data (e.g., the information pertaining to a product in a product database) over several relations, and it takes time to assemble disjoint parts for a query. Many of Amazon's services demanded mostly primary-key reads on their data, and with speed a top priority, putting these pieces together was extremely taxing.<ref name=":1">{{Cite journal|last=DeCandia|first=Giuseppe|last2=Hastorun|first2=Deniz|last3=Jampani|first3=Madan|last4=Kakulapati|first4=Gunavardhan|last5=Lakshman|first5=Avinash|last6=Pilchin|first6=Alex|last7=Sivasubramanian|first7=Swaminathan|last8=Vosshall|first8=Peter|last9=Vogels|first9=Werner|date=October 2007|title=Dynamo: Amazon's Highly Available Key-value Store|journal=SIGOPS Oper. Syst. Rev.|volume=41|issue=6|pages=205–220|doi=10.1145/1323293.1294281|issn=0163-5980}}</ref> Content with compromising storage efficiency, Amazon's response was [[Dynamo (storage system)|Dynamo]]: a highly available key-value store built for internal use.<ref name="Vogels 2012-01-18" /> Dynamo, it seemed, was everything their engineers needed, but adoption lagged. Amazon's developers opted for "just works" design patterns with [[Amazon S3|S3]] and SimpleDB. While these systems had noticeable design flaws, they did not demand the overhead of provisioning hardware and scaling and re-partitioning data. Amazon's next iteration of [[NoSQL]] technology, DynamoDB, automated these database management operations. ==Overview== [[File:Dynamodb-aws-console.png|alt=Web console|thumb|Web console]] DynamoDB differs from other Amazon services by allowing developers to purchase a service based on [[throughput]], rather than [[computer data storage|storage]]. If Auto Scaling is enabled, then the database will [[Scalability|scale]] automatically.<ref>{{Cite web|url=http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html|title=Managing Throughput Capacity Automatically with DynamoDB Auto Scaling|website=Amazon DynamoDB|access-date=2017-07-05}}</ref> Additionally, administrators can request throughput changes and DynamoDB will spread the data and traffic over a number of servers using [[solid-state drive]]s, allowing predictable performance.<ref name='ZDNet 2012-01-19'/> It offers integration with [[Hadoop]] via [[Amazon Elastic MapReduce|Elastic MapReduce]].<ref>{{cite news |last1=Gray |first1=Adam |title=AWS HowTo: Using Amazon Elastic MapReduce with DynamoDB (Guest Post) |url=https://aws.amazon.com/blogs/aws/aws-howto-using-amazon-elastic-mapreduce-with-dynamodb/ |accessdate=29 October 2019 |work=AWS News Blog |date=25 January 2012}}</ref> In September 2013, Amazon made a local development version of DynamoDB available so developers could test DynamoDB-backed applications locally.<ref>{{Cite web|url=https://aws.amazon.com/blogs/aws/dynamodb-local-for-desktop-development/|title=DynamoDB Local for Desktop Development|website=Amazon Web Services|date=12 September 2013|accessdate=13 September 2013}}</ref> ==Development considerations== === Data modeling === A DynamoDB [[Table (database)|table]] features items that have attributes, some of which form a [[primary key]].<ref name=":0">{{Cite web|url=https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html|title=Amazon DynamoDB Developer Guide|date=August 10, 2012|website=AWS|access-date=July 18, 2019}}</ref> In relational systems, however, an item features each table attribute (or juggles "null" and "unknown" values in their absence), DynamoDB items are schema-less. The only exception: when creating a table, a developer specifies a primary key, and the table requires a key for every item. Primary keys must be scalar ([[String (computer science)|strings]], numbers, or [[Binary number|binary]]) and can take one of two forms. A single-attribute primary key is known as the table's "partition key", which determines the partition that an item [[Hash (computer science)|hashes]] to––more on partitioning below––so an ideal partition key has a uniform distribution over its range. A primary key can also feature a second attribute, which DynamoDB calls the table's "sort key". In this case, partition keys do not have to be unique; they are paired with sort keys to make a unique identifier for each item. The partition key is still used to determine which partition the item is stored in, but within each partition, items are sorted by the sort key. === Indices === In the relational model, indices typically serve as "helper" [[data structure]]s to supplement a table. They allow the DBMS to optimize queries under the hood and they do not improve query functionality. In DynamoDB, there is no [[Query optimization|query optimizer]], and an index is simply another table with a different key (or two) that sits beside the original.<ref name=":0" /> When a developer creates an index, she creates a new copy of her data, but only the fields that she specifies get copied over (at a minimum, the fields that she indexes on and the original table's primary key). DynamoDB users issue queries directly to their indices. There are two types of indices available. A global secondary index features a partition key (and optional sort key) that's different from the original table's partition key. A local secondary index features the same partition key as the original table, but a different sort key. Both indices introduce entirely new query functionality to a DynamoDB database by allowing queries on new keys. Similar to relational database management systems, DynamoDB updates indices automatically on addition/update/deletion, so you must be judicious when creating them or risk slowing down a write-heavy database with a slew of index updates. === Syntax === DynamoDB uses [[JSON]] for its syntax because of its ubiquity.{{citation needed|date=October 2019}} The create table action demands just three arguments: TableName, KeySchema––a list containing a partition key and an optional sort key––and AttributeDefinitions––a list of attributes to be defined which must at least contain definitions for the attributes used as partition and sort keys. Whereas [[relational database]]s offer robust query languages, DynamoDB offers just Put, Get, Update, and Delete operations. Put requests contain the TableName attribute and an Item attribute, which consists of all the attributes and values the item has. An Update request follows the same syntax. Similarly, to get or delete an item, simply specify a TableName and Key. ==System architecture== === Data structures === DynamoDB uses [[hash function|hashing]] and [[B-tree]]s to manage data. Upon entry, data is first distributed into different partitions by hashing on the partition key. Each partition can store up to 10GB of data and handle by default 1,000 write capacity units (WCU) and 3,000 read capacity units (RCU).<ref name=":2">{{Cite web|url=https://shinesolutions.com/2016/06/27/a-deep-dive-into-dynamodb-partitions/|title=A Deep Dive into DynamoDB Partitions|last=Gunasekara|first=Archie|date=2016-06-27|website=Shine Solutions Group|language=en|access-date=2019-08-03}}</ref> One RCU represents one [[Strong consistency|strongly consistent]] read per second or two [[Eventual consistency|eventually consistent]] reads per second for items up to 4KB in size.<ref name=":0" /> One WCU represents one write per second for an item up to 1KB in size. To prevent data loss, DynamoDB features a two-tier backup system of replication and long-term storage.<ref name=":3">{{Citation|title=AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)|url=https://www.youtube.com/watch?v=yvBR71D0nAQ|language=en|access-date=2019-08-03}}</ref> Each partition features three nodes, each of which contains a copy of that partition's data. Each node also contains two data structures: a B tree used to locate items, and a replication log that notes all changes made to the node. DynamoDB periodically takes snapshots of these two data structures and stores them for a month in [[Amazon S3|S3]] so that engineers can perform point-in-time restores of their databases. Within each partition, one of the three nodes is designated the "leader node". All write operations travel first through the leader node before propagating, which makes writes consistent in DynamoDB. To maintain its status, the leader sends a "heartbeat" to each other node every 1.5 seconds. Should another node stop receiving heartbeats, it can initiate a new leader election. DynamoDB uses the [[Paxos (computer science)|Paxos algorithm]] to elect leaders. Amazon engineers originally avoided Dynamo due to engineering overheads like provisioning and managing partitions and nodes.<ref name=":1" /> In response, the DynamoDB team built a service it calls AutoAdmin to manage a database.<ref name=":3" /> AutoAdmin replaces a node when it stops responding by copying data from another node. When a partition exceeds any of its three thresholds (RCU, WCU, or 10GB), AutoAdmin will automatically add additional partitions to further segment the data.<ref name=":2" /> Just like indexing systems in the relational model, DynamoDB demands that any updates to a table be reflected in each of the table's indices. DynamoDB handles this using a service it calls the "log propagator", which subscribes to the replication logs in each node and sends additional Put, Update, and Delete requests to indices as necessary.<ref name=":3" /> Because indices result in substantial performance hits for write requests, DynamoDB allows a user at most five of them on any given table.{{citation needed|date=October 2019}} === Query execution === Suppose that a DynamoDB user issues a write operation (a Put, Update, or Delete). While a typical relational system would convert the SQL query to [[relational algebra]] and run optimization algorithms, DynamoDB skips both processes and gets right to work.<ref name=":3" /> The request arrives at the DynamoDB request router, which authenticates––"Is the request coming from where/whom it claims to be?"––and checks for authorization––"Does the user submitting the request have the requisite permissions?" Assuming these checks pass, the system hashes the request's partition key to arrive in the appropriate partition. There are three nodes within, each with a copy of the partition's data. The system first writes to the leader node, then writes to a second node, then sends a "success" message, and finally continues propagating to the third node. Writes are consistent because they always travel first through the leader node. Finally, the log propagator propagates the change to all indices. For each index, it grabs that index's primary key value from the item, then performs the same write on that index without log propagation. If the operation is an Update to a preexisting item, the updated attribute may serve as a primary key for an index, and thus the B tree for that index must update as well. B trees only handle insert, delete, and read operations, so in practice, when the log propagator receives an Update operation, it issues both a Delete operation and a Put operation to all indices. Now suppose that a DynamoDB user issues a Get operation. The request router proceeds as before with authentication and authorization. Next, as above, we hash our partition key to arrive in the appropriate hash. Now, we encounter a problem: with three nodes in eventual consistency with one another, how can we decide which to investigate? DynamoDB affords the user two options when issuing a read: consistent and eventually consistent. A consistent read visits the leader node. But the consistency-availability trade-off rears its head again here: in read-heavy systems, always reading from the leader can overwhelm a single node and reduce availability. The second option, an [[Eventual consistency|eventually]] consistent read, selects a random node. In practice, this is where DynamoDB trades consistency for availability. If we take this route, what are the odds of an inconsistency? We'd need a write operation to return "success" and begin propagating to the third node, but not finish. We'd also need our Get to target this third node. This means a 1-in-3 chance of inconsistency within the write operation's propagation window. How long is this window? Any number of catastrophes could cause a node to fall behind, but in the vast majority of cases, the third node is up-to-date within milliseconds of the leader. ==Performance== DynamoDB exposes performance metrics that help users provision it correctly and keep applications using DynamoDB running smoothly: * Requests and throttling * Errors: ConditionalCheckFailedRequests, UserErrors, SystemErrors * Metrics related to [http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html Global Secondary Index] creation<ref>{{cite web|url=https://www.datadoghq.com/blog/top-dynamodb-performance-metrics/|title=Top DynamoDB performance metrics}}</ref> These metrics can be tracked using the [[Amazon Web Services|AWS]] Management Console, using the AWS [[Command Line Interface]], or a monitoring tool integrating with [[Amazon CloudWatch]].<ref>{{cite web|url=https://www.datadoghq.com/blog/how-to-collect-dynamodb-metrics/|title=How to collect DynamoDB metrics}}</ref> ==Language bindings== Languages and frameworks with a DynamoDB [[language binding|binding]] include [[Java (programming language)|Java]], [[JavaScript]], [[Node.js]], [[Go (programming language)|Go]], [[C Sharp (programming language)|C#]] [[.NET Framework|.NET]], [[Perl]], [[PHP]], [[Python (programming language)|Python]], [[Ruby (programming language)|Ruby]], [[Haskell (programming language)|Haskell]], [[Erlang (programming language)|Erlang]], [[Django (web framework)|Django]], and [[Grails (framework)|Grails]].<ref>{{cite web|url=https://aws.amazon.com/blogs/aws/amazon-dynamodb-libraries-mappers-and-mock-implementations-galore/|title=Amazon DynamoDB Libraries, Mappers, and Mock Implementations Galore!|website=Amazon Web Services}}</ref> == Code examples == Against [https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html HTTP API], query items:<syntaxhighlight lang="json"> POST / HTTP/1.1 Host: dynamodb.<region>.<domain>; Accept-Encoding: identity Content-Length: <PayloadSizeBytes> User-Agent: <UserAgentString> Content-Type: application/x-amz-json-1.0 Authorization: AWS4-HMAC-SHA256 Credential=<Credential>, SignedHeaders=<Headers>, Signature=<Signature> X-Amz-Date: <Date> X-Amz-Target: DynamoDB_20120810.Query { "TableName": "Reply", "IndexName": "PostedBy-Index", "Limit": 3, "ConsistentRead": true, "ProjectionExpression": "Id, PostedBy, ReplyDateTime", "KeyConditionExpression": "Id = :v1 AND PostedBy BETWEEN :v2a AND :v2b", "ExpressionAttributeValues": { ":v1": {"S": "Amazon DynamoDB#DynamoDB Thread 1"}, ":v2a": {"S": "User A"}, ":v2b": {"S": "User C"} }, "ReturnConsumedCapacity": "TOTAL" } </syntaxhighlight>Sample response:<syntaxhighlight lang="json"> HTTP/1.1 200 OK x-amzn-RequestId: <RequestId> x-amz-crc32: <Checksum> Content-Type: application/x-amz-json-1.0 Content-Length: <PayloadSizeBytes> Date: <Date> { "ConsumedCapacity": { "CapacityUnits": 1, "TableName": "Reply" }, "Count": 2, "Items": [ { "ReplyDateTime": {"S": "2015-02-18T20:27:36.165Z"}, "PostedBy": {"S": "User A"}, "Id": {"S": "Amazon DynamoDB#DynamoDB Thread 1"} }, { "ReplyDateTime": {"S": "2015-02-25T20:27:36.165Z"}, "PostedBy": {"S": "User B"}, "Id": {"S": "Amazon DynamoDB#DynamoDB Thread 1"} } ], "ScannedCount": 2 } </syntaxhighlight>GetItem in [[Go (programming language)|Go]]:<syntaxhighlight lang="go"> getItemInput := &dynamodb.GetItemInput{ TableName: aws.String("happy-marketer"), Key: map[string]*dynamodb.AttributeValue{ "pk": { S: aws.String("project"), }, "sk": { S: aws.String(email + " " + name), }, }, } getItemOutput, err := dynamodbClient.GetItem(getItemInput) </syntaxhighlight>DeleteItem in [[Go (programming language)|Go]]:<syntaxhighlight lang="go"> deleteItemInput := &dynamodb.DeleteItemInput{ TableName: aws.String("happy-marketer"), Key: map[string]*dynamodb.AttributeValue{ "pk": { S: aws.String("project"), }, "sk": { S: aws.String(email + " " + name), }, }, } _, err := dynamodbClient.DeleteItem(deleteItemInput) if err != nil { panic(err) } </syntaxhighlight> ==See also== *[[Amazon Aurora]] *[[Amazon_DocumentDB|Amazon DocumentDB (with MongoDB compatibility)]] *[[Amazon Redshift]] *[[Amazon Relational Database Service]] ==References== {{reflist}} == External links == <!-- Per [[WP:ELMINOFFICIAL]], choose one official website only --> * {{Official website|http://aws.amazon.com/dynamodb/}} *[https://www.youtube.com/watch?v=6yqfmXiZTlM <nowiki>Video: AWS re:Invent 2019: [REPEAT 1] Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1)</nowiki>] {{Amazon}} {{Cloud computing}} <!--Categories--> [[Category:Amazon Web Services|DynamoDB]] [[Category:Cloud storage]] [[Category:Distributed data stores]] [[Category:Structured storage]] [[Category:NoSQL products]] [[Category:Cloud databases]] [[Category:Computer-related introductions in 2012]] [[Category:Key-value databases]]'
Unified diff of changes made by edit (edit_diff)
'@@ -172,4 +172,5 @@ <!-- Per [[WP:ELMINOFFICIAL]], choose one official website only --> * {{Official website|http://aws.amazon.com/dynamodb/}} +*[https://www.youtube.com/watch?v=6yqfmXiZTlM <nowiki>Video: AWS re:Invent 2019: [REPEAT 1] Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1)</nowiki>] {{Amazon}} '
New page size (new_size)
20632
Old page size (old_size)
20466
Size change in edit (edit_delta)
166
Lines added in edit (added_lines)
[ 0 => '*[https://www.youtube.com/watch?v=6yqfmXiZTlM <nowiki>Video: AWS re:Invent 2019: [REPEAT 1] Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1)</nowiki>]' ]
Lines removed in edit (removed_lines)
[]
Whether or not the change was made through a Tor exit node (tor_exit_node)
false
Unix timestamp of change (timestamp)
1585482816