You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/modeling-data.md
+7-7Lines changed: 7 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -67,7 +67,7 @@ Now let's take a look at how we would model the same data as a self-contained en
67
67
]
68
68
}
69
69
70
-
Using the approach above we have now **denormalized** the person record where we **embedded** all the information relating to this person, such as their contact details and addresses, in to a single JSON document.
70
+
Using the approach above we have now **denormalized** the person record where we **embedded** all the information relating to this person, such as their contact details and addresses, into a single JSON document.
71
71
In addition, because we're not confined to a fixed schema we have the flexibility to do things like having contact details of different shapes entirely.
72
72
73
73
Retrieving a complete person record from the database is now a single read operation against a single collection and for a single document. Updating a person record, with their contact details and addresses, is also a single write operation against a single document.
@@ -168,7 +168,7 @@ Take this JSON snippet.
168
168
]
169
169
}
170
170
171
-
This could represent a person's stock portfolio. We have chosen to embed the stock information in to each portfolio document. In an environment where related data is changing frequently, like a stock trading application, embedding data that changes frequently is going to mean that you are constantly updating each portfolio document every time a stock is traded.
171
+
This could represent a person's stock portfolio. We have chosen to embed the stock information into each portfolio document. In an environment where related data is changing frequently, like a stock trading application, embedding data that changes frequently is going to mean that you are constantly updating each portfolio document every time a stock is traded.
172
172
173
173
Stock *zaza* may be traded many hundreds of times in a single day and thousands of users could have *zaza* on their portfolio. With a data model like the above we would have to update many thousands of portfolio documents many times every day leading to a system that won't scale well.
174
174
@@ -255,7 +255,7 @@ If we look at the JSON below that models publishers and books.
255
255
...
256
256
{"id": "100", "name": "Learn about Azure Cosmos DB" }
257
257
...
258
-
{"id": "1000", "name": "Deep Dive in to Azure Cosmos DB" }
258
+
{"id": "1000", "name": "Deep Dive into Azure Cosmos DB" }
259
259
260
260
If the number of the books per publisher is small with limited growth, then storing the book reference inside the publisher document may be useful. However, if the number of books per publisher is unbounded, then this data model would lead to mutable, growing arrays, as in the example publisher document above.
261
261
@@ -274,7 +274,7 @@ Switching things around a bit would result in a model that still represents the
274
274
...
275
275
{"id": "100","name": "Learn about Azure Cosmos DB", "pub-id": "mspress"}
276
276
...
277
-
{"id": "1000","name": "Deep Dive in to Azure Cosmos DB", "pub-id": "mspress"}
277
+
{"id": "1000","name": "Deep Dive into Azure Cosmos DB", "pub-id": "mspress"}
278
278
279
279
In the above example, we have dropped the unbounded collection on the publisher document. Instead we just have a reference to the publisher on each book document.
280
280
@@ -294,7 +294,7 @@ You might be tempted to replicate the same thing using documents and produce a d
294
294
{"id": "b2", "name": "Azure Cosmos DB for RDBMS Users" }
295
295
{"id": "b3", "name": "Taking over the world one JSON doc at a time" }
296
296
{"id": "b4", "name": "Learn about Azure Cosmos DB" }
297
-
{"id": "b5", "name": "Deep Dive in to Azure Cosmos DB" }
297
+
{"id": "b5", "name": "Deep Dive into Azure Cosmos DB" }
298
298
299
299
Joining documents:
300
300
{"authorId": "a1", "bookId": "b1" }
@@ -315,7 +315,7 @@ Consider the following.
315
315
{"id": "b1", "name": "Azure Cosmos DB 101", "authors": ["a1", "a2"]}
316
316
{"id": "b2", "name": "Azure Cosmos DB for RDBMS Users", "authors": ["a1"]}
317
317
{"id": "b3", "name": "Learn about Azure Cosmos DB", "authors": ["a1"]}
318
-
{"id": "b4", "name": "Deep Dive in to Azure Cosmos DB", "authors": ["a2"]}
Now, if I had an author, I immediately know which books they have written, and conversely if I had a book document loaded I would know the ids of the author(s). This saves that intermediary query against the join table reducing the number of server round trips your application has to make.
321
321
@@ -377,7 +377,7 @@ Sure, if the author's name changed or they wanted to update their photo we'd hav
377
377
378
378
In the example, there are **pre-calculated aggregates** values to save expensive processing on a read operation. In the example, some of the data embedded in the author document is data that is calculated at run-time. Every time a new book is published, a book document is created **and** the countOfBooks field is set to a calculated value based on the number of book documents that exist for a particular author. This optimization would be good in read heavy systems where we can afford to do computations on writes in order to optimize reads.
379
379
380
-
The ability to have a model with pre-calculated fields is made possible because Azure Cosmos DB supports **multi-document transactions**. Many NoSQL stores cannot do transactions across documents and therefore advocate design decisions, such as "always embed everything", due to this limitation. With Azure Cosmos DB, you can use server-side triggers, or stored procedures, that insert books and update authors all within an ACID transaction. Now you don't **have** to embed everything in to one document just to be sure that your data remains consistent.
380
+
The ability to have a model with pre-calculated fields is made possible because Azure Cosmos DB supports **multi-document transactions**. Many NoSQL stores cannot do transactions across documents and therefore advocate design decisions, such as "always embed everything", due to this limitation. With Azure Cosmos DB, you can use server-side triggers, or stored procedures, that insert books and update authors all within an ACID transaction. Now you don't **have** to embed everything into one document just to be sure that your data remains consistent.
381
381
382
382
## <aname="NextSteps"></a>Next steps
383
383
The biggest takeaways from this article are to understand that data modeling in a schema-free world is as important as ever.
0 commit comments