Skip to content

Commit 733c35e

Browse files
committed
Fixes mkdoc template
1 parent c9ab078 commit 733c35e

File tree

10 files changed

+576
-1
lines changed

10 files changed

+576
-1
lines changed

Dockerfile

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
FROM sonarsource/local-travis
2+
3+
RUN apt-get -y update && apt-get install -y build-essential cmake python python-pip
4+
RUN pip install redis

TODO.md

+152
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# ReJSON TODOs
2+
3+
---
4+
5+
# MVP milestone
6+
7+
This is what ReJSON (https://github.com/redislabsmodules/rejson) currently has ready for the MVP:
8+
9+
* Code is under src/
10+
* Building with CMake
11+
- Need to verify on OSX
12+
- Currently does not have an `install` option - needed?
13+
* Documentation
14+
- docs/commands.md
15+
- docs/design.md ~ 30% done
16+
- README.md ~ 85% done
17+
- Missing about/what is ReJSON
18+
- Some notes about performance
19+
- Perhaps a Node.js example
20+
- Source code is about 90% documented
21+
* AGPLv3 license
22+
* Copyright
23+
24+
## Missing misc
25+
26+
1. Peer review of project CTO->RnD/QA?
27+
1. Number overflows in number operations
28+
1. Something is printing "inf"
29+
30+
## Source code/style
31+
32+
1. Review and standardize use of int/size_t/uint32...
33+
1. Improve/remove error reporting and logging in case of module internal errors
34+
35+
## Benchmarks
36+
37+
1. Need to include a simple standalone "benchmark", either w/ redis-benchmark or not ~ 30% done, need to complete some suites and generate graphs from output
38+
39+
## Examples
40+
41+
TBD
42+
43+
1. A session token that also has a list of last seen times, what stack though
44+
1. Node.js example perhaps
45+
46+
## Blog post
47+
48+
References:
49+
50+
* [Parsing JSON Is A Minefield](http://seriot.ch/parsing_json.php)
51+
52+
---
53+
54+
# Post MVP
55+
56+
## Profiling
57+
58+
1. Memory usage: implemented JSON.MEMORY, need to compile an automated reporting tool
59+
1. Performance with callgrind and/or gperftools
60+
61+
## Build/test
62+
63+
1. Run https://github.com/nst/JSONTestSuite and report like http://seriot.ch/parsing_json.php
64+
1. Need a dependable cycle to check for memory leaks
65+
1. Once we have a way to check baseline performance, add regression
66+
1. Fuzz all module commands with a mix of keys paths and values
67+
1. Memory leaks suite to run
68+
`valgrind --tool=memcheck --suppressions=../redis/src/valgrind.sup ../redis/src/redis-server --loadmodule ./lib/rejson.so`
69+
1. Verify each command's syntax - need a YAML
70+
1. Add CI to repo?
71+
72+
## Path parsing
73+
74+
1. Add array slice
75+
76+
## Dictionary optimiztions
77+
78+
Encode as trie over a certain size threshold to save memory and increase lookup performance. Alternatively, use a hash dictionary.
79+
80+
## Secondary indexing
81+
82+
Integrate with @dvirsky's `secondary` library.
83+
84+
## Schema
85+
86+
Support [JSON Schema](http://json-schema.org/).
87+
88+
JSON.SETSCHEMA <key> <json>
89+
90+
Notes:
91+
1. Could be replaced by a JSON.SET modifier
92+
2. Indexing will be specified in the schema
93+
3. Cluster needs to be taken into account as well
94+
95+
JSON.VALIDATE <schema_key> <json>
96+
97+
## Expiry
98+
99+
JSON.EXPIRE <key> <path> <ttl>
100+
101+
# Cache serialized objects
102+
103+
Manage a cache inside the module for frequently accessed object in order to avoid repeatative
104+
serialization.
105+
106+
## KeyRef nodes
107+
108+
Add a node type that references a Redis key that is either a JSON data type or a regular Redis key.
109+
The module's API can transparently support resolving referenced keys and querying the data in them.
110+
KeyRefs in cluster mode will only be allowed if in the same slot.
111+
112+
Redis core data types can be mapped to flat (i.e. non-nested) JSON structure as follows:
113+
* A Redis String is a JSON String (albeit some escaping may be needed)
114+
* A List: a JSON Array of JSON Strings, where the indices are the identical
115+
* A Hash: a JSON Object where the Hash fields are the Object's keys and the values are JSON Strings
116+
* A Set: a JSON Object where the Set members are the keys and their values are always a JSON Null
117+
* A Sorted Set: a JSON Array that is made from two elements:
118+
* A JSON Object where each key is a member and value is the score
119+
* A JSON Array of all members ordered by score in ascending order
120+
121+
## Compression
122+
123+
Compress (string only? entire objects?) values over a (configureable?) size threshold with zstd.
124+
125+
## Additions to API
126+
127+
JSON.STATS
128+
Print statistics about encountered values, parsing performance and such
129+
130+
JSON.OBJSET <key> <path> <value>
131+
An alias for 'JSON.SET'
132+
133+
JSON.COUNT <key> <path> <json-scalar>
134+
P: count JS: ? R: N/A
135+
Counts the number of occurances for scalar in the array
136+
137+
JSON.REMOVE <key> <path> <json-scalar> [count]
138+
P: builtin del JS: ? R: LREM (but also has a count and direction)
139+
Removes the first `count` occurances (default 1) of value from array. If index is negative,
140+
traversal is reversed.
141+
142+
JSON.EXISTS <key> <path>
143+
P: in JS: ? R: HEXISTS/LINDEX
144+
Checks if path key or array index exists. Syntactic sugar for JSON.TYPE.
145+
146+
JSON.REVERSE <key> <path>
147+
P: reverse JS: ? R: N/A
148+
Reverses the array. Nice to have.
149+
150+
JSON.SORT <key> <path>
151+
P: sort JS: ? R: SORT
152+
Sorts the values in an array. Nice to have.

benchmarks/graphs/benchmark.csv

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
title,size,concurrency,rate,avgLatency,50.00%-tile,90.00%-tile,95.00%-tile,99.00%-tile,99.50%-tile,100.00%-tile
2+
JSON.SET {key} .,380,16,52537.50,0.30,0.27,0.43,0.50,0.64,0.69,7.05
3+
JSON.GET {key} .,380,16,60343.12,0.26,0.24,0.36,0.44,0.59,0.65,6.41
4+
JSON.GET {key} sclr,380,16,98514.45,0.16,0.15,0.23,0.26,0.33,0.36,4.88
5+
JSON.SET {key} sclr,380,16,84124.03,0.19,0.17,0.27,0.31,0.39,0.42,7.10
6+
JSON.SET {key} .,1441,16,30105.82,0.53,0.47,0.75,0.90,1.12,1.20,6.80
7+
JSON.GET {key} .,1441,16,17754.51,0.90,0.77,1.29,1.55,1.94,2.00,7.61
8+
JSON.GET {key} [0],1441,16,84786.94,0.19,0.17,0.27,0.30,0.38,0.41,7.09
9+
JSON.SET {key} [0],1441,16,82954.30,0.19,0.18,0.27,0.31,0.39,0.43,5.71
10+
JSON.SET {key} .,3468,16,21529.24,0.74,0.70,0.81,0.96,1.52,1.68,5.88
11+
JSON.GET {key} .,3468,16,10497.95,1.52,1.45,1.58,1.99,3.14,3.51,8.57
12+
"JSON.GET {key} [""web-app""].servlet[0][""servlet-name""]",3468,16,83277.94,0.19,0.18,0.26,0.30,0.37,0.40,6.64
13+
"JSON.SET {key} [""web-app""].servlet[0][""servlet-name""] ""bar""",3468,16,84622.61,0.19,0.17,0.27,0.31,0.39,0.42,5.79
14+
JSON.SET {key} .,18446,16,5664.91,2.82,2.73,2.83,3.16,5.38,6.25,12.30
15+
JSON.GET {key} .,18446,16,1062.04,15.05,14.82,15.05,15.93,22.90,28.02,48.96
16+
JSON.GET {key} ResultSet.totalResultsAvailable,18446,16,92883.96,0.17,0.16,0.24,0.27,0.35,0.39,7.41
17+
JSON.SET {key} ResultSet.totalResultsAvailable 1,18446,16,83471.43,0.19,0.17,0.27,0.31,0.38,0.41,8.69
18+
JSON.SET {key} .,39491,16,2953.48,5.41,5.33,5.45,5.54,8.39,9.99,16.92
19+
JSON.GET {key} .,39491,16,513.05,31.13,30.18,32.18,36.13,58.42,70.90,84.10
20+
JSON.GET {key} message.code,39491,16,91236.38,0.17,0.16,0.24,0.27,0.34,0.38,7.73
21+
JSON.SET {key} message.code 1,39491,16,81865.64,0.19,0.18,0.27,0.30,0.38,0.42,5.66

benchmarks/graphs/make.py

+53
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
"""
2+
Make charts from ReJSONBenchmark's output
3+
"""
4+
5+
import matplotlib.pyplot as plt
6+
import numpy as np
7+
8+
data = np.genfromtxt('benchmark.csv', delimiter=',', names=True, dtype=None)
9+
10+
# Each JSON value has 4 operations: set root, get root, set path, get path
11+
NOP = 4
12+
13+
# The number of JSON values
14+
N = int(len(data) / NOP)
15+
16+
# The x locations
17+
ind = np.arange(N)
18+
19+
# The bars' width
20+
width = (1 - .3) / NOP
21+
22+
colors = ['r','g','b','y']
23+
24+
fig, ax = plt.subplots()
25+
26+
# Iterate each operation
27+
for oidx in range(NOP):
28+
d = data[oidx::NOP]
29+
# ax1 = ax[oidx]
30+
31+
# plt.subplot(NOP,1,oidx+1)
32+
33+
# Plot the rate as bars
34+
plt.bar(ind+oidx*width, d['rate'], width, align='center', color=colors[oidx])
35+
36+
# Plot the latency as line
37+
# tax = plt.twinx()
38+
# tax.set_yscale('log')
39+
# tax.plot(ind, d['avgLatency'], color='r')
40+
41+
# plt.title('{} rate and average latency'.format(d['title'][0]))
42+
# ax1.set_ylabel('Rate (op/s)', color='b')
43+
# ax1.set_xlabel('Object size (bytes)')
44+
# for t in ax1.get_yticklabels():
45+
# t.set_color('b')
46+
# ax2.set_ylabel('Average latency (msec)', color='r')
47+
# for t in ax2.get_yticklabels():
48+
# t.set_color('r')
49+
# plt.xticks(ind, d['size'])
50+
51+
plt.grid(True)
52+
plt.xticks(ind + width, d['size'])
53+
plt.show()

benchmarks/graphs/standard.csv

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
title,concurrency,rate,average latency,50.00%-tile,90.00%-tile,95.00%-tile,99.00%-tile,99.50%-tile,100.00%-tile
2+
JSON.SET {key} . {size: 2 B},16,79313.90,0.20,0.18,0.29,0.32,0.40,0.43,9.41
3+
JSON.GET {key} .,16,96787.14,0.16,0.15,0.23,0.26,0.33,0.36,7.23
4+
JSON.SET {key} . {size: 380 B},16,52537.50,0.30,0.27,0.43,0.50,0.64,0.69,7.05
5+
JSON.GET {key} .,16,60343.12,0.26,0.24,0.36,0.44,0.59,0.65,6.41
6+
JSON.GET {key} sclr,16,98514.45,0.16,0.15,0.23,0.26,0.33,0.36,4.88
7+
JSON.SET {key} sclr {size: 1 B},16,84124.03,0.19,0.17,0.27,0.31,0.39,0.42,7.10
8+
JSON.GET {key} sub_doc,16,79419.40,0.20,0.18,0.29,0.32,0.41,0.45,5.73
9+
JSON.GET {key} sub_doc.sclr,16,89612.06,0.18,0.16,0.25,0.28,0.35,0.39,7.44
10+
JSON.GET {key} array_of_docs,16,67806.08,0.23,0.21,0.34,0.40,0.51,0.57,6.45
11+
JSON.GET {key} array_of_docs[1],16,76575.85,0.21,0.19,0.30,0.33,0.42,0.47,5.26
12+
JSON.GET {key} array_of_docs[1].sclr,16,89853.27,0.18,0.16,0.25,0.28,0.35,0.38,6.96
13+
JSON.SET {key} . {size: 1.4 kB},16,30105.82,0.53,0.47,0.75,0.90,1.12,1.20,6.80
14+
JSON.GET {key} .,16,17754.51,0.90,0.77,1.29,1.55,1.94,2.00,7.61
15+
JSON.GET {key} [0],16,84786.94,0.19,0.17,0.27,0.30,0.38,0.41,7.09
16+
JSON.SET {key} [0] {size: 5 B},16,82954.30,0.19,0.18,0.27,0.31,0.39,0.43,5.71
17+
JSON.GET {key} [7],16,89787.85,0.18,0.16,0.25,0.28,0.34,0.38,5.81
18+
JSON.GET {key} [8].zero,16,89826.21,0.18,0.16,0.24,0.28,0.35,0.39,8.95
19+
JSON.SET {key} . {size: 3.5 kB},16,21529.24,0.74,0.70,0.81,0.96,1.52,1.68,5.88
20+
JSON.GET {key} .,16,10497.95,1.52,1.45,1.58,1.99,3.14,3.51,8.57
21+
"JSON.GET {key} [""web-app""]",16,10726.16,1.49,1.44,1.51,1.70,2.82,3.22,7.11
22+
"JSON.GET {key} [""web-app""].servlet[0]",16,17250.85,0.92,0.89,1.00,1.09,1.77,1.97,7.10
23+
"JSON.SET {key} [""web-app""].servlet[0][""servlet-name""] {size: 5 B}",16,84622.61,0.19,0.17,0.27,0.31,0.39,0.42,5.79
24+
"JSON.GET {key} [""web-app""].servlet[0][""servlet-name""]",16,83277.94,0.19,0.18,0.26,0.30,0.37,0.40,6.64
25+
JSON.SET {key} . {size: 18 kB},16,5664.91,2.82,2.73,2.83,3.16,5.38,6.25,12.30
26+
JSON.GET {key} .,16,1062.04,15.05,14.82,15.05,15.93,22.90,28.02,48.96
27+
JSON.SET {key} ResultSet.totalResultsAvailable {size: 1 B},16,83471.43,0.19,0.17,0.27,0.31,0.38,0.41,8.69
28+
JSON.GET {key} ResultSet.totalResultsAvailable,16,92883.96,0.17,0.16,0.24,0.27,0.35,0.39,7.41
29+
JSON.SET {key} . {size: 40 kB},16,2953.48,5.41,5.33,5.45,5.54,8.39,9.99,16.92
30+
JSON.GET {key} .,16,513.05,31.13,30.18,32.18,36.13,58.42,70.90,84.10
31+
JSON.SET {key} message.code {size: 1 B},16,81865.64,0.19,0.18,0.27,0.30,0.38,0.42,5.66
32+
JSON.GET {key} message.code,16,91236.38,0.17,0.16,0.24,0.27,0.34,0.38,7.73

design.md

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# ReJSON Module Design
2+
3+
## Abstract
4+
5+
The purpose of this module is to provide native support for JSON documents stored in Redis, allowing
6+
users to:
7+
8+
1. Store a JSON blob
9+
2. Manipulate just a part of the json object without retrieving it to the client
10+
3. Retrieve just a portion of the object as JSON
11+
12+
Later on, we can use the inernal object implementation in this module to produce similar modules for
13+
other serialization formats, namely XML and BSON.
14+
15+
## Design Considerations
16+
17+
* Documents are added as JSON but are stored in an internal representation and not as strings.
18+
* Internal representation does not depend on any JSON parser or library, to allow connecting other formats to it later.
19+
* The internal representation will initially be limited to the types supported by JSON, but can later be extended to types like timestamps, etc.
20+
* Queries that include internal paths of objects will be expressed in JSON path expressionse (e.g. `foo.bar[3].baz`)
21+
* We will not implement our own JSON parser and composer, but use existing libraries.
22+
* The code apart from the implementation of the redis commands will not depend on redis and will be testable without being compiled as a module.
23+
24+
## Object Data Type
25+
26+
The internal representation of JSON objects will be stored in a redis data type called Object [TBD].
27+
28+
These will be optimized for memory efficiency and path search speed.
29+
30+
See [src/object.h](src/object.h) for the API specification.
31+
32+
## QueryPath
33+
34+
When updating, reading and deleting parts of json objects, we'll use path specifiers.
35+
36+
These too will have internal representation disconnected from their JSON path representation.
37+
38+
## JSONPath syntax compatability
39+
40+
We only support a limited subset of it. Furthermore, jsonsl's jpr implementation may be worth looking into.
41+
42+
| JSONPath | rejson | Description |
43+
| ---------------- | ----------- | ----------------------------------------------------------------- |
44+
| `$` | key name | the root element |
45+
| `*` | N/A #1 | wildcard, can be used instead of name or index |
46+
| `..` | N/A #2 | recursive descent a.k.a deep scan, can be used instead of name |
47+
| `.` or `[]` | `.` or `[]` | child operator |
48+
| `[]` | `[]` | subscript operator |
49+
| `[,]` | N/A #3 | Union operator. Allows alternate names or array indices as a set. |
50+
| `@` | N/A #4 | the current element being proccessed by a filter predicate |
51+
| [start:end:step] | N/A #3 | array slice operator |
52+
| ?() | N/A #4 | applies a filter (script) expression |
53+
| () | N/A #4 | script expression, using the underlying script engine |
54+
55+
ref: http://goessner.net/articles/JsonPath/
56+
57+
1. Wildcard should be added, but mainly useful for filters
58+
1. Deep scan should be added
59+
1. Union and slice operators should be added to ARR*, GET, MGET, DEL...
60+
1. Filtering and scripting (min,max,...) should wait until some indexing is supported
61+
62+
## Connecting a JSON parser / writer
63+
64+
## Conneting Other Parsers

mkdocs.yml

+3-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,6 @@ pages:
2323
- 'RAM usage': 'ram.md'
2424
- 'Developer notes': 'developer.md'
2525

26-
google_analytics: ['UA-89559992-1', 'rejson']
26+
google_analytics:
27+
- 'UA-89559992-1'
28+
- 'auto'

0 commit comments

Comments
 (0)