Graph Technology Buyers Guide EN A4
Graph Technology Buyers Guide EN A4
Graph Technology Buyers Guide EN A4
WHITE PAPER
neo4j.com
The #1 Platform for Connected Data
White Paper
Buyer’s Guide
Good for? 4
Structure 10
Introduction
Open Source Foundation & The graph technology market is heating up with a wide variety of database and analytics
Community 10
vendors staking claim to supporting graph (connected data) capabilities.
Native Graph Storage 11 The challenge for the modern technology buyer is weeding through mountains of material
in order to make an informed buying and implementation decision. This challenge
ACID Compliance 11 is compounded because graph technology is a new way of managing data. Data
consistency, performance, scalability and other important characteristics of any database
Graph Query Languages 13 technology are all much different with graph databases than with traditional data
management platforms like relational databases (RDBMS).
Hybrid Transactional-Analytic
Platforms (HTAP) 15
This buyer’s guide is designed to help expedite those decisions and explain what makes
purchasing this type of technology so different from other database purchase decisions.
OLTP Applications 15
For the full list of buying criteria and questions, skip to Part 2 on page XX. For readers who
aren’t yet familiar with the basics of graph technology, we’ll start with a brief introduction.
Query Traversal Performance
Benchmark Comparisons 17
Graph Analytics 18
Developer Tooling 20
Visualization Capabilities 21
Conclusion 28
1 neo4j.com
The Graph Technology Buyer’s Guide
Part 1:
What Are Graph
Databases Good for?
2 neo4j.com
The Graph Technology Buyer’s Guide
What these use cases have in common is that their success requires solving complex
problems with dynamic and interconnected datasets. To this end, we should reframe
the question, “What are graph databases good for?” as a technical one.
Viewed through a technological lens, graph databases tackle the most harrowing of
data problems – ones that often linger at the root of project failures and delays. These
include:
• Vastly different views of the data model between business and technology teams,
which result in misunderstanding and miscommunication.
• The “JOIN problem” which occurs when queries become so tangled that even
powerful databases with massive amounts of hardware resources grind to a halt in
their attempts to bring the resulting data together.
With the right choice of technologies, graph databases introduce a new way of looking
at data by promising to significantly address all of these issues. They seek to take you
beyond handling sheer volumes of relatively simple data and towards revealing the
interconnected complexities latent in your data – and deriving bottom-line value from
them.
3 neo4j.com
The Graph Technology Buyer’s Guide
Early on, a handful of companies have truly tapped into the power of graph technology
as the driver of their businesses, including Google, Facebook, LinkedIn and Microsoft as
well as upstarts like Airbnb and Uber. Well-established entities such as NASA, eBay, UBS
and many of their peers use graph technology to improve their customer experiences
and increase their competitiveness.
Today, graph-powered applications are used by more than 75% of the Fortune 500,
including:
4 neo4j.com
The Graph Technology Buyer’s Guide
and administrators In SQL’s case, the normalization of data into a tabular schema aims to minimize storage
of duplicate data objects, types and values. These systems were born during the era of
to work with simple
scarce physical memory and expensive disk-based storage, designed to avoid managing
data in a way that can often-redundant data objects such as physical location addresses for shipping, billing,
easily scale. homes, offices, destinations, businesses, etc. For example, all of these data objects
included common, redundant data such as a country and its provinces or states, or
telephone area codes and postal codes.
NoSQL systems like document, wide column and key-value data stores carry those
concepts forward (and backward) by simplifying their models in exchange for higher
levels of scale and simplicity. By eschewing data relationships and providing simple
programmatic APIs, NoSQL systems make it easy for developers and administrators to
work with simple data in a way that can easily scale.
A lack of concern about relationships leads to looser data guarantees, plain APIs and
straightforward scaling schemes. Data is easily spread out and just as easily retrieved,
without the need to maintain the integrity of related data that’s written across a
distributed storage or a cluster of machines and without needing to concern itself with
the performance of distributed JOINs across those machines.
These systems take on the “store and retrieve” problem at scale for simple data, and
their architectures reflect this, as does the set of problems they are equipped to address.
However, none of these systems focus on interrelated, contextualized data or how that
data might be traversed to reveal unobvious relationships, as explained below.
5 neo4j.com
The Graph Technology Buyer’s Guide
indexed). The context provided by these data connections is essential to identifying friendships,
making relevant real-time recommendations, attaching adjacent ideas and detecting fraud
by following money trails. Without relationships as first-class data entities, all of these use
cases become extremely difficult to execute.
• You can change or update a property graph easily, because its agile design
eliminates most of the structural overhead of traditional database schemas.
• You can quickly program property graphs because their query language expresses
and follows relationships.
• You can visualize and navigate property graphs efficiently by following the
relationships on their paths to context.
• You can rapidly determine data context when property graph queries are
executed in hyper-fast native graph platforms built on reliable, scalable database
architectures.
6 neo4j.com
The Graph Technology Buyer’s Guide
find context for your • Flexibility for evolving data structures: Graph technology provides flexible
next breakthrough schema evolution. In a constantly changing data environment, you need the option
to add or drop data entities or relationships as well as extend or modify your data
application or model. Graph databases allow for evolving data structures that match today’s agile
analysis. development environments.
• Simultaneous support for real-time updates and queries: A graph data store
and its model allow real-time updates on graph data while supporting queries
concurrently.
• Better, faster and more powerful querying and analytics: Graph data stores
provide superior query performance with connected data using native storage and
native indexed data structure.
Connected data in property graphs enables you to illustrate and traverse many
relationships and find context for your next breakthrough application or analysis.
7 neo4j.com
The Graph Technology Buyer’s Guide
Part 2:
The Buying Criteria for
Graph Technology
8 neo4j.com
The Graph Technology Buyer’s Guide
Open source technology helps drive adoption, ensure quality and assist market fit as the
software matures version by version with the feedback – and sometimes contribution – of
a global community. Beyond Oracle, Microsoft SQL Server and a handful of other early
RDBMS, very few proprietary database technologies survive when compared to their open
source counterparts like Postgres, MySQL, MariaDB, MongoDB and many others.
The premise of using open source as a means to validate and move a market forward is
no different in the case of graph databases. Therefore, buyers should consider the size
and enthusiasm of a graph technology’s open source community as a proxy for multiple
items, including the market impact of that technology, the likely availability of skill sets
in the market, and the flexibility of the software to evolve beyond the capabilities and
personnel of a single software vendor.
Apache TinkerPop is another open source graph project. The TinkerPop graph traversal
engine’s v1 release was in 2011, and it became an Apache-incubated project in 2015.
Given its ease of integration, flexible storage options and permissive-use license, it has
become a common choice for NoSQL vendors when adding graph interfaces to their
products.
9 neo4j.com
The Graph Technology Buyer’s Guide
Long considered There are a number of other proprietary graph vendors as well. However, their ability to
establish market share and adoption is severely limited when compared to the Neo4j and
the “gold standard” TinkerPop open source projects.
for database
Independent sites like DB-Engines prove out this market reality. As of the publication of
transactional this guide, Neo4j’s score accounts for roughly half of the total graph market popularity
reliability, ACID score. Meanwhile, TinkerPop-based products make up approximately 40 percent of that
(Atomicity, score. The remaining 10 percent are divided among over 20 additional vendors.
Consistency, Isolation
and Durability) Native Graph Storage
compliance ensures
that a DBMS safely Native graph storage ensures that relationship information – the entity that connects one
data node to another – is persisted as a primary data element.
and reliably stores
transactional data Without native graph storage, relationship information may be lost, misconnected or
abandoned, all of which are unique symptoms of graph data corruption. Many non-
as it enters and is native graph solutions allow for (and sometimes create) graph data corruption, which is
updated within the unacceptable for mission-critical enterprise applications.
system – especially as
the data and system What to Look for: Transposition vs. Persisted Relationships
grow. When considering non-native graph technology, a buyer should ask: Is there a graph-to-
document or graph-to-column transposition step buried in the software?
Index-free adjacency avoids expensive, low-level transpositions that are supported by non-
native graph storage and retrieval architectures. Index-free adjacency creates fixed-size
pointers to and from any node to another; native graph technology is therefore both fast
and predictable in its performance costs.
ACID Compliance
Long considered the “gold standard” for database transactional reliability, ACID (Atomicity,
Consistency, Isolation and Durability) compliance ensures that a DBMS safely and reliably
stores transactional data as it enters and is updated within the system – especially as the
data and system grow.
While some (NoSQL) databases eschew ACID and can get away with doing so for certain
use cases, the impact of non-ACID compliance with graph datasets is grave. Because
graphs are inherently connected, proven risks arise with non-ACID transactions, where
parts of a graph transaction successfully commit while other parts fail, leaving data in a
corrupted state.
For example, corrupted graph data can result in dangling relationships that point
nowhere, phantom properties, and nodes that can only be reached from one direction.
The ripple effect of these situations is quite staggering when you consider what happens
10 neo4j.com
The Graph Technology Buyer’s Guide
Unlike the ISO/ANSI when subsequent reads against that corrupt data influence new updates or writes, which
further spread corruption because they are based on the original corrupt, untrustable
SQL that has served values. Ultimately, significant portions of the database become unusable.
RDBMS users as
For non-native graph databases built on top of other data models, ACID compliance can
the standard query be equally complex and concerning. For example, key-value stores are optimized for high
language for 40 throughput and low latency on single-key lookups (reads) and writes. They support ACID
years, there is not yet semantics only at a single-key level. As a result, a graph data store that needs transactional
and data consistency capabilities cannot be implemented directly over native key-value
an industry-standard stores efficiently.
query language for
graphs. What to Look for: Durability
Database systems should be evaluated and rigorously tested to ensure that they lose no
data when introducing a systemic hardware failure, such as a “kill -9” process interruption,
power failure or kernel panic.
It has been observed that some native graph and multi-model database systems are
unable to recover gracefully here, both by losing transactions that the system had
previously accepted and/or failing consistency checks after recovering from the failure
itself.
rr Does the database offer any level of consistency checking for graphs?
rr What safety measures are available to ensure writes are written correctly?
rr Do these measures work as expected in a clustered, High Availability (HA)
environment?
11 neo4j.com
The Graph Technology Buyer’s Guide
These SQL-like dialects are often supported by only one database vendor, lack a
community of skilled users together with a third-party tooling ecosystem, and also tend
to miss out on the technical benefits of query languages designed from the ground up for
the property graph model.
Below is a review of more detailed considerations, but here is a summary of where most
users end up: nearly all graph database vendors offer either direct Cypher support, direct
Gremlin support or both. All graph databases that support Gremlin can now also run
Cypher, thanks to openCypher’s “Cypher for Gremlin” project, which has been donated to
the Apache Foundation under the official TinkerPop umbrella, making Cypher support part
and parcel of TinkerPop.
Also worth noting is that the vast majority of graph database users (which encompasses
developers as well as projects) are using Cypher. There is an active and concerted effort
inside of the ISO and the openCypher community to create a new standard language
under the International Standards Organization umbrella as a sibling language to SQL.
This language is inspired by and would be highly compatible with Cypher and its own
sibling graph query language derivatives.
Declarative query languages are more easily learned, as the query author needs only to
instruct the database about what to retrieve and not be concerned about the low-level
details of how the database should go about obtaining it.
Declarative languages are also easier to write, read and debug for end users than
imperative languages. This declarative approach has been a key factor in SQL’s popularity
over the years. Today, the most popular, fully declarative graph query language is Cypher.
Imperative languages offer fine-grained control over every behavior of the query by
instructing the system where to go as it touches every node, almost like when one learns
hopscotch. The issue with imperative languages is that the developer must be intimately
familiar with the graph schema they are querying in order to tell the system how to
execute each traversal, which increases both the learning curve for the language as well as
the complexity of the code being written.
The most popular imperative graph query language is Gremlin, though Gremlin also
combines some declarative attributes.
Declarative languages require far more work on the part of the database vendor to build
and optimize, as they effectively render what a user is asking for into a set of low-level
imperative commands that the database carries out. TinkerPop is thus relatively easy
to port to any database backend. On the other hand, supporting Cypher requires a
12 neo4j.com
The Graph Technology Buyer’s Guide
The most successful sophisticated combination of parsing, planning and runtime work, as well as database
statistics and cost-based optimization techniques, all of which require substantial effort
graph applications to write and maintain.
help organizations
innovate more What to Look for: Compiled vs. Interpreted Instructions
quickly, generate Some graph query languages must be compiled into binary packages prior to execution
and optimize within the database. While this technique may be advantageous for optimizing the
performance of a single query, it undoubtedly increases debugging complexity and
revenue streams,
extends the time it takes to write, build and execute queries.
and improve the
Compiled queries are also a major obstacle when it comes to ad-hoc querying by end
customer experience users and end-user tools. This effectively eliminates the ability to change a query on the
by preventing fly – perhaps to just correct a typo – or to modify an ad-hoc query parameter.
unexpected
maintenance. What to Look for: Graph Query Language Standardization
The relational database market of the 1980s grew much more rapidly after multiple
vendors adopted ANSI SQL as their standard query language. This made it easier to find
and hire skilled resources for nearly any project and created more application portability
across platforms.
Today, during the emergence of NoSQL and graph databases, it is not as easy to find
trained developers who know each vendor’s proprietary query language. This will remain
true until another standard language takes shape.
Developing a new standard not only requires an abundant user community, but also
cooperation among vendors, large and small. This cooperation has not happened in the
NoSQL space, as the SQL Standards oversight team chose to simply adapt the ANSI SQL
standard to accommodate new NoSQL-style syntaxes and data types.
Graph Query Language (GQL) standards development is underway. This new standard
is emerging both from the usage and adoption preferences of the graph tech user
community, as well as the adoption and implementation of open source language toolkits
within the vendor community.
Contrast this to the openCypher project with growing support from SAP HANA, Databricks
and the Apache Spark project, Redis, Memgraph, Bitnine and Cambridge Semantics.
There is also a swell of support for GQL, as the ISO W3C (the SQL Standards body)
held a discussion on endorsing both a SQL-compatible, yet independent, Graph Query
Language based on inputs from the Linked Data Benchmark Council (LDBC), Neo4j’s
Cypher and Oracle’s Property Graph Query Language (PGQL).
13 neo4j.com
The Graph Technology Buyer’s Guide
OLTP Applications
Common OLTP use cases for graphs include identity and access management, real-
time recommendations, social network publishing, fraud detection, network systems
operations, cybersecurity, Internet of Things (IoT), artificial intelligence, supply chain,
institutional knowledge recall, customer 360, digital transformation and metadata
management.
Often, these are combined or layered to feed one another. Moreover, these transactional
applications must integrate a variety of technologies beyond the underlying data
management system. Therefore, the graph technology must support integrations across
use cases, data interfaces, deployment environments, programming languages, user
experience and more.
If any vendor is unable to supply solid customer references, then there’s a high likelihood
that the implementations only exist in theory. The greater number of example successes
for your use case, the higher the probability of your own success in implementing a graph
solution.
Be sure to ask any vendor that provides use case examples whether the referenced
company has a separate commercial relationship with the graph provider – for example,
as an investor or business partner.
The most successful graph applications help organizations innovate more quickly,
generate and optimize revenue streams, and improve the customer experience by
preventing unexpected maintenance. Make sure your graph technology vendor knows
how and why their customers and users chose graphs for their application. Beware of
vendor inexperience and “foilware.”
14 neo4j.com
The Graph Technology Buyer’s Guide
rr Does the graph system support mapping, indexing and searching for specific text
within the graph? How is that performed?
15 neo4j.com
The Graph Technology Buyer’s Guide
rr Does the vendor fully exercise the system or simply highlight its most favorable
tasks and results? For example, are workloads mixed as both transactional writes
and analytic reads, or is only one type of activity executed?
rr Are caching strategies equal? Are both graphs fully loaded into memory (where they
will be extra fast), or is only one product fully loaded into memory while the other
swaps between cache and disk?
rr Are benchmark sessions multi-user, or are they single sessions that utilize all
available CPU resources to parallelize query execution? How realistic is it that one
user will use the database at a time?
Look out for benchmarks that claim fairness by using database defaults, which are often
extremely minimal in order to support small-scale functional evaluation rather than
performance at scale.
16 neo4j.com
The Graph Technology Buyer’s Guide
database to help Graph analytics tasks search for pathways, clusters, similarities and contextualization.
developers write Given the newness of graph analytic exercises, you should place a higher degree of
confidence in vendors who offer a variety of educational, implementation and integration
clean, clear queries materials to assist data scientists and other data analysts.
and also understand
what those queries What to Look For: Publications & Educational Material
return. rr Has the vendor not only published content about graph systems, but also
published information about how to perform graph analytics exercises using graph
algorithms?
rr If not, then how do they teach users how to use any graph algorithm libraries they
might offer?
When data is imported into a graph, the data’s relationships must be materialized.
Furthermore, due to the schema-optional nature of graph systems, data is easily
changed, added and removed. Vendors will outline their ability to ingest high volumes of
data rapidly, which is primarily useful when a dataset first enters the graph. Enterprises
who have invested in other big data technologies will need integration to their Hadoop
installations, ideally using already-deployed data processing technologies such as Apache
Spark.
It also may become necessary for data to be streamed or imported into the graph system
on a regular basis. Therefore, the inclusion or integration with data processing tools such
as Kettle or Kafka (open source) or commercial tools from Informatica, iWay, Trifacta or
others is desirable from any graph technology vendor.
Organizations who have invested in data lakes – and especially those who use Apache
Spark as their analytic processing engine – should consider how easily the graph system
integrates with that source data.
17 neo4j.com
The Graph Technology Buyer’s Guide
A well-built graph Some questions to consider for data lake integrations with graph technology:
visualization tool rr Can the graph system operate within the Apache Spark ecosystem?
allows you to explore rr Can graphs be materialized from Spark Data Frames?
the graph in a
rr Can graph structures be directly imported into the graph database system?
variety of ways, such
as filtering nodes What to Look for: Data Integration & Ingestion Technologies
and relationships, Data integration and data streaming capabilities are very desirable add-ons for graph
lassoing parts of the systems. Without them, the DBA is left with simple comma-separated value (CSV) import
graph, reviewing and functions, which could be sub-optimal.
editing properties, Look for libraries, SDKs and data integration adapters to popular data management tools.
Open source tools like Talend and Kettle should be available for a given graph system,
highlighting paths,
while commercial product support from Informatica, Ab Initio, iWay, Trifacta or Tamr is
coloring nodes desirable for organizations that have made investments in these tools.
categories and
What to Look for: CSV Import & High-Speed Ingestion
including graphic
At a minimum, most products import CSV text files. Any given graph technology vendor
icons.
should also support additional functions such as high-speed and bulk ingestion.
More advanced data population strategies include the ability to defer index building and
consistency checking until after all the data is ingested. However, development teams
should be careful not to over-consider the importance of high-speed ingestion. For most
use cases, high-volume ingestion is usually more of a setup exercise when compared to
ongoing incremental updates, which tend to be much smaller than initial data loads.
A graph platform should include features for all of these users, offering extensions,
apps, libraries, APIs and SDKs, as well as integration with a variety of complementary
technologies.
18 neo4j.com
The Graph Technology Buyer’s Guide
manage multiple In this area, the vendor should offer a visual programming environment that allows
databases within the developer to not only write queries, but to also see the graph upon which they are
working.
an instance of the
Developers need a variety of drivers and APIs (beyond just JDBC) with which to integrate
software. For graphs,
application-level functionality. Furthermore, the database should be embeddable within a
this is not yet the larger application as is often necessary in modern deployments.
case.
What to Look for: Developer Launchpad Application
An ideal developer experience is one that assembles all developer tools in one
application wrapper such that the developer always enjoys the same launchpad from
which to work with their graph database system.
Neo4j is the only graph database system to offer the Neo4j Desktop: an integrated
developer launchpad for graph projects. Neo4j Desktop provides access to the Enterprise
Edition of the Neo4j Graph Database along with its accompanying algorithm and
procedure libraries, as well as graph application examples. The Neo4j Desktop is free for
development use.
Neo4j offers commercial support for Java, JavaScript, .NET, Python and Go, while the
Neo4j community offers drivers for even more languages. TinkerPop also provides a
number of community-supported language integration drivers.
Interfaces such as Java or the database’s native language are desirable, as are interfaces
and exchange formats for other common occurrences. Any graph technology vendor
should also offer support for GraphQL (Facebook’s data exchange format) as a way of
interacting with applications above the database querying, as well as support for SPARQL,
the query language for RDF-based systems (Resource Description Framework), REST and
others.
19 neo4j.com
The Graph Technology Buyer’s Guide
Visualization Capabilities
Graph data visualization tools are extremely helpful in comprehending graph data
behaviors. These environments allow users to quickly see shapes within connected data
and determine where to look further.
20 neo4j.com
The Graph Technology Buyer’s Guide
Other systems offer redundancy of multiple, duplicate writable graphs, which adds fault
tolerances to applications, networks and operations that lie above the storage system.
For these systems, it is important that their strength of consistency be carefully evaluated.
For graphs, consistency is vitally important, and when writing to multiple duplicates,
consistency must be coordinated carefully.
Current market expectations are that graph systems should horizontally scale for writes
as easily as they might for read optimization. The reality, however, is that creating a
horizontally scalable graph system requires the user to understand and plan their
graph schema carefully. For example, a user would need to design their schema so
that the graph could be partitioned in some form and therefore fit within any hardware
configuration.
Once this takes place, the user must then recognize how each partition will be identified
and how queries will be directed to the proper partition. Finally, the system will ultimately
need a means to leap queries from one partition to an adjacent one, which no graph
database product currently supports.
21 neo4j.com
The Graph Technology Buyer’s Guide
Even within the Due to every vendor’s inability to easily break apart a graph, write scaling must scale
vertically, requiring beefier hardware or a cloud environment. Such large, vertically scaled
almost wholly systems can scale writes in the following ways:
online world of
rr By offering more CPU-core backed sessions to accommodate concurrent
data technology, connections
face-to-face events,
rr By increasing RAM in which to cache most of the graph
gatherings and
conferences are rr By providing enough CPU power to minimize transaction cycle times
just as important What to Look for: Multi-Graph Support
since they establish
Some graph vendors have begun to solve the above-mentioned writing challenge. These
and strengthen vendors have identified means by which to isolate, view and operate upon one or many
relationships among graph clusters.
users, customers, Vendors are also identifying ways in which to extend application drivers to support
partners and directing queries to different graphs using a multi-graph routing table. While these graphs
are distinct, this is an illustration of the first stage in supporting a partitioned graph
employees.
solution.
Many graph database vendors support the ability to offer readable replicas of the main
graph. Some systems scale these replicas hierarchically. It is important to understand how
replicas stay consistent with the main writable-core instance to ensure that an application
or client can read their own writes (RYOW).
22 neo4j.com
The Graph Technology Buyer’s Guide
Scaling large volumes of data is accomplished via big data processing technologies such
as Apache Spark for analytic activities or Apache Kafka for data streaming.
Ask vendors about their ability to execute graph queries upon Apache Spark, or whether
they can materialize graphs from Spark DataFrames. Look for mature clustering
technologies that use tools such as the Raft protocol instead of a simpler master-slave
architecture.
rr Does the vendor offer both on-premises and cloud-based versions of its software?
rr Does the vendor offer support for multiple cloud vendors beyond Amazon Web
Services, such as Microsoft Azure, Google Cloud Platform or Alibaba Cloud?
23 neo4j.com
The Graph Technology Buyer’s Guide
rr How does the environment secure users’ authentication and authorization rights?
rr Does the system integrate with popular directories such as LDAP or Active
Directory?
rr Is the data encrypted on disk? (This is often done at the operating system and file
system level.)
rr Does the system offer the ability to check logging activity or query executions?
rr Can users terminate runaway queries?
rr Can users view query execution plans?
rr Can administrators review usage statistics that may be used for billing or
departmental chargeback?
24 neo4j.com
The Graph Technology Buyer’s Guide
rr Does the vendor have a wide variety of partnerships throughout both business and
technology landscapes?
rr How closely does the vendor manage and work with their partners? Are they
partners in name only?
25 neo4j.com
The Graph Technology Buyer’s Guide
rr Does the vendor offer a variety of license models to suit the buyer’s preferred
deployment and feature combinations?
Likewise, look out for aging technology that is losing its place in the market. This is often
referred to as the “Goldilocks effect,” whereby mature technologies grow stale, paving
the way for newer ones. The aging technology is then slowly put to pasture as a cash
cow or retrofitted with bolt-on updates that simply add features with little regard to their
effectiveness. The result is a Swiss Army Knife – a tool that does a little of everything but
none of it well.
In addition, a mature graph technology vendor should have the resources necessary
to support your graph database deployment should any issue or problems arise. No
technology is ever deployed perfectly (no matter whether the vendor or the buyer are
at fault), but given the newness of the graph paradigm, a vendor must have a dedicated
team of support engineers who can help with your critical issues or questions as they
arise. This team should be geographically spread out and available for emergency
support around the clock.
26 neo4j.com
The Graph Technology Buyer’s Guide
As the graph technology market continues to grow and mature, purchasing decisions
may become more clear as vendors coalesce around shared standards (such as query
languages, data formats, etc.). But until then, we hope that this buyer’s guide has been
helpful in clarifying which factors are most important when considering a graph database
product – and which factors might just be marketing fluff with no substance behind them.
For your convenience, we’ve included a blank checklist in Appendix A that lists all of the
factors outlined in the sections above.
If you’d like to talk to a graph expert about the advantages of the Neo4j Graph Platform,
please contact us today.
27 neo4j.com
The Graph Technology Buyer’s Guide
This Appendix includes a blank checklist for the reader in order to independently evaluate and compare graph database choices.
Appendix A
The Buying Criteria for Graph Technology
Open Source Foundation & Community
rr Is the graph database built on an open source foundation?
rr Does the graph database vendor have an active open source community?
Native Graph Storage
rr Transposition vs. Persisted Relationships
rr Is the offering a native or non-native graph database?
rr Is there a graph-to-document or graph-to-column transposition step buried in the software?
rr Index-Free Adjacency
rr Does the graph database support index-free adjacency for high-performance graph traversals across deep graph
datasets?
ACID Compliance
rr Durability
rr Is the graph database systems rigorously tested for data durability?
rr Does the graph database lose data during systemic hardware failures?
rr Consistency Checks
rr Does the database offer any level of consistency checking for graphs?
rr What safety measures are available to ensure writes are written correctly?
rr Do these measures work as expected in a clustered, High Availability (HA) environment?
rr Does the database vendor’s documentation contain warnings, such as beware of ghost vertices or floating or
untethered edges?
28 neo4j.com
The Graph Technology Buyer’s Guide
rr Did the vendor use the Linked Data Benchmark Council (LDBC) array of tests?
rr Did the vendor work with independent third parties that have performed other vendor-neutral comparisons?
rr Does database performance deteriorate exponentially when the number of traversal hops across the graph increases?
rr Does the benchmark compare a pre-compiled binary query against an interpreted query?
rr Does the vendor fully exercise the system or simply highlight its most favorable tasks and results?
rr For example, are workloads mixed as both transactional writes and analytic reads, or is only one type of activity
executed?
rr Are one-time data loading operations overly emphasized – versus multi-user or multi-session tests – which better
reflect regular usage?
rr Are benchmark sessions multi-user, or are they single sessions that utilize all available CPU resources to parallelize
query execution?
rr Does the benchmark only use database defaults, rather than being optimized for performance at scale?
rr Graph Analytics (OLAP Applications)
rr Has the vendor published information about how to perform graph analytics exercises using graph algorithms?
rr Does the vendor offer support for popular graph algorithms?
rr Does the vendor include documentation explaining the appropriate use and application of various graph algorithms?
rr Data Integration & Ingestion
rr Does the graph system operate within the Apache Spark ecosystem?
rr Can graphs be materialized from Spark DataFrames?
rr Can graph structures be directly imported into the graph database system?
rr Can graph algorithms be run against data in Apache Spark?
rr Does the vendor offer libraries, SDKs and data integration adapters to popular data management tools?
29 neo4j.com
The Graph Technology Buyer’s Guide
Graph Platform with Tools & Support for All Types of Users
rr Developer Tooling
rr Does the platform include a launchpad application wrapper for all developer tools related to the graph database
system?
rr Does the platform offer a visual development environment to help developers write clean, clear queries and
understand what those queries return?
rr Does the visual development tool include keyword color-coding and auto-completed values drawn from the
database?
rr Does the platform offer commercial support for multiple language drivers for the most popular programming
languages?
rr Does the vendor’s wider user community offer drivers for other popular programming languages?
rr Does the graph platform offer API support for Java or the database’s native language?
rr Does the platform include support for GraphQL?
rr Does the platform support integration with SPARQL?
rr Does the platform support REST APIs and frameworks?
rr Does the vendor actively support multiple emerging standards such as the GRANDstack or similar?
rr Can the database be embedded within a surrounding application?
rr How easily can the graph system embed itself in larger applications?
rr Does the vendor support a robust OEM network and/or startup program?
rr Data Visualization Capabilities
rr Does the platform include a native graph data visualization tool?
rr Is the graph data visualization tool codeless for non-technical users?
rr Does the visualization tool support natural language search?
rr Does the graph visualization tool offer built-in query suggestions?
rr Is the visualization tool web-based for ease of access?
rr Does the visualization tool allow non-technical users to easily navigate and explore a given graph dataset?
rr Does the graph visualization tool allow users to link to (or embed) views of the graph?
rr Does the visualization tool let technical users execute Cypher queries or adjust perspectives for other non-technical
users?
rr Does the data visualization tool include advanced security features such as user- or group-level viewing permissions?
rr Does the visualization tool offer multi-user sharing of graph views, storyboarding, printing or other presentation
methods?
30 neo4j.com
The Graph Technology Buyer’s Guide
rr Does the graph platform vertically scale for your expected write load?
rr Does the graph platform include multi-graph support for both writes and reads?
rr Are multi-database features included in the vendor’s product roadmap?
rr Will the vendor’s planned system support multiple tenants, hosting multiple databases?
rr Does the graph platform support robust read scaling?
rr Does the graph database offer readable replicas of the main graph?
rr How do replicas stay consistent with the main writable-core instance to ensure that an application or client can read
their own writes (RYOW)?
rr Does the vendor offer both on-premises and cloud-based versions of its software?
rr Does the vendor offer support for multiple cloud vendors beyond Amazon Web Services, such as Microsoft Azure,
Google Cloud Platform or Alibaba Cloud?
31 neo4j.com
The Graph Technology Buyer’s Guide
rr Is security transparent to the developer, or must the developer themselves manage security within the applications they
build?
rr What sorts of face-to-face events and conferences does the vendor participate in?
rr What events does the vendor host to develop its community?
rr How many regular meetups does it participate in?
rr What breadth of geography does a vendor cover in their events and in-person outreach?
rr Does the vendor hold a regular conference dedicated to graph technology?
rr Who are the vendor’s reference customers?
rr Do they have a listing of public customers available on their website?
rr Will the vendor’s sales team allow you to speak to some of their current customers?
rr Do any of the vendor’s reference customers hold a conflict of interest, such as also being a corporate investor?
32 neo4j.com
The Graph Technology Buyer’s Guide
rr What is the vendor’s licensing model for their graph database software?
rr Does the vendor offer a variety of license models to suit the buyer’s preferred deployment and feature combinations?
rr Does the vendor publish their price list?
rr How long has the vendor been working on their graph database product?
rr Does is meet the Monash Rule of 5-7 years of minimum development for a stable database product?
rr Are the graph capabilities merely an add-on of an older, legacy product?
rr Is the graph technology vendor have the business staying power to build, update and support the product for years to
come?
rr Does the vendor have a team of professional support engineers to help with critical issues around the globe and
around the clock?
Neo4j is the leading graph database platform that drives innovation and competitive advantage at Airbus, Comcast, Questions about Neo4j?
eBay, NASA, UBS, Walmart and more. Hundreds of thousands of community deployments and more than 300
customers harness connected data with Neo4j to reveal how people, processes, locations and systems are
Contact us around the
interrelated.
globe:
Using this relationships-first approach, applications built using Neo4j tackle connected data challenges including info@neo4j.com
artificial intelligence, fraud detection, real-time recommendations and master data. Find out more at Neo4j.com. neo4j.com/contact-us
© 2019 Neo4j. All rights reserved. Front cover image: Anna Sullivan on Unsplash. neo4j.com