Graphs in Government:: Fulfilling Your Mission With Neo4j
Graphs in Government:: Fulfilling Your Mission With Neo4j
Graphs in Government:: Fulfilling Your Mission With Neo4j
WHITE PAPER
Graphs in
Government:
Fulfilling Your Mission with Neo4j
Jason Zagalsky, Neo4j
neo4j.com
The #1 Platform for Connected Data
White Paper
Graphs in Government:
TABLE OF CONTENTS
Graphs are versatile and dynamic. The use cases for graphs in government are endless.
Graphs are the key to solving the challenges you face in fulfilling your mission.
Uncovering the relationships between data locked in various repositories requires a graph
database platform that’s flexible, scalable and powerful. A graph database platform reveals
data connectedness to achieve your agency’s mission-critical objectives – and so much more.
Using real-world government use cases, this white paper explains how graphs solve a broad
range of complex problems that can’t be solved in any other way.
A graph database enables you to discover connections among data, and do so much faster
than joining tables within a traditional relational database or even using another NoSQL
database such as MongoDB or Elasticsearch.
Neo4j is a highly scalable, native graph database that stores and manages data relationships
as first-class entities. This means the database maintains knowledge of the relationships, as
opposed to a relational database (RDBMS), which instantiates relationships using table JOINs
based on a shared key or index.
A native graph database like Neo4j offers index-free adjacency: data is inherently connected
with no foreign keys required. The relationships are stored right with the data object, and
connected nodes physically point to each other.
1 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
Relational Graph
Database VS. Database
Connected data is the The graph data model is easy to understand, as it reflects how data naturally exists – as
objects and the relationships between those objects. It’s a model that you naturally sketch
representation, usage on a whiteboard when talking about data, with data elements (nodes or vertexes) and the
and persistence of relationships (or edges) between them.
relationships between Each node represents an entity, and each relationship represents how two nodes
data elements. are associated. Property attributes (and indexes) can be attached to both nodes and
relationships as well.
By assembling the simple abstractions of nodes and relationships into connected structures,
Neo4j allows you to build sophisticated, flexible models that map closely to a problem
domain.
Data becomes more useful once its connectedness is established. Connected data is the
representation, usage and persistence of relationships between data elements. Neo4j makes
it possible to query relationships across disparate data sources, regardless of the type of
data or originating database.
Neo4j connects multiple layers of data – across processes, people, networks and things.
Once you’ve connected layers, you gain intelligence downstream and provide a connected
view of the data to analytic and operational applications. You also obtain context that allows
you to more deeply or better refine the pieces of information you’re collecting. The better
your understanding of data connections, the better your downstream insights will be.
Neo4j empowers government agencies and organizations to iterate and expand on current
datasets, gaining momentum to execute on bigger and better ideas, and find deeper
contextual meaning in the data. Using graph technology, you can increase the number of
2 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
NEO4J AT SCALE AT hops (the levels of connections) between data without a corresponding increase in compute
cost. As a result, you gain higher degrees of context not easily achieved by JOINing three or
THE U.S. ARMY
four tables together in an RDBMS.
Database stats Neo4j’s architecture enables these deep, complex queries. The enterprise-grade, native
graph database is built from the ground up to traverse data connections at depth, in real
> 1 TB of total data
time and at scale.
> 2.1 billion nodes
Reduce Infrastructure Costs
> 5.9 billion relationships
Your government agency runs on a lean budget. Any opportunity to reduce infrastructure
spending frees up resources to focus on the core mission. A graph database does just that.
It delivers deep, complex queries with less hardware, which means reduced costs. The
standard, highly available Neo4j installation is 3-5 servers, versus an RDBMS with a graph
layer, which requires about 50 servers for the same scale. With this efficiency, Neo4j also
requires fewer licenses, further reducing database costs. Neo4j offers deployment flexibility,
with servers on premises or in the cloud.
Neo4j delivers a 1,000x performance advantage over relational and other NoSQL databases
hosting graph engines, reducing response times from minutes to milliseconds for queries of
graphs containing billions of connections.
Neo4j traverses any level of data in real-time due to its native graph architecture. RDBMS and
other NoSQL databases typically see a significant performance degradation when traversing
data beyond three levels of depth.
3 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
Person of Interest
Suspect
Here are several real-world case studies to get you thinking about how a graph database
helps your agency meet its mission objectives. Don’t let these examples limit your
imagination, but rather use them to imagine other possible use cases – remember, graphs
are everywhere!
4 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
the components for Maintenance and support for this equipment necessitates the procurement of millions
our cost allocation of spare parts per year. Prior to Neo4j, the Army used an aging mainframe-based system
to track parts orders and their connections to equipment systems, components and
algorithms to work subcomponents.
through.”
However, an increasing volume of available data and changing historical data sources made
data management increasingly difficult, resulting in unpredictable maintenance costs. It was
– Preston Hendrickson, leader of
obvious that a system with more flexibility would offer greater performance and the ability to
CALIBRE’s technical team for the
add new dimensions for more insights and richer analysis.
U.S. Army’s Operating and Support
Management Information System
ARMY LEADERS WANTED TO RAPIDLY QUERY CONNECTED DATA TO:
• Forecast the need for replacement parts considering their theater location and climate
• Answer vital “what-if” questions such as the cost of deploying certain forces and supporting
equipment to a new war zone
The Army recognized the need to modernize its core tracking system. Working with CALIBRE,
a leading provider of management consulting and information technology solutions, the U.S.
Army employed Neo4j as a major part of its solution for providing greater visibility into the
total costs of owning a system.
With Neo4j, the Army has a much more flexible and robust view of the parts requirements
and costs of these parts across systems, components and subcomponents. It’s also much
easier now to rapidly store, explore and visualize a wealth of logistics and cost data.
“The scale of the information Neo4j handles is vast,” said Preston Hendrickson, who
leads CALIBRE’s technical team for the U.S. Army’s Operating and Support Management
Information System. “For example, just one of the tanks we track includes ~10 million parts
records, creating more than 15 million possible relationships among the components for our
cost allocation algorithms to work through.
“The flexibility and speed at which we can now add in new data sources or make changes to
the structures of current data in Neo4j has been a real game changer for our IT team.
5 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
“Neo4j also saves our analytics team huge amounts of time. The graph now serves as an
analytics platform that is capable of housing everything they need together in one place.
“This is giving us visibility into more detailed connections within our data that were previously
much harder to find or perhaps sometimes even overlooked. Analysts can now look for
answers to their questions and perform ‘what-if’ scenarios immediately without having
to load data from multiple sources and in some cases reload a mainframe for repeat
computation.”
organization, and we IQT’s ultimate aim was to break down product sets into core capabilities to both evaluate the
wanted to do a JOIN merit of offering investment capital to the vendor while also understanding the components
of its “technologic DNA,” which it then combines (or recombines with technology from other
across all of them.” vendors) to create custom technology stacks that solve complex problems. However, this
proved difficult because technologic vocabulary varied widely by industry, technology and
–Ravi Pappu, vendor.
CTO, In-Q-Tel
“We had no way to automate these exercises,” explained CTO Ravi Pappu. “Technology
evaluation and decomposition was done manually in spreadsheets and presentation
diagrams. Tech suppliers were matched manually, and the process of identifying new product
combinations was slow and generated few ideas. We needed a common lexical catalog for all
the technology components.”
• Mapping the connections between intelligence agencies, their mission problems and startups
• Integrating masses of information drawn from different suppliers and other sources
• Quickly pinpointing significant links between different tech products to create new solutions
Pappu recognized that the best way to solve these issues was with a graph database.
“Our tools did not reflect the connectedness of our data,” he said. “That’s what we solved with
Neo4j.”
“The fundamental reason for us to choose a graph database over other systems is that there
is enormous value in the relationships between different objects,” he explained. “We have
6 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
many different data silos in our organization, and we wanted to do a JOIN across all of them.”
Thanks to Neo4j, IQT’s technical staff now develop technology solutions by searching through
a wealth of internally created data, along with information imported from sources such as the
U.S. Public Library of Science and IEEE, all integrated under one common taxonomy.
The project has produced multiple benefits, Pappu said, including better product innovations.
“Neo4j is making it easier and faster to generate new ideas to present to the government,”
Pappu said. “We are better at evaluating technology, too, and we can now see even better
into future technology trends.”
“Graph queries
MITRE
make it possible to
focus our analysis Fighting and Tracing Cybersecurity
Threats
on the relevant
In their efforts to stop cyber attacks, analysts track large amounts of detailed information,
portions of attack
such as network and endpoint vulnerabilities, firewall configurations and intrusion detection
graphs, allowing events. The solutions used to analyze this data typically track data points. But to be
us to pinpoint successful, analysts need to understand how those data points are related.
vulnerabilities and To address these challenges, researchers at MITRE Corporation, a U.S. federally funded,
target responses.” not-for-profit company, used Neo4j to develop CyGraph, a tool for cyberwarfare analytics,
visualization and knowledge management.
–Steven Noel, CyGraph brings together isolated data and events into an ongoing overall picture for decision
Cybersecurity Researcher, support and situational awareness. It prioritizes exposed vulnerabilities, mapped to potential
MITRE threats, in the context of mission-critical assets. It also correlates intrusion alerts to known
vulnerability paths and suggests the best course of action for responding to attacks. For post-
attack forensics, CyGraph shows vulnerable paths that warrant deeper inspection.
The model schema in the CyGraph architecture is free to evolve with the available data
sources and desired analytics. The data model is based on a flexible property-graph
formulation implemented in Neo4j. REST web services provide interfaces in CyGraph for data
ingestion, analytics and graph visualization.
“Graph queries make it possible to focus our analysis on the relevant portions of attack
graphs, allowing us to pinpoint vulnerabilities and target responses,” said Steven Noel, a
cybersecurity researcher at MITRE.
The use of Neo4j at the MITRE provides insight into the mission impact of cyber activities.
Graph layers (network infrastructure, cyber defense posture, mission dependencies and so
on) define subsets of the overall model space with connections within and across each layer.
Analysts also gain visibility into operations for global situational awareness.
7 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
In the past, answering a question across systems could take weeks. Interfaces between a
few key systems were built, but those interfaces were expensive and not scalable for an
organization with thousands of datasets.
By connecting all their data, they get immediate answers to critical questions. “We have tons
of telemetry data coming in and artificial intelligence analyzing it,” said Grubbs.
“If we see a problem emerging in a particular area, we need to know everything there is to
know about that immediately,” she explained. “Who can we call? What happened in test?
What did the engineering look like? We need a quick picture of everything in order to respond
to that. We can’t wait two weeks to find out why a part is failing.”
The graph database enables LMS to perform an impact analysis to determine the
downstream result of an issue or change anywhere in the product lifecycle. For example,
if there’s a delay in engineering, how will that impact the overall schedule for the product
launch?
“In the past we’ve had someone manually identify the root cause of a failure,” Grubbs
explained. “They’d identify all the things that could have influenced that part and caused it to
fail. Is it engineering? Procurement? Supplier? Is it a vendor issue or a manufacturing defect?
The idea is to let the graph do those traversals and find variance in the process and report it
back instantaneously versus a human taking weeks to do it manually.”
“We use computer-aided design (CAD) systems, and there’s a complexity rating to CAD
8 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
models,” said Grubbs. “Using Neo4j, we figure out if it’s really worth spending the time and
money to get to the next rating. Will making a design more complex improve quality or not?”
“We have far more data than humans alone could ever understand or manage,” said Grubbs.
“With Neo4j, we’ve been able to put it in a perspective that makes sense to anyone all the
way up to the CEO. There’s a lot of interest from the business. They have all kinds of business
cases lined up, ready to go.”
NASA
“The Lessons Learned Knowledge Graph of Historical Lessons Learned
database has saved
NASA has been collecting project data since the late 1950s.
us at least a year and Locked in that data is knowledge that holds incredible value
over $2M in research to help cut down on project time, enable engineers to identify
trends that can prevent disasters and incorporate lessons
and development
learned into new projects.
towards our Mission
But accessing that information is a challenge due to silos between departments and within
to Mars planning,” individual groups, products and programs. NASA needed to break down those silos, which is
exactly what it achieved using Neo4j.
–David Meza, Chief Knowledge
Architect, NASA NASA’s Lessons Learned Database is part of the organization’s knowledge management
strategy for how it collects, stores and shares information. Engineers use this database to
learn about past projects, including any mistakes or successes and what actions were taken.
“We started adding lessons in about 1990, and they went up and down until around
2003, when we had a shuttle disaster that resulted because of a thermal tile malfunction,”
explained David Meza, Chief Knowledge Architect at NASA. “If we had had this information
beforehand and understood the trends better, we might have been able to prevent the
disaster from taking place.”
Previously, the database was made up of less than 1% of the organization’s 20 million
documents. With a total of 80,000 employees, the volume, variety and velocity of data was
taxing the system. NASA needed a better way for end-users to access this information.
Meza started looking at how to take the documents and convert them into graphs. Because
there was a lot of metadata associated within the lessons, Meza correlated the topics
based on their self-assigned categories. He could see each lesson with its topic as well as
correlations between topics, so he could also see how topics correlated to one another.
This allows users to look at trends, which can potentially help NASA engineers prevent
disasters like the aforementioned shuttle disaster.
Meza developed a simple graph model to showcase the data to end users. Engineers quickly
perform searches and pull the information they need. They also jump from one part of a
system or subsystem to another and see the connections between the subsystems. Similarly,
project managers use the system to look at information pertaining to various subsystems
handled by disparate team members and pull it all together to understand the entire system.
“When you start looking at what kind of documents you have and how you’re able to
transform those into actionable knowledge for your end users, you improve your decision
making,” said Meza. “Of course, you also leverage lessons from the past, because we tend to
make the same mistakes over and over.”
9 neo4j.com
Graphs in Government: Fulfilling Your Mission with Neo4j
Follow the money: The Lessons Learned database has already generated significant value.
Governments around “This has saved us at least a year and over $2M in research and development towards our
the world have Mission to Mars planning,” said Meza.
collected more than Going forward, Meza and his team plan to provide users with the ability to input lessons
$700 million in fines directly into the database. They’ll also run a text analysis to find text reuse or similarity that
allows them to identify documents that are similar topically, but different enough that they
and back taxes using might not be caught.
connected data from
“We’re constantly looking at how to redo our Lessons Learned Database,” Meza said. “One
the Panama Papers, of our problems is that we don’t read the database when we’re having issues. But part of
stored in Neo4j. knowledge management is the ability to take that know-what into know-how for the end user,
and transmit that knowledge to the next generation.”
• Can I use data relationships (either defined or hidden) to improve my analysis and decision
making?
• Will connecting data across systems or silos allow me to answer key mission needs?
• Can I be more efficient with a flexible data model that can answer questions much faster?
Conclusion
Now that you’ve seen the innovative ways government agencies are using graph databases
to fulfill their missions, it’s important to remember that these case studies are not all-
encompassing.
Graph databases are as versatile as the government agencies that use them.
Your use case may differ from those illustrated here, but the impact will be just as
empowering. At Neo4j, we help our customers realize the power of graph technology to
solve the most challenging and obscure problems, and we’d love to help you consider ways a
graph database helps you meet your mission objectives.
Neo4j is the leading graph database platform that drives innovation and competitive advantage at Airbus, Comcast, Questions about Neo4j?
eBay, NASA, UBS, Walmart and more. Hundreds of thousands of community deployments and more than 300
customers harness connected data with Neo4j to reveal how people, processes, locations and systems are
Contact us around the globe:
interrelated.
info@neo4j.com
Using this relationships-first approach, applications built using Neo4j tackle connected data challenges including neo4j.com/contact-us
artificial intelligence, fraud detection, real-time recommendations and master data. Find out more at Neo4j.com.
© 2019 Neo4j. All rights reserved. Front cover image: Anna Sullivan on Unsplash. neo4j.com