Open Uba
Open Uba
Open Uba
Jovonni L. Pharr
jpharr2@student.gsu.edu
Atlanta, GA, USA
[v0.0.1]
Abstract
This project describes a system for analytic modeling of user & entity behavior. We
make use of the scientific computing community, and the tools used within it. We
demonstrate conditions under which modeling invocation takes place, the handling of
model results, and a description of model components. Version control is also described
for models changing over time. Feedback is described for both rules, and models. Effi-
cient system storage is also demonstrated for anomalies, and events of interest. Various
risk calculation approaches are also described. The system takes advantage of turing-
completeness for model development, instead of limited development freedom. Lastly,
different scenarios in which models, and rules accept feedback are covered. The com-
ponents described construct a UBA system designed to be extensible, powerful, and
open.
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
1
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
Where f is the file, and d is the en- We represent the model invocation
coded data. A model component contains function as
a set of hashes,
Φ = invoke(. . . ) (10)
H := {H(f ), H(d) | f 6= ∅, d 6= ∅} (5)
2
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
Model groups are sets of models to run with where unnew is either the new risk
a shared context, and shared datasets. We score for user, u, or the risk score to be
define executing over model groups by used for new risk score calculation.
Model return types are defined at model If there exists a model where the model
configuration time. version is the maximum version number in
the set of model versions, the system will
infer on that model by default. Given a set
ρ = resulttype (15)
of all model versions, Mvers
and the set of result types is
3
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
4
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
ers. "Black box" risk calculations hinder We define an abstracted risk calcula-
SOCs from fully understanding how case tion function, and map over all users
generating anomalies are precisely scored,
and their ability to concretely explain the
calculations to third parties. There is urisk = γ(u, risk(u, t), Mur ), ∀u ∈ U
an inherent tradeoff between interpretable (25)
risk scores, and the determinism of risk
γ is a cartesian product of the user
score calculation. The more a model uses
set, and two real numbers, and yields a real
stochastic, and more complex processes to
number, representing a new risk score for
calculate risk scores, the more SOC ana-
user, u.
lysts lose granular explainability of risk re-
sults.
γ :u×<×<→< (26)
The philosophical concept of a risk
score is subjective as well, and is designed For simple risk score calculations, the
for human intuition. This goal can be function does not need u, however, the sys-
hindered by arbitrary vendor-driven risk tem can consider a user’s profile as context
score calculations, of which decrease in- during risk calculation, opposed to a risk
terpretability of model outputs. In data score being independent of u.
science, there exists an accuracy vs inter-
pretability tradeoff. As more complexed
models are invented, of which increase in Case Feedback
performance, we lose human intuition on
the inner workings of the model. This jus- Upon a SOC analyst providing feed-
tifies enabling SOC teams to transparently back to a model, the models adjust internal
define their risk score calculation logic as logic in some meaningful way. Feedback
much as possible – hiding concrete risk into a model differs in several ways depend-
score logic only harms day-to-day SOC de- ing on properties of the model. However,
cision making, and requires a great deal of feedback on a rule, must be separated from
trust from the SOC onto the vendor. feedback to a model.
5
OpenUBA: A SIEM-agnostic, Open Source Framework for Modeling User Behavior
Future Work
Herein, we have demonstrated an
"Open Model" UBA solution for monitor-
ing users and entities. Once systems like
OpenUBA are implemented, macro-level
SOC activities can be performed using the
system’s data, and the community data.
Several threat feeds exists today, but lit-
tle to none exists with a focus on analytics,