Skip to content

Commit b06b880

Browse files
committed
document monitoring and running a full validator
1 parent f0ef0f3 commit b06b880

16 files changed

+298
-41
lines changed

README.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -17,5 +17,7 @@ Alternatively you may choose to install the **stellar-quickstart** package which
1717
6. [Debug Symbols](docs/debug-symbols.md)
1818
7. [Running Horizon in production](docs/running-horizon-in-production.md)
1919
8. [Building Packages](docs/building-packages.md)
20-
9. [Publishing a History archive](docs/publishing-a-history-archive.md)
21-
10. [Testnet Reset](docs/testnet-reset.md)
20+
9. [Running a Full Validator](docs/running-a-full-validator.md)
21+
10. [Publishing a History archive](docs/publishing-a-history-archive.md)
22+
11. [Monitoring](docs/monitoring.md)
23+
12. [Testnet Reset](docs/testnet-reset.md)

docs/adding-the-sdf-stable-repository-to-your-system.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# SDF - packages
2-
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Adding the SDF stable repository to your system
1517

docs/bleeding-edge-unstable-repository.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# SDF - packages
2-
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Bleeding Edge Unstable Repository
1517

docs/building-packages.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Building Packages
1517

docs/debug-symbols.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Debug Symbols
1517

docs/installing-individual-packages.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Installing individual packages
1517

docs/monitoring.md

+94
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# SDF - packages
2+
3+
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
4+
2. [Quickstart](quickstart.md)
5+
3. [Installing individual packages](installing-individual-packages.md)
6+
4. [Upgrading](upgrading.md)
7+
5. [Bleeding Edge](bleeding-edge-unstable-repository.md)
8+
6. [Debug Symbols](debug-symbols.md)
9+
7. [Running Horizon in production](running-horizon-in-production.md)
10+
8. [Building Packages](building-packages.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
15+
16+
### Monitoring
17+
Monitoring `stellar-core` using Prometheus is by far the simplest solution, especially if you already have a Prometheus server within your infrastructure. Prometheus is a time-series database with a simple yet incredibly powerful query language `PromQL`, Prometheus is also tightly integrated with Grafana and enables us to render complex visualisations with ease.
18+
19+
In order for Prometheus to scrape `stellar-core` application metrics, you will need to install the stellar-core-prometheus-exporter (`apt-get install stellar-core-prometheus-exporter`) and configure your Prometheus server to scrape this exporter (default port: `9473`).
20+
21+
#### Install a Prometheus server within your infrastructure
22+
Installing and configuring a Prometheus server is out of scope of this document, however it is a fairly simple process, Prometheus is a single Go binary which you can download from https://prometheus.io/download/.
23+
24+
#### Install the stellar-core-prometheus-exporter
25+
The stellar-core-prometheus-exporter is an exporter that scrapes the `stellar-core` metrics endpoint (`http://localhost:11626/metrics`) and renders these metrics in the Prometheus text based format available for Prometheus to scrape and store in it's timeseries database.
26+
27+
The exporter needs to be installed on every stellar-core node you wish to monitor.
28+
29+
* `apt-get install stellar-core-prometheus-exporter`
30+
31+
You will need to open up port `9473` between your Prometheus server and all your `stellar-core` nodes for your Prometheus server to be able to scrape `stellar-core` metrics.
32+
33+
#### Point Prometheus to stellar-core-prometheus-exporter
34+
Pointing your Prometheus instance to the exporter can be achieved by manually configuring a scrape job, depending on the number of hosts you need to monitor this quickly becomes unwieldy. With this in mind the process can also be automated using Prometheus' various "service discovery" plugins. For example with AWS hosted instance you can use the `ec2_sd_config` plugin.
35+
36+
##### Manual
37+
```yaml
38+
- job_name: 'stellar-core'
39+
scrape_interval: 10s
40+
scrape_timeout: 10s
41+
static_configs:
42+
- targets: ['core-node.example.com:9473'] # stellar-core-prometheus-exporter default port is 9473
43+
- labels: ['application': 'stellar-core']
44+
```
45+
46+
##### Using Service Discovery (EC2)
47+
```yaml
48+
- job_name: stellar-core
49+
scrape_interval: 10s
50+
scrape_timeout: 10s
51+
ec2_sd_configs:
52+
- region: eu-west-1
53+
port: 9473
54+
relabel_configs:
55+
# ignore stopped instances
56+
- source_labels: [__meta_ec2_instance_state]
57+
regex: stopped
58+
action: drop
59+
# only keep with `core` in the Name tag
60+
- source_labels: [__meta_ec2_tag_Name]
61+
regex: "(.*core.*)"
62+
action: keep
63+
# use Name tag as instance label
64+
- source_labels: [__meta_ec2_tag_Name]
65+
regex: "(.*)"
66+
action: replace
67+
replacement: "${1}"
68+
target_label: instance
69+
# set application label to stellar-core
70+
- source_labels: [__meta_ec2_tag_Name]
71+
regex: "(.*core.*)"
72+
action: replace
73+
replacement: stellar-core
74+
target_label: application
75+
```
76+
77+
#### Useful Exporters
78+
79+
You may find the below exporters useful for monitoring your infrastructure as they provide incredible insight into your operating system and database metrics. Unfortunately installing and configuring these exporters is out of the scope of this document but should be relatively straightforward.
80+
81+
* [node_exporter](https://prometheus.io/docs/guides/node-exporter/) can be used to track all operating system metrics.
82+
* [postgresql_exporter](https://github.com/wrouesnel/postgres_exporter) can be used to monitor the local stellar-core database.
83+
84+
#### Install a Grafana server within your infrastructure
85+
Now that you have configured Prometheus to scrape and store your stellar-core metrics, you will want a nice way to render this data for human consumption. Grafana offers the simplest and most effective way to achieve this. Again installing Grafana is out of scope of this document but is a very simple process, especially when using the prebuilt apt packages (https://grafana.com/docs/installation/debian/#apt-repository)
86+
87+
##### Stellar Core Full dashboard
88+
We have created a `Stellar Core Full` Grafana dashboard which exposes a simple health summary as well as all `stellar-core-prometheus-exporter` metrics. This dashboard is available for installation at https://grafana.com/dashboards/10334.
89+
90+
###### Health Summary
91+
![Stellar Core Full Health Summary](../images/stellar-core-full-health-summary.png)
92+
93+
###### Overlay Metrics
94+
![Stellar Core Full Overlay Metrics](../images/stellar-core-full-overlay-metrics.png)

docs/publishing-a-history-archive.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Publishing a history archive
1517

docs/quickstart.md

+7-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Quickstart
1517

@@ -86,7 +88,7 @@ As with [accessing the database directly](#accessing-the-quickstart-databases),
8688
| stellar-core | none | installs stellar-core binary, systemd service, logrotate script, documentation |
8789
| stellar-core-utils | none | installs useful command line tools (stellar-core-cmd, stellar-core-gap-detect) |
8890
| stellar-core-prometheus-exporter | none | installs a Prometheus exporter to facilitate ingesting stellar-core metrics |
89-
| stellar-core-postgres | stellar-core, PostgreSQL | configures a PostgreSQL server, creates a stellar db,role and system user |
91+
| stellar-core-postgres | stellar-core, PostgreSQL | configures a PostgreSQL server, creates a stellar db,role and system user, the default stellar-core configuration contained in this package will connect to the Testnet|
9092
| stellar-archivist | none | installs stellar-archivist cli tool for managing stellar-core History archives |
9193
| stellar-horizon | none | installs stellar-horizon binary, systemd service |
9294
| stellar-horizon-utils | none | installs useful command line tools (stellar-horizon-cmd) |

docs/running-a-full-validator.md

+80
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# SDF - packages
2+
3+
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
4+
2. [Quickstart](quickstart.md)
5+
3. [Installing individual packages](installing-individual-packages.md)
6+
4. [Upgrading](upgrading.md)
7+
5. [Bleeding Edge](bleeding-edge-unstable-repository.md)
8+
6. [Debug Symbols](debug-symbols.md)
9+
7. [Running Horizon in production](running-horizon-in-production.md)
10+
8. [Building Packages](building-packages.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
15+
16+
## Running a Full Validator
17+
When deciding to run a Full Validator it is important to understand the requirements and benefits of doing so, it is also worth noting that to become a **Tier 1 validator operator** one must run at least 3 full validators and have a demonstrable [high validator uptime](https://www.stellarbeat.io/nodes).
18+
19+
### Benefits
20+
Running a full validator is not only beneficial to the operator but is also a great way to contribute to the general health of the Stellar network. Full validators are the true measure of how decentralized and redundant the network is as they are the only type of validators that perform all functions on the network. Listed below are some of the operator level benefits.
21+
22+
* Enables deeper integrations by clients and business partners
23+
* Official endorsement of specific ledgers in real time (via signatures)
24+
* Quorum Set aligned with business priorities
25+
* Additional checks/invariants enabled
26+
* Validator can halt and/or signal that for example (in the case of an issuer) that it does not agree to something
27+
28+
### Requirements
29+
Running a full validator is a fairly straightforward process and depending on your capacity requirements needs very little computing resources.
30+
31+
* server/instance to run the stellar-core application and corresponding postgres database
32+
* secrets management to securely store the seed(s)
33+
* access to an Object store such as S3/Spaces/Azure Blob for storing history (optional, can also be stored and served locally)
34+
* monitoring (optional, can also be plugged into existing monitoring)
35+
36+
For more in-depth requirements please see the main [admin guide](https://www.stellar.org/developers/stellar-core/software/admin.html#full-validators)
37+
38+
#### Required Steps
39+
40+
1. install stellar-core
41+
2. configure validation
42+
3. publish history
43+
44+
### Installing stellar-core
45+
For this guide we will be documenting the process of installing the `stellar-core-postgres` Debian package. As described in more detail [here](quickstart.md#moving-on-from-quickstart), the `stellar-core-postgres` package pulls in the `stellar-core` package and configures a local PostgreSQL database server ready for use by stellar-core, please note that by default this package connects `stellar-core` to the **Testnet** as a simple watcher node.
46+
47+
Install stellar-core and a local postgres database using PostgreSQL's `peer authentication method` for authentication/security:
48+
49+
* `apt-get install stellar-core-postgres`
50+
51+
Accessing the locally configured `stellar` PostgreSQL database and running stellar-core commands such as `new-db` is described in more detail [here](quickstart.md#accessing-the-quickstart-databases).
52+
53+
Alternatively if you prefer to install and configure PostgreSQL yourself, you will only need to install the `stellar-core` package.
54+
55+
* `apt-get install stellar-core`
56+
57+
### Configuration
58+
59+
Once you have configured your stellar-core node and it's respective database, you will need to configure your stellar-core instance to become a **Full Validator**. The simplest way to create a validator configuration is to start off with a working `watcher` config and simply add the required validator configuration parameters (`NODE_IS_VALIDATOR`, `NODE_SEED`).
60+
61+
As mentioned previously, the configuration installed by the `stellar-core-postgres` package configures `stellar-core` to connect to the SDF **Testnet** as a basic watcher node. Connecting to the **Pubnet** as a watcher is simply a matter of modifying the `stellar-core` configuration to point to **Pubnet** and running `sudo -u stellar stellar-core --conf /etc/stellar/stellar-core.cfg new-db` to reset the database and buckets directory.
62+
63+
If you want to create a **Pubnet** validator, you can use the [**Pubnet Watcher**](stellar-core_pubnet_watcher.cfg) config.
64+
65+
#### Validation
66+
Configuring a node to participate in SCP and sign messages is a 3 step process consisting of securely generating a `seed`, adding this seed to your configuration file and finally setting the `NODE_IS_VALIDATOR=true` parameter.
67+
68+
* create a keypair `stellar-core gen-seed`
69+
* add `NODE_SEED="SD7DN..."` to your configuration file
70+
* add `NODE_IS_VALIDATOR=true` to your configuration file
71+
72+
You will most likely want to share the public portion of your keypair to other validator operators so that they can add your nodes to their quorum set. This is best achieved by publishing a [.well-known/stellar.toml](https://www.stellar.org/.well-known/stellar.toml) on your homedomain, you can also use it to share other aspects of your nodes configuration, such as history archive location, organisation name, etc.
73+
74+
#### Quorum Set
75+
The quorum set is used by stellar-core to define which nodes on the network you trust as well as to configure stellar-core's behaviour during arbitrary node failures. More information can be found in the [quorum set](https://www.stellar.org/developers/stellar-core/software/admin.html#crafting-a-quorum-set) section of the main admin guide.
76+
77+
### Publishing History
78+
Full validators participate in consensus and publish their history data either to a blob store such S3, Spaces or Azure Blob. This aspect of the full validator is incredibly important as it enables other network users to connect and sync to the network using your validator thus increasing network resiliency and decentralisation.
79+
80+
The process of publishing `stellar-core` history is described in detail [here](publishing-a-history-archive.md)

docs/running-horizon-in-production.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# SDF - packages
2-
1+
# SDF - packages
2+
33
1. [Adding the SDF stable repository to your system](adding-the-sdf-stable-repository-to-your-system.md)
44
2. [Quickstart](quickstart.md)
55
3. [Installing individual packages](installing-individual-packages.md)
@@ -8,8 +8,10 @@
88
6. [Debug Symbols](debug-symbols.md)
99
7. [Running Horizon in production](running-horizon-in-production.md)
1010
8. [Building Packages](building-packages.md)
11-
9. [Publishing a History archive](publishing-a-history-archive.md)
12-
10. [Testnet Reset](testnet-reset.md)
11+
9. [Running a Full Validator](running-a-full-validator.md)
12+
10. [Publishing a History archive](publishing-a-history-archive.md)
13+
11. [Monitoring](monitoring.md)
14+
12. [Testnet Reset](testnet-reset.md)
1315

1416
## Running Horizon in production
1517

0 commit comments

Comments
 (0)