Skip to content

Commit 5234962

Browse files
Peter Bengtssongracepark
andauthored
README for the src/shielding subject folder (#38707)
Co-authored-by: Grace Park <gracepark@github.com>
1 parent a391eb1 commit 5234962

File tree

2 files changed

+37
-1
lines changed

2 files changed

+37
-1
lines changed

src/search/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,5 +125,5 @@ Each record represents a section of a page. Sections are derived by splitting up
125125
- It's not strictly necessary to set an `objectID` as the search index will create one automatically, but by creating our own we have a guarantee that subsequent invocations of this upload script will overwrite existing records instead of creating numerous duplicate records with differing IDs.
126126
- Our search querying has typo tolerance. Try spelling something wrong and see what you get!
127127
- Our search querying has lots of controls for customizing each index, so we can add weights to certain attributes and create rules like "title is more important than body", etc. But it works pretty well as-is without any configuration.
128-
- Our search querying has support for "advanced query syntax" for exact matching of quoted expressions and exclusion of words preceded by a `-` sign. This is off by default but we have it enabled in our browser client. The settings in the web interface can be overridden by the search endpoint. See [middleware/search.js]([middleware/search.js).
128+
- Our search querying has support for "advanced query syntax" for exact matching of quoted expressions and exclusion of words preceded by a `-` sign. This is off by default, but it is enabled in our browser client. The settings in the web interface can be overridden by the search endpoint. See [middleware/search.js](middleware/search.js).
129129
- When needed, the Docs Engineering team can commit updates to the search index, as long as the label `skip-index-check` is applied to the PR.

src/shielding/README.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Shielding
2+
3+
## Overview
4+
5+
Essentially code in our server that controls the prevention of "junk requests" is scripted HTTP requests to endpoints that are *not* made by regular browser users.
6+
7+
For example, there's middleware code that sees if a `GET` request
8+
comes in with a bunch of random looking query strings keys. This would cause a PASS on the CDN but would not actually matter to the rendering. In this
9+
case, we spot this early and return a redirect response to the same URL
10+
without the unrecognized query string keys so that if the request follows
11+
redirects, the eventual 200 would be normalized by a common URL so the CDN
12+
can serve a HIT.
13+
14+
Here's an in-time discussion post that summaries the *need* and much of the
15+
recent things we've done to fortify our backend servers to avoid unnecessary
16+
work loads:
17+
18+
**[How we have fortified Docs for better resiliency and availability (June 2023)](https://github.com/github/docs-engineering/discussions/3262)**
19+
20+
## How it works
21+
22+
At its root, the `src/shielding/middleware/index.js` is injected into our
23+
Express server. From there, it loads all its individual middleware handlers.
24+
25+
Each middleware is one file that focuses on a single use-case. The
26+
use-cases are borne from studying log files (CDN and Azure App Service) to
27+
spot patterns of request abuse.
28+
29+
## Notes
30+
31+
- The best place to do shielding is as close to the client(s) as possible,
32+
i.e. in the CDN or in Azure Frontdoor. Having the code in our own backend
33+
has the advantage that it's easier to write custom business logic
34+
along with end-to-end tests.
35+
- Some shielding "tricks" appear in other places throughout the code
36+
base such as controlling the 404 response for `/assets/*` URLs.

0 commit comments

Comments
 (0)