Skip to content

Commit 4c7d446

Browse files
committed
Script to parse page-requests.txt
1 parent ca0ceb5 commit 4c7d446

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

scripts/pageviews.py

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
from pprint import pprint
2+
3+
results = {}
4+
5+
# page-requests-20200504-20200510.txt
6+
# is a file containing stats from pageviews on the official Python documentation
7+
# (including different versions and languages)
8+
# grep -E '\.html$' page-requests-20200504-20200510.txt | grep -v tutorial | sed 's/3\..\///g' | sed 's/3\///g' | sed 's/2\///g' > pageviews.txt
9+
pages = open('pageviews.txt').readlines()[:-1]
10+
for p in pages:
11+
count, key = int(p.split()[0]), p.split()[-1].strip()
12+
if key in results:
13+
results[key] += count
14+
else:
15+
results[key] = count
16+
17+
for p in sorted(list(results.items()), key=lambda x: x[1], reverse=True)[50:100]:
18+
print(p[1], p[0][1:])

0 commit comments

Comments
 (0)