Python tool to easily report on data fetched from Github's GraphQL API
A project to explore the Github GraphQL API for fun and to output data on repos, users and commits for reporting.
Why GraphQL and not REST API? This project arose because of speed and rate limit issues with using the REST API for large volumes of commit data. But, the GraphQL is about 100 times faster, in cases such as getting a page of 100 commits rather than one commit from the REST API commit endpoint. See the Datasources doc's GraphQL benefits section for more details.
The aim of this project is to fetch stats about Github repos of interest and generate reports.
The GraphQL API is used to get this data at scale, which enables quick reporting on a even large Github organization or user account with many repos. This project's reports generally fetch data in a single request that otherwise take 100 or more requests to the REST API. Additionally, some of the report script in this project have pagination built in, to get data beyond the first page.
The response data is parsed and then printed on the screen or written to CSV reports.
Here is an outline of the report scripts available in this project, with links to them in usage document.
- Demos reports - A few simple scripts.
- Repo summary reports - Get metadata about reports or counts of commits.
- Commit reports - Get the git commit history across multiple repos and to see how your organization or team members contribute.
Another aim of this project is to explore how to run GraphQL queries with Python. This work here can be used as a reference for programmers new to this area. The understanding of querying Github can be applied to other GraphQL APIs.
No library specific to GraphQL or Github is used. Rather this project's scripts use Python requests to send a query string and optional query parameters to the GraphQL API.
The project includes Python scripts and GraphQL queries of varying complexity. Some reports multiple pages of data. Some accept command-line arguments. One of them reads required report data from a config file.
You need the following to run this project:
- Github account
- Github API token with access to repos
- Internet connection
- Python 3.6+
See the following guides so you can use this project to generate some reports for yourself on users or repos you are interested in. Note that these only cover the case of a Unix-style environment.
- Installation - Setup project environment and configs.
- Usage - Run scripts to generate reports.
- Datasources - Info and links around APIs.