Skip to content

Commit e27255a

Browse files
committed
docs: document and announce the proposed architectural redesign
1 parent df3ea35 commit e27255a

File tree

4 files changed

+368
-0
lines changed

4 files changed

+368
-0
lines changed

README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
[![Build Status](https://github.com/ruby-git/ruby-git/workflows/CI/badge.svg?branch=main)](https://github.com/ruby-git/ruby-git/actions?query=workflow%3ACI)
1212
[![Conventional Commits](https://img.shields.io/badge/Conventional%20Commits-1.0.0-%23FE5196?logo=conventionalcommits&logoColor=white)](https://conventionalcommits.org)
1313

14+
- [📢 Architectural Redesign 📢](#-architectural-redesign-)
1415
- [📢 We Now Use RuboCop 📢](#-we-now-use-rubocop-)
1516
- [📢 Default Branch Rename 📢](#-default-branch-rename-)
1617
- [📢 We've Switched to Conventional Commits 📢](#-weve-switched-to-conventional-commits-)
@@ -24,6 +25,39 @@
2425
- [Ruby version support policy](#ruby-version-support-policy)
2526
- [License](#license)
2627

28+
## 📢 Architectural Redesign 📢
29+
30+
The git gem is undergoing a significant architectural redesign for the upcoming
31+
v5.0.0 release. The current architecture has several design challenges that make it
32+
difficult to maintain and evolve. This redesign aims to address these issues by
33+
introducing a clearer, more robust, and more testable structure.
34+
35+
We have prepared detailed documents outlining the analysis of the current
36+
architecture and the proposed changes. We encourage our community and contributors to
37+
review them:
38+
39+
1. [Analysis of the Current Architecture](redesign/1_architecture_existing.md): A
40+
breakdown of the existing design and its challenges.
41+
2. [The Proposed Redesign](redesign/2_architecture_redesign.md): An overview of the
42+
new three-layered architecture.
43+
3. [Implementation Plan](redesign/3_architecture_implementation.md): The step-by-step
44+
plan for implementing the redesign.
45+
46+
Your feedback is welcome! Please feel free to open an issue to discuss the proposed
47+
changes.
48+
49+
> **DON'T PANIC!**
50+
>
51+
> While this is a major internal refactoring, our goal is to keep the primary public
52+
API on the main repository object as stable as possible. Most users who rely on
53+
documented methods like `g.commit`, `g.add`, and `g.status` should find the
54+
transition to v5.0.0 straightforward.
55+
>
56+
> The breaking changes will primarily affect users who have been relying on the
57+
internal g.lib accessor, which will be removed as part of this cleanup. For more
58+
details, please see the "Impact on Users" section in [the redesign
59+
document](redesign/2_architecture_redesign.md).
60+
2761
## 📢 We Now Use RuboCop 📢
2862

2963
To improve code consistency and maintainability, the `ruby-git` project has now

redesign/1_architecture_existing.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Analysis of the Current Git Gem Architecture and Its Challenges
2+
3+
This document provides an in-depth look at the current architecture of the `git` gem, outlining its primary components and the design challenges that have emerged over time. Understanding these challenges is the key motivation for the proposed architectural redesign.
4+
5+
- [1. Overview of the Current Architecture](#1-overview-of-the-current-architecture)
6+
- [2. Key Architectural Challenges](#2-key-architectural-challenges)
7+
- [A. Unclear Separation of Concerns](#a-unclear-separation-of-concerns)
8+
- [B. Circular Dependency](#b-circular-dependency)
9+
- [C. Undefined Public API Boundary](#c-undefined-public-api-boundary)
10+
- [D. Slow and Brittle Test Suite](#d-slow-and-brittle-test-suite)
11+
12+
## 1. Overview of the Current Architecture
13+
14+
The gem's current design is centered around three main classes: `Git`, `Git::Base`, and `Git::Lib`.
15+
16+
- **`Git` (Top-Level Module)**: This module serves as the primary public entry point for creating repository objects. It contains class-level factory methods like `Git.open`, `Git.clone`, and `Git.init`. It also provides an interface for accessing global git configuration settings.
17+
18+
**`Git::Base`**: This is the main object that users interact with after creating or opening a repository. It holds the high-level public API for most git operations (e.g., `g.commit`, `g.add`, `g.status`). It is responsible for managing the repository's state, such as the paths to the working directory and the `.git` directory.
19+
20+
**`Git::Lib`**: This class is intended to be the low-level wrapper around the `git` command-line tool. It contains the methods that build the specific command-line arguments and execute the `git` binary. In practice, it also contains a significant amount of logic for parsing the output of these commands.
21+
22+
## 2. Key Architectural Challenges
23+
24+
While this structure has been functional, several significant design challenges make the codebase difficult to maintain, test, and evolve.
25+
26+
### A. Unclear Separation of Concerns
27+
28+
The responsibilities between Git::Base and Git::Lib are "muddy" and overlap significantly.
29+
30+
- `Git::Base` sometimes contains logic that feels like it should be lower-level.
31+
32+
- `Git::Lib`, which should ideally only be concerned with command execution, is filled with high-level logic for parsing command output into specific Ruby objects (e.g., parsing log output, diff stats, and branch lists).
33+
34+
This blending of responsibilities makes it hard to determine where a specific piece of logic should reside, leading to an inconsistent and confusing internal structure.
35+
36+
### B. Circular Dependency
37+
38+
This is the most critical architectural flaw in the current design.
39+
40+
- A `Git::Base` instance is created.
41+
42+
- The first time a command is run, `Git::Base` lazily initializes a `Git::Lib` instance via its `.lib` accessor method.
43+
44+
- The `Git::Lib` constructor is passed the `Git::Base` instance (`self`) so that it can read the repository's path configuration back from the object that is creating it.
45+
46+
This creates a tight, circular coupling: `Git::Base` depends on `Git::Lib` to execute commands, but `Git::Lib` depends on `Git::Base` for its own configuration. This pattern makes the classes difficula to instantiate or test in isolation and creates a fragile system where changes in one class can have unexpected side effects in the other.
47+
48+
### C. Undefined Public API Boundary
49+
50+
The gem lacks a formally defined public interface. Because `Git::Base` exposes its internal `Git::Lib` instance via the public `g.lib` accessor, many users have come to rely on `Git::Lib` and its methods as if they were part of the public API.
51+
52+
This has two negative consequences:
53+
54+
1. It prevents the gem's maintainers from refactoring or changing the internal implementation of `Git::Lib` without causing breaking changes for users.
55+
56+
2. It exposes complex, internal methods to users, creating a confusing and inconsistent user experience.
57+
58+
### D. Slow and Brittle Test Suite
59+
60+
The current testing strategy, built on `TestUnit`, suffers from two major issues:
61+
62+
- **Over-reliance on Fixtures**: Most tests depend on having a complete, physical git repository on the filesystem to run against. Managing these fixtures is cumbersome.
63+
64+
- **Excessive Shelling Out**: Because the logic for command execution and output parsing are tightly coupled, nearly every test must shell out to the actual `git` command-line tool.
65+
66+
This makes the test suite extremely slow, especially on non-UNIX platforms like Windows where process creation is more expensive. The slow feedback loop discourages frequent testing and makes development more difficult. The brittleness of filesystem-dependent tests also leads to flickering or unreliable test runs.

redesign/2_architecture_redesign.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Proposed Redesigned Architecture for the Git Gem
2+
3+
This issue outlines a proposal for a major redesign of the git gem, targeted for version 5.0.0. The goal of this redesign is to modernize the gem's architecture, making it more robust, maintainable, testable, and easier for new contributors to understand.
4+
5+
- [1. Motivation](#1-motivation)
6+
- [2. The New Architecture: A Three-Layered Approach](#2-the-new-architecture-a-three-layered-approach)
7+
- [3. Key Design Principles](#3-key-design-principles)
8+
- [A. Clear Public vs. Private API](#a-clear-public-vs-private-api)
9+
- [B. Dependency Injection](#b-dependency-injection)
10+
- [C. Immutable Return Values](#c-immutable-return-values)
11+
- [D. Clear Naming for Path Objects](#d-clear-naming-for-path-objects)
12+
- [4. Testing Strategy Overhaul](#4-testing-strategy-overhaul)
13+
- [5. Impact on Users: Breaking Changes for v5.0.0](#5-impact-on-users-breaking-changes-for-v500)
14+
15+
## 1. Motivation
16+
17+
The current architecture, while functional, has several design issues that have accrued over time, making it difficult to extend and maintain.
18+
19+
- **Unclear Separation of Concerns**: The responsibilities of the `Git`, `Git::Base`, and `Git::Lib` classes are "muddy." `Git::Base` acts as both a high-level API and a factory, while `Git::Lib` contains a mix of low-level command execution and high-level output parsing.
20+
21+
- **Circular Dependency**: A key architectural flaw is the circular dependency between `Git::Base` and `Git::Lib`. `Git::Base` creates and depends on `Git::Lib`, but `Git::Lib`'s constructor requires an instance of Git::Base to access configuration. This tight coupling makes the classes difficult to reason about and test in isolation.
22+
23+
- **Undefined Public API**: The boundary between the gem's public API and its internal implementation is not clearly defined. This has led some users to rely on internal classes like `Git::Lib`, making it difficult to refactor the internals without introducing breaking changes.
24+
25+
- **Slow and Brittle Test Suite**: The current tests rely heavily on filesystem fixtures and shelling out to the git command line for almost every test case. This makes the test suite slow and difficult to maintain, especially on non-UNIX platforms.
26+
27+
## 2. The New Architecture: A Three-Layered Approach
28+
29+
The new design is built on a clear separation of concerns, dividing responsibilities into three distinct layers: a Facade, an Execution Context, and Command Objects.
30+
31+
1. The Facade Layer: Git::Repository
32+
33+
This is the primary public interface that users will interact with.
34+
35+
**Renaming**: `Git::Base` will be renamed to `Git::Repository`. This name is more descriptive and intuitive.
36+
37+
**Responsibility**: It will serve as a clean, high-level facade for all common git operations. Its methods will be simple, one-line calls that delegate the actual work to an appropriate command object.
38+
39+
**Scalability**: To prevent this class from growing too large, its methods will be organized into logical modules (e.g., `Git::Repository::Branching`, `Git::Repository::History`) which are then included in the main class. This keeps the core class definition small and the features well-organized. These categories will be inspired by (but not slavishly follow) the git command line reference in [this page](https://git-scm.com/docs).
40+
41+
2. The Execution Layer: Git::ExecutionContext
42+
43+
This is the low-level, private engine for running commands.
44+
45+
**Renaming**: `Git::Lib` will be renamed to `Git::ExecutionContext`.
46+
47+
**Responsibility**: Its sole purpose is to execute raw git commands. It will manage the repository's environment (working directory, .git path, logger) and use the existing `Git::CommandLine` class to interact with the system's git binary. It will have no knowledge of any specific git command's arguments or output.
48+
49+
3. The Logic Layer: Git::Commands
50+
51+
This is where all the command-specific implementation details will live.
52+
53+
**New Classes**: For each git operation, a new command class will be created within the `Git::Commands` namespace (e.g., `Git::Commands::Commit`, `Git::Commands::Diff`).
54+
55+
**Dual Responsibility**: Each command class will be responsible for:
56+
57+
1. **Building Arguments**: Translating high-level Ruby options into the specific command-line flags and arguments that git expects.
58+
59+
2. **Parsing Output**: Taking the raw string output from the ExecutionContext and converting it into rich, structured Ruby objects.
60+
61+
**Handling Complexity**: For commands with multiple behaviors (like git diff), we can use specialized subclasses (e.g., Git::Commands::Diff::NameStatus, Git::Commands::Diff::Stats) to keep each class focused on a single responsibility.
62+
63+
## 3. Key Design Principles
64+
65+
The new architecture will be guided by the following modern design principles.
66+
67+
### A. Clear Public vs. Private API
68+
69+
A primary goal of this redesign is to establish a crisp boundary between the public API and internal implementation details.
70+
71+
- **Public Interface**: The public API will consist of the `Git` module (for factories), the `Git::Repository` class, and the specialized data/query objects it returns (e.g., `Git::Log`, `Git::Status`, `Git::Object::Commit`).
72+
73+
- **Private Implementation**: All other components, including `Git::ExecutionContext` and all classes within the `Git::Commands` namespace, will be considered internal. They will be explicitly marked with the `@api private` YARD tag to discourage external use.
74+
75+
### B. Dependency Injection
76+
77+
The circular dependency will be resolved by implementing a clear, one-way dependency flow.
78+
79+
1. The factory methods (`Git.open`, `Git.clone`) will create and configure an instance of `Git::ExecutionContext`.
80+
81+
2. This `ExecutionContext` instance will then be injected into the constructor of the `Git::Repository` object.
82+
83+
This decouples the `Repository` from its execution environment, making the system more modular and easier to test.
84+
85+
### C. Immutable Return Values
86+
87+
To create a more predictable and robust API, methods will return structured, immutable data objects instead of raw strings or hashes.
88+
89+
This will be implemented using `Data.define` or simple, frozen `Struct`s.
90+
91+
For example, instead of returning a raw string, `repo.config('user.name')` will return a `Git::Config::Value` object containing the key, value, scope, and source path.
92+
93+
### D. Clear Naming for Path Objects
94+
95+
To improve clarity, all classes that represent filesystem paths will be renamed to follow a consistent `...Path` suffix convention.
96+
97+
- `Git::WorkingDirectory` -> `Git::WorkingTreePath`
98+
99+
- `Git::Index` -> `Git::IndexPath`
100+
101+
- The old `Git::Repository` (representing the .git directory/file) -> `Git::RepositoryPath`
102+
103+
## 4. Testing Strategy Overhaul
104+
105+
The test suite will be modernized to be faster, more reliable, and easier to work with.
106+
107+
- **Migration to RSpec**: The entire test suite will be migrated from TestUnit to RSpec to leverage its modern tooling and expressive DSL.
108+
109+
- **Layered Testing**: A three-layered testing strategy will be adopted:
110+
111+
1. **Unit Tests**: The majority of tests will be fast, isolated unit tests for the `Command` classes, using mock `ExecutionContexts`.
112+
113+
2. **Integration Tests**: A small number of integration tests will verify that `ExecutionContext` correctly interacts with the system's `git` binary.
114+
115+
3. **Feature Tests**: A minimal set of high-level tests will ensure the public facade on `Git::Repository` works end-to-end.
116+
117+
- **Reduced Filesystem Dependency**: This new structure will dramatically reduce the suite's reliance on slow and brittle filesystem fixtures.
118+
119+
## 5. Impact on Users: Breaking Changes for v5.0.0
120+
121+
This redesign is a significant undertaking and will be released as version 5.0.0. It includes several breaking changes that users will need to be aware of when upgrading.
122+
123+
- **`Git::Lib` is Removed**: Any code directly referencing `Git::Lib` will break.
124+
125+
- **g.lib Accessor is Removed**: The `.lib` accessor on repository objects will be removed.
126+
127+
- **Internal Methods Relocated**: Methods that were previously accessible via g.lib will now be private implementation details of the new command classes and will not be directly reachable.
128+
129+
Users should only rely on the newly defined public interface.
130+

0 commit comments

Comments
 (0)