diff options
Diffstat (limited to 'fi-prune-empty2/HACKING.md')
-rw-r--r-- | fi-prune-empty2/HACKING.md | 161 |
1 files changed, 161 insertions, 0 deletions
diff --git a/fi-prune-empty2/HACKING.md b/fi-prune-empty2/HACKING.md new file mode 100644 index 0000000..da6ed91 --- /dev/null +++ b/fi-prune-empty2/HACKING.md @@ -0,0 +1,161 @@ +# DESCRIPTION + +Command fi-prune-empty is a filter that prunes a fast-import stream, +removing empty commits and empty merges. It writes `git replace` +references to map between the original git objects and the resulting +git objects. + +Like all of the filters in this repository, fi-prune-empty works by +implementing a `fiutil.Handler interface` that `main.go` passes to +`fiutil.RunHandler()`. However, there's an additional wrinkle: +because fi-prune-empty doesn't just care whether something *is* empty +and it cares whether that thing *became* empty as a result of +filtering the input stream, it needs a way (the `--srcdir=` option) to +bypass the input filtering and ask questions directly to the original +source repo. So there are two sources of input. And the "output" +stream actually has information flow bi-directionally, since we can +send `get-mark` queries to it; so kinda three sources of input. + +# CODE LAYOUT + +`main.go` +: Parses the arguments, and sets up the whole thing running by calling + `fiutil.RunHandler()`. + +`prune.go` +: Contains logic for deciding if a commit should be pruned or not, + which primarily consists of the `Pruner` object. + +`replace.go` +: Contains the logic for writing `git replace` refs, which primarily + consists of the `Replacer` object. + +`stream.go` +: Contains the logic for calling to the `Pruner` and the to the + `Replacer` (which `main.go` aggregated together in to an + implementation of the `stream.go:Driver` interface) and then + applying the results to the stream. This primarily consists of the + `Handler` object. + +`srcrepo.go` +: Contains the systems-y helper functions for interacting directly + with the source repo. + +# ARCHITECTURE OVERVIEW + +The fi-prune-empty `Handler` is one of the largest handlers that I've +implemented, and it is sort-of spread across several files. + +The `stream.go:Handler` struct contains the system logic of processing +the input stream and emiting the output stream, calling to an inner +`stream.go:Driver` interface for all of the business logic. + +```ascii + +-[fi-prune-empty]--------------------------------------------------+ + | | ++----------+ | +-[fiutil.RunHandler]-----------------+ | +----------+ +| src repo | +-[git fast-export]-+ +-[???]-+ | {frontend} | +-{stream.go:Handler}---+ | | | dst repo | +| >--->| |>-->| |>-->|>------------->|>-, | | | | | | +| | +-------------------+ +-------+ | | `->|>-, +-{Driver}--+ | | | | | +| | | | | `->| |<><>|<>-, | {backend} | +-[git fast-import]-+ | | +| | {--srcdir=} | {args.srcdir} | | | | | }-------------<>|<>--<>| |>---> | +| >----------------------------------------------------------------------->| |>-------' | | +-------------------+ | | +| | | | | +-----------+ | | | | | ++----------+ | | | | | | +----------+ + | | +-----------------------+ | | + | +-------------------------------------+ | + | | + +-------------------------------------------------------------------+ +``` + +The `Handler` contains a lot of boring clue, but it's *not just* +boring glue. It contains all of the logic for keeping track of which +marks and refs have been emitted, what each ref currently points to, +and what's in the tree of the current commit. It deals with a lot of +state. State is gross and kind of wants to gobble up all of the other +logic. Don't let it! Keep it easy to look at how decisions are made +by forcing it to call out to an external business-logic `Driver` for +those decisions! + +> First of all, I want to say that the `Handler` was designed by +> writing the "business-logic" how I thought best, and then +> implementing the "system-logic" around that to make it work. It was +> designed in a business→system order. But I'm going to explain it in +> a system→business order. + +The `Driver` interface is relatively simple (I've ordered the methods +in the rough order that they get called in): + +```ascii + +-{Driver}------------+ + | | +(input-commit) >---> ProcessCommit() >---> (output-commit) + | | - Do we even output this commit? + | | - Do we need to change this + | | commit's list of parents? + | | + | GotMark() <---------< (output-mark, output-hash) + | | + (input-mark) >---> FixMark() >---------> (output-mark) + | | + (input-mark) >---> FixCommitIsh() >----> (output-mark) + | | + | HandleDone() >------> (arbitrary stream output) + | | + +---------------------+ +``` + +`GotMark()` is mostly just called immediately after `Handler` deals +with the result of `ProcessCommit()`, but there are a few other times +it can get called as well. + +The `Driver` is actually implemented in two parts: + + 1. The `Replacer`, which mostly just listens until the very end, when + `HandleDone()` is called and it emits a bunch of `refs/replace/` + refs that record everything that happened. + 2. The `Pruner`, which takes an active in deciding what and how to + prune. + +`main.go` contains the aggregate `Driver` implementation; it `type +driver struct` aggregates the `Replacer` and the `Pruner` together in +to a complete `Driver` implementation: + +```ascii + +-{main.go:driver}----------------------+ + | | + | +-{Replacer}----+-{Pruner}-----+ | + | | : | | + | | : | | + | | : | | + | ,---> Replacer.ProcessCommit() | | +(input-commit) >-------> Pruner.ProcessCommit() >-------> (output-commit) + | | : | | + | | : | | + | | : | | + | | : | | + | | Replacer.GotMark() <---------, | + | | Pruner.GotMark() <-------------< (output-mark, output-hash) + | | : | | + | | : | | + | | +-------------+ | | + | | | | | + | | | | | + (input-mark) >--------> Pruner.FixMark() >--------------> (output-mark) + | | | | | + | | | | | + | | | | | + | | | | | + (input-mark) >--------> Pruner.FixCommitIsh() >---------> (output-mark) + | | | | | + | | | | | + | | +--------------------------+ | | + | | | | | + | | | | | + | | Replacer.HandleDone() >----------> (arbitrary stream output) + | | | | | + | | | | | + | +----------------------------+-+ | + | | + +--------------------------------------+ +``` |