162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci=====================================
462306a36Sopenharmony_ciHandling messy pull-request diffstats
562306a36Sopenharmony_ci=====================================
662306a36Sopenharmony_ci
762306a36Sopenharmony_ciSubsystem maintainers routinely use ``git request-pull`` as part of the
862306a36Sopenharmony_ciprocess of sending work upstream.  Normally, the result includes a nice
962306a36Sopenharmony_cidiffstat that shows which files will be touched and how much of each will
1062306a36Sopenharmony_cibe changed.  Occasionally, though, a repository with a relatively
1162306a36Sopenharmony_cicomplicated development history will yield a massive diffstat containing a
1262306a36Sopenharmony_cigreat deal of unrelated work.  The result looks ugly and obscures what the
1362306a36Sopenharmony_cipull request is actually doing.  This document describes what is happening
1462306a36Sopenharmony_ciand how to fix things up; it is derived from The Wisdom of Linus Torvalds,
1562306a36Sopenharmony_cifound in Linus1_ and Linus2_.
1662306a36Sopenharmony_ci
1762306a36Sopenharmony_ci.. _Linus1: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/
1862306a36Sopenharmony_ci.. _Linus2: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/
1962306a36Sopenharmony_ci
2062306a36Sopenharmony_ciA Git development history proceeds as a series of commits.  In a simplified
2162306a36Sopenharmony_cimanner, mainline kernel development looks like this::
2262306a36Sopenharmony_ci
2362306a36Sopenharmony_ci  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ciIf one wants to see what has changed between two points, a command like
2662306a36Sopenharmony_cithis will do the job::
2762306a36Sopenharmony_ci
2862306a36Sopenharmony_ci  $ git diff --stat --summary vN-rc2..vN-rc3
2962306a36Sopenharmony_ci
3062306a36Sopenharmony_ciHere, there are two clear points in the history; Git will essentially
3162306a36Sopenharmony_ci"subtract" the beginning point from the end point and display the resulting
3262306a36Sopenharmony_cidifferences.  The requested operation is unambiguous and easy enough to
3362306a36Sopenharmony_ciunderstand.
3462306a36Sopenharmony_ci
3562306a36Sopenharmony_ciWhen a subsystem maintainer creates a branch and commits changes to it, the
3662306a36Sopenharmony_ciresult in the simplest case is a history that looks like::
3762306a36Sopenharmony_ci
3862306a36Sopenharmony_ci  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
3962306a36Sopenharmony_ci                          |
4062306a36Sopenharmony_ci                          +-- c1 --- c2 --- ... --- cN
4162306a36Sopenharmony_ci
4262306a36Sopenharmony_ciIf that maintainer now uses ``git diff`` to see what has changed between
4362306a36Sopenharmony_cithe mainline branch (let's call it "linus") and cN, there are still two
4462306a36Sopenharmony_ciclear endpoints, and the result is as expected.  So a pull request
4562306a36Sopenharmony_cigenerated with ``git request-pull`` will also be as expected.  But now
4662306a36Sopenharmony_ciconsider a slightly more complex development history::
4762306a36Sopenharmony_ci
4862306a36Sopenharmony_ci  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
4962306a36Sopenharmony_ci                |         |
5062306a36Sopenharmony_ci                |         +-- c1 --- c2 --- ... --- cN
5162306a36Sopenharmony_ci                |                   /
5262306a36Sopenharmony_ci                +-- x1 --- x2 --- x3
5362306a36Sopenharmony_ci
5462306a36Sopenharmony_ciOur maintainer has created one branch at vN-rc1 and another at vN-rc2; the
5562306a36Sopenharmony_citwo were then subsequently merged into c2.  Now a pull request generated
5662306a36Sopenharmony_cifor cN may end up being messy indeed, and developers often end up wondering
5762306a36Sopenharmony_ciwhy.
5862306a36Sopenharmony_ci
5962306a36Sopenharmony_ciWhat is happening here is that there are no longer two clear end points for
6062306a36Sopenharmony_cithe ``git diff`` operation to use.  The development culminating in cN
6162306a36Sopenharmony_cistarted in two different places; to generate the diffstat, ``git diff``
6262306a36Sopenharmony_ciends up having pick one of them and hoping for the best.  If the diffstat
6362306a36Sopenharmony_cistarts at vN-rc1, it may end up including all of the changes between there
6462306a36Sopenharmony_ciand the second origin end point (vN-rc2), which is certainly not what our
6562306a36Sopenharmony_cimaintainer had in mind.  With all of that extra junk in the diffstat, it
6662306a36Sopenharmony_cimay be impossible to tell what actually happened in the changes leading up
6762306a36Sopenharmony_cito cN.
6862306a36Sopenharmony_ci
6962306a36Sopenharmony_ciMaintainers often try to resolve this problem by, for example, rebasing the
7062306a36Sopenharmony_cibranch or performing another merge with the linus branch, then recreating
7162306a36Sopenharmony_cithe pull request.  This approach tends not to lead to joy at the receiving
7262306a36Sopenharmony_ciend of that pull request; rebasing and/or merging just before pushing
7362306a36Sopenharmony_ciupstream is a well-known way to get a grumpy response.
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ciSo what is to be done?  The best response when confronted with this
7662306a36Sopenharmony_cisituation is to indeed to do a merge with the branch you intend your work
7762306a36Sopenharmony_cito be pulled into, but to do it privately, as if it were the source of
7862306a36Sopenharmony_cishame.  Create a new, throwaway branch and do the merge there::
7962306a36Sopenharmony_ci
8062306a36Sopenharmony_ci  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
8162306a36Sopenharmony_ci                |         |                                      |
8262306a36Sopenharmony_ci                |         +-- c1 --- c2 --- ... --- cN           |
8362306a36Sopenharmony_ci                |                   /               |            |
8462306a36Sopenharmony_ci                +-- x1 --- x2 --- x3                +------------+-- TEMP
8562306a36Sopenharmony_ci
8662306a36Sopenharmony_ciThe merge operation resolves all of the complications resulting from the
8762306a36Sopenharmony_cimultiple beginning points, yielding a coherent result that contains only
8862306a36Sopenharmony_cithe differences from the mainline branch.  Now it will be possible to
8962306a36Sopenharmony_cigenerate a diffstat with the desired information::
9062306a36Sopenharmony_ci
9162306a36Sopenharmony_ci  $ git diff -C --stat --summary linus..TEMP
9262306a36Sopenharmony_ci
9362306a36Sopenharmony_ciSave the output from this command, then simply delete the TEMP branch;
9462306a36Sopenharmony_cidefinitely do not expose it to the outside world.  Take the saved diffstat
9562306a36Sopenharmony_cioutput and edit it into the messy pull request, yielding a result that
9662306a36Sopenharmony_cishows what is really going on.  That request can then be sent upstream.
97