162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci===================================== 462306a36Sopenharmony_ciHandling messy pull-request diffstats 562306a36Sopenharmony_ci===================================== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciSubsystem maintainers routinely use ``git request-pull`` as part of the 862306a36Sopenharmony_ciprocess of sending work upstream. Normally, the result includes a nice 962306a36Sopenharmony_cidiffstat that shows which files will be touched and how much of each will 1062306a36Sopenharmony_cibe changed. Occasionally, though, a repository with a relatively 1162306a36Sopenharmony_cicomplicated development history will yield a massive diffstat containing a 1262306a36Sopenharmony_cigreat deal of unrelated work. The result looks ugly and obscures what the 1362306a36Sopenharmony_cipull request is actually doing. This document describes what is happening 1462306a36Sopenharmony_ciand how to fix things up; it is derived from The Wisdom of Linus Torvalds, 1562306a36Sopenharmony_cifound in Linus1_ and Linus2_. 1662306a36Sopenharmony_ci 1762306a36Sopenharmony_ci.. _Linus1: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/ 1862306a36Sopenharmony_ci.. _Linus2: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/ 1962306a36Sopenharmony_ci 2062306a36Sopenharmony_ciA Git development history proceeds as a series of commits. In a simplified 2162306a36Sopenharmony_cimanner, mainline kernel development looks like this:: 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ci ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ciIf one wants to see what has changed between two points, a command like 2662306a36Sopenharmony_cithis will do the job:: 2762306a36Sopenharmony_ci 2862306a36Sopenharmony_ci $ git diff --stat --summary vN-rc2..vN-rc3 2962306a36Sopenharmony_ci 3062306a36Sopenharmony_ciHere, there are two clear points in the history; Git will essentially 3162306a36Sopenharmony_ci"subtract" the beginning point from the end point and display the resulting 3262306a36Sopenharmony_cidifferences. The requested operation is unambiguous and easy enough to 3362306a36Sopenharmony_ciunderstand. 3462306a36Sopenharmony_ci 3562306a36Sopenharmony_ciWhen a subsystem maintainer creates a branch and commits changes to it, the 3662306a36Sopenharmony_ciresult in the simplest case is a history that looks like:: 3762306a36Sopenharmony_ci 3862306a36Sopenharmony_ci ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 3962306a36Sopenharmony_ci | 4062306a36Sopenharmony_ci +-- c1 --- c2 --- ... --- cN 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ciIf that maintainer now uses ``git diff`` to see what has changed between 4362306a36Sopenharmony_cithe mainline branch (let's call it "linus") and cN, there are still two 4462306a36Sopenharmony_ciclear endpoints, and the result is as expected. So a pull request 4562306a36Sopenharmony_cigenerated with ``git request-pull`` will also be as expected. But now 4662306a36Sopenharmony_ciconsider a slightly more complex development history:: 4762306a36Sopenharmony_ci 4862306a36Sopenharmony_ci ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 4962306a36Sopenharmony_ci | | 5062306a36Sopenharmony_ci | +-- c1 --- c2 --- ... --- cN 5162306a36Sopenharmony_ci | / 5262306a36Sopenharmony_ci +-- x1 --- x2 --- x3 5362306a36Sopenharmony_ci 5462306a36Sopenharmony_ciOur maintainer has created one branch at vN-rc1 and another at vN-rc2; the 5562306a36Sopenharmony_citwo were then subsequently merged into c2. Now a pull request generated 5662306a36Sopenharmony_cifor cN may end up being messy indeed, and developers often end up wondering 5762306a36Sopenharmony_ciwhy. 5862306a36Sopenharmony_ci 5962306a36Sopenharmony_ciWhat is happening here is that there are no longer two clear end points for 6062306a36Sopenharmony_cithe ``git diff`` operation to use. The development culminating in cN 6162306a36Sopenharmony_cistarted in two different places; to generate the diffstat, ``git diff`` 6262306a36Sopenharmony_ciends up having pick one of them and hoping for the best. If the diffstat 6362306a36Sopenharmony_cistarts at vN-rc1, it may end up including all of the changes between there 6462306a36Sopenharmony_ciand the second origin end point (vN-rc2), which is certainly not what our 6562306a36Sopenharmony_cimaintainer had in mind. With all of that extra junk in the diffstat, it 6662306a36Sopenharmony_cimay be impossible to tell what actually happened in the changes leading up 6762306a36Sopenharmony_cito cN. 6862306a36Sopenharmony_ci 6962306a36Sopenharmony_ciMaintainers often try to resolve this problem by, for example, rebasing the 7062306a36Sopenharmony_cibranch or performing another merge with the linus branch, then recreating 7162306a36Sopenharmony_cithe pull request. This approach tends not to lead to joy at the receiving 7262306a36Sopenharmony_ciend of that pull request; rebasing and/or merging just before pushing 7362306a36Sopenharmony_ciupstream is a well-known way to get a grumpy response. 7462306a36Sopenharmony_ci 7562306a36Sopenharmony_ciSo what is to be done? The best response when confronted with this 7662306a36Sopenharmony_cisituation is to indeed to do a merge with the branch you intend your work 7762306a36Sopenharmony_cito be pulled into, but to do it privately, as if it were the source of 7862306a36Sopenharmony_cishame. Create a new, throwaway branch and do the merge there:: 7962306a36Sopenharmony_ci 8062306a36Sopenharmony_ci ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 8162306a36Sopenharmony_ci | | | 8262306a36Sopenharmony_ci | +-- c1 --- c2 --- ... --- cN | 8362306a36Sopenharmony_ci | / | | 8462306a36Sopenharmony_ci +-- x1 --- x2 --- x3 +------------+-- TEMP 8562306a36Sopenharmony_ci 8662306a36Sopenharmony_ciThe merge operation resolves all of the complications resulting from the 8762306a36Sopenharmony_cimultiple beginning points, yielding a coherent result that contains only 8862306a36Sopenharmony_cithe differences from the mainline branch. Now it will be possible to 8962306a36Sopenharmony_cigenerate a diffstat with the desired information:: 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ci $ git diff -C --stat --summary linus..TEMP 9262306a36Sopenharmony_ci 9362306a36Sopenharmony_ciSave the output from this command, then simply delete the TEMP branch; 9462306a36Sopenharmony_cidefinitely do not expose it to the outside world. Take the saved diffstat 9562306a36Sopenharmony_cioutput and edit it into the messy pull request, yielding a result that 9662306a36Sopenharmony_cishows what is really going on. That request can then be sent upstream. 97