[PATCH 4 of 5] changegroup: use filecloser for closing filelog handles

Gregory Szorc gregory.szorc at gmail.com
Wed Sep 30 00:36:30 CDT 2015


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1443552606 25200
#      Tue Sep 29 11:50:06 2015 -0700
# Node ID a8d8f0d70593aaa6fb2b86dc77d29c021d83277d
# Parent  4ade2d6126d9b0f67ea898ab0d021abb88d157dd
changegroup: use filecloser for closing filelog handles

On a large repository like mozilla-central, writing filelogs during
changegroup application during a clone opens (and closes) over 200,000
files. On Windows, file closes are slow and block the main thread while
they are closing.

Moving filelog file closes to a background thread on Windows frees up
the main thread to process the next filelog before waiting for the close
to complete. The end result is faster filelog application.

On my Windows 10 machine (not a VM) and with a modern SSD, this patch
has a drastic impact on unbundling a snapshot of mozilla-central:

Before: ~910s
After:  ~609s

That's a 301s or 5:01 wall time reduction!

And I'm convinced this isn't the upper limit of Windows performance: I
made some local changes to measure time spent waiting for the file
closer thread's queue to unblock and it said we still spend ~34s of main
thread time waiting on file closes. I also haven't really experimented
with the various wait intervals and limits in the file closer patch. But
since we're already looking at a 5 minute performance win, I'm content
with where things are.

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -657,12 +657,14 @@ def changegroup(repo, basenodes, source)
     # to avoid a race we use changegroupsubset() (issue1320)
     return changegroupsubset(repo, basenodes, repo.heads(), source)
 
 def addchangegroupfiles(repo, source, revmap, trp, pr, needfiles):
-    if True:
+    closer = util.filecloser(repo.ui)
+    try:
         revisions = 0
         files = 0
         while True:
+            closer.verifystate()
             chunkdata = source.filelogheader()
             if not chunkdata:
                 break
             f = chunkdata["filename"]
@@ -670,9 +672,9 @@ def addchangegroupfiles(repo, source, re
             pr()
             fl = repo.file(f)
             o = len(fl)
             try:
-                if not fl.addgroup(source, revmap, trp):
+                if not fl.addgroup(source, revmap, trp, closecb=closer.close):
                     raise util.Abort(_("received file revlog group is empty"))
             except error.CensoredBaseError as e:
                 raise util.Abort(_("received delta base is censored: %s") % e)
             revisions += len(fl) - o
@@ -687,8 +689,10 @@ def addchangegroupfiles(repo, source, re
                         raise util.Abort(
                             _("received spurious file revlog entry"))
                 if not needs:
                     del needfiles[f]
+    finally:
+        closer.cleanup()
         repo.ui.progress(_('files'), None)
 
     for f, needs in needfiles.iteritems():
         fl = repo.file(f)


More information about the Mercurial-devel mailing list