<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Feb 25, 2015 at 12:44 PM, Ryan McElroy <span dir="ltr"><<a href="mailto:rm@fb.com" target="_blank">rm@fb.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>

On 2/24/2015 1:11 AM, Gregory Szorc wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

# HG changeset patch<br>

# User Gregory Szorc <<a href="mailto:gregory.szorc@gmail.com" target="_blank">gregory.szorc@gmail.com</a>><br>

# Date 1424769075 28800<br>

#      Tue Feb 24 01:11:15 2015 -0800<br>

# Branch stable<br>

# Node ID 7f1904705c29ebe7de3874f2f03c42<u></u>e261ed1c96<br>

# Parent  7d72752b8da5bb2482e6eac47545a7<u></u>8ed3fff592<br>

tags: preserve filtered .hgtags filenodes in tags cache (issue4550)<br>

<br>

If the tags cache is populated on an unfiltered repository and later<br>

populated on a filtered repository, .hgtags filenode entries for<br>

filtered revisions will disappear from the tags cache because the tags<br>

cache code currently filters out filenode entries for revisions not<br>

known to the current repo object. This behavior results in potentially<br>

expensive recalculation of .hgtags filenode values for filtered<br>

revisions. For evolution users, who create many hidden changesets and<br>

heads, this could result in gradual slowdown, as each hidden head will<br>

add overhead to resolving tags on an unfiltered repo.<br>

<br>

This patch makes the tags cache filtered revision aware. Filenode<br>

entries for filtered revisions are preserved during reading and writing.<br>

Entries are only dropped from the tags cache if they don't correspond to<br>

a head, filtered or otherwise.<br>

<br>

diff --git a/mercurial/tags.py b/mercurial/tags.py<br>

--- a/mercurial/tags.py<br>

+++ b/mercurial/tags.py<br>

@@ -246,12 +246,15 @@ def _readtagcache(ui, repo):<br>

          return (None, None, tags, False)<br>

      if cachefile:<br>

          cachefile.close()               # ignore rest of file<br>

  -    repoheads = repo.heads()<br>

+    ourheads = repo.heads()<br>

+    repo = repo.unfiltered()<br>

+    allheads = repo.heads()<br>

+<br>

      # Case 2 (uncommon): empty repo; get out quickly and don't bother<br>

      # writing an empty cache.<br>

-    if repoheads == [nullid]:<br>

+    if allheads == [nullid]:<br>

          return ([], {}, {}, False)<br>

        # Case 3 (uncommon): cache file missing or empty.<br>

  @@ -268,14 +271,14 @@ def _readtagcache(ui, repo):<br>

      # exposed".<br>

      if not len(repo.file('.hgtags')):<br>

          # No tags have ever been committed, so we can avoid a<br>

          # potentially expensive search.<br>

-        return (repoheads, cachefnode, None, True)<br>

+        return (ourheads, cachefnode, None, True)<br>

        starttime = time.time()<br>

        newheads = [head<br>

-                for head in repoheads<br>

+                for head in allheads<br>

                  if head not in set(cacheheads)]<br>

        # Now we have to lookup the .hgtags filenode for every new head.<br>

      # This is the most expensive part of finding tags, so performance<br>

@@ -297,9 +300,9 @@ def _readtagcache(ui, repo):<br>

             len(cachefnode), len(newheads), duration)<br>

        # Caller has to iterate over all heads, but can use the filenodes in<br>

      # cachefnode to get to each .hgtags revision quickly.<br>

-    return (repoheads, cachefnode, None, True)<br>

+    return (ourheads, cachefnode, None, True)<br>

    def _writetagcache(ui, repo, heads, tagfnode, cachetags):<br>

        try:<br>

@@ -309,29 +312,39 @@ def _writetagcache(ui, repo, heads, tagf<br>

        ui.log('tagscache', 'writing tags cache file with %d heads and %d tags\n',<br>

              len(heads), len(cachetags))<br>

  -    realheads = repo.heads()            # for sanity checks below<br>

+    # We want to carry forward tagfnode entries that belong to filtered revs,<br>

+    # even if they aren't in the explicit list of heads. Since entries in the<br>

+    # cache must be in descending revlog order, we need to merge the sets<br>

+    # before writing.<br>

+    #<br>

+    # When choosing what filenode entries to write, we must consider both the<br>

+    # filtered and unfiltered views. Otherwise, valid entries may be dropped.<br>

+    revs = {}<br>

+    ourheads = set(repo.heads())<br>

+    repo = repo.unfiltered()<br>

+    unfilteredheads = set(repo.heads())<br>

+    allheads = ourheads | unfilteredheads<br>

</blockquote></div></div>

/me confused: Why isn't unfilteredheads == allheads?<br><div><div></div></div></blockquote><br></div><div class="gmail_quote">There are two flavors of DAG heads: unfiltered and filtered. If a true DAG head is hidden, you'll need to record its parent, as it appears as a head on filtered repos.<br><br></div><div class="gmail_quote">This is a case not explicitly tested by this patch series. This code may even cause some heads to get dropped from the cache. However, that scenario should be rare and will only result in loss of a handful of heads, not the potentially hundreds that will be lost if all hidden DAG heads need recomputed.<br></div></div></div>