Issue1273

Title hg status got very slow with changes introduces in 1e2850ed8171
Priority bug Status resolved
Superseder Nosy List abuehl, avermel, mpm, p.f.moore, pko, pmezard, tksoh, tonfa
Assigned To Topics patch

Created on 2008-08-21.21:10:56 by pko, last changed 2008-10-18.15:58:25 by mpm.

Messages
msg6927 (view) Author: tonfa Date: 2008-09-03.00:56:09
I was more thinking the filename processing rather than the stat stuff :)

But you remarks are interesting. Implementing osutil for windows shouldn't be
very hard for a skilled windows dev.
msg6925 (view) Author: pko Date: 2008-09-03.00:44:15
In general the status can be made much faster on Win32, but it requires taking
advantage of Windows specific FindFirstFile/FindNextFile behavior. The Win32
find iterator structure returns almost all the info required by stat (date/size)
except chmod related values which on windows are made up anyway. With some extra
arithmetic for date conversion the full fstat can be easily produced. 

Quick test on moderately large tree (29k files/1300 directories) the C code that
walks the whole tree and produces information equivalent to stat generated data
takes about 140ms to complete. The equivalent Python code using ctypes takes
about 400ms. Current hg stat with the recent fix spends about 1.6s in the
distate.status call for same tree.

For larger tree - 76k files / 2800 directories the status vs non-status walk in
pure C takes 386ms vs 2900ms respectively - so the overhead of the extra status
calls is quite obvious. 



So eliminating stat and moving some of the python code to C would be definitely
an option if performance on large trees. 

Looking at bzr source tree - they actually seem to be doing exactly this -
taking advantage of the native Win32 trick when tree during stat calls. They use
Pyrex instead of directly implemented C module for the native code.
msg6922 (view) Author: tonfa Date: 2008-09-02.23:35:37
in crew 5e1a867e5d65

It stills needs a cleanup or something. We can make that faster on windows
(although I can't test).
msg6890 (view) Author: abuehl Date: 2008-09-01.13:35:00
Important point is as Andrei Vermel (avermel) wrote in
http://www.selenic.com/pipermail/mercurial-devel/2008-August/007371.html:

On 02.08.2008 23:55, Andrei Vermel wrote:
> Note that os.path.normcase(os.path.normpath(path)) is applied to map keys
> when _foldmap is originally filled. When _foldmap is indexed in
> _normalize(), apparently the same needs to be done.
msg6888 (view) Author: abuehl Date: 2008-09-01.12:33:05
checkin also got considerably slower with 1e2850ed8171.

hg ci of a single changed file of the netbeans repo
http://hg.netbeans.org/main/misc/
went from 2.5 sec (version d8367107da05 of hg) to 4.7 sec (version
1e2850ed8171) on my WinXP box.

Bisecting this speed regression leads to 1e2850ed8171 as well.
msg6829 (view) Author: abuehl Date: 2008-08-22.07:42:26
It looks like avermel's and pko's patch are logically
identical, solving two independent problems with the same
code change.

I'm adding p.f.moore to the nosy list. And mpm, as he was the one
who did the refactoring in 1e2850ed8171 (dirstate: simplify normalize logic).

It's unfortunate that Windows testing is so much harder (compared
to how easy it is to run the testsuite on unix).
msg6828 (view) Author: avermel Date: 2008-08-22.05:34:52
The same bug also causes this:

D:\distribs\mercurial\qqq>hg init
D:\distribs\mercurial\qqq>echo 1>aaa.txt
D:\distribs\mercurial\qqq>echo 1>BBB.txt
D:\distribs\mercurial\qqq>hg add
adding BBB.txt
adding aaa.txt
D:\distribs\mercurial\qqq>hg ci -m 1
D:\distribs\mercurial\qqq>mv aaa.txt AAA.txt
D:\distribs\mercurial\qqq>mv BBB.txt bbb.txt
D:\distribs\mercurial\qqq>hg stat
? AAA.txt                    // aaa.txt and bbb.txt behave differently!

See http://www.selenic.com/pipermail/mercurial-devel/2008-August/007367.html
msg6827 (view) Author: pko Date: 2008-08-21.21:10:55
the denormalization of path causes most of the calls to _normalize to miss the
fold map 

on large repos this causes significant slowdown (factor of 3 on 29k files where
stat goes from 1.7s to 9s) 

following patch fixes the problem: 

diff -r 6dcbe191a9b5 mercurial/dirstate.py
--- a/mercurial/dirstate.py     Mon Aug 18 16:50:36 2008 -0500
+++ b/mercurial/dirstate.py     Thu Aug 21 17:05:34 2008 -0400
@@ -344,11 +344,12 @@
             self._ui.warn(_("not in dirstate: %s\n") % f)

     def _normalize(self, path):
-        if path not in self._foldmap:
+        npath = os.path.normcase(os.path.normpath(path))
+        if npath not in self._foldmap:
             if not os.path.exists(path):
                 return path
-            self._foldmap[path] = util.fspath(path, self._root)
-        return self._foldmap[path]
+            self._foldmap[npath] = util.fspath(path, self._root)
+        return self._foldmap[npath]

     def clear(self):
         self._map = {}
History
Date User Action Args
2008-10-18 15:58:25mpmsetstatus: testing -> resolved
nosy: mpm, tonfa, tksoh, pmezard, p.f.moore, abuehl, avermel, pko
2008-10-18 15:58:10mpmsetnosy: mpm, tonfa, tksoh, pmezard, p.f.moore, abuehl, avermel, pko
2008-09-04 12:57:47tksohsetnosy: + tksoh
2008-09-03 00:56:09tonfasetnosy: mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko
messages: + msg6927
2008-09-03 00:44:16pkosetnosy: mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko
messages: + msg6925
2008-09-02 23:35:37tonfasetstatus: in-progress -> testing
nosy: mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko
messages: + msg6922
2008-09-02 14:52:19tonfasetnosy: + tonfa
2008-09-01 13:35:01abuehlsetnosy: mpm, pmezard, p.f.moore, abuehl, avermel, pko
messages: + msg6890
2008-09-01 12:33:05abuehlsetnosy: mpm, pmezard, p.f.moore, abuehl, avermel, pko
messages: + msg6888
2008-08-22 07:42:26abuehlsetnosy: + p.f.moore, mpm
messages: + msg6829
2008-08-22 06:44:30pmezardsetnosy: + pmezard
2008-08-22 05:34:54avermelsetnosy: + avermel
messages: + msg6828
2008-08-21 22:30:25abuehlsettopic: + patch
2008-08-21 22:29:59abuehlsetnosy: + abuehl
2008-08-21 21:14:51pkosettitle: hg status got flow with changes introduces in 1e2850ed8171 -> hg status got very slow with changes introduces in 1e2850ed8171
2008-08-21 21:10:56pkocreate