Created on 2008-08-21.21:10:56 by pko, last changed 2008-10-18.15:58:25 by mpm.
| msg6927 (view) |
Author: tonfa |
Date: 2008-09-03.00:56:09 |
|
I was more thinking the filename processing rather than the stat stuff :)
But you remarks are interesting. Implementing osutil for windows shouldn't be
very hard for a skilled windows dev.
|
| msg6925 (view) |
Author: pko |
Date: 2008-09-03.00:44:15 |
|
In general the status can be made much faster on Win32, but it requires taking
advantage of Windows specific FindFirstFile/FindNextFile behavior. The Win32
find iterator structure returns almost all the info required by stat (date/size)
except chmod related values which on windows are made up anyway. With some extra
arithmetic for date conversion the full fstat can be easily produced.
Quick test on moderately large tree (29k files/1300 directories) the C code that
walks the whole tree and produces information equivalent to stat generated data
takes about 140ms to complete. The equivalent Python code using ctypes takes
about 400ms. Current hg stat with the recent fix spends about 1.6s in the
distate.status call for same tree.
For larger tree - 76k files / 2800 directories the status vs non-status walk in
pure C takes 386ms vs 2900ms respectively - so the overhead of the extra status
calls is quite obvious.
So eliminating stat and moving some of the python code to C would be definitely
an option if performance on large trees.
Looking at bzr source tree - they actually seem to be doing exactly this -
taking advantage of the native Win32 trick when tree during stat calls. They use
Pyrex instead of directly implemented C module for the native code.
|
| msg6922 (view) |
Author: tonfa |
Date: 2008-09-02.23:35:37 |
|
in crew 5e1a867e5d65
It stills needs a cleanup or something. We can make that faster on windows
(although I can't test).
|
| msg6890 (view) |
Author: abuehl |
Date: 2008-09-01.13:35:00 |
|
Important point is as Andrei Vermel (avermel) wrote in
http://www.selenic.com/pipermail/mercurial-devel/2008-August/007371.html:
On 02.08.2008 23:55, Andrei Vermel wrote:
> Note that os.path.normcase(os.path.normpath(path)) is applied to map keys
> when _foldmap is originally filled. When _foldmap is indexed in
> _normalize(), apparently the same needs to be done.
|
| msg6888 (view) |
Author: abuehl |
Date: 2008-09-01.12:33:05 |
|
checkin also got considerably slower with 1e2850ed8171.
hg ci of a single changed file of the netbeans repo
http://hg.netbeans.org/main/misc/
went from 2.5 sec (version d8367107da05 of hg) to 4.7 sec (version
1e2850ed8171) on my WinXP box.
Bisecting this speed regression leads to 1e2850ed8171 as well.
|
| msg6829 (view) |
Author: abuehl |
Date: 2008-08-22.07:42:26 |
|
It looks like avermel's and pko's patch are logically
identical, solving two independent problems with the same
code change.
I'm adding p.f.moore to the nosy list. And mpm, as he was the one
who did the refactoring in 1e2850ed8171 (dirstate: simplify normalize logic).
It's unfortunate that Windows testing is so much harder (compared
to how easy it is to run the testsuite on unix).
|
| msg6828 (view) |
Author: avermel |
Date: 2008-08-22.05:34:52 |
|
The same bug also causes this:
D:\distribs\mercurial\qqq>hg init
D:\distribs\mercurial\qqq>echo 1>aaa.txt
D:\distribs\mercurial\qqq>echo 1>BBB.txt
D:\distribs\mercurial\qqq>hg add
adding BBB.txt
adding aaa.txt
D:\distribs\mercurial\qqq>hg ci -m 1
D:\distribs\mercurial\qqq>mv aaa.txt AAA.txt
D:\distribs\mercurial\qqq>mv BBB.txt bbb.txt
D:\distribs\mercurial\qqq>hg stat
? AAA.txt // aaa.txt and bbb.txt behave differently!
See http://www.selenic.com/pipermail/mercurial-devel/2008-August/007367.html
|
| msg6827 (view) |
Author: pko |
Date: 2008-08-21.21:10:55 |
|
the denormalization of path causes most of the calls to _normalize to miss the
fold map
on large repos this causes significant slowdown (factor of 3 on 29k files where
stat goes from 1.7s to 9s)
following patch fixes the problem:
diff -r 6dcbe191a9b5 mercurial/dirstate.py
--- a/mercurial/dirstate.py Mon Aug 18 16:50:36 2008 -0500
+++ b/mercurial/dirstate.py Thu Aug 21 17:05:34 2008 -0400
@@ -344,11 +344,12 @@
self._ui.warn(_("not in dirstate: %s\n") % f)
def _normalize(self, path):
- if path not in self._foldmap:
+ npath = os.path.normcase(os.path.normpath(path))
+ if npath not in self._foldmap:
if not os.path.exists(path):
return path
- self._foldmap[path] = util.fspath(path, self._root)
- return self._foldmap[path]
+ self._foldmap[npath] = util.fspath(path, self._root)
+ return self._foldmap[npath]
def clear(self):
self._map = {}
|
|
| Date |
User |
Action |
Args |
| 2008-10-18 15:58:25 | mpm | set | status: testing -> resolved nosy:
mpm, tonfa, tksoh, pmezard, p.f.moore, abuehl, avermel, pko |
| 2008-10-18 15:58:10 | mpm | set | nosy:
mpm, tonfa, tksoh, pmezard, p.f.moore, abuehl, avermel, pko |
| 2008-09-04 12:57:47 | tksoh | set | nosy:
+ tksoh |
| 2008-09-03 00:56:09 | tonfa | set | nosy:
mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko messages:
+ msg6927 |
| 2008-09-03 00:44:16 | pko | set | nosy:
mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko messages:
+ msg6925 |
| 2008-09-02 23:35:37 | tonfa | set | status: in-progress -> testing nosy:
mpm, tonfa, pmezard, p.f.moore, abuehl, avermel, pko messages:
+ msg6922 |
| 2008-09-02 14:52:19 | tonfa | set | nosy:
+ tonfa |
| 2008-09-01 13:35:01 | abuehl | set | nosy:
mpm, pmezard, p.f.moore, abuehl, avermel, pko messages:
+ msg6890 |
| 2008-09-01 12:33:05 | abuehl | set | nosy:
mpm, pmezard, p.f.moore, abuehl, avermel, pko messages:
+ msg6888 |
| 2008-08-22 07:42:26 | abuehl | set | nosy:
+ p.f.moore, mpm messages:
+ msg6829 |
| 2008-08-22 06:44:30 | pmezard | set | nosy:
+ pmezard |
| 2008-08-22 05:34:54 | avermel | set | nosy:
+ avermel messages:
+ msg6828 |
| 2008-08-21 22:30:25 | abuehl | set | topic:
+ patch |
| 2008-08-21 22:29:59 | abuehl | set | nosy:
+ abuehl |
| 2008-08-21 21:14:51 | pko | set | title: hg status got flow with changes introduces in 1e2850ed8171 -> hg status got very slow with changes introduces in 1e2850ed8171 |
| 2008-08-21 21:10:56 | pko | create | |
|