Avoid 00changelog.i corruption

Ingo Proetel proetel at aicas.de
Mon Jul 20 02:56:56 CDT 2009


Hi,

I came across a problem while using VirtualBox running Ubuntu on a Windows XP host. When pushing a change from inside
the VirtualBox to a repo that resides on a shared folder (a folder that exists on the underlying NTFS and is mounted in
Ubuntu) I got a corrupted 00changelog.i. I found that python has a problem with opening files to append to. Apparently
NULL characters are included when writing into such a file. While a broken append function is not a mercurial problem
having corrupted data is a mercurial problem. Data corruption is not a problem you want to have with a version control
system. A clean failure and rollback is acceptable thought.

So I would suggest the following patch (or something a like it) to try to make sure that the written data is what is
expected.

$ hg log -p -r 9046
changeset:   9046:91b293cf8a5e
tag:         tip
user:        Ingo Proetel <proetel at aicas.com>
date:        Sat Jul 18 01:19:03 2009 +0200
summary:     Make writing of changelog index more robust against data corruption.

diff -r 996c1cd8f530 -r 91b293cf8a5e mercurial/changelog.py
--- a/mercurial/changelog.py	Tue Jul 14 14:05:07 2009 +0200
+++ b/mercurial/changelog.py	Sat Jul 18 01:19:03 2009 +0200
@@ -7,7 +7,7 @@

 from node import bin, hex, nullid
 from i18n import _
-import util, error, revlog, encoding
+import util, error, revlog, encoding,os

 def _string_escape(text):
     """
@@ -107,9 +107,22 @@
         if self._delayname:
             util.rename(self._delayname + ".a", self._delayname)
         elif self._delaybuf:
-            fp = self.opener(self.indexfile, 'a')
-            fp.write("".join(self._delaybuf))
-            fp.close()
+            fin = self.opener(self.indexfile,'rb')
+            fout = self.opener(self.indexfile+".a",'wb')
+            startsize=os.path.getsize(fin.name)
+            fout.write(""+fin.read())
+            fout.write("".join(self._delaybuf))
+            fout.flush()
+            endsize=os.path.getsize(fout.name)
+            fout.close()
+            fin.close()
+
+            if (endsize - startsize) == len("".join(self._delaybuf))):
+                util.rename(fout.name , fin.name)
+            else
+                os.remove(fout.name)
+                raise error.RevlogError(_("unexpected amount of data written to " + self.indexfile))
+
             self._delaybuf = []
         # split when we're done
         self.checkinlinesize(tr)

Cheers,
Ingo


More information about the Mercurial mailing list