Issue1003

Title URLError: <urlopen error (10053, 'Software caused connection abort')> in mercurial\keepalive.pyc#_start_transaction
Priority urgent Status need-eg
Superseder Nosy List ThomasAH, VaanDeFanel, abuehl, aeomer, azraiyl, blacktrash, bos, jglick, jk, madesroches, mathos, mzlamal, pmezard
Assigned To ThomasAH Topics http_proto, windows

Created on 2008-02-25.14:44:27 by jglick, last changed 2008-11-21.07:41:20 by VaanDeFanel.

Messages
msg7951 (view) Author: VaanDeFanel Date: 2008-11-21.07:41:19
the error appears when i try to push a file bigger than 10Mo
msg7924 (view) Author: mathos Date: 2008-11-19.10:48:56
I am still puzzled here. Where are the PY scripts in a TortoiseHG install?
msg7858 (view) Author: azraiyl Date: 2008-11-06.16:16:09
A second quick fix that helps here. Add the following line to the method
_start_transaction in keepalive.py (maybe this fix reduces it to nomore-keepalive).

h.connect()
msg7857 (view) Author: azraiyl Date: 2008-11-06.15:56:12
Happens here too (with crew) if I "push -f" about 80 MB from Wintendo to a Linux
server (can repoproduce it). Don't have tested it thoroughly but if I replace

class HTTPHandler(KeepAliveHandler, urllib2.HTTPHandler):

with

class HTTPHandler(urllib2.HTTPHandler):

in keepalive.py the push finishes without an error.
msg7693 (view) Author: aeomer Date: 2008-10-28.17:39:33
Note: improper use of returned 'recv' can cause erroneous early termination of 
a client/server conversation.

Many developers use the value returned by the recv functions to determine the 
end of a transmission. TCP specification shows this is an erroneous assumption. 
It is quite possible to for recv to return a block length less than the 
supplied buffer and yet there still be data at the socket waiting to be read. 
The specification even allows for for recv to return a data length of zero even 
if data is waiting to be read, but for last 20 years or so all TCP stacks 
implement a return value of zero as data complete.

1. Please check you are not assuming if the returned data is less than the size 
of the buffer then the communication is complete.

2. Check you are not using an overly large read buffer. These are commonly 
quite small on some hardware (4K) although OS can often prevent issues here TCP 
spec says the code needs to take account of the issue rather than the TCP 
stack. That seems daft to me, but that's how it goes sometimes. Setting a 
buffer size of 32766 rarely goes wrong on Windows.

I have had exactly this problem in my encrypted comms server. Changing the 
logic to only terminate read on recv() returning zero fixed it.
msg7593 (view) Author: mathos Date: 2008-10-20.14:26:20
I have noticed this every time I do a large push from Windows (TortoiseHG or the
binaries from CLI) towards a Lighttpd or Apache2 server.

I did not reproduce this using the hg serve webserver.

If there is anything on information you require, I would be happy to provide it.
msg6046 (view) Author: ThomasAH Date: 2008-05-20.08:50:26
I just got an independent report of this problem when pushing a single 5MB
changeset via plain http to freehg.org using tortoise HG, so maybe I can use
that server to reproduce the problem.
msg6025 (view) Author: jglick Date: 2008-05-13.00:23:08
Continues to be observed periodically by Windows users pushing large numbers of
changesets at once.
msg5678 (view) Author: ThomasAH Date: 2008-03-21.20:58:08
I wasn't able to reproduce this either.
I used Windows XP with the TortoiseHG 0.3 installer as client and Mercurial
0.9.5 with hgweb.cgi with apache2 2.2.3-4+etch4 on the server.

I've noticed that Mercurial tries to reuse connections, but does not send
Connection: keep-alive. If keep-alive is sent, the server switches to chunked
transfer encoding with sometimes extremely small chunks, e.g. 

9
<a href="
13
/cgi-bin/hgweb.cgi?
9
shortlog/
c
509de8ab6f31
2
">
3
-60
5
</a> 


But this only happens if a proxy is used which translates the hg client's request.

mzlamal, jglick: I don't think a proxy is involved with https for you, right?
msg5628 (view) Author: bos Date: 2008-03-17.15:04:29
Backed out in c92cf92c55a6.
msg5627 (view) Author: blacktrash Date: 2008-03-17.11:27:19
the httplib of the python versions i have here (2.5, 2.3) 
only provide a send method, not sendall.
msg5626 (view) Author: blacktrash Date: 2008-03-17.10:58:56
91ac1726730a breaks pull here
(because remote doesn't provide connection.sendall?):

pulling from http://selenic.com/repo/hg
** unknown exception encountered, details follow
** report bug details to http://www.selenic.com/mercurial/bts
** or mercurial@selenic.com
** Mercurial Distributed SCM (version 5f404dc8acd0)
Traceback (most recent call last):
  File "/usr/local/bin/hg", line 20, in <module>
    mercurial.dispatch.run()
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 20, in run
    sys.exit(dispatch(sys.argv[1:]))
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 29, in dispatch
    return _runcatch(u, args)
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 45, in _runcatch
    return _dispatch(ui, args)
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 364, in _dispatch
    ret = _runcommand(ui, options, cmd, d)
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 417, in _runcommand
    return checkargs()
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 373, in checkargs
    return cmdfunc()
  File "/usr/local/lib/python2.5/site-packages/mercurial/dispatch.py", line 356, in <lambda>
    d = lambda: func(ui, repo, *args, **cmdoptions)
  File "/usr/local/lib/python2.5/site-packages/mercurial/commands.py", line 2034, in pull
    modheads = repo.pull(other, heads=revs, force=opts['force'])
  File "/usr/local/lib/python2.5/site-packages/mercurial/localrepo.py", line 1455, in pull
    fetch = self.findincoming(remote, heads=heads, force=force)
  File "/usr/local/lib/python2.5/site-packages/mercurial/localrepo.py", line 1288, in findincoming
    heads = remote.heads()
  File "/usr/local/lib/python2.5/site-packages/mercurial/httprepo.py", line 367, in heads
    d = self.do_read("heads")
  File "/usr/local/lib/python2.5/site-packages/mercurial/httprepo.py", line 351, in do_read
    fp = self.do_cmd(cmd, **args)
  File "/usr/local/lib/python2.5/site-packages/mercurial/httprepo.py", line 305, in do_cmd
    resp = urllib2.urlopen(request(cu, data, headers))
  File "/usr/local/lib/python2.5/urllib2.py", line 121, in urlopen
    return _opener.open(url, data)
  File "/usr/local/lib/python2.5/urllib2.py", line 374, in open
    response = self._open(req, data)
  File "/usr/local/lib/python2.5/urllib2.py", line 392, in _open
    '_open', req)
  File "/usr/local/lib/python2.5/urllib2.py", line 353, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python2.5/site-packages/mercurial/httprepo.py", line 108, in http_open
    return self.do_open(httpconnection, req)
  File "/usr/local/lib/python2.5/site-packages/mercurial/keepalive.py", line 241, in do_open
    self._start_transaction(h, req)
  File "/usr/local/lib/python2.5/site-packages/mercurial/keepalive.py", line 313, in _start_transaction
    h.request(req.get_method(), req.get_selector(), body, headers)
  File "/usr/local/lib/python2.5/httplib.py", line 862, in request
    self._send_request(method, url, body, headers)
  File "/usr/local/lib/python2.5/httplib.py", line 885, in _send_request
    self.endheaders()
  File "/usr/local/lib/python2.5/httplib.py", line 856, in endheaders
    self._send_output()
  File "/usr/local/lib/python2.5/httplib.py", line 728, in _send_output
    self.send(msg)
  File "/usr/local/lib/python2.5/site-packages/mercurial/httprepo.py", line 99, in _sendfile
    connection.sendall(self, data)
AttributeError: class HTTPConnection has no attribute 'sendall'
msg5624 (view) Author: bos Date: 2008-03-17.06:01:36
I haven't been able to reproduce this, but I found a simple and very obvious bug
that I think is probably responsible.  Please try 91ac1726730a and let me know
if it makes the problem go away.
msg5436 (view) Author: bos Date: 2008-02-29.16:43:44
Thanks, Michal and Jesse.  I'll take a stab at reproducing this in my local
environment tonight.
msg5434 (view) Author: mzlamal Date: 2008-02-29.15:18:57
Here is one of debug outputs:
[C:/Sun/rave/main] hg push -v --debug
using https://hg.netbeans.org/main/
http auth: user quynguyen, password *******
pushing to https://quynguyen:*******@hg.netbeans.org/main/
sending capabilities command
capabilities: unbundle=HG10GZ,HG10BZ,HG10UN lookup forests changegroupsubset
sending heads command
searching for changes
common changesets up to 8a23d48ae31f
9 changesets found
List of changesets:
7cb6c863cceeee62c038a712680ad3e466bea98d
05ea6b12eb5b7de517b8514a13510b6d7cdb384e
4184ef7b5e72c0c62cead3393089338eb6d286ed
14e9cb229811ae99e5c5a8da32ed81ac868e4009
94c759beb7e7b4a0d225ef65c04e7081e53f8d7e
2abc3e434950dbba356b1dceada09283b89dd4c5
0962e89828617d0d9c8ca644f4bc745664ee99ea
cb022e43b17204cf522cfe5e5262fc75c788d87d
0545645c7a042d0dcc71b4add586ee5cedcf4189
sending unbundle command
sending 32967 bytes
abort: error: Software caused connection abort

BTW If you like more details about the server side setup, feel free to ask for 
specific parts. I should be able to provide them to you than.
msg5431 (view) Author: jglick Date: 2008-02-29.14:54:06
mzlamal may be able to add more, but:

SunOS hg 5.11 snv_62 i86pc i386 i86pc
Mercurial Distributed SCM (version 0.9.5)
Apache (some version), serving HTTPS

I don't know of any particular correlation to the amount of data pushed, though
that would certainly be a plausible place to look. I have heard people say that
they had been pushing fine for some weeks and one day the push failed with this
message, then a while later it started working again. I can't imagine they were
pushing _less_ when it started working again; if anything they would be pushing
slightly more (another merge changeset).
msg5422 (view) Author: bos Date: 2008-02-29.05:02:21
Jesse, could you tell me a little about your server-side setup, so I can try to
reproduce this?

Also, do you know of any relationship between the amount of data you have to
push and how likely this crash is?
msg5385 (view) Author: jglick Date: 2008-02-25.14:44:23
It seems a lot of Windows users get a sporadic error pushing over HTTPS:

C:\....>hg push --traceback
pushing to [.....]
searching for changes
Traceback (most recent call last):
 File "mercurial\dispatch.pyc", line 45, in _runcatch
 File "mercurial\dispatch.pyc", line 348, in _dispatch
 File "mercurial\dispatch.pyc", line 401, in _runcommand
 File "mercurial\dispatch.pyc", line 357, in checkargs
 File "mercurial\dispatch.pyc", line 340, in <lambda>
 File "mercurial\commands.pyc", line 2147, in push
 File "mercurial\localrepo.pyc", line 1385, in push
 File "mercurial\localrepo.pyc", line 1465, in push_unbundle
 File "mercurial\httprepo.pyc", line 425, in unbundle
 File "mercurial\httprepo.pyc", line 302, in do_cmd
 File "urllib2.pyc", line 130, in urlopen
 File "urllib2.pyc", line 358, in open
 File "urllib2.pyc", line 376, in _open
 File "urllib2.pyc", line 337, in _call_chain
 File "mercurial\httprepo.pyc", line 119, in https_open
 File "mercurial\keepalive.pyc", line 224, in do_open
 File "mercurial\keepalive.pyc", line 272, in _reuse_connection
 File "mercurial\keepalive.pyc", line 315, in _start_transaction
URLError: <urlopen error (10053, 'Software caused connection abort')>
abort: error: *Software caused connection abort*

No obvious way to reproduce, just happens a lot when trying to push.

A quick search turned up

http://www.velocityreviews.com/forums/t329892-socket-error-10053-software-caused-connection-abort.html

which implies a bug in Mercurial socket-handling code, though I know no more
than that.
History
Date User Action Args
2008-11-21 07:41:20VaanDeFanelsetnosy: + VaanDeFanel
messages: + msg7951
2008-11-19 10:48:56mathossetnosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal, madesroches, azraiyl, jk, mathos, aeomer
messages: + msg7924
2008-11-06 16:16:10azraiylsetnosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal, madesroches, azraiyl, jk, mathos, aeomer
messages: + msg7858
2008-11-06 15:56:14azraiylsetnosy: + azraiyl
messages: + msg7857
2008-10-28 17:39:37aeomersetnosy: + aeomer
messages: + msg7693
2008-10-20 14:26:24mathossetnosy: + mathos
messages: + msg7593
2008-07-28 21:37:03jksetnosy: + jk
2008-06-18 21:46:44mpmsetstatus: chatting -> need-eg
nosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal, madesroches
2008-05-23 02:53:42madesrochessetnosy: + madesroches
2008-05-20 08:50:26ThomasAHsetnosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal
messages: + msg6046
assignedto: bos -> ThomasAH
2008-05-13 00:23:11jglicksetnosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal
messages: + msg6025
2008-03-21 20:58:11ThomasAHsetnosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal
messages: + msg5678
2008-03-17 20:35:55bossetstatus: testing -> chatting
nosy: bos, ThomasAH, blacktrash, pmezard, jglick, abuehl, mzlamal
2008-03-17 15:16:28pmezardsetnosy: + pmezard
2008-03-17 15:04:29bossetnosy: bos, ThomasAH, blacktrash, jglick, abuehl, mzlamal
messages: + msg5628
2008-03-17 13:02:26ThomasAHsetnosy: + ThomasAH
2008-03-17 11:27:19blacktrashsetnosy: bos, blacktrash, jglick, abuehl, mzlamal
messages: + msg5627
2008-03-17 10:58:59blacktrashsetnosy: + blacktrash
messages: + msg5626
2008-03-17 06:01:36bossetstatus: chatting -> testing
nosy: bos, jglick, abuehl, mzlamal
messages: + msg5624
2008-02-29 16:43:44bossetnosy: bos, jglick, abuehl, mzlamal
messages: + msg5436
2008-02-29 15:18:57mzlamalsetnosy: bos, jglick, abuehl, mzlamal
messages: + msg5434
2008-02-29 14:54:06jglicksetnosy: + mzlamal
messages: + msg5431
2008-02-29 05:03:12bossetnosy: bos, jglick, abuehl
assignedto: bos
2008-02-29 05:02:21bossetstatus: unread -> chatting
nosy: + bos
messages: + msg5422
2008-02-25 15:48:54abuehlsetnosy: + abuehl
2008-02-25 14:44:27jglickcreate