2011/10/23 Greg Ward <span dir="ltr"><<a href="mailto:greg@gerg.ca">greg@gerg.ca</a>></span><br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Hi all --<br>

<br>

a few days ago, Eli Carter perfectly described the confusion about<br>

stores and caches with largefiles:<br>

<br>

  <a href="http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/44912" target="_blank">http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/44912</a><br>

<br>

The ensuing thread got us somewhere, and I think the patches sent by<br>

Benjamin as a result helped. But I'm still confused about a rather<br>

fundamental point: on the client, why do we need *both* a user cache<br>

(currently ~/.cache/largefiles) *and* a local store (.hg/largefiles)?<br>

<br>

The server-side is fairly clear: we must have a complete and canonical<br>

store containing every revision of every large file in history. That<br>

is what .hg/largefiles is for *on the server* (right?). And there is<br>

no need for a cache on the server, because no one has a working dir on<br>

the server. (And if they did, I suppose you could just take large file<br>

revs straight from the store.)<br>

<br>

(Yes, all this talk of clients and servers is unorthodox in DVCS<br>

circles, but you know what I mean. There's a big difference between<br>

the repo on your local disk where you work, and the repo out there on<br>

the network that you push to. largefiles just takes this existing<br>

informal distinction and makes it a little more formal; it injects a<br>

little old-fashioned client/server talk from the bygone days of CVCS<br>

into a modern DVCS. Yes it's impure, but "practicality beats purity".)<br>

<br>

But back on the client, where I pull and push and update and commit,<br>

what purpose does .hg/largefiles server? Having a local cache is<br>

obviously a good thing, although it's not essential. (I never got<br>

around to implementing caching with bfiles, and we've lived without<br>

it. It wastes bandwidth and increases network uptime requirements, but<br>

our LAN at work is fast and reliable. And our biggest bfile is ~30 MB:<br>

peanuts by game developer standards.)<br>

<br>

More importantly, the very meaning of .hg/largefiles appears to be<br>

inconsistent from reading hgext/largefiles/design.txt: on the server,<br>

it contains every revision of every largefile ("complete and<br>

canonical"). But on the client, it's just a subset of that. So ...<br>

it's ... like ... a cache. Except it's not called a cache; that's what<br>

~/.cache/largefiles is. Huh?<br>

<br>

The only reason I can see for having something other than a cache is<br>

for outgoing revs: if I do<br>

<br>

  hg add --large largefile<br>

  hg commit<br>

  # modify largefile<br>

  hg commit<br>

  # modify largefile<br>

  hg commit<br>

  hg push<br>

<br>

Then push has to send 3 revs to the server. Only one of them is in the<br>

working dir, and even that's not guaranteed: the user might have<br>

modified it post-commit. So those committed revs have to come from<br>

somewhere. Keeping them in the cache is risky, because users are<br>

allowed to nuke their ~/.cache directory whenever they like. (And<br>

maybe someday we'll add LRU semantics with maximum disk space to the<br>

largefiles cache.)<br>

<br>

If *that* is the purpose of .hg/largefiles on the client, then I<br>

understand. But I think it's dangerous using .hg/largefiles as a<br>

complete canonical store on the server, and as<br>

subset-of-the-store-that-is-kinda-like-a-cache-but-not-really-and-oh-by-the-way-also-holds-outgoing-revs<br>

on the client.<br>

<br>

Why not .hg/lfoutgoing?<br>

<br>

Greg<br clear="all"></blockquote></div><br>I think the fundamental thing you are missing here is that it is quite possible for a user to have multiple clones that share the same set of largefiles.  If there is a team that uses branch-by-cloning, this is almost *certainly* the case.  Our team does, and I'm sure there are still others -- which will continue to be the case until either feature-branching-by-named-branches is no longer discouraged or bookmarks are actually supported in the real world (which means by hosting solutions, continuous integration solutions, etc).<br>

<br>By storing a copy of all of the largefiles in a local cache somewhere, the user, when they make a new branch clone, or update to a revision that needs one of the lagefiles that is used by another clone, they can simply copy it out of the cache, rather than re-download it, thus saving bandwidth (which is one of the goals of this extension anyway).<br>

<br>Cheers,<br>Na'Tosha<br><br>-- <br><div><div><span style="color: rgb(153, 153, 153);"><b>Na'Tosha Bard</b></span></div><div><font color="#999999">Build & Infrastructure Developer | Unity Technologies</font></div>

<div><font color="#999999"><br></font></div><div><font color="#999999"><b>E-Mail:</b> <a href="mailto:natosha@unity3d.com" target="_blank">natosha@unity3d.com</a></font></div><div><font color="#999999"><b>Skype:</b> natosha.bard</font></div>

</div><br>