Monday, January 04, 2010

Give in to the Darcs Side

Continuing from my last post, I did want to talk about SCM I will not be considering:

Darcs
  • Written in Haskell, considered the poster-program for Haskell. Haskell is a functional language, similar to ML. Given the horror of ML, and the behavior of Darcs - I am not motivated.
  • Operates on a "theory of patches". Finding the right set of patches, and applying them in the right order. I don't believe it is possible for a machine (currently) to handle this level of complexity. Heck, I don't know that I can handle the complexity!
  • Ate our repo, twice. Losing the repo is an unforgivable sin for SCM.
  • Was really slow. They are supposed to have fixed the number of "exponential merges" (nice). But it was slow all the time. The exponential merges were just insult to injury.
BitKeeper
  • Commerical software. No way I am going to pay for software for my personal projects.
  • Also ate the repo. Maybe twice.
SVN
  • No distributed model
  • Logs and history are not available offline
  • While creating branches is now easier than CVS, merging them back is still just as hard (this is a hard problem, so I don't blame them)

3 comments:

kowey said...

Ned: thanks for your post. I stumbled on it recently on my semi-regular trawling of Google's blogsearch on Darcs.

Indeed, it is unforgivable for Darcs to have eaten your repos. Sorry.

If you still have them around somewhere, we'd love to look into them (bugs@darcs.net). Perhaps the breakage is due to something we've already fixed (and we've been doing a lot of fixing since the last time you've used Darcs). Some causes for errors might be corruption in your pristine directory (which Darcs 2 now more skillfully avoids by hashing that dir), or pending patch errors (which David Roundy has squashed).

Now to address some other points:

1. Perhaps a better poster-program for Haskell nowadays would be xmonad. Oh, we'll keep working on Darcs, continuously removing suckage, even if it takes us a decade to get there. Hopefully Haskell will be vindicated when we do :-)

2. The theory of patches stuff is mostly just boring accounting work. All it cares about are straightforward dependencies: if I apply this patch, will I be able to apply that other patch? There's no voodoo there, just induction.

I don't think that Darcs does anything necessarily harder than what Git and the others do; we just try to be more precise about it. Rather than guessing where patches apply with fuzz factors and what not, we simply keep track of the previous patches and see if the new patch lines up. That's it.

My point is what the theory of patches does, ultimately, is to replace guesswork with tracking. No magic...

True, things do get tricky with conflicts, and this is still a bit of an open question for us, but for a lot of use cases (including our own), what we have works well enough. We will have to do some more long-term conflicts work, however.

3. Slow: It's still slow, and you're exactly right that the exponential merge stuff is not the key problem.

There are a lot of areas we've been making progress in: ssh connection sharing, faster repo-local operations, global caching and lazy fetching of patches. These are the kinds of things that have been happening since Darcs 2 and up.

We still have a long ways to go. We still need to make darcs get a lot faster (by borrowing the notion of packs, for example). We still need to reduce the amount of memory we're using. We still need to tweak the complexity on some of our inventory manipulating operations. But we will get there.

Hope some of this has been useful :-)

nedbrek said...

Hello, welcome!

It was for a company project, and I no longer work there...

kowey said...

That's a shame. Oh well, thanks anyway.