Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Linus definitely did his own thing with git. The general ideas came from BK, BK gave you clone/pull/push/commit as the model. Everyone copied that because it just makes sense. The all or nothing clone model came from BK.

How it is all glued together differs quite a bit. BK has the concept of a revisioned file, git does not, it versions trees. That's why Linus thinks renames are silly, he doesn't care about them, he cares about the tree.

The graphs of commits comes straight from BK, that's BK's changeset file - which is sort of neat in that it is a version controlled file itself. BK is the only system that I know of that uses a versioned file to store the metadata.

OK, so on the business model thing, I'm not sure. The way we did the old compatible format is compatible but it's pretty slow, it converts to the new format in memory and then converts back if you write it out. It's slower than the older implementation (but this way we have one in memory format, less bugs). I thought it was good enough for small projects, my team overrode me and said "too slow".

As for enterprise customers "happily paying", um, no. We constantly get wacked with "if you don't do this or that we're moving to git". Which could be viewed as a good thing, we have to keep making it better, but it gets tiresome.



Thanks! Renames make archeology difficult in git. I've become reluctant to change {file,directory} names, even when it's clearer...

BTW: What are the benefits of versioning changesets themselves? Isn't it rare to only change the changeset?

Chained conversions are elegant, but slower code is unsatisfying... I guess such hobbling is the essence of open-source-as-freemium. :(

I meant they "happily pay" for full over free versions. (For them, it's also paying for "new" features!)


Renames are a thing and git made the wrong choice there. It's not like we are perfect but we are way closer.

So on versioning changesets I didn't really explain. Lemme try again.

In any DVCS you have a bill of materials, that's what describes the tree. Git's is different than ours because they don't version files, we do. So our bill of materials looks like:

  path/to/file <version>
  path/to/different_file 1.1
  path/different_dir/a_file 1.19
If you "cat" the changeset file as of any version you get what the tree looks like, a list of files and a list of revisions.

Of course it doesn't work like that because, um, reality and merges and parallel development. We have UUIDs for each file and each version so it looks like

UUID_for_a_file UUID_for_a_version

and our UUIDs are pretty sweet, not sha1 or some other useless thing, they are

user@host|path/to/where/it/was|YYYYMMDDHHMMSS|checksum

those are for each node in the graph, for the very first node which is the UUID for the file, there is a "|<64 bits of /dev/random>" appended.

So the changeset file is just a list of

UUID UUID

Not sure if that helps.

The benefit of versioning the file that holds all that data is we can use BK to ask it stuff. Want to see the history of the repo? bk revtool ChangeSet Want to see what files changed in a commit? bk diffs -r$commit ChangeSet Yeah, we have to process all the UUIDs and turn them into pathnames and revisions but we can do that and do it fast. So it works.

All the tools we built to look at stuff can look at the metadata. That's worked out well.


Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: