Phew, I finally did it. Migrating part of an svn-repository to a new repository while changing the destination paths isn’t as trivial as the svn manual implies. The workflow according to the book should look like this:
- Dump the repository while filtering with svndumpfilter
- Edit the dumpfile to change paths
- Load the dumpfile into the new repository
While this is basically what I did, it wasn’t as straight forward as it sounds…
Dumping and filtering the old repository isn’t that hard. The fact that you have to dump the whole repository and just snatch the relevant parts makes it annoying and slow, but it’s still doable:
svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/family/clockwork > clockwork.svndump
So, now we have the relevant portions of the old repository neatly in a dumpfile, let’s start editing. I wanted to move the project to a new repository in /group/work/svnrepo/ into a parent directory projects/department/clockwork/client. Ok, that’s easy. Just open the dumpfile in an editor and search-replace family with department and clockwork with clockwork/client. So far so good.
After checking and double checking that all the paths were right, it was now time for the third part: importing (this is going really smoothly, right?).
Here is were the troubles started piling up. If the repository contains binary files, svn calculates an md5 checksum of it and adds that to the dump. Apparently, when editing the text-part of the dumpfile, something in the binary data get’s screwed up as well, changing the md5 sum of the binary file. This results in svn throwing the folloing error at you:
svnadmin load /group/work/svnrepo/ < clockwork.svndump
svnadmin: Checksum mismatch, file '/projects/family/clockwork/clockwork.jpg':
expected: 7d72863dab994058bbca4622d54fe21d
actual: 0bf7ea2bc06df62039375bb2cb3ffbd2
Crap. So, what to do? Of course I tried dumping the repo again, using another editor to edit the file, filtering the dumpfile through sed, checking the validity of the repository, etc etc. Nothing worked. Even Google didn’t want to give me any answers, so I was getting quite desperate.
As a final trick, I tried to filter the stream coming from svnadmin dump directly through sed before redirecting the output to a file. There was quite a lot of sed’ing so the command grew into a monster:
svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/family/clockwork | sed 's/\/family/\/department/g' | sed 's/\/clockwork\//\/clockwork\/client\//g' > clockwork.svndump
Now, after this seemingly equivalent operation to that of streaming the dumpfile saved on disk through sed, the svnadmin load suddenly worked!
svnadmin load /group/work/svnrepo/ < clockwork.svndump
Amazing… I don’t know why this works like this, but I’m happy that it does. Actually I would be even more happy if it had worked the way we all think it would…
But it didn’t stop here. Nooo. While I was comfortably assured that I had solved the deepest mysteries of svn, I was actually in line for a new ride on the horror train.
The next project I was about to migrate was actually an immigrant in a sense. It had started off in /projects/rolex, but after an svn copy operation it had moved to /projects/other/rolex. Now this was a true problem for svndumpfilter while filtering the dumpfile.
Naturally, since the project at the time of the dump resides in /projects/other/rolex, this is what we want to dump. Like this:
svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex > rolex.svndump
Svn wasn’t at all happy with this. For me rolex’ past was long forgotten, svn on the other hand had an elephant’s memory:
svndumpfilter: Invalid copy source path '/projects/rolex'
Now what’s this all about? Some googling again didn’t reveal much (other than some patch-proposals to svndumpfilter). Hmm, apparently svn don’t like to dump a directory that has moved in from somewhere else without also dumping the originating folder. So I tried to include the source folder as well:
svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex /project/rolex > rolex.svndump
This actually worked! Now, remembering the sed-pipe-disaster from the last project, the final command looked like this:
svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex /projects/rolex | sed 's/\/rolex/\/server/g' | sed 's/\/other\//\/clockwork\//g' | sed 's/projects\///g' > rolex.svndump
Now loading the dumpfile into the new repository worked like a charm:
svnadmin load /group/work/svnrepo/ --parent-dir projects < rolex.svndump
And now, I’m going to have a well deserved cup of coffee!





svnadmin load has a –parent-dir argument you can use to put the loaded paths under a specific directory in the new repo. I find it easiest to use that and then, if necessary, do an svn move to get things into the right place in the new repo.
But you just lucked out on this one. I’m trying to deal with a much more complex situation: a repository that has had a lot of copies done all over it, and has IP that I need to get rid of before I can import what code is mine into my repo.
It’s pretty inconvenient that svndumpfilter can do only includes or only excludes in a pass, but not both. It’s even more inconvenient that if you do multiple passes where svndumpfilter doesn’t fail, it can still produce a dump that can’t be loaded. I may just have to abandon up most of my change history on this one, which is fairly frustrating.
Hi,
Thanks a ton for posting this. This was exactly what i was looking for and you really really saved my day.
I also tried using sed to change file paths in an SVN dump, and ran into the same import error. I initially tried your suggestion to pipe through sed during the original dump command rather than on the saved file, and it didn’t work for me.
After a little further investigation I realized there is a ‘-b’ flag for sed to treat the input stream as binary. When I added this flag svnadmin load worked perfectly, even after running sed on the saved dump file!
Don’t do string substitutions with “sed ’s/\/family/\/department/g’”!
NEVER!
Cause, you are also modifying actual content, like you discovered binary, but also if some writes an essay about families.
Just search for “Node-path: family” or “Copyfrom-path: family” and replace this. See SVN book chapter 5 “repository administration” for more details.
I have been visiting this site a lot lately, so i thought it is a good idea to show my appreciation with a comment.
Thanks,
Jim Mirkalami
Thank you! I’m happy that my random thoughts are of use to someone.
HA. you’re hilarious. “svn had an elephants memory.” great hacking, and thanks for posting bud.
I’ve used this tip twice now, much thanks!
Maki brings up a good point with specific searching for svn node paths. I was bitten by this for about half an hour, each time getting svnadmin: Dumpstream data appears to be malformed. Turns out my sed query found other unrelated items and replaced them as well.
When I switched to searching for Node-path:, Copyfrom-path: and Node-copyfrom-path:, all was well.
Hi Guys,
Try imagining the repository structure by the following
“svn://localhost/Hardware/Circuits/trunk”
“svn://localhost/Hardware/Circuits/tags”
“svn://localhost/Hardware/Circuits/branches”.
While getting the dump of the repo i want the dump to be in such a way that if it is loaded into new repository it should have directories “trunk, tags, and branches” at the root.
And i dont want the directories “Circuits” in the new repository.
I modified the dump file changing the “Node-path” and “Node-copyfrom-path” but when i tried to load the dump file it says “Checksum error”. It was expecting some checksum value for the file but the actual checksum was different.
Then i thought of changing content of the dump file before it was written.
Below is the script i wrote:
$ svnadmin dump C:/\svn_repository/\Hardware | svndumpfilter include /Circuits/
trunk | sed -e “s/Node-path: Circuits\/trunk/Node-path: trunk/g” | sed -e “s/Node-copyfrom-path: Circuits\/trunk/Node-copyfrom-path: trunk/g” > hardwarecircui
ts.dmp
Even then i was not able to solve the checksum error issue.
I would appreciate any help. Hope to hear from you soon.
Regards
Kranthi
This works pretty nicely for me:
cat /backup/svn/svn.dump | svndumpfilter –drop-empty-revs –renumber-revs include frank | sed ’s/Node-path: FOLDERNAME\//Node-path: /g’ | sed ’s/Node-copyfrom-path: FOLDERNAME\//Node-path: /g’ | gzip -9 > FOLDERNAME.dump.gz
There was an error on my last post, this is correct:
cat /backup/svn/svn.dump | svndumpfilter –drop-empty-revs –renumber-revs include FOLDERNAME | sed ’s/Node-path: FOLDERNAME\//Node-path: /g’ | sed ’s/Node-copyfrom-path: FOLDERNAME\//Node-path: /g’ | gzip -9 > FOLDERNAME.dump.gz