The results are in, insights into improving NautilusSvn’s performance.

Based on the results so far from the performance poll with a sample of 130 users it seems 38% of NautilusSvn users are not satisfied with the current performance of the extension, while about 43% of the users could be considered satisfied. 19% rated performance as acceptable but it’s a thin line both ways so I won’t count them in either camp. There are very few people that flat out refuse to consider NautilusSvn because of this issue, if I had to take a guess I’d say some 11% ;-)

If I had to place a wager I would bet that most people don’t mind if NautilusSVN takes a while to determine the status for entire working copies (to a certain extent), as long as it doesn’t overheat their CPU, doesn’t consume too much memory and most importantly doesn’t hang Nautilus. If anybody disagrees with this please leave a comment.

However, judging by some of the comments it seems that some people just have extremely large working copies or tend to collect a lot of extremely large working copies in a single project directory. Based on some tests with timing the PySVN status method and the command-line Subversion client I would have to say that there’s really not a lot that can be improved with regards to actual performance. I’m sorry to have to say it, but that’s just the way it is.

Allow me to elaborate.

First, note that on average initial status checks take about 10x longer than consecutive ones. For example, the initial check for the entire TortoiseSVN working copy takes 8309.0079 milliseconds, consecutive ones take 865.9279 milliseconds. That’s 8 seconds compared to 0.8 seconds, I think everybody agrees with me that that’s quite a difference. However, as I see it there’s simply no way to speed up that initial status check. So, say you have 15 working copies the size of TortoiseSVN organized in a single directory, upon entering that directory it would take NautilusSvn some 2 minutes to just figure out the statuses.

Also note that if we want to properly keep working copy and directory emblems up-to-date we’ll also have to recursively register watches on each working copy, initially registering watches using inotify in the case of the TortoiseSVN working copy takes quite a few seconds (I didn’t time it).

There’s still room left to make NautilusSvn more efficient and perhaps a bit more snappy here and there. Especially with regards to the status logic there are still improvements that can be made. Also the implementation of a proper cache will certainly help in making consecutive status checks even faster. But none of this will result in substantial improvements in the area of initial status checks.

So if there isn’t a way to substantially improve performance with regards to initial status checks what can we do? What I know we can do is create the illusion of performance or possibly degrade some functionality. Here’s a few things that come to mind:

  • Pre-loading working copies. Do the initial status checks when the user isn’t looking. It’s probably a good idea to not do this immediately after booting. We would also have to make sure the computer is not on battery power.
  • Allow the user to configure to disable NautilusSvn for certain directories.
  • Do some scheduling tricks and progressively check parts of a working copy. This will also help prevent 100% CPU usage issues for a considerable duration of time (leading to overheating).
  • Executing the status checks asynchronously

However, let me point out that doing everything asynchronously (i.e. in the background) will only obscure performance issues. Sure, Nautilus wouldn’t hang anymore but NautilusSvn will still be hacking away in the background (possibly causing your CPU temperature to rise to unacceptable levels). Especially when developing I find Nautilus hanging a very useful indicator on whether or not progress is being made.

In the end, irrelevant of the performance issues, what I’m most interested in is having an elegant, flexible, maintainable and robust codebase.

Any thoughts?

P.S.

I hope to be posting more of these type of blog entries, that is if people are interested in hearing me talk about this. :-) Now, back to pointless, incessant barking.

8 responses to “The results are in, insights into improving NautilusSvn’s performance.”

  1. The different suggestions you make about improving, or hiding to the user the performances issues seems a good idea to me. Well I mean, their maybe not the best way, and other user’s comment with other helpful hints could be of help too but I think that the little tricks such as running in background for initial checks or executing the status checks asynchronously could be something to consider, as long as it should be possible to enable/disable that feature in the settings since it’s not everybody that would like it. Personally, I would be pleased with something that would stop nautilus from hanging even if the performances would be the same, the statuses would just periodically refresh while that initial status would do its job. Since I use SVN as a cross-platform and cross-computer file sharing and backuping (yea people can argue on that but I like the way I can just commit when I have deleted old music or such) and thus my directories are enormous, and loading time is a killer on my home.

    Anyway, keep up the good work, when I was Googleing for a tortoiseSVN like application for nautilus and stumbled on your application and installed it, I was astonished with the number functionalities implemented for the beta release. Moreover, when I saw that a repo-browser was on the road-map, my choice was clear. Just keep up the good work, and we’ll see with the other comments above what others think :) I give an A+ for NautilusSVN!

  2. Carlos Tasada says:

    Hi Bruce,

    The main issue with the performance, from my point of view, are the continuous freezes in Nautilus when NautilusSVN is updating (mainly after an svn commit or an svn update).

    If this issue can be somehow fixed, doesn’t matter if it still takes time, always that you can continue working with the system, the I think it will be good enough for an official 0.12 release.

    Just tell me if I can help with anything ;)

  3. I’m unfortunately one of the 11%, since we have to deal with a huge projects root folder, containing a lot of projects/repos – which simply is too much for NatilusSVN.

    I’ve quickfixed the issue now by excluding that folder (hard-coded), using your hint:

    > if (os.path.dirname(realpath(gnomevfs.get_local_path_from_uri(uri)))
    > == “/media/Sites”): return False

    However, it would be great if

    a) exclude-folders would be configurable
    b) It would scan/update subfolders of excluded folders in the moment one clicks it
    (e.g. /media/Sites/xyz)

    I don’t need every subfolder/project folder updated all the time – just the ones I currently work with. This could solve the issue for the unfortunate 11% with too big projects root folders elegantly.

    • Hey Martin, I think I remember our discussion on the mailing list. :-) Both your suggestions are great. So basically a tab in the settings dialog where you have a table to add paths and set a state of “always check”, “only check on click”, “never check”, “never check children” would be sufficient for you?

      For bonus points it would be really cool if NautilusSvn were able to automatically detect which working copies you work the most with.

      If you can file two separate requests for enhancement I’ll be sure to implement both.

  4. Thierry Bothorel says:

    Could the use of an interpreted language as python be the bottleneck compared to a compiled program as Tortoise or is this irrelevant?

  5. Consider my suggestion at comment
    http://cobradragon.com/nautilussvn/archives/73/comment-page-1#comment-2467

    I made just a minute ago. Basically it’s similar to what Martin Bachmann said, but only for the root folder.

    I distinctly recall that TortoiseSVN never checks all the folders recursively, it only updates the emblems on opening the directoy itself. I like the idea that NautilusSVN really does a recursive check and therefore only propose to disable the real-time emblems for the root checkout directory.
    Then Nautilus won’t lockup simply because one needed to edit a file outside of the SVN directory but inside the same project directory…

    /home/jere/project1/mySVN/ -> update emblems
    /home/jere/project1/ -> don’t update emblems, just show a old/cached/general SVN emblem

    I think a lot of users would find it acceptable to have slow performance if it’s really restricted to the inside of the svn-directories (“cd”-ing). Merely “ls”-ing the project directory shouldn’t lockup the browser.

  6. skorka says:

    First of all, i’m new here and want to congratulate you for the initiative, which fills a real gap.
    From my personal experience, I’m dealing with a repository of 4GB containing about 25000 files. It is hosted on an external server, and i’m accessing it from the office under windows xp / tortoiseSvn and at home from an ubuntu box.
    From the office, the directory access is quasi-immediate. Update/commit take a reasonable time.
    From ubuntu, i tried NautilusSvn and it proved unusable in my case. It takes about 10 to 15 minutes to only open 4 nested directories (and sometimes seems to hang).
    Good luck and i hope that the performance issue will be in your priorities… in the meanwhile i had to fall back to a simpler tool (eSvn), the use of which is quite fluid.