From: Archie Cobbs (archie@whistle.com)
Date: Sat Dec 19 1998 - 03:59:36 EST
Godmar Back writes:
> there is one fundamental reason to oppose your string internalization
> project. I think this is the reason that Java's designers made
> string internalization optional even though java strings were defined
> to be immutable.
>
> The reason is that you pay an overhead at creation time: you must
> lookup the string, which involves taking locks, grabbing a lock (*),
> walking a hashtable, comparing the contents etc..
> Plus, you'll have to remove the entries from the table upon
> deallocation.
>
> You will get a payback for this overhead if and only if you either
> intern that string or if you perform equals on it. Now without
> traces, I cannot tell you whether that will be a win for most
> string creation, but there is reason to doubt it.
Yes, good point. This is the kind of question that can be debated,
but not really answered except in light of some specific measurable
goals. I mean, what kind of apps is kaffe optimizing for?
If your application is not string intensive, then such a change
wouldn't matter. If it is string intensive, then is it likely that
many strings are duplicates? (the more duplicates, the more memory
saved). Are there a lot of string comparisons? (the more comparisons,
the more time saved). Hmm.
Probably a good bit of string storage is String constants from
class files, too.
It would be nice to come up with a mixed test suite of representative
applications (a compiler, a GUI app, etc.) and take some measurements.
How many newly created strings are duplicates, etc.
Would make for an interesting mini-research project.
Also, since strings are often created from other strings (eg, using
String.substring()) what about storing all strings in a suffix tree?
etc... Interesting to think about.
-Archie
___________________________________________________________________________
Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 19:57:24 EDT