Monday, October 15, 2007

String to char-array in Java

For those curious Java versus LS. It's lightning fast:

String to char[]: 15ms, string length, 700 000

Added to the string concatenation "benchmark"

17 comments:

Charles Robinson said...

That's bleeding fast! What about stepping through it a character at a time? :-)

Tommy Valand said...

StringBuffer-concatenation (75 000 concatenations): 62ms, string length, 300 000

String to char[]: 0ms, string length, 300 000

Step through char[]: 78ms, char[] length, 300 000

I updated the "benchmark"-script, if you want to take a look at it.

Charles Robinson said...

I think I need to learn some Java. That's simply amazing.

Tommy Valand said...

I'm currently in that same process myself.

One of the (few) advantages in Notes lagging behind in the Java runtime versions is that there are a lot of cheap (used) books on Java 1.4 in Amazon Marketplace..

Java 5 is apparantly included in Notes/Domino 8.

tbahn said...

Hi Tommy,

are you really sure, that your program did the conversion, the stepping-through and so on? ;.)

I ask, because I heard Brian Goetz from Sun speaking about "Java Performance Myths" at the JAX 2007 conference.

He spoke about the achievements of the Java engineers in optimizing the just-in-time compiler, especially in removing "useless" code, like adding a constant number in a loop to a variable, that isn't used afterwards!

Conclusion was: the usual micro-benchmarks don't work anymore in a world of optimizing (JIT) compilers.

Just my 2 cent
Thomas
http://www.assono.de/blog/

Tommy Valand said...

I have no idea.

I'm just getting started with Java, so I'm not surprised if my benchmarks are bad.

If you see flaws in my script, you are more than welcome to write your own/give me pointers on what I'm doing wrong.. :)

tbahn said...

Hi Tommy,

sorry, but I'm also not too deep in the Java performance optimization and benchmarking thing.

See http://www.xiguaforever.net/confluence/download/attachments/3342381/PerformanceMyths.pdf?version=1 for the presentation I meant (go directly to page 27 for the summary).

Dwight Wilbanks said...

Just a quick reality check.

When I ran your java agent it "seemed" like more than 62ms and the LS agent really did seem like 200 MS.

So, I created a 3rd agent that tests the agent start to finish. The cost of playing with java agent needs to be concidered.

On my system, its about 950 ms to start a java agent.

Sub Initialize
Dim Session As New NotesSession
Dim Currdb As NotesDatabase
Dim javaagt As NotesAgent
Dim lsagt As NotesAgent
Dim javastart As Single
Dim javaend As Single
Dim LSstart As Single
Dim LSEnd As Single
Dim outmsg As String
Set currdb = session.currentdatabase
Set javaagt = currdb.GetAgent("StringBuffer_Java")
Set lsagt = currdb.GetAgent("StringBuffer_LS")
LSstart = Timer()
Call lsagt.Run()
LSEnd = Timer()
javastart = Timer()
Call javaagt.Run()
javaend = Timer()
outmsg = "Run Time is" & Chr(13)
outmsg = outmsg & "Java=" & (javaend - javastart) & Chr(13)
outmsg = outmsg & "LS=" & (LSEnd - LSstart)
Msgbox outmsg
End Sub

I also improved Julian's code with a function that takes a string as parameter appendstr instead of variant. Than improves his time by about 10%

Tommy Valand said...

@Thomas: You are allowed to post <a href>'s

Thomas' link - Java Performance Myths

I'll take a look at the link, thank you.

Tommy Valand said...

@Dwight:
I'm not sure if you agree on this, but 1s startup time of an agent that is going to process extremely large strings (exports/etc) isn't really much of a problem, if the trade-off is greatly increased processing speed while the agent is running.

An unknown in Java, though (at least to me), is how fast you can get to the Notes-data. If you loose milliseconds on every document, compared to doing the same routine with a Notes-agent, it won't matter if the StringBuffer is ultra-fast.

kerr said...

Doh, Just added a comment on the previous post and now find the basic comment has also been made here. When performance testing Java you need to make sure the test in real world enough that the JVM can't optimise the test out of existence.

It's not too difficult to do this with randomly generated strings etc.

kerr said...

Oh yeah, forgot to mention, in Java the String object is backed by a char array anyway, so toCharArray() just does an arraycopy(), which is JVM specific, very fast code.

Dwight Wilbanks said...

@Tommy,
At this point its completely academic, but kinda fun, so, Im going to kick this dead horse a little more.

On my system, the inner loop of 75,000 iterations take 160ms (10^-3) in LS & 31ms in java that gives java the advantage of 1.72 microseconds (10^-6) per iteration, java takes an average of 975 ms java to start an agent and LS takes 26 ms to start an agent.
If there was a linear curve you need to have 546,000 iterations to break even. My actual tests showed 470,000 iterations. All this assumes an addtional string size of 1 char (ie String str = "a", far from realistic).

When I switched to a more realistic (?) 32 char string, java ran out of memory and failed to complete. My upper limit in java was 426,000, but at that level java had lost quite a bit of its advantage.

So, Based on my very un-scientific and biased findings, I was not able to iterate the concatenate of a single 32 char enough times to in the java code to overcome the "agent running cost".

Also, to eliminate the potential of compiler optimization, I wrote the result out to a file in both agents.

I now declare the dead horse, still dead.

Tommy Valand said...

After reading the PDF posted by Thomas, I see that my way of benchmarking string concatenation in Java isn't any good.

Random strings, as kerr suggest, would be more realistic.

I'm throwing in the towel in this "fight", and going back to other experiments.. :)

kerr said...

@dwight,

I think your method of determining java agent start up speed is flawed. I have done a test with a servlet running on a seperate box, calling simple domino agents. I wrote an agent in java and another in LS. They both simply printed out the evaluation of @unique. The servlet recorded the response times for calling and getting the full result from each agent. There was some variance in time, but it averaged 45ms, regardless of whether it was a java or LS agent being called.

"Also, to eliminate the potential of compiler optimization, I wrote the result out to a file in both agents."

I would not underestimate the complexity of some of the JVM hotspot tricks. It's entirly possible for a warmed up good JVM to inline that opperation.

Dwight Wilbanks said...

@Kerr
I've re-implemented my test cases as WebQueryOpens (I have a utility that tells me response times) and partially agree with your statements. When the agents are serverbased the times are very similar, slight advantage lotuscript, but very slight.

So, for the WQO agents the startup performance is about even, but java has an advantage in string manipulation.

The part about my statement that I stick to is that when running a client based agent there is a 1 second delay. This is significant in a UI.

My testing was done with 7.02

kerr said...

@dwight, I wasn't particularly surprised to see a lag when calling a java agent from within a chunk of LS, but I've never had a performance issue with using java agents, so I wanted to examine my major use case to put some numbers to my experience.

Again from experience I don't see anything like a 1 sec wait for an agent to start when run on the client form an action button, but I don't have a reliable way to profile that. I'd be interested to see some results. I think you are reaching a little to extrapolate anything form your original results other than loading a java agent via LS is going to take a performance hit. It would be interesting to see the performance of calling LS and java agents from within another java agent.

I just don't want to see the old Java == slow meme rear it's ugly head again.