In
my opinion, there are two general classes of code optimizations: Those
that can be ascertained simply by looking at the code and those that
can be only ascertained by performance measurement. In the past I have
spent a great deal of time trying to fix code that I felt was not
optimized only to discover after measuring that the changes I made (
and spent a lot of time on ) really didn’t change much. However, I have
also changed code and increased performance by orders of magnitude.
Usually in the latter case, the code optimizations are blatantly
obvious.
For example, I can tell you from experience that if
string concatenation is being done with four or more strings ( in .Net
) that using a StringBuilder is going to be more efficient. However,
even changing that obvious case may not overall make any difference in
the performance of the application unless it’s in a critical block of
code that is called often. The only reason I would consider changing
code in that case is so that it matches up with coding best practices
for junior programmers who might come along and see the code and learn
something they shouldn’t.
In talking about performance, I think
that something we as programmers often forget is that we are abstracted
quite a few layers away from the actual code that gets run on the CPU.
Each layer gets optimized more and more.
For example ( I'm sure the Java stack is similar )
C# Code
-> Gets compiled to MSIL
-> Gets Just-in-time compiled to Native Code at runtime
-> which runs inside inside the .Net VM
-> which runs Native CPU Instructions
->
on the CPU, which itself may have abstraction layers. For example, my
laptop is a Pentium 4 Hyper-Threaded single processor. Windows thinks
it’s running on two processors, which both show up in Task Manager but
the actual CPU is a single processor which has been optimized to run
two simultaneous operations at a time if there’s room.
That’s
why sometimes code we think is optimized is not and other code we think
is not optimized is. I've sometimes "optimized" code only to find out
that the result runs more slowly.
For what it's worth, when I
have a junior programmer under my wing, here are the points I try to
get across to them in their daily coding practices.
Cross-process
calls ( remoting, database server, ZIS, etc ) are at least an order of
magnitude slower than in-process method calls. These calls are almost
always the performance bottleneck in a client-server type application.
Make sure that you don't make a round-trip call to the database any
more often than you have to. Sometimes that means getting more than one
row at a time rather than two separate calls or efficiently caching
frequently-used data. Using SQL [out] parameters is way faster than
getting a single-row resultset.
Learn to use StringBuffer instead of String when doing repetitive string concatenations.
Readable,
maintainable code is better than "optimized" code in most cases, except
for bottleneck areas of the code, which can usually only be found by
profiling.
If performance is an issue at this point, profile the
code and look for methods that allocate lots of memory or that are
called often and take up the bulk of the processing time to see if you
can find areas that might need optimization. After optimizing, see if
you really fixed the problem. Roll back your changes if you didn't
improve much and the code is harder to understand.