Science and Technology

From 2004 Guide to the IEEE Software Engineering Body of Knowledge:

1.3. Constructing for Verification [Ben00; Hun00; Ker99; Mag93; McC04] /></a></b></p> <p align=

Anticipating change is supported by many specific techniques summarized in topic 3.3 Coding.

Many other citations on the page show similar escaped (pun intended) HTML. Is it intentionally ironic that the SWEBOK site itself is broken?

方舟子 strikes again:



As I mentioned I’ll probably be stuck forever as n00b, so I have absolutely no intention to compare the real merits of languages. It’s just that I finally got to learn Ruby (and then Python) for real, and it’s so different from C++ and Java that I think it’s worth documenting.

I don’t have any emotion for or against C++ since it’s just too big. I’ve used Java for just over 2 years and like its relative simplicity. I’ve learned Ruby for a day, and would take

a = { 1 => "blah" }


Map map = new HashMap(); map.put(1, "blah");

any given time.






single base class, multiple interfaces

single base class, multiple mixins


throw try catch

throw try catch finally

raise begin rescue ensure
catch throw (with label, not exception)

raise try except finally







Object …


*args, **args



inner class

anonymous class


lambda (limited)

final (not quite)


enum: int only

enum: full class, but no more inheritance

N/A (type-safety is moot)


pointer, iterator

JRE, javadoc, IDE

yield, block/closure

list comprehension

hash table, regexp, thread



complex and cryptic

String ==

elsif, no ++

elif, no switch/case

This seems to explain things pretty clearly.

Definition Task Manager Process Explorer vadump -s
Physical memory in use Mem Usage Working Set

Private (no DLL) VM allocated/committed VM Size Private Bytes PagefileUsage
Total VM (including mmap, dll, etc) N/A Virtual Size (Image + Priv + Mapped) Commitment + Dynamic Reserved Memory

On 32-bit Windows, max address space (virtual size) is 2GB.

The 3 types of commitment in vadump -s are:

  • Image: process executable code
  • Mapped: memory mapped stuff like files
  • Private: process heap (and stack?)

vadump -so has a section that breaks down working set. Two entries are important: Heap is Windows native heap, and Other Data would be, e.g. CLR and JVM stuff.

No, it’s not about religion, or reality TV, or random rant.

Those are names in Java’s memory management model.

In the beginning, Java gc was simple and stupid: run when heap is full. So your app happily gobbles up memory until… a… lo…ng pau…se.

Then the Java guys found a common pattern among most apps: most objects die young (used only for a short time), but those who survive live long. The X unit of the graph is object life span not in time, but in terms of number of bytes allocated between their birth and death.

Therefore Java memory is now divided into 3 generations:

  1. Young
    1. eden
    2. two survivor spaces
  2. Tenured
  3. Permanent (and code cache): stores JVM’s own stuff

Heap = young + tenured. It starts at physical memory / 64, and max is min(mem/4, 1GB), unless you specify -Xms and -Xmx. Default perm size is 64MB (-XX:MaxPermSize). Default code cache is 32MB (-XX:ReservedCodeCacheSize).

Now object life cycle is like this:

  1. Objects are always allocated to eden.
  2. When eden fills up, a fast but not comprehensive gc (minor collection) is run over the young generation only.
  3. All survivors are moved into one survivor space, plus everything from the other survivor space (survivors from the previous minor collection).
  4. When objects in survivor space is old enough (or survivor fills up), they are moved to tenured.
  5. When tenured fills up, a major collection is run that is comprehensive: all heap, all objects.

Run java with -verbose:gc (or -Xloggc:file) and it prints stuff like this:

[GC 15081K->14088K(20988K), 0.0110810 secs]
[Full GC 15078K->13996K(20988K), 0.1845024 secs]

GC = minor collection and Full GC = major. Numbers are pre gc -> post gc (total committed heap).

Stevey “Long Long” Yegge strikes again.

And this time it cuts twice: first it wasted a lot of my time as usual, then it showed why I suck at being a developer: I can’t get over the n00b mentality and capacity.

OK it’s not that bad. I’ll never be able to write a compiler or understand LISP, but I don’t think my code has that much of a chance to be honored on the Daily WTF, either.

This is the paragraph that makes me sweat (bold by me):

A programmer with a high tolerance for compression is actually hindered by a screenful of storytelling (referring to long comments). Why? Because in order to understand a code base you need to be able to pack as much of it as possible into your head. If it’s a complicated algorithm, a veteran programmer wants to see the whole thing on the screen, which means reducing the number of blank lines and inline comments – especially comments that simply reiterate what the code is doing. This is exactly the opposite of what a n00b programmer wants. n00bs want to focus on one statement or expression at a time, moving all the code around it out of view so they can concentrate, fer cryin’ out loud.

I would always cringe at the LISP code segment he quoted, not just because it’s LISP, but mostly because it’s too terse and dense, exactly as he wants it. My current team lead writes code just like that, and now I can understand his smirk whenever I ask him for more comments.

Stevey’s take home messages are clear, balanced, and easy to follow, though. The essence is the same as the original Agile Manifesto, which is working code (including test code) is the one and only true goal of development. Not any kind of artifact (Stevey calls it metadata) like process, model, and documentation.

When in doubt, don’t model it. Just get the code written, make forward progress. Don’t let yourself get bogged down with the details of modeling a helper class that you’re creating for documentation purposes.

If it’s a public-facing API, take a lesson from doc-comments (which should be present even in seasoned code), and do model it. Just don’t go overboard with it. Your users don’t want to see page after page of diagrams just to make a call to your service.

Lastly, if you’re revisiting your code down the road and you find a spot that’s always confusing you, or isn’t performing well, consider adding some extra static types to clarify it (for you and for your compiler). Just keep in mind that it’s a trade-off: you’re introducing clarifying metadata at the cost of maintenance, upkeep, flexibility, testability and extensibility. Don’t go too wild with it.

I mentioned this before, and this blog confirmed my complaints:


I have maybe 1/3 of these and I just can’t keep track of them.

How hard would it be to create a personal mashup portal to integrate all these?

Maybe Facebook’s platform can do it already.

Maybe not.

I was looking for a personal library site. Facebook has some apps that suck. Douban is pretty good but I want more.

I asked for an integrated portable device.

Now I’m asking again (see bottom) for a personal mashup portal.

Then I will access my PMP anytime, anywhere on my IPD.

I won’t ask for anything any more, I promise.

Can you hear me now, Google?

Next Page »