Science and Technology


From 2004 Guide to the IEEE Software Engineering Body of Knowledge:

1.3. Constructing for Verification [Ben00; Hun00; Ker99; Mag93; McC04] /></a></b></p> <p align=

Anticipating change is supported by many specific techniques summarized in topic 3.3 Coding.

Many other citations on the page show similar escaped (pun intended) HTML. Is it intentionally ironic that the SWEBOK site itself is broken?

方舟子 strikes again:

《后汉书·张衡传》认为它是管用的,记载了这么一个著名的故事:有一次地动仪的机关发动,但是人们并不觉得地在动,京师(洛阳)的学者都怪它乱报,几天后信使来了,果然在陇西发生了地震,于是大家都佩服它的巧妙,从那以后皇帝就让史官记载地动发生的方位。

但是这个记载很成问题。按《后汉书·张衡传》所说,地动仪建成于阳嘉元年(公元132年),张衡卒于永和四年(公元139年)。在此期间,《后汉书》只记载发生过一次陇西地震,那就是永和三年(公元138年)的地震。一般认为地动仪检测的就是这次地震。但是《后汉书·五行志》说得很清楚,这次的陇西地震在京师是有感的,破坏很严重,“裂城廓,室屋坏,压杀人”,京师学者不会对地动仪的机关发动感到奇怪,与《张衡传》的故事矛盾。可见地动仪检测的不可能是这次地震。

As I mentioned I’ll probably be stuck forever as n00b, so I have absolutely no intention to compare the real merits of languages. It’s just that I finally got to learn Ruby (and then Python) for real, and it’s so different from C++ and Java that I think it’s worth documenting.

I don’t have any emotion for or against C++ since it’s just too big. I’ve used Java for just over 2 years and like its relative simplicity. I’ve learned Ruby for a day, and would take

a = { 1 => "blah" }

over

Map map = new HashMap(); map.put(1, "blah");

any given time.

C++

Java

Ruby

Python
Inheritance

multiple

single base class, multiple interfaces

single base class, multiple mixins

multiple
Syntax

throw try catch

throw try catch finally

raise begin rescue ensure
catch throw (with label, not exception)

raise try except finally
#include

import

require/load

import
dynamic_cast/typeid

instanceof

kind_of

isinstance
varargs

Object …

*args

*args, **args
NULL

null

nil

None
inner class

anonymous class

block

lambda (limited)
const

final (not quite)

object.freeze

N/A
enum: int only

enum: full class, but no more inheritance

N/A (type-safety is moot)

N/A
Hits

pointer, iterator

JRE, javadoc, IDE

yield, block/closure

list comprehension
Misses

hash table, regexp, thread

verbose

performance?
Idiosyncrasy

complex and cryptic

String ==

elsif, no ++

elif, no switch/case

This seems to explain things pretty clearly.

Definition Task Manager Process Explorer vadump -s
Physical memory in use Mem Usage Working Set

WorkingSetSize
Private (no DLL) VM allocated/committed VM Size Private Bytes PagefileUsage
Total VM (including mmap, dll, etc) N/A Virtual Size (Image + Priv + Mapped) Commitment + Dynamic Reserved Memory

On 32-bit Windows, max address space (virtual size) is 2GB.

The 3 types of commitment in vadump -s are:

  • Image: process executable code
  • Mapped: memory mapped stuff like files
  • Private: process heap (and stack?)

vadump -so has a section that breaks down working set. Two entries are important: Heap is Windows native heap, and Other Data would be, e.g. CLR and JVM stuff.

No, it’s not about religion, or reality TV, or random rant.

Those are names in Java’s memory management model.

In the beginning, Java gc was simple and stupid: run when heap is full. So your app happily gobbles up memory until… a… lo…ng pau…se.

Then the Java guys found a common pattern among most apps: most objects die young (used only for a short time), but those who survive live long. The X unit of the graph is object life span not in time, but in terms of number of bytes allocated between their birth and death.

Therefore Java memory is now divided into 3 generations:

  1. Young
    1. eden
    2. two survivor spaces
  2. Tenured
  3. Permanent (and code cache): stores JVM’s own stuff

Heap = young + tenured. It starts at physical memory / 64, and max is min(mem/4, 1GB), unless you specify -Xms and -Xmx. Default perm size is 64MB (-XX:MaxPermSize). Default code cache is 32MB (-XX:ReservedCodeCacheSize).

Now object life cycle is like this:

  1. Objects are always allocated to eden.
  2. When eden fills up, a fast but not comprehensive gc (minor collection) is run over the young generation only.
  3. All survivors are moved into one survivor space, plus everything from the other survivor space (survivors from the previous minor collection).
  4. When objects in survivor space is old enough (or survivor fills up), they are moved to tenured.
  5. When tenured fills up, a major collection is run that is comprehensive: all heap, all objects.

Run java with -verbose:gc (or -Xloggc:file) and it prints stuff like this:


[GC 15081K->14088K(20988K), 0.0110810 secs]
[Full GC 15078K->13996K(20988K), 0.1845024 secs]

GC = minor collection and Full GC = major. Numbers are pre gc -> post gc (total committed heap).

Stevey “Long Long” Yegge strikes again.

And this time it cuts twice: first it wasted a lot of my time as usual, then it showed why I suck at being a developer: I can’t get over the n00b mentality and capacity.

OK it’s not that bad. I’ll never be able to write a compiler or understand LISP, but I don’t think my code has that much of a chance to be honored on the Daily WTF, either.

This is the paragraph that makes me sweat (bold by me):

A programmer with a high tolerance for compression is actually hindered by a screenful of storytelling (referring to long comments). Why? Because in order to understand a code base you need to be able to pack as much of it as possible into your head. If it’s a complicated algorithm, a veteran programmer wants to see the whole thing on the screen, which means reducing the number of blank lines and inline comments – especially comments that simply reiterate what the code is doing. This is exactly the opposite of what a n00b programmer wants. n00bs want to focus on one statement or expression at a time, moving all the code around it out of view so they can concentrate, fer cryin’ out loud.

I would always cringe at the LISP code segment he quoted, not just because it’s LISP, but mostly because it’s too terse and dense, exactly as he wants it. My current team lead writes code just like that, and now I can understand his smirk whenever I ask him for more comments.

Stevey’s take home messages are clear, balanced, and easy to follow, though. The essence is the same as the original Agile Manifesto, which is working code (including test code) is the one and only true goal of development. Not any kind of artifact (Stevey calls it metadata) like process, model, and documentation.

When in doubt, don’t model it. Just get the code written, make forward progress. Don’t let yourself get bogged down with the details of modeling a helper class that you’re creating for documentation purposes.

If it’s a public-facing API, take a lesson from doc-comments (which should be present even in seasoned code), and do model it. Just don’t go overboard with it. Your users don’t want to see page after page of diagrams just to make a call to your service.

Lastly, if you’re revisiting your code down the road and you find a spot that’s always confusing you, or isn’t performing well, consider adding some extra static types to clarify it (for you and for your compiler). Just keep in mind that it’s a trade-off: you’re introducing clarifying metadata at the cost of maintenance, upkeep, flexibility, testability and extensibility. Don’t go too wild with it.

I mentioned this before, and this blog confirmed my complaints:

mess

I have maybe 1/3 of these and I just can’t keep track of them.

How hard would it be to create a personal mashup portal to integrate all these?

Maybe Facebook’s platform can do it already.

Maybe not.

I was looking for a personal library site. Facebook has some apps that suck. Douban is pretty good but I want more.

I asked for an integrated portable device.

Now I’m asking again (see bottom) for a personal mashup portal.

Then I will access my PMP anytime, anywhere on my IPD.

I won’t ask for anything any more, I promise.

Can you hear me now, Google?

Next Page »