Monday, February 27, 2006

Media Tax

In Finland the state carries a tax on blank media, a fraction of which supposedly goes to the artists. For a write-once DVD this is roughly 60-70% of the disk's street price, or some 21 sn per gigabyte. You pay this tax regardless of whether you use the disk to share art with your friends or whether you back up your own works on it.

But let's take the media tax to the logical extreme, and then a little beyond. First, think about paper. High-quality laser printers claim a 600 dpi resolution, but let's assume only 150 dpi would be recoverable and hence usable to store data reliably. With margins a normal A4 sheet of paper could then hold roughly 250 kilobytes of data. A box of 2000 sheets of copier paper could with double-sided printing hold one gigabyte. That's only 1% of the street price of the paper, but think about the volumes: roughly 300 000 metric tons is produced yearly, or approximately 6.5 billion euros worth of taxes. Worth lobbying, don't you think?

Ok, what about empty space. Waves propagating through space is also a storage medium. Suppose I send a laser beam one km away, where it is reflected back to a point one millimeter right of the origin, where it is reflected back one millimeter left of the first reflection point, etc. turning the beam gradually around. Approximately five seconds and 1.5 million reflections later the laser beam reaches its origin. With 10 ns pulses this device stores 62 megabytes of data. But the fun has just begun, because we can use many laser colors and stack such laser disks (SIC!) on top of each other. Even more efficient would be to construct a sphere instead of a ring of mirrors. The sphere would contain some 3*10^12 mirrors and the laser would reflect through all of them in four months. With, say, fourty different laser colors we could store over 5000 terabytes of data in the sphere. That's worth a million in media taxes!

The German 3G licenses for 20 years cost over 3 billion (10^9) euros each. But the German airspace's volume could easily hold millions of laser spheres, each worth one million in media tax. This sounds the expensive 3G licenses of empty space were actually a bargain. Oh, the lasers can be a little bright, so remember to wear a tin foil hat should you enter the German airspace.

Now, let's resort to my pet peeve, homeopathy. Homeopaths claim that even if you dilute water so far that it no longer contains a single atom of the original remedy, the "memory of the water" will remember what it is supposed to cure. Logically we can extend this not only to vials of water, but also single water atoms. Hahnemann listed 217 remedies, but I don't know whether a water molecule can "remember" only one of them or any combination of these remedies. If the former, a water molecule can store 7.76 bits instead of 217 bits of information. If we assume the former, a litre of water can hold so much information that its media tax is worth 1200 times the whole world's GDP.

Taking off the tin foil hat, empty space is not economic and homeopathy is not possible as a storage medium. But media tax will nevertheless have to be moderated at some point. Otherwise future storage media, such as the HVD will cost nearly 1000 euros per disk.

Friday, February 24, 2006

Five-in-a-Row on a Torus

Work has the bad habit of to disturb blogging, but recently I had a nice chance to combine them.

Some time ago I got a sudden inexplicable urge to implement MTD(f), the reportedly fastest two-player game tree search algorithm. Having some background in AI and games it was enjoyable to implement and tweak. The only issue the description doesn't quite mention is how to catch the best move in the root, let alone the principal variation, but after a few trials I got that working as well.

To get something concrete, I decided to apply MTD(f) to the game mentioned in the subject: five-in-a-row, but on the surface of a torus instead of a flat sheet of paper. Classic five-in-a-row favors the first player, and in fact a strategy stealing argument shows that with correct play, the first player will not lose. More advanced variations of five-in-a-row restrict the first player's moves in various ways, but many of them, including Go-moku and free Renju, have still been proven with deep computer analyses to be wins for the first player.

On a torus, it is easy to check manually that if a 2-by-2 grid or 3-by-3 grid is drawn on the torus, the first player wins in two and four moves, respectively. With this presumption I tested my program on 4-by-4 and 5-by-5 grids, and to my dismay the program reported them as draws. I fell into a bogus bughunt, until I humiliatingly realized that those variations are indeed draws! Because the surface is a torus, it is possible to block a three-in-a-row or four-in-a-row with a single stone from both sides.

Solving the 6-by-6 game wasn't going to be easy, so I was forced against my sincere will (yeah, right!) to have some serious fun: threat-space search, good move ordering heuristics, a reasonable evaluation function, and rote learning as in Samuel's Checkers program. The positions were stored in the transposition and rote learning tables in a format that canonified equivalent positions obtained by reflection, translation and transposing into identical ones.

Playing against this beast was fun athough it demonstrated human fallibility in very convincing ways. But if I wanted the game solved in a time shorter than the average interval between power outages (roughly 6 months), I had to add a few extra tricks. Firstly, I changed the move ordering for the second player so that it actively seeks draws by preferring preventively blocking moves to attacking moves. Secondly, I added an algorithm which checks, with a number of pessimistic assumptions, if the position will unavoidably end in a draw even if there are empty places left. Thirdly, I split the transposition table in two: one which replaces based on recency of access, one which replaces based on effort needed to determine the position's value. Without the latter hash table, the long-running search algorithm would replace the valuable upper level nodes before returning from a deep part of the search.

Several days worth of clock cycles later the program reported that the first player can always win, but only barely because the board (the torus, I mean) is almost full when the fifth-in-a-row is played. In a few days I should be able to show the optimal play.

Oh, and the way this fun relates to work is through an course exercise: a simplified version of this code along with a little bit of network code, HTTP parsing and HTML generation serves as the basis for that course's programming exercise.

Sunday, February 19, 2006

Hedgehog-2.1 and thoughts about programming language ecosystems

I finally made version 2.1 of Hedgehog, a small tactical Lisp system for low-end embedded systems. The most significant additions in this version are the SHA-256 cryptographic hash function, a beefed up version of XXTEA cipher, and a 256-bit shared secret between the interpreter and compiler (or rest of the host system).

Hedgehog is a programming environment deliberately and rather single-mindedly targeted for a small niche market: very small, remotely controlled and updated M2M-boxes and telematics applications. In general purpose programming languages C is the assembler, Java the COBOL, and Python the Basic of this era. They and their equivalents in other systems all have a strong position and it seems it will take some time, perhaps a decade, before significant new revolutions can happen. Therefore I predict that in the near future various niches is where we will see most of new practical programming language development. Mainstream general-purpose programming systems, such as Java, take a few hundred person-years to develop when you include not only the core language but also libraries, tools and documentation, and ten times more if you consider marketing, community building and piloting. Yet they don't fit every machine and every programming task. There one often needs to design something very uncompromising and occasionally the outcome is something elegant and interesting.

The Hedgehog Lisp dialect was designed by Lars Wirzenius, and my role has merely been to (re)implement his design. Ten years ago the position was reversed. I had come across the idea that if one removed in-place mutation entirely from the programming language, one can find significantly improved solutions for generational garbage collection, persistence (backing up to computation and heap disk), and replication (backing up and cross-checking the computation among several machines). On top of that we (a group of researchers including Lars) implemented a number of index structures and a more modern functional programming language. Technically I think we succeeded in most of criteria, specifically in reaching good performance, but eventually IPR problems killed the possibility to continue with the work. But persistent programming languages and fail-safe computing is an interesting niche I someday hope to return to.

Future computational devices, such as mixtures of FPGA and normal CPUs, GPU computing, and of course massively multicore chips will also offer potential for new programming language designs. XML processing languages, Links, Javascript and other examples of languages that have a niche.

Friday, February 03, 2006

YASSSP = Yet Another Seven Secret of Successful Programmers

Lars Wirzenius wrote in his loki a better set of seven secrets of successful programmers than the original. But feeling slightly cynical today and having served as a mental shoulder for many now former programmers over the past years, here's my list. It's written more from the point of view of a programmer's profession rather than programming itself. Not that I've fully followed all of these rules, but nevertheless:

  1. Love programming, problem solving, good "hacks", ideas that seemingly effortlessly get something done well, and read a lot of other people's code. Find the principles of aesthetics in all of this.

  2. Don't fixate to any one programming environment if you want to make a long-term living. You will need to learn many new programming languages and tools, some far worse than what you were used to. Ideally you pick you tools according to the problem at hand, but sometimes they are mandated by management.

  3. Don't be disproportionately good in any one thing, otherwise you will be kept doing it until the thing is obsolete and you're fired. Becoming a specialist tends to be a career-freezer. If you are the best a company has for something, learn also other skills.

  4. Have a healthy amount of cynicism. You should be loyal to your employer, but only to a reasonable limit, otherwise you will be exploited. For example, if the the company can pay dividends, it can also pay overtime compensation. Or, if the employer requests for your services after you have been fired, don't work for your prior hourly salary. Instead, require typical consulting fees, for example 100 e/h.

  5. But don't be malicious or extort your employer either. For example, don't leave deliberate bugs, trapdoors, or purposefully write unmaintainable code. Otherwise you risk not getting another project from them.

  6. Have a life outside of programming. Don't use your friends, spouse, or children as lightning rods, but let them take your mind off troubled times at work and have a chance to relax. Hobbies and physical exercise are good for this as well.

  7. Don't fixate on programming, since times are tough for programmers and will become tougher for some time. Very few programmers make a living comparable to upper management or similarly deeply skilled individuals in many other professions. Those programmers who have made money, have many other talents and insights in technology and business as well. The only chance to make a fortune by being predominantly a programmer, it seems, is to become an entrepeneur some years before a bubble, and take all the risks and other less nice things that come with it. So unless you're willing to go up the management ladder, at least consider having wealthy spouse or parents, some savings, or a second profession, such as plumbing.

The last rule may appear controversial, but its justified by noting that economic success often follows from taking some risks, and if you have a Plan B other than living on the street, you are better able to take risks.

Wednesday, February 01, 2006

Less than half of people are stupider than the average

Ok, I'm just nitpicking on an overused joke, but there is a clear distinction between average and median. Half of people are indeed stupider than the median, but the average IQ is probably a little less than the median IQ because there are more very stupid people than very smart people. Depending on the chosen measure, roughly one in 200 people have an IQ at least 75 points below the average and are in practice institutionalized, while only one million have an IQ of 75 or more above the average - ironically, many of them spend much of their time in various institions as well.

Why this asymmetry? High intelligence is reached when many factors coexist, whereas mental retardedness may be caused by only a single factor, such as a genetic error. So in the extremes of the distribution, additional intelligence and stupidity are caused by different random processes.

Back to the main point: we can analoguously clearly see that with the possible exception of Vatican, most of you earn or own far less than the average simply because the income and wealth statistics are very skewed in all countries except the Vatican.