Tuesday, March 9, 2010

Elements and Attributes in XML

Today I got myself wondering about when to use differences and attributes in XML. The general understanding is that the question is very philosophical, depending on how your system is designed and, even more, on how you understand the data in your application.

However, the subject isn't new and there are a lot of articles on the web that already deal with the issue. So, instead of repeating what other, more experienced people already wrote, I'd rather appoint readers in the right direction:

Friday, February 12, 2010

What About Google Wave?

I'm a technology enthusiast. Let me rephrase that: I'm a software enthusiast. Every time I can lay fingers on a new piece of software, I do. I have installed and run VirtualBox on my computer, just for the sake of it; I use Cygwin daily. I don't need but use OneNote because the concept is cool. I'm currently learning Powershell scripting and my navigator of choice is Chrome, just because Google decided to use processes instead of threads. And, of course, when Google Wave was announced at Google I/O, I couldn't wait to put my hands on it!

At first I was, of course, astonished. Google delivered almost everything the presentation first promised, which was indeed a lot. At first look, I found the tool to be very similar to a web forum. I even guess that was one of the goals, uniting the community activity of forums to email. As a tool, everything was sleek and had that "Google touch", which means good quality, a lightweight interface, strange coloring, etc. The site makes extensive use of AJAX to offer a seamless interface and dynamic loading of data. The load time for the items are as expected: good as long as they aren't cluttered with posts.

The best features are the phenomenal integration with YouTube! and Gears, the possibility to upload files to a persistent location which makes sharing that more easier (even though I haven't really discovered the size limit for each file) and the formating, which is very present, so users are encouraged to write visually rich "articles". Finally, some of it's real-time features can add something to the experience, even though I think that is far from being one of the product's best qualities.

But, whereas technically Google Wave stands out as a very polished tool filled with features, conceptually, it's flawed. Google aimed high and large: Wave was going to replace e-mail, forums, IM, etc -- almost every well established electronic communication tool out there. And the application had a list of features to make up for each of those tools that we use. Problem is, as it's usually the case with swiss-knives kinds of software (or any product, for that matter), Wave accomplishes almost none of the replacements it proposed itself to:

When compared to forums, you will discover that not being "public" makes Wave less than useful. It's cool to discuss about that new movie with your friends, but community boards will enable you to get a different point of view every now and them, so important to discussions.

And then Wave should replace e-mail. E-mails chains are like stacks of information. Everyone adds something to the top, and you "can't" change what already has been done (on some cases, e-mail is more a tree, because of simultaneous replies, but that doesn't happens very often). Wave, however, threats information like a "live" document, which is compatible with the Collaboration Age (or Collaboration Wave, as one might call...). So what you said yesterday may change today, or even cease to exists. It's confusing. There's a feature that tries to solve (patch) that, called "history", but it's like a pain to use it to check what you missed. On a e-mail, you just have to scroll-down the screen a bit and there it's, that last piece of info that you forgot to read. On Wave's History, you will keep going back and forth until you discover what was there but isn't anymore.

And IM? Well, Google shouldn't even have proposed that. IM works because you have a small icon on your desktop that you can double-click in order to get a list of contacts and start talking with whoever is online. It's that simple. On wave, to IM, you use the IM panel, which is just like a sized-down version of Google Talk... So, why would you? Isn't it easier to Google Talk anyway?

All things said, I must add that the worst part of Google Wave is the problem that, for a dynamic application, the topics are treated very statically. Let's say you send your pals an e-mail about that latest game for Playstation. You start talking how it reminded you of a certain movie, and before you know, your friends started talking about movies. After a while, the subject changes slightly, after another friend adds that this actor was once arrest for drug use, and then you're discussing the lives of the famous. That's is a bit chaotic, indeed, but so are conversations. They rarely ever turns into what we wanted in the first place... once they begin, they become live things that change whenever something different is said, because that's the way the brain works, by making free association of ideas. When you try to have this same conversation in Wave, however, things get a bit confusing. Since Waves are so clearly persistent, you look at them that way. You start a Wave about a trip to Europe, you hope people only talk about that in there, which doesn't happens. Someone ends up saying something about a bar or restaurant, someone remembers about another one in your own city, and all the sudden your Wave about Europe is now arranging a dinner with your friends. And worse, since you can post anywhere (Waves are flattened-down trees), the subjects are intertwined. One can argue that this would resemble our thinking process even better than e-mail, but as a tool, it gets messy.

All-in-all, Google has offered us a nice product, with nice features, that is useless only because all its older brothers are more effective at what it does.

Thursday, January 21, 2010

Why COBOL Is Bad For Your Health

Before you think that I'm going to continue one of the eternal developers discussions like Windows x Linux, or C# x Java or even OGL x DX, I'm not. COBOL is a useful language and will remain that way for a very long time. It has and keeps serving its purpose, which is to be a language targeted at non-programmers, mostly business analysts, with very few or none programming knowledge whatsoever. What I'm about to state here are the deficiencies of COBOL: being business oriented has its cost and COBOL pays dearly for it. Also appreciate that I have good knowledge over COBOL, Mainframe, and Batch architecture. However, I was groomed in C/C++ and specialize in distributed systems, so I have a reasonable understanding of both worlds. 


Recently a fellow in my team asked me why I hated COBOL so much. To keep things short, my answer was that I did not hate COBOL at all, but I thought that there were better languages which could do COBOL work better; I stated that COBOL syntax might be easy and simple, however, COBOL programs are semantically obscure and can often lead very bad algorithms. I'm now going to explain why I think that.
Remember that I'm not a doom-sayer. COBOL isn't dead, nor is it going to die. It has its purpose, and it does it well enough. People will keep learning COBOL for a long time now, and many enterprises will continuously grow their mainframe platform. 
So, without further ado, let's give you my reasons why I believe that COBOL is bad for your health:


SECTION A - CODE SAFETY:

I. All Variables Are Global
It probably goes without saying (at least to any weathered, non-COBOL programmer), that you shouldn't use global variables in your programs. Globals are bad for your health because it's hard to predict their value, the reason for that beings that every single instruction inside the program might modify it. 


If your program has less than two hundred lines and variables follow good naming rules and if you're not using redefines, maybe you can find out where the variable is accessed and predict its value. But in a world where the average COBOL program has way more than ten times that number of lines, you are in for a very hard "mind compiling" experience.


Moreover, there's a have a side-effect of variable cramming. Whenever a programmer needs to extend or fix the source of a COBOL program, instead of using the variables that already are there (since he can't know where the given variable is accessed), he declares a new variable. The effect of this is that the source end up having more variables than effectively necessary and gets even harder to read. Add to that the fact that COBOL doesn't allow variables to be declared within procedure code (like old C), and now you have this programmer hell: many variables whose declaration are very far on the code from their use spot, which means a lot of scrolling up and down the source.

II. Variables Aren't Type Safe
Type safety is a very complex subject. Many languages that are perceived as type safe actually aren't -- C/C++ can cast anything to void, and void can be cast to anything -- but COBOL goes way beyond that when it gives programmers REDEFINES. REDEFINES allows anything to be seem as a different type at compile time, and is, in many ways, a cast. One can argue that C/C++ presents us with a similar structures with unions. However, for some reason, C/C++ programs hardly ever use unions, preferring to have a bytestream that is then copied to a new instance of a certain type.


Besides, COBOL also have "untyped" variables, called group items. Group items in COBOL are similar to  C structs, being a definition of a group of variables that are aligned together in the memory. However, in COBOL, those group items doesn't have a defined type, and the compiler allows that any date be moved to such group or from the group. There is no runtime boundary checking as well, so you can easily overflow the area. What COBOL does for you instead is area truncating. It's completely left to the programmer the responsibility of knowing the types fit.*

II. Variables Aren't Really Typed
This one will probably be the most polemic point here. COBOL use a typing system that includes mainly two types of variables, numeric and text. Numeric variables can be of COMPUTATIONAL type, which means that they allow numeric data but such data is stored on a different way -- compacted. The first criticism here is that, for a language called 3GL, COBOL, exposes a lot of the underlying implementation to its programmer, which has to know the differences between compacted COMPUTATIONAL data and "common". Of course, there are historical factors that led to this implementation, namely, the fact that storage was way more expensive when COBOL was conceived. But using this as an excuse only proves that COBOL is obsolete and should be dumped.


SECTION B - Code Structure


I. Where You Write Your Code Matters
COBOL still inherits a lot from punched-card days. In COBOL, code can only be contained between column 8 and 72, and column 7 is reserved for "indicators", that can help you inform the compiler that the following line is a commentary or a continuation from the previous line. Add to that the fact that some commands need to start on what COBOL calls AREA B. Area B starts at the column 11. This means that you have only 61 characters to input commands, which are very long in nature already (you need a least 11 characters to write an attribution, for example). And remember that variables tends to have lots of prefixes and suffixes, because the COBOL scope member operator OF is never used.


II. Periods Are Both Scope And Statement Terminators
Another one of COBOL strange behaviors that will make you shiver. In COBOL, you can finish statements with a period ("."). You can, because most of the time you don't need to. Most of the time, because sometimes they are necessary. Already confused? Well, it gets worse. In COBOL, you also close scope with a period. So, if you begin an IF construct and stick a period just after the first statement, the scope is terminated and whatever comes afterwards is considered outside from the IF. 


So, if you decided to stick periods after all sentences, you can't. So you decide to abandon periods, and be on the safe, never ending a loop or scope accidentally... but, just like we said, you can't.  


III. Idiosyncrasies
COBOL is a champion when it comes to idiosyncrasies. For example, assignments in COBOL are written as MOVE variable TO variable. Moving is usually conceived as taking something from one place and putting it somewhere else, but assignment works by copying the value of a certain variable to the value of another one, and that's exactly what the MOVE operator does in COBOL.


In sum...
There are a lot of reasons why COBOL should be avoided at your enterprise. Sure, you can have a person trained in COBOL in less than a week, but how long will you take to remove bugs from his code? How many bugs will appear in the future? COBOL is a counter-productive language, that encourages bad developers to write bad code.


* COBOL has evolved during the years, and so have the compilers. I wouldn't be surprised if there was a compiler directive that allowed such checking to be made, but I must say that I never saw anyone using it.



Friday, October 2, 2009

Variable Scoping In Bash Functions

Quick and dirty, people. Today I was testing some complex bash scripts that we wrote on my current project assignment -- those which should have been a python or pearl script to begin with -- and bumped into a problem concerning functions and variable scope.

Since I'm mainly a C* programmer (C, C++ and C#), my shellscript looks too much like a common program, admittedly. This means that I use a lot of functions. But there's a catch: variables used in bash are automatically global. So, if I execute the portion of code below:

#!/bin/bash
function test()
{
    value=123
    echo $value
}
value=321
echo $value
test                          
echo $value

I will get the following output:

321
123
123

Which isn't what I intended. So, I found this nice command that was secret to me, called "local". Variables in bash can be declared, and it's scope can be determined by using this command. So, the following script:

#!/bin/bash
function test()
{
    local value=123
    echo $value
}
value=321
echo $value
test
echo $value


Which would produce:

321
123
321

Which would be exactly what I desired.

Wednesday, August 19, 2009

ThinkPad T400

Today I got my "new" Lenovo ThinkPad T400 as part of a hardware upgrade program from Accenture. Every 3 years (in my case, 2,5) Accenture upgrades our notebooks, since, of course, they get aged somewhat fast after intense use. And, in my case, they also get outdated because customers like cutting-edge technology, and cutting-edge technology needs cutting edge hardware to be developed upon.

So, I got this business oriented Lenovo ThinkPad T400. And I've decided to blog about it, because I'm tired of complaining to the walls (or to my girlfriend). Considering just the hardware, this notebook is top value. Good HD (160GB 7400RPM), nice RAM (3GB), the case seems sturdy, and the battery life is amazing. Besides that, the screen has a cool lamp that lights the keyboard, in case you have to type in complete darkness and can't find the keys. Cool, but not sure if it's useful.

Now, for the rant. The keyboard plainly sucks. I have no better term so, sorry purists, but this is it: the keyboard sucks. Every key that isn't the default QWERTY has a terrible placement. Lenovo has put a lot of work into making this notebook very small, and reducing keyboard's size occupation was probably #1 concern. to achieve this, some keys are way smaller than the others, like, for example, the Windows Key, which is narrower than the rest, may escape your fingers (if you type on such a hurry as I do).

The worse, however, remains for the "Fn" and ESC key. Fn, for those who are new to the notebook world, is a special function key that works similarly to CTRL or ALT -- you need to hold it and press a certain key to access special notebook functions. The Fn key was placed just below the left-shift key, just where the CTRL should be. Now for those who, like myself, use the side of the small finger to hold the CTRL, this is a nightmare. Fn has much less function than CTRL, but the key is in a more reachable place.

In my case, the worse part is that I use a lot of CTRL+Right and CTRL+Left to skip words when typing, but on the Lenovo, Fn+Right and Fn+Left skips a track on whatever multimedia player is running (it's a Windows' multimedia key). I still haven't learned how to disable this function.

Also, the ESC key is sitting on the top of the board, above the F1 key. I find myself reaching for ESC and hitting F1 instead, which usually means opening Help. It's annoying. Since ESC is a big part of Windows, this is almost worse than control. Luckily, I use it less than I thought and it's easier to get used to it.

Finally there is the Track Point in the middle of the keyboard. This one is hard to decide -- there are people who loves it, but on my opinion, I have the touch pad, thank you very much. My complain is that I end up touching it when travelling over the keyboard. This is annoying, but if it was the only problem, I wouldn't care about it.

For those of you using a Desktop, think how awful would it be if you couldn't change the keyboard, if you were stuck with a ESC in a wrong position. Now, think about it, but without a mouse. Sounds like hell, doesn't it?