|« January 2004||March 2004 »|
I'm Ryan Lowe, a Software Engineering graduate living in Ottawa, Canada. I like agile software development and Ruby on Rails.
I write this blog in Canadian English and don't use a spell checker. Typos happen.
» Full-time Ruby on Rails freelancer
» Full-time with Rails since May 2005
» Former committer for RadRails (now Aptana)
» I also have a few Rails side-projects in development:
1. wheretogoinTO.com Toronto nightlife
2. Hey Heads Up! TODO list and sharing
3. Layered Genealogy family history research
4. foos for foosball scoring
5. fanconcert for music fans (on hold)
Hiring Rails developers? I can telecommute by the hour from Ottawa, Canada
»» Email: rails AT ryanlowe DOT ca
Now hosted on Hey! Heads Up -- check it out!
Derek Lowe's (Ryan's older brother) words at Ryan's funeral
[email protected] no more
Forging Email Headers: Good, Bad or Ugly?
Sarcastic Dictionary (Part 1 of Many)
Twisting Rails is Risky Business
Risky Business? My Take on Early Alphas
Whoa, it's August 2007
A Postscript to "Growth at the grassroots"
»» All Blog Posts
David Heinemeier Hansson
James Duncan Davidson
Signal vs. Noise
Amy Hoy: (24)slash7
Luis de la Rosa
# Free Software Realities
James Robertson has linked to a few capitalist rants by Clemens Vasters: one and two. Ponder this quote from Clemens: "selfish is not the one who wants to get a tangible reward for his work. Selfish is the one who denies that reward."
I'm glad I'm not alone in disagreeing with him and I didn't think I would be. I was going to write a comment on James' blog about his latest post but it got long so I'll post it here instead:
Isn't it kind of silly to be having a conversation about this? I mean, do people expect open source developers to say "hey, ya ... you know what? I'm wasting my time. You're right. OK, let's go back to the old way so everyone can get a paycheck. We all deserve it." It ain't gonna happen. There will always be enough control freaks and freedom idealists to commoditize the next software market with free software.
There's clearly a difference of opinion here. On one hand we have people who are looking for something in return for their investment of time. Open source has some qualities that allow those people to do that (free marketing, public speaking fees, increased feedback, bugfixes, whatever). That's fine, that's great, I can appreciate that. You need to support a family and I need to support my Mac hardware fetish.
But there is a segment of the free software world that just doesn't see it that way. They pump out code for the greater good, or to boost their egos and not their wallets. These people will not be convinced by capitalists ranting about losing earning potential ... in fact they will be driven the other way and be motivated to stick it to you. As software developers looking to get paid I think we owe it to ourselves to understand the rationale behind this "competition". I think it's great that I can get paid for my hobby and call it a career ... but I'm also realistic. Enough hobbyists working for free will marginalize me, so I'll just have to watch out and stay ahead of them. Free software has too much momentum now.
I know I'm naive -- and I can conveniently use my youth as an excuse. But I'm also realistic about free software. It's serious competition ... and you aren't going to convince too many free software developers to start coding only for dollars just because you think it's "selfish" not to charge for their work. Oh no, they'll just see it the other way ... it's selfish for you to expect to get paid for something someone else will do for FREE in their spare time. That's life in the software industry of the next 30 years ... that's what I'm expecting.
Naive would be thinking that free software won't impact your market niche or that you can convince people to change their minds. It will have an impact and you can't change enough minds to make a difference at this point. People are getting a taste of software freedom (that's the free that should be emphasized, not the code) and they like it. So you might as well start figuring out how to make money despite free software's existence. I'm naive about business but I know that much ... and I can see it coming from a mile away. Oh, and so can IBM.
Software developers seem to have it in their heads that they will always have jobs and it's just not the case. Remember, this "market" didn't exist 50 years ago. If you lose your job because of free software don't blame free software, blame yourself for not having the foresight to move on to greener pastures and better opportunities. If you want a secure job, the software "industry" is not a good place to look for one. Things change too quickly.
Another choice quote from Clemens, linked by Dave, was: "If you want to put your skills to work and you need to support a family, your work and work results can�t be free." What if I'm a juggler? I could be the best juggler in Canada, yet I can't get a job juggling to support my family. Jugglers used to be popular in circuses and carnivals. Now kids play on their Xboxes. Progress is a wonderful thing, right?
Update Monday 1:05 PM: Clemens got linked by slashdot.
# Removable Media
For the sake of this post, let's say that removable media refers to any storage device that is not permanently connected to your machine. If you have a better word for it I'm very open to suggestions. Here are some examples of removable media:
- CD-R/CD-RW/DVD-R discs
At any given time your computer may or may not have access to the information contained on this media. But you'd still like to know you have it around, right? So AudioMan will help you keep track of it. I'll run through a scenario:
You put a CD-R disc in the drive. It contains 100 mp3s that you ripped from the original audio CDs. AudioMan crawls the whole CD and adds all of the songs to AudioMan. It also makes a playlist that you can select to see just that disc in the view. So the playlist tree might look like:
You take the disc out so the songs can't be played from the disc any more. Earlier, you copied 50 of the 100 mp3s on that CD to your hard drive. When you scanned the CD in, AudioMan matched up the files on the CD to the ones on your hard drive and consolidated them into one entry so you don't have duplicates in the view. With the CD-R you just added as the selected playlist, you can see which ones can be played off the hard drive instead.
You can also toggle between showing only available playable songs and all of the songs you have in your entire collection, including all of your removable media. If you need to know which disc of 30 your Led Zeppelin songs are on, you just browse in AudioMan to Led Zeppelin and show the info for the entry. It tells you they are on disc 12. A lot easier than searching through a stack of CD-Rs manually.
You could also put in a regular audio CD. It would download the track list from freedb and then match up those entries to mp3s you ripped to your hard drive three weeks ago.
Let's say you download a lot of singles from the iTunes Music Store or Napster. You are responsible for backing them up, and once you scan all of your burned discs, AudioMan would give you a list of mp3 files that haven't been burned to CD yet to make this easier. Then you can stay on top of backups.
What do you guys think of these ideas?
# Getting Ahead of MP3
To make testing files for AudioMan, often I only need the first 20k or so of a 5MB file. So I transfer the file to my Mac and use the command:
head -n 50 file.mp3 > clip.mp3
The Unix utility head is used to send an arbitrary number of lines of a text file to stdout, so it counts 50 end of line characters into the original file and (after redirecting stdout it) outputs the result into clip.mp3. Odds are the original file will have bytes that are the same as end of line characters, so all you have to do is adjust the -n number to get the output file size you want. Then you get the head of the mp3 file without modifying it.
I originally tried tools like mp3splt to get the start of an audio file but it modified the tag of the output file. So it wasn't much use to me when I wanted to keep iTunes tags, for instance.
# Keeping Unused Fields Around
Users don't care about tagging that much but as a developer of a tagging tool I have to care. :) So here's what I'm dealing with right now:
In the newer tagging formats the fields are arranged in an expandable list. There are standard fields to use (for MP3 id3v2 and OGG comments) but there is no restriction on how many times a field can be used, like comments. Also the ordering of the fields only matters between fields of the same name.
For example, iTunes uses the id3v2.2 comment field (COM) three times. The user's comment is the last one. The other two are the file checksum and a CDDB lookup number used by iTunes (when you rip a file from CD). When I'm reading in these file formats I'm only concerned about a few of the fields, like that last comment.
When a user edits a file in AudioMan, even though I never use the other two comment fields I don't want to erase or modify them. So on a file write I will have to keep track of the fields I don't modify and rewrite them back to the file in the same order. Then I won't screw up fields that AudioMan isn't concerned about (yet) and that other applications use.
# Abstracting MP3 Tagging
If there is anything I've learned from Apple and Microsoft, it's that users don't care about technical details that much. Regular users love easy, manageable abstractions even if they are somewhat broken. It's the geeks that complain about missing details and they are by far the minority.
That's why I'm going to hide those details from them in AudioMan. If a geek is really that concerned about minute details of tags there are plenty of picky tools written by picky geeks that do that. Most of the ones I've seen aren't intuitive at all, which is my best guess as to why no one has heard of a mainstream tag editor for MP3 files.
iTunes doesn't do very much automatic tag editing, even though it would probably get the product more users. My best guess is that they are afraid that if they have features like that the record companies will see it as tacit approval of illegal downloading. Apple is tip-toeing around the record companies as it is. It's all garbage though -- people just want to be in control of their files.
In AudioMan you'll write the details for the file in a small, set number of fields and it writes them in both id3v1 and id3v2 for MP3s (we are also looking to support other formats like WMA, OGG, Real audio, AAC). When it reads files, the fields in the id3v2 tag get priority because they can be longer and more descriptive. That's about it. The list of supported fields might increase over time but they will be relegated to an advanced tab or something similar .... something out of the way of most people. It's funny, the fields used in id3v1 (artist name, album name, track name, year, genre, comment, track number) were good enough but they just weren't long enough.
id3v2 is one of the most frustratingly incompatible stardards I've ever read, even though it was supposed to be designed to be extendable. Maybe I just haven't read enough standards yet. :) It will take a while to fully support all of the versions in AudioMan, if ever. We'll just do it one field at a time, starting with the basic seven from id3v1. If users want to see other fields we'll gladly take that feedback.
# Smalltalk Eclipse IDE Presentation Post Mortem
Last night Roy and I went to the the Ottawa Carleton Smalltalk Users Group meeting, where the presentation was "Smalltalk in Eclipse: How to build your favorite language IDE" by John Duimovich of IBM. It was for SmallTalk developers so John thought we would all know Smalltalk and not much about Eclipse. There were a few of us where the opposite was true. :) But that was still OK, I still learned a lot and Eclipse and SmallTalk.
Unfortunately it looks like a Smalltalk IDE in Eclipse from IBM won't be coming out for a while. One of the reasons is that IBM is wary of releasing new code as open source after the whole SCO mess started going down, which is fair enough. That code included the Eclipse plugins themselves, a Smalltalk library and IBM's Smalltalk VM (which he mentioned they would have to release as binary probably). Another is they already have another Smalltalk IDE, developed by the same company (OTI, bought by IBM) that ended up writing Eclipse. There doesn't seem to be much demand for Smalltalk in Eclipse either, so without a business case the effort might have to be more community-driven. It seemed like John was testing the waters with these Smalltalk developers and although there were no boisterous bouts of spontaneous applause they seemed impressed by Eclipse's features.
The talk was based on a presentation John just made at EclipseCON 2004 less than a month ago and then he demo'd his Smalltalk Eclipse IDE. It is language aware, constructing a DOM out of the Smalltalk code and letting the user see it in the outline view as well as code completion.
The difficulty of making an Eclipse IDE for Smalltalk was the fact that Smalltalks don't use files, they use an image. The analogy of an image being like an old carpet was pretty good -- it catches everything that walks over it -- all of your edits. So he had to make a system where an image could be reconstructed and then saved, split the Smalltalk source into files on the file system and create equivalents to the Java tools java, javac and javaw. So it appears to me, though I am new to Smalltalk, that this wouldn't be completely like a traditional Smalltalk IDE -- it's morphed to fit into the Eclipse paradigm.
Part of the talk included a very interesting tangent on a concept called "doits" (pronounced "do its", not "doyts"). I'm not sure if it only applied to the Smalltalk Eclipse IDE or Eclipse in general though. The idea is to put a simple scripting language in Eclipse that you can use to perform quick tasks in the IDE. The example doit he gave was if someone else found a bug they could attach a doit to the bug report that would download the project from CVS, compile it and then highlight the line that refers to the bug. A scriptable IDE sounds really cool indeed. I don't think he mentioned a possible language it would use if they made it though. Maybe python?
In the future after a complete Smalltalk IDE was made the next step might be to let people write Eclipse plugins in Smalltalk. Obviously this would require Java (the language Eclipse is written in) and Smalltalk to play nicely with each other. The Eclipse developers have been down this path before though when they made Visual Age for Java, except it was the other way around. VAJava is written in Smalltalk.
So in that situation the Java and Smalltalk VMs would be going at the same time. The conversation at the talk then went into Microsoft and the .NET Common Language Runtime (CLR) and how it was able to excute all of these different languages in the same runtime. Would something like this be possible for Java and Smalltalk? John said that it was done before and called the Universal VM but it wasn't an ideal solution because the UVM didn't support C++ from the beginning. He also mentioned that Microsoft's work on the CLR may have started because they heard about the UVM and thought they better make something to compete with it! So now Microsoft has the CLR and the rest of us have, well, nothing ... unless you count Mono and I don't because Java doesn't run on Mono.
As someone new to Smalltalk I'm a little worried about all of the dialects of the language. Do they fragment the Smalltalk community? It seems as though they would make developing a Smalltalk IDE in Eclipse harder. John mentioned that for his Smalltalk Eclipse IDE he would like the various Smalltalks to agree on an output format for the images and such. What if this doesn't happen easily? It could slow down development more.
At any rate, Smalltalk is interesting stuff and I'll probably be hacking on it soon even though the Eclipse IDE isn't coming out for a while. Any recommendations for a good (and free) Smalltalk IDE for a Java developer slash Smalltalk noob like myself?
# AudioMan 0.2.0 Released
What's coming up in the 0.3 stream? I'll be putting TRM calculations back in to uniquely identify songs by their audio, which will lead to other features. Reading (but not writing yet) id3v2 tags is a good candidate. Maybe support for removable media like CD-Rs.
I'm only going to add critical fixes to the 0.2 branch and release updates as necessary. Don't worry, you won't have to wait long for the next stable branch to come out. I'd like to release 0.4.0 in about a month so people can get their hands on new features.
Is there anything you want to see in AudioMan? You can leave a comment or put it in the project's Bugzilla repository.
# CVS Branching
Here's a summary of my understanding of CVS branching. If anything is wrong please let me know. Note the difference between the words revision and release.
CVS has a notion of a main development trunk (tree analogy) also called HEAD. You may branch off from the trunk and check in changes only to the branch without affecting the trunk. You may also merge the branch back to the trunk one or many times.
All of the new AudioMan development will happen on the trunk. I'll branch off 0.2 so that I can make maintenance fixes. Then I'll tag releases 0.2.1, 0.2.2, etc on that branch. Concurrently I'll also be tagging releases 0.3.1, 0.3.2, etc on the trunk.
For each source code file, CVS stores all of the revision information for the trunk and the branches in the same place. It can tell changes from the trunk apart from changes to branches by using a hierarchial system of revision numbers.
For each file tags are kept as well. Tagging a file is a good way to mark a specific product release even though individually all of the source files probably have different revision numbers at the time you tagged them.
When you create a branch it's important to realise it's being done in the repository on each file. Branches always have a root, which is a tag on the trunk marking where the branch came from on the trunk. It is used to merge changes back later.
For example I could tag AudioMan version 0.2.0 on the trunk and then branch off from it, continuing 0.2 development from that point on the branch. The trunk would then be used for 0.3. When I wanted to merge a minor bugfix I made in the 0.2 back to the trunk, it would use the root tag as a comparison point to figure out the difference between the branch, the file at the time of the root tag and the present trunk.
Incidentally it looks like I'll tag 0.2.0 when it is ready and then wait until the first development release to tag 0.3.0. Sometimes the first development release is x.y.1 instead but it doesn't really matter. For example there didn't seem to be a 2.5.0 release of the Linux kernel but there was a 2.3.0 release.
For stable releases Bugzilla has 2.16 instead of 2.16.0. I think it's a bit confusing not having the last digit because 2.16 could refer to the series of releases rather than a specific version.
I'm setting up the AudioMan CVS server for this and the module AudioMan might be in flux for a bit. I'll post an update when it is ready. Until then use module AudioManPoint2.
Update 9:09 AM: OK, module AudioMan is ready now. Rather than synchronize with this new version of the module, you are better off deleting what you have and checking AudioMan out again.
# Having Stable and Development Branches
Given that AudioMan works on people's data and could -- in the future when it starts dealing with id3v2 for example -- corrupt or delete files, I think I'm going to have to have at least two active branches going at the same time. A lot of projects follow this system, including the Linux kernel and Bugzilla.
Version 0.2.0 will be the first release in the stable 0.2 branch and after that it will be updated with bug fixes only. This way people can use releases in the 0.2 branch knowing that it has predictable, though limited, behaviour. Not everyone wants to be on the bleeding edge.
At the same time I release 0.2.0 I will start the 0.3 branch, which is where all of the new development will go. The bug fixes that are done to the 0.2 branch will be done on 0.3 if they still apply. The odd second number means that the 0.3 branch is frequently unstable and is under development. When interim releases like 0.3.1, 0.3.2, etc are done they will be in a semi-stable state to get user feedback.
When the 0.3 branch is mature and stable enough it will branch off to 0.4 and 0.5. Until the 0.4 branch has been used a lot and trusted people may still be cautious so the 0.2 branch will have to be maintained during that time. Eventually 0.2 will either run out of bugs to fix or users and will be retired.
Like 0.3, the 0.5 branch is where the new features will go after 0.4 branches off. Then the cycle repeats itself. I'd like to keep the iterations fairly short, so that stable branches are released regularly. This way people can get new features more often and don't have to take their chances on the unstable branch. I'm thinking maximum two months between stable versions unless a major change of architecture takes place.
When does AudioMan go to 1.0? That's hard to say but I definitely have some features in mind that I want to get in before 1.0 is released. Right now it's just too far away to think about, so I'll just keep iterating and hopefully people will keep giving me feedback.
I'm not sure how to do branches in CVS so it could be very fun figuring that out. Eclipse seems to have branch support but I have never used it. When I get it sorted I'll be sure to blog it.
# Updating From File Changes Made Outside of AudioMan
On the home stretch with AudioMan 0.2.0 and I need to write this out to get it organised in my head. To get AudioMan to function like it did in 0.1.2, I need to get two things working that are kind of related: updating songs in AudioMan when they were edited by another application and automatic capitalization.
If you refresh your memory about how components in AudioMan communicate you'll see that step B2 involves getting a bunch of "records" from the repository that match what the user wants to see.
After it gets these records it has to check each one to make sure the file hasn't changed since the last time we read the information. The repository stores a time stamp for each file so AudioMan just has to compare it against the file's last modified time.
If the file has changed, AudioMan has to read the new data from the file (B3) and then update the repository (B4). After that the models can be safely updated (B5).
The old way used to do this one record at a time and then return the finished array but I want to be more flexible. If a file has changed AudioMan might have to do some long operation (like recompute the TRM) which could delay updating the model and affect UI responsiveness. Speaking of that, I have an idea of how to do those long TRM calculations in the background but I'll talk about that in another post.
So what I could do is add all of the records to the model and then update all of the stale ones via the TracksMutator afterwards. The models listen to the TracksMutator so they would be updated as well.
This also has the advantage of keeping all the write operations in the mutators and all of the read operations in the controller, which makes flow control nice and simple.
Automatic capitalization squeezes into this update procedure as well. I see the logic going something like:
//one record of many, inside a loop
//copy the original AudioData so we can compare later
Long timeStamp = (Long)ad.get(AudioDataKey.TIME_STAMP);
if (f.canRead() && (f.lastModified() > timeStamp.longValue()))
//check the formatting
if (false == newAd.equals(ad))
The format() method does the capitalization and it returns the AudioData object unmodified if no formatting needed to be done.
After looping through all of the records the model is loaded with all of the records. Then the TracksMutator is used to update the records needing updating.
//update the stale tracks
Explicity synchronizing with the file system like this will all go away once I figure out how to use file system listeners. You can attach listeners to directories to see if files are added/removed/modified, etc. In AudioMan in order to see a change in the file system you have to select another view and then come back. With file listeners I can update the data in the model and view as soon as the files are changed, just like Windows Explorer does.
I know this is possible because Eclipse does it since 3.0 M7.
# More Music
The great music just keeps coming in. It's a good thing too, at the rate I'm listening to music now I'd get sick of the stuff I have. Here are some other great albums I've been listening to:
Teitur - Poetry & Aeroplanes
# Playlist Support Back In
AudioMan is almost ready for the 0.2.0 release and you can check out another new developer build today if you want. Since the last update I've put support for playlists back in and cleaned up the GUI a bit.
I put over 1700 songs in it and there is only a small delay switching between All Artists and a specific artist. The UI is definitely a lot better architected this way. Even though it took me a month to do the rewrite it was well worth it. I'd hate to imagine doing it in my spare time while I was working though. It would have been more like 4 months. I couldn't imagine doing a change like this incrementally either -- it was too much of a major change of architecture.
I'd like to get as close to the old AudioMan 0.1.2 functionality before I release 0.2.0, after which I'll start taking bug reports again. What's left includes the preference dialog, quality preference, automatic capitalization, updates from the files if they have been changed outside of AudioMan, exporting to M3U and HTML, UI tweaks and probably other stuff. I'll see if I can get that done tonight.
# SWT AssertionFailedException
If you're getting this error in your SWT program:
org.eclipse.core.internal.runtime.AssertionFailedException: assertion failed: The application has not been initialized.
then code that SWT is calling is throwing an exception that is not being caught, probably a runtime exception like NullPointerException. SWT seems to catch all exceptions and then throw this generic one.
Why throw this cryptic general error message with no hint as to the real problem? I don't know but it's really not that helpful for debugging the problem. There's probably a good reason for it though because it's not an ideal solution. Ha, does that make sense? :)
To find out what's wrong you can wrap calls to methods outside of SWT in try/catch blocks. Normally I wouldn't advocate catching Exception but in this situation wrapping it in a try/catch block temporarily is a good way to figure out what's wrong. Usually I do something like:
This will print a full stack trace in the Eclipse console view. Then after you debug the problem take the try/catch block out. It's not going to do much good to you in production when users can't see the command line, a console view (stdout).
So I guess this is where the log file solution comes in that I talked about earler. You could catch all exceptions (even runtime) coming from outside SWT and log the exceptions with stack traces to a file. This is what Eclipse does if I'm not mistaken -- at least it did in 2002 when I worked at Rational.
I wonder if you could use aspectj to do something like "every time I call outside of SWT wrap it in a try/catch block 'automatically' and log the exceptions to a file." Then you wouldn't have to clutter up your code with explicit try/catch blocks and better yet you wouldn't have to remember to try/catch every time. That kind of thing would be much less error prone. Unfortunately I don't know enough about aspectj to know if this is possible. Time to find out I think...
# Thinking About the Sweet Spot
People have been asking me where I'm going with technology, so that got me thinking about how a person would go about thinking about it (metathinking, if you will).
First off, besides all other considerations, what interests you personally? I shouldn't have to go into a big rant about being happy at work, this is just obvious. Do you like playing with databases, languages, architecture? If you don't like technology and you're just in it for the money, that's your own problem. :) If you like a segment of technology that isn't doing too well, you better be competitive or you won't make it in, much less survive there. Be realistic.
Then do the inevitable what do I have to get there planning. You can't start as an architect unless you can really talk your way into a job -- just like you can't start as a middle manager if you have a business degree unless you're the CEO's son. You have to do your time and coding and testing, often unglamourous work at first depending on how much you like coding. But keep reading about architecture, keep yourself interested, figure out the architecture (if any) of the code you're working on and critique it. Most of all, and this will be hard at first, keep your naive opinions about architecture to yourself. Don't make too many suggestions to your managers or they will think you're trying to flip them the bird. You can't just shrug off office politics if you want to be an architect or a team lead ... you have to embrace politics.
Right now I'm interested in programming but I group languages into three buckets: dead/dying, alive and future. Notice this is different than what are people using today thinking. For example, C++ is a dying language. There may be a lot of C++ jobs out there right now, but there won't be in 5-10 years. Memory management, security and pointers are some of the biggest reasons. That's not to say people aren't going to be using C++ for many years, but the market will be just like assembly programming is today -- fairly niche. Even Microsoft has given up on C++, once their bread and butter, only supporting it for legacy purposes in .NET. They are heading in the managed code direction.
Java, .NET (C# and VB.NET) and other object oriented languages with garbage collection are alive today. Chances are you will use these languages in your career over the next 20 years. Unless you are a fantastic C++ programmer, why compete against people with 10+ years of experience when C++ jobs are becoming more scarce?
The future languages and software engineering techniques should always be on your radar. It's good to know when a good job can be done even better, easier or quicker by a leap in technology so keep your ear to the ground. Do you know the differences between Python and Ruby? Have you heard of Aspect Oriented Programming and how does it apply to your favourite language? Have you tried Ant? What's a functional programming language and when should you use it?
Software engineering concepts like patterns are also gaining mainstream popularity, so you should pay attention to those. The main thing they give everyone a common frame of reference to have an intelligent, brief conversation with academics and architects, which may be important for your career. Testing (especially automating manual testing) is becoming a big deal because programs are becoming too large to maintain otherwise. Developers are becoming more involved in the testing process instead of insulated from it.
Another tip: don't get too specific. Learn how to go end-to-end. When is the last time you wrote a SQL query or designed (or critiqued) a database? Do you know how to set up the tools you use on your project like CVS and Bugzilla? Speaking of CVS, what are the alternatives and what advantages do they have? CVS is dying too. Tying yourself to one platform or vendor is also dangerous. What's the difference between Java and C#? Have you tried using a Mac? How about a Linux distribution? What advantages do these platforms have for the people that use them? What niche do they fill? The answers can sometimes expose new opportunities for you.
Technology trends are another thing to keep in your head: don't operate in a vacuum. Do you know what RSS is and what it's used for? It could be the web of the future so you should keep your eye on it. In the same vein, what is social software? What are the Google APIs? Or even: what are web services and how would I use them in my applications? Who are the best bloggers to read for the technology I'm interested in? What are all of the platforms your favourite language runs on? Who knows, you could be making Java cell phone games one day.
That's just off the top of my head. Anyone else have ideas?
# Does Java Need a Higher Level Pattern Validation?
After reading this interview with Charles Simonyi, I wondered: What if there was a design pattern language built on top of Java? Java code is verified by the Java compiler and maybe as it is written in an IDE like Eclipse. The Java programming language is quite flexible and generally portable.
If you wanted an object to represent part of a design pattern, couldn't you do it in a language built on top of Java which would verify that the class was properly structured to represent that pattern after it compiled correctly? In a regular or legacy Java compiler these high language directives would be ignored.
Simple examples would be immutable objects and singletons. For immutable objects, the design pattern verifier would ensure that all of the members variables were private final, and that they were set only by a constructor and had the proper get accessor methods. Otherwise, you'd get warning or error. For singleton classes it would ensure that the constructor was private, there was a private static member variable representing the singleton in the class and that a static getInstance() method was present.
The logical extension of this system would then be wizards that pump out immutable objects and singletons, based on a few pieces of information. The design pattern verifier enforces the design pattern contract you are trying to use, or at the very least warns you when you go astray.
It might also make it easier to refactor one design pattern to another: going from a singleton class to a factory class, for example.
Some people might be saying "so what? I can already write my patterns in Java". You're right, you can. The power comes from the machine verification of the design pattern. If the patterns are rigidly enforced, quality of the code will improve because you're no longer breaking basic pattern contracts and abstractions needed to assemble larger systems. However, building it on top of Java ensures that the parts of the patterns that need implementing can be done in a flexible way and in a language that many people are familiar with.
What would a "language" like this look like? Well obviously it would have to be in Java comments so that it could be safely ignored by standard issue Java compilers. It might look something like Javadoc. Who's going to build it? Do we wait for Sun to step up to the plate or start a community effort?
# Seeing the Forest for the Tests
"How long would it take you to write method X if you didn't unit test it?"
It's a fair question, often asked by managers looking to cut corners. Not that I don't have confidence in my coding skills but I wouldn't consider writing code without unit tests any more. Here's why ...
First of all test-driven development (TDD), when done properly, gives you the the minimum amount of code needed to satisfy the requirements of the method. This minimizes cruft gathering in the code. For those that are unfamiliar with test-driven development, here's how I interpret and use TDD.
Say you have a function you know will have 3 parameters and return one value. Start with the first parameter and ask yourself a few simple questions: what are the ranges of values this Object (or primitive value like int) can represent? Can I pass in null as a legal value, as if so what is the output? Is this parameter used in combination with any of the other parameters to produce the return value or is it used in the programming logic?
I usually write the null tests first, because they are the most basic requirements. When you pass an Object to a method chances are you want to use one of its methods. If the parameter is null when you try to call the method the VM is just going to throw NullPointerException anyway.
So write one test at a time. Make sure it fails and then change the method code to make the test pass. Having the test initially fail is crucial because you could have a bug in your code if you expect a test to fail and it unexpectedly passes. As you add tests, the tests you have already written act as regression sensors. So if while figuring out how to make test 9 pass you break tests 4 and 5 you'll know right away. Regression tests are one of the simplest ways to make strong code.
Consider the alternative: writing code from your brain to the editor, figuring out all of the possible paths in your head and keeping track of possible regressions as you modify and add to the method. Figuring out obscure bugs by using the debugger and printf()s. It's just too much to worry about. No wonder some coders can't think long term: they are too worried about breaking something in the method they are working on in the short term. If you have really great unit regression tests, you worry less about that and can free your mind up for the big picture.
Using TDD I find that I use the debugger and printf()s very rarely. Unit tests usually isolate a problem with code well enough that you don't need to. When you're starting out with TDD you might get a bit ahead of yourself and write a lot of code sometimes and personally I find this is when the program gets buggy again. Back up and take it one test at a time. It's not glamourous but it gets the job done right.
When you come back to the code in six months if you have unit tests you can read them to figure out the expected behaviour of the method. Unit tests also give a good idea how how to properly use the method in production. Without unit tests you have to grok the code, and comments if there are any and maybe read some design documents. Unit tests are a great way of documenting code without writing artifacts. It's laziness but it's constructive laziness.
So when a manager asks me "How long would it take you to write method X if you didn't unit test it?" I'll probably reply "about the same amount of time". However, I'll add that while I'll guarantee the tested version will run as expected because I'm writing a unit testing contract at the same time I'm writing the code, there's no quick way to guarantee the behaviour of the untested version even if it is actually 100% correct. You'd have to take the code and audit it line by line, using the design documents (assuming you made them) as reference. A manager that studies software development will probably let you take the time to write the tests and invest in the long term future of the project.
# Asynchronous View Update with setInput() Works Well
I have changed around AudioMan so that it does was I blogged about yesterday: it fills the models in a new thread and then uses setInput() to attach the new model into the view. The old way was just adding and removing items from the same model, which took a long time for the UI to update.
The switch between All Artists and a specific artist is almost instantaneous. I've also changed the way include directory adds new songs. They will appear in the view as the crawling function finds them.
# Include Directory and Threading
The Include Directory action in AudioMan can be quite a long operation. The way I do it now, the recursive crawler code goes into the directory recursively and finds all of the MP3 files and returns an array of tags at the end of the crawl. Then I iterate through this array of tags and add them to the repository one at a time.
There are problems with this approach though. Crawling through the directories takes much longer than adding a file to the repository, so the UI appears to do nothing for a long time and then is suddenly loaded with many songs. I could keep a count of found MP3 files in the status bar as I find them but all of the songs would still appear all at once in the end.
I also have to consider the TRM calculation for unique id's based on the audio I want to do later, which is long and requires a lot of I/O. I could do those after I've read all of the tags though (in the background) so that the user gets all of the tag information as soon as possible. This is the same way that Windows Explorer in Windows XP calculates the length of large movies, one at a time after the window appears. Once the TRM is calculated I only have to recalculate it if the file's last modified date changes. So it's expensive but not done very often.
A better option would be two threads: one crawling the directories and the other adding them to the repository as they are found. The second thread would give a handler to the first thread (using the Command pattern again) so it can tell the other thread when an MP3 file is found. Then the view will start updating immediately and the user will see new files added as the crawler finds them.
The problem with crawling a directory recursively is that you don't know how many files you'll find or how long it will take so a progress bar doesn't really fit. The user interface should give the user feedback some other way though. I could put the directory currently being looked at in the status bar but then they would just zip by. Same goes for the song filenames as I include them -- they would just go by too quickly for the user to read them.
On the other hand that's not really the point. The point is to let the user know that something is happening so they don't feel like the operation didn't work -- the UI needs to give feedback. Remember, when you do an include directory now you can still browse around and use the application. When you do a long include directory it seems as though nothing is happening until when much later a whole bunch of songs are added. To show the directory names in the status bar the UI will have to send a handler to the include directory operation to update the GUI using the SWT thread.
I also have to figure out why my Java directory crawling code is so slow. I've seen Windows programs like WinAmp (likely written in C++) crawl directories recursively much much faster. I wonder if it is a limitation of the Java VM or just the way my code is written.
# Throws Object
Jim talks about throwing Throwable and using exceptions in private methods. Joshua Bloch covers this pretty well in Effective Java.
Using exceptions when checking parameters is OK. It all depends on the type of error. A programming error, like passing in null when the method doesn't like it, should throw an unchecked (runtime) exception. These types of errors should be noticed immediately by the programmer, and unchecked exceptions do that -- the application quits and a stack trace is printed. IllegalArgumentException is another useful unchecked exception to indicate a programming error.
It is OK in my opinion for private methods to throw exceptions, as long as they are used for exceptional cases, just like public methods. private methods are often refactorings of common code used by many public methods, or just a block of code you want to take out of a public method to simplify it. So they are like extensions to a public method used from the outside. If a private method throws an unchecked exception indicating a programming error then it is like the public method that is using the private method throws that exception, which is just as effective.
On the other side, if I remember correctly another rule of thumb that Effective Java used was if the situation is recoverable you should use a checked exception. For example, if a file isn't found throw FileNotFoundException, catch it and recover. However the code using the method that throws FileNotFoundException shouldn't depend on the exception for it's main flow control. It should look more like:
File f = new File("somefile.png");
if (f.exists() && f.canRead())
The IOException covers an exceptional case here: if the file goes missing or becomes unreadable after the if (...) but before the read(), which could happen in a multithreaded application that uses files a lot or even by another application if a context switch happened there. This kind of race condition would be a real pain in the ass to track down, so exceptions save you from having to synchronize this bit (though you probably should anyway, or lock the file somehow).
But like I said above, it's for exceptional corner cases. You should not be using exceptions for flow control because they are computationally expensive and harder to trace than normal procedural code. Add multithreading into the mix and you can get really whacked in the side of the head.
Throwing and catching Throwable isn't just ugly, it's wrong. The only thing worse is throwing Object (and I've seen that too). The reason you don't want to do this is because you catch unchecked exceptions too, which are programming errors you want to propagate up the stack and be caught by the VM, printed in stack trace and terminate the program immediately. Unchecked exceptions, being the opposite of checked exceptions, indicate that the program is in an unrecoverable state. So it doesn't make much sense to catch unchecked exceptions at all.
Unchecked exceptions in Java are RuntimeException class and its children, which you'll notice extends Object, Thowable and Exception. Code should never explicitly catch Object, Thowable or Exception and if it does it's a dead giveaway that the person who wrote it doesn't have a clue how to program in Java.
There is one slight exception though. Sometimes programs catch all exceptions (including unchecked runtime exceptions) near the top of the stack and log them in a file, especially applications that are GUI-based since it can be hard to see stdout while it's running. This is a good strategy for logging a GUI-based app but it presents it's own problems. What if the log file has an IOException, where is that written to? :)
# Phone Geeking
Cell phone numbers aren't in phone books which makes reverse lookups impossible. You can still get two pieces of information 1) if the number is in fact a call phone number and 2) who the provider is. If the number is a land line, you can find out what town or city the number is in.
You get this information from the Central Office (CO) Code availability page. Each Central Office code covers 10,000 phone numbers and they are allocated in chunks that large. For example on the 613 page you can see that Bell Mobility has CO codes 858, 859, 921, 922 and others.
The allocation scheme also means is that a lot of numbers are unused (but at the same time allow for future growth). For example, the town of Pakenham and the surrounding area has all of code 624 and a lot less than 10,000 residents.
The last column on the availability table, Remarks, is interesting too. All of the Hull CO codes that have recently been changed to local calls in the 819 area code are now reserved in the 613 area code. There are numbers for special use (200, 311, 400, 411, 911, 939, 976) and even misdials -- check out 912, 914, 915 but not 913 (the 3 is far enough away from the 1 to reduce the chance of a misdial).
# Multithreaded Use of a TableViewer Content Provider's setInput() Method
So far things are good with the new AudioMan architecture. I've taken playlists out of the picture to simplify the application while I play with SWT TableViewers. If you download the latest developer build that I released today you'll notice that the UI is much more responsive, owing to the fact that I now spawn asynchronous threads for the two longest running operations; Include File and Include Directory.
The Include Directory operation has no progress bar yet but I do plan to use one. For now the UI just sits there (though it's not frozen, you can still use it and browse the collection). I would like to allow the user to use the GUI while the include operations are taking place because they can be quite long. What I would like to do is use the status bar at the bottom to show the include progress and songs will be added to the model as they are included.
Another thing you'll notice is that the models are never cleared -- entries are added and removed as needed. If you add a few hundred songs and browse around, the time it takes to reload the All Artists track list is quite long as is the time to filter down to a specific artist again from the All Artists list. Though the UI effect is pretty neat -- seeing the items added and removed -- this is not the response we want.
The reason why it's doing that is because I set the TableViewer content provider's pointer to the model once (setInput()) and as I mentioned before, add and remove records as needed. The call to the database/repository runs asynchronously in its own thread, requesting the tracks that match the what the user wants (artist, album) loading them into the model and then finally removing the ones that don't match.
Instead what I think I will do is every time the user changes the artist or album I will make new models with the returned artists, albums and tracks (think recordsets) and use the content providers' setInput() method to swap the views' model listeners from the old models to the new models. Then the change in the UI will appear to be (in theory, hopefully) instantaneous, instead of the way it is now where you can see the items being added and removed.
Given that the call to the controller will be in its own thread, when the database/repository returns with the recordsets (they are arrays, but the analogy is helpful) I can construct the new models in the thread but I still need to call the content providers' setInput() method from the SWT thread. As mentioned in my previous post, all SWT calls must be done with the SWT thread (or threads maybe, I'm not sure. I've read that Swing supports more than one UI thread so maybe SWT does too).
Then the user interface will have to be notified asynchronously when the controller has loaded the new models and also given pointers to these new models. That can be done two ways: by registering a listener with the controller or by passing a handler as a parameter which implements the Runnable interface, which is executed when the models are ready. The handler option is what's known as the Gang of Four Command pattern, which is like passing a function pointer in C++ as a callback method.
Registering listeners is useful when you have more than one thing interested in being notified. In this case, the only thing that needs to be notified when the new models are ready is the view (GUI) so the overhead of registering and keep track of listeners is unnecessary though the architecture might be easier to grok. For now I will use the Command pattern with the Runnable parameter and see how well that works.
I am most interested to see if using setInput() with the new models will result in more instantaneous loading of the track list when the artist or album is switched. There could be a long delay extracting an Object array from a large model in the getElements() method, which is called by setInput() when there is a model change. Storing the models as arrays isn't an option because they have to be able to grow and shink. The conversion from collection to array has to happen.
By the way, Allen Holub's book Taming Java Threads is quite good. He gets into the details about how Java threads are (what I would describe as) a leaky abstraction (*cough*deprecated methods*cough*), especially from platform to platform. It's particularly good with the technical details of thread safety, which is exactly what I was looking for. It's a relatively old book though (2000) and that makes me wonder how much the virtual machine implementations have changed with regard to threading since then (approx Java VM 1.2).
# Eclipse Gets a Facelift
I have two comments about Eclipse's new user experience, to be released in Eclipse 3.0 M8, shown below:
1. Why would you go to all of that trouble to make the SWT widget toolkit use native widgets and then not use them (or put a skin over them)? The SWT team should be crying bloody murder on this one, no? Eclipse is SWT's biggest customer right now and if Eclipse goes through with this change it is marginalizing one of the main selling points of the toolkit, namely the native look and feel (at the same time we should realise that this transition required the help of the SWT team). What is SWT's focus? To support Eclipse and its direct plugins or to be a general toolkit?
2. Just at first glance it seems ugly and unnecessary to me. No, I have not used the demo yet. Of course I thought the same thing about the new MSN Messenger interface and when I go back to the old one now it seems quaint in comparison. So I'm not sure ... it would be nice if they would let you toggle it off an on but that likely won't happen. The present "native" flat interface on Win32 looks more professional. If they are making this change just to show off the skinning abilities of SWT I think that's a big mistake. Where is the need?
Judging from the comments left in the bug related to this plan many people aren't happy about the proposed new look and feel. This change could disappoint a lot of people -- it's risky. I sure hope the Eclipse usability team knows what they are doing. Just remember who your customers are: engineers and coders don't need bells and whistles, they just get in the way.
But these are just my first impressions. I'm definitely not a usability person I just know what I like. :) I'll try to get my hands on the demo and give it a go. It appears as though the plan is to introduce the look and feel into the main development branch post-M7.
Update 12:26 Looks like they are changing the Mac interface too:
Skinning a Win32 interface isn't unheard of, but Carbon? Honestly, going anything but native on Mac OS X is just asking for it. I can't imagine you'd get Apple's support. Do you think it follows Apple's Human Interface Guidelines?
When it comes to GUIs it seems like everyone has an opinion. This should be a fun ride.
# Black Box vs. Crystal Box Development and Security
Dana Epp has written an interesting post about security of closed source versus open source projects. The main issue I have with it is that he groups all open source projects together. As a whole, they all have different reasons for being and different sizes of teams. They also have different security requirements, which cannot just be generalized under one heading.
While I generally agree as a software engineer that the Build-it Fix-it (BIFI) approach that open source software development uses is not at all secure by itself, the fact that most people don't read the code doesn't make it less secure either.
Some enterprising young hacker would love to be able to brag that he wrote a Linux/Apache/SAMBA exploit and it's getting more attractive every day. It's the same kind of challenge that attracts people to write Windows viruses and worms. That kind of attention on *open* source code only makes it that much better for these projects, even though they use a BIFI process.
The truth is that code audits ARE done on important OSS projects by many people experienced and not and these audits take place over months and years. Many eyeballs make all bugs shallow. The newest Linux kernel won't be considered to be production quality for at least a year but many people will still use it for less than mission-critical things to try to break it. If you want extreme security you stick with Linux kernel 2.4 because it's seen so many eyeballs.
Comparing the two processes then isn't really valid. Projects like the Linux kernel, SAMBA and Apache follow a long, iterative open development process with many developers, testers and (maybe most importantly) users on the bleeding edge. Black box projects like Windows have internal testing, a few external test releases and a handful of outside (but not unpaid) code reviews before it is quite suddenly released into the wild to thousands of users all at the same time, updated on a monthly patch schedule. So of course the black box development style requires a more constructed and thought out process and security auditing because it can't possibly compete with open source software otherwise from a security standpoint. It doesn't seem to have the same level of iterations of development and use, nor the eyeballs on its code.
He does have a point about the rest of the smaller open source projects with less resources being much less secure though. Maybe those projects should be using (generally) sandboxed languages like Java or C#, instead of "vulnerable" languages like C++ (buffer overflows, etc). The authors of these smaller projects should be able to write software without having to worry about larger security concerns, don't you think? A sandbox is the way to go.
# The SWT Thread
One of the key things no one seemed to tell us about SWT was that it ran in one thread by default. This has the unfortunate side-effect of freezing the GUI completely when you call a lengthy method from an event handler. This may seem like a big duh! to some people but I don't recall Microsoft Visual Studio GUI event handlers requiring explicit threading. It seems like a definite control versus ease-of-use tradeoff.
This is exactly where a university education falls short: practical information. We covered the basics of threads in second year but even then we only brushed on it briefly, and never with a GUI. When you step in and try to use SWT in a production environment you're completely clueless. No wonder the application was thrashing so badly, geez. Now because I'm using another thread the GUI is as smooth as silk.
The mutator event listeners that I implemented all cascade on the same thread: from the mutator to the controller to the model to the view when a change is made by a mutator. You want to call the mutator methods in a new thread from a GUI event handler so that the GUI won't freeze. When the listeners are notified that the mutator made a change when it gets back to the view it throws SWTException because you tried to access/modify the GUI objects from a thread that is not the SWT thread -- the same one doing the mutating also notifies and handles the listening.
So the last listener method (which still runs on the mutator thread) in the view has to notify the SWT thread that it wants to make a change. It does this by passing a Runnable object to SWT's display object:
if (tableViewer != null)
Notice the way that Runnable interface's run() method is implemented inline. The objects that I used in the inline (tableViewer and me) must be final (constants) or the compiler will complain.
There were two separate ways I got around that. First, the me object was passed as a parameter to the listener method that contains this inline object. So I just made the parameter final in the method header. Second, the viewer changes in the object that contains this inline object, so before I make the inline object I just make a new final copy of it to tableViewer and use it instead of the non-final version.
Note that having the asyncExec() call in the view (actually it's the TableViewer's content provider) puts all of the SWT code togther in the user interface, rather than having some below it the GUI talking to the SWT thread which would increase coupling by putting SWT objects in the application code.
If you want to see how I implemented threading in AudioMan, it's in the CVS repository under the project AudioManPoint2. There are all kinds of race conditions now because methods that should be synchronized, especially in the repository, aren't yet. I'll be working on that this week -- I just wanted to see if the asyncExec() call worked, and it does.
# Java Microbenchmarks are Evil
I tried to make a benchmark to compare returning objects vs. throwing exceptions but the Java virtual machine is a very hard thing to benchmark because of the optimizations it does. See this old Q+A for more information, and optimizations have probably improved since then.
I wanted to compare the numbers to Andrew's numbers from .NET that he wrote in my comments but they are probably skewed/optimized too.
For example I wrote two functions
private static Exception returnException()
private static void throwsException() throws Exception
and called them a million times in a loop. With the first one I assigned the result to an Exception variable inside the loop. The second I put in a try/catch block inside the loop and caught the exception. When I timed both it I got around the same time (I did about 10 of each and recorded the high and low):
returnException: 3755-3766 ms
So this can make it look like there is no performance difference between throwing an exception and returning a value. Nope, not so fast. The interpreter/compiler is optimizing the returnException() call. Because it's such a small function, it's just inlining it into the loop itself and removing the overhead of having a function call (using the call stack). The second function that throws an exception is likely inlined as well.
The compiler is apparently also smart enough not to generate code for variables that aren't used, like my Exception variable that holds the result of the method call (that has since been optimized out). But that made me wonder: what's taking so long then? All you would have is an empty loop. I compared returning new Exception() to returning just boolean true and it was 1000 times slower than boolean. But apparently the allocation using new can't be optimized away, even though the variable is never used.
The whole point of testing this was to show that when an exception is thrown it has to navigate back up the call stack (cleaning up the stack as it goes) to find the correct catch block. This is what is expensive about throwing exceptions compared to calling functions, which only have to push and pop a few values on and off the stack to return and don't have to manage the try/catch logic. If the compiler optimizes the code you can't compare them fairly.
So beware of microbenchmarks, which is exactly what the five year old Q+A linked above said. So only way to fairly test the speed of code generated from an optimizing compiler is to test it in a large product. I wonder if anyone has done a return Object versus throw Exception comparison on a larger scale.
# Down to Two Blog Columns
No, the blog isn't broken -- I switched the layout. I've incorporated the two right columns into one column instead. Now the main blog has more horizontal space for code and stuff. It also looks a lot better on my horizontal screenspace-challenged Mac but the light colours are still too faded out on the Mac's LCD screen. The light orange background looks almost the same colour as the white unless I tip the screen back.
That doesn't happen on my excellent Viewsonic LCD though -- the colours are very distinct. I've never been so impressed with an LCD monitor and I think I'll wait until the PowerBook's LCD looks this good until I splurge for one. Maybe it already does look better than the iBook's ... Jamie, what's the verdict?
Just as a general warning to everyone: tweaking your blog is addictive. If you have a blog you just can't stop playing with the layout. It's a serious time sink and if there's any good reason not to get a blog, this is might be one of them. I'm just glad I settled on MoveableType for blogging software. If I was playing with the blog backend too I'd never have time for other projects.
# Microsoft Windows Source Leak Could Improve Security?
Dave Winer says: "Everyone's so worried about the Microsoft source leak. "It could open new security holes!" they say. But check this out, the source for Linux, a popular Microsoft competitor, has always been available, and this is promoted by its advocates saying it makes Linux more secure, not less."
That's true but Windows wasn't written to be open source. So there could be (and probably are) errors that are hidden because no one outside of Microsoft and its partners have seen the code. If there are security holes being discovered regularly without the code imagine how much easier it will be with it.
There's a whole class of developers that try to hack/break Windows just for the sheer challenge of it. The NY Times found that out when they interviewed virus writers recently. Would these guys turn into white hats to improve Windows security? Maybe some would but others would still like the "coolness" of creating a virus and seeing it reek havok. It's fun for them.
So sure, in the long run Windows could be more secure if the source was open. Let's say they released all of the Windows 2000 code under the GPL today. In the next few years we'd see more security exploits while all of the bugs were found by curious hackers. The nice ones would notify Microsoft and the evil ones would write damaging viruses for all of us to enjoy. The problems would eventually be fixed but it would be an incredibly painful period for Windows.
That's why Windows can never be open source even if they wanted it to be. There are just not enough Windows developer eyeballs at Microsoft to compete with all of the malicious hackers out there. Heck Windows is closed source right now and they can't respond quickly (see Microsoft Sits on Security Flaw for Six Months and 200 days to fix a broken Windows).
Bottom line: security by obscurity doesn't work very well. The Linux crowd has been preaching that for years. As soon as the code is released or leaked you have a major problem on your hands. The cat gets out of the bag and never returns.
Update 4:21am CNN reports that the leaked code is full of profanities. Does this surprise anyone in the software world? Not really. To outsiders though it might be publicly embarassing to Microsoft.
# Smalltalk Eclipse IDE Presentation
James Robertson points out that the Ottawa Carleton Smalltalk User Group's next meeting will be about creating a Smalltalk IDE with Eclipse on Wednesday, February 25th at 7 PM. The presentation is by the project lead so it will probably be pretty decent.
I don't know much about Smalltalk or the Eclipse plugin architecture so it could be a great opportunity to learn about both. Anyone else interested in going? We have to RSVP.
# Java Unit Testing Exceptions with JUnit
When writing unit tests for a method you want to make sure you cover all of the boundary cases and possible inputs. Some inputs will cause the method to throw an exception and you'll want to verify that output.
Exceptions are caught using a try/catch block. You're trying out a bit of code in the try block and if an exception is thrown by a line in the try block, the catch blocks are there to catch certain types of exceptions you are interested in.
There are two types of exceptions: checked and unchecked exceptions. The differences are outlined extremely well by Josh Bloch in his book Effective Java. This book really is a must for any Java programmer and I can't recommend it enough.
Checked exceptions are for expected errors and should be dealt with programmatically. For example, the checked exception FileNotFoundException happens when you try to open a file that cannot be found. You want your code to work around this situation because users can easily enter bogus files or paths. If a function throws a checked exception, it must indicate that in its method header:
public int read(byte b) throws IOException
Unchecked exceptions on the other hand are for programming errors and are also called runtime exceptions. All runtime exceptions are subclasses of the RuntimeException class and do not have to be declared in the method headers of methods that throw those exceptions. They also do not have to be (and should not be) explicitly caught by the code that calls the method. Runtime exceptions should be allowed to trickle all the way up to the virtual machine uncaught and terminate the program because they indicate situations that the program is not meant to recover from. So here's a good general rule: unless you are debugging, never catch Exception because you'll catch all of the runtime exceptions too.
As a side note, Eclipse is great at telling you when you need to catch checked exceptions as you edit code. It even knows that subclasses of RuntimeException do not need to be explicitly caught. It's a fairly simple thing but it saves tons of time because you don't realise it oops! at compile time.
Now, one of Bloch's good pieces of advice is "only throw exceptions in exceptional cases". Part of the rationale is that it creates weird flow in your programs, which can be hard to debug (Joel Spolsky wrote a good article on the downsides of exceptions). But the main reason is even simpler: throwing and catching exceptions is a computationally expensive operation and so it should be done rarely.
In tests however, you should catch all exceptions to make sure they are working properly. For example, my code from yesterday called a method with null to make sure a NullPointerException was thrown:
public void testSetupBuilder_Null()
Let's examine the flow. The call we are interested in is repository.setupBuilder(null);, which is surrounded by a try/catch block. If a NullPointerException is thrown by the method call, the flow will go into the catch block and hit the return; call, ending the test method (by default tests pass if no assertions are made). If NullPointerException is not thrown the catch block will be skipped and JUnit's fail() method will be called, which fails that test immediately.
Couldn't the fail() call be in the try block below the call to repository.setupBuilder(null);?
public void testSetupBuilder_Null()
//other runtime exceptions are caught by JUnit
If an exception isn't thrown the fail() call will be hit. True, but let's say hypothetically for a moment that repository.setupBuilder(); also throws the unchecked exception IllegalArgumentException. If that exception is thrown instead of NullPointerException the test will pass because we're only catching NullPointerException and the fail() is in the try block and is missed. We don't want the test to pass in that situation because the wrong exception was thrown. Having the fail() call at the end of the test method is more comprehensive.
What I wrote before isn't right so I've crossed it out. The Eclipse JUnit plugin will actually catch the thrown IllegalArgumentException and fail the test, not pass it. Depending on the robustness of your unit testing framework this may work as well. Having the fail() call at the end is more explicit but allowing the exception to be thrown outside of the test will let you examine the stack trace, like I do in the next example. Thanks Andrew for correcting me in the comments.
I feel like making this post longer, so let's do another example. Let's say I have a method call that throws a checked exception but I want to verify an unchecked exception. Here's the (admittedly fictitious) method header:
public int read(File f) throws IOException
and the test method for an input of null:
public void testRead_Null() throws IOException
Notice that because we're not explicitly catching the checked IOException we have to make our test method throw it. That's OK though because in the event that an IOException occurs instead of NullPointerException the JUnit testing framework will display a proper error message. The JUnit plugin in Eclipse even shows a stack trace for you. So it's actually in your advantage to write the test methods like this when using Eclipse.
But there is another way. You can handle the IOException yourself which takes a bit more work:
public void testRead_Null()
This will print the stack trace to standard out, and then hit the fail() call at the end. I prefer the other way because it handles the checked exception better with JUnit. It might depend more on your unit testing framework and how robust it is so I've included this alternate way.
I hope this post helps people out. If you have any comments or corrections let me know.
# Code Covering ParserConfigurationException
Java 1.4 has a new library that deals with XML built into it called JAXP. In order to parse an XML document, you need a Document object. You can get a Document object from a DocumentBuilder object and you get DocumentBuilder objects from a DocumentBuilderFactory object. Yeah, I'm not sure if I like that either but it's the way they do it.
This is where my 1 single line of uncovered code is: when you use the newInstance() method of the DocumentBuilderFactory to create a new DocumentBuilder if a "builder cannot be created with the configuration requested" the newDocumentBuilder() method throws ParserConfigurationException. This exception is a checked exception and should be dealt with and not terminate the program.
Here was my method. How could I throw that exception and cover the catch line?
protected void setupBuilder()
I'm going to turn it into an unchecked runtime exception by catching it and throwing IllegalArgumentException. In the context I'm using the factory it would be a programming error to misconfigure it and then try to use it to create a builder. Programming errors, like using a method improperly, usually throw unchecked runtime exceptions like IllegalArgumentException.
You might be wondering why I don't just make this function throw ParserConfigurationException. Because then I'd have to explicitly use a try/catch whenever I used it. If I change the exception to a runtime exception, that is not the case. So here's what I came up with:
protected void setupBuilder(DocumentBuilderFactory factory)
The key is to give the method a parameter that will change it's behaviour or output. Without an input parameter you cannot control the output and have good unit testing. Here are the test methods I wrote to completely cover the method:
public void testSetupBuilder_Null()
public void testSetupBuilder_DefaultFactory()
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
public void testSetupBuilder_InvalidFactory()
DocumentBuilderFactory factory = new DocumentBuilderFactory()
public void setAttribute(String name, Object value) throws IllegalArgumentException
public Object getAttribute(String name) throws IllegalArgumentException
The last test just implements a custom factory that throws ParserConfigurationException when newDocumentBuilder() is called. It's a pretty simple way to cover the catch block. So now I have 100% code coverage for AudioMan. Sweeeet.
# Access to Information
It's funny how ordinary sources of information when looked at differently can be a privacy concern. Take a phone book for example. Even though it connects names to addresses and phone numbers people don't worry about it. You have to know a person's name to get anywhere unless you want to search the whole book by hand.
Well welcome to the Internet age my friends. Not a lot of people know that you can enter someone's phone number and get their name and address. Or even enter their address and get their name and phone number.
The most interesting one on this page? Find out all of the names and phone numbers of the people on any Canadian street (leave the house number blank). You should open the link in Internet Explorer though -- it doesn't seem to like Mozilla.
# 1 of 1531
I've released a developer build of what I have so far for AudioMan. If you've used it before you'll notice that playlists aren't there. They are next on my TODO list
It was a pretty painful refactoring and it took me longer than I thought it would. But I did end up with 99.93% code coverage at the end: I'm only missing 1 jcoverage line out of 1531.
# Selecting with SWT Table and JFace TableViewer
A few nasty bugs were coming up in my GUI code and they were hard to recognize as GUI errors. I'm using the model-view-controller pattern for the three views so far in AudioMan, which use JFace TableViewers. TableViewers are an abstraction around an SWT Table that allow a programmer to use an SWT Table more conveniently.
The problems I ran into dealt with having a selection on an SWT TableViewer. If an item was selected in a TableViewer and I called a method to add to or remove from the model (which would then notify the TableViewer's content provider via a listener), I would get a duplicate of the selected item which would linger around and never go away. The solution was to deselect all of the elements in the TableViewer before doing additions or removals but that has to be done against the interior Table like:
So after the operation was finished I wanted to select an item again. If I selected it with the TableViewer's select methods I would fire the selection listener attached to the TableViewer (and cause a stack overflow because of recursion because of the way I called my methods). To get around that I went back down to the interior SWT Table again to select the item, which doesn't fire the TableViewer's selection listener:
It's one of those things that is a "duh!" moment once you figure it out.
By the way, I'm just finishing up the unit testing on the AudioMan stuff I have so far and I should release the first developer build tomorrow afternoon.
# Is a Hashtable of <String, Vector> Threadsafe?
I have a Java question but here's some background first. A Vector and an ArrayList both implement the List interface except a Vector is threadsafe because it has synchronized methods. Likewise, a Hashtable is a threadsafe implementation of the Map interface and HashMap is its thread unsafe compadre.
ArrayList and HashMap exist because synchronized methods are expensive. So if you just need a Map and you're not going to use more than (the) one (main) thread then you can get a performance boost from HashMap.
So here's the question: If I have a Hashtable where the keys are Strings and the values are Vectors of immuatable objects, is that threadsafe? I'm guessing probably not because all of the methods that manipulate this structure extract a Vector from the Hashtable, add or remove objects and then replace the Vector in the HashTable where it was. It looks like I'm going to have to make all of the mutator methods synchronized and take the performance hit.
Speaking of threads, I'm probably going to pick up Taming Java Threads by Allen Holub soon. Has anyone used this book?
# 0.2.0 Update
I've finally squashed a major bug that has been dogging me for days now. Just when I think I have SWT figured out it teaches me a lesson. Anyway, AudioMan is using the model-view-controller design pattern now and the views are updated as files are added and removed, which is pretty neat.
I'll probably release a developer build tomorrow so people can check it out and then work on getting playlists back in for the rest of the week. I'll release AudioMan 0.2.0 once everything works pretty much like it did before the refactoring. Phew, almost done.
# AudioMan Gets a New Home
I have moved the AudioMan site to its new home: http://www.audioman.org.
I'll be writing more on the site about AudioMan's new architecture next week though a lot of it will be blogged here to work the kinks out. Judging by the response so far not a lot of you are interested in it. :)
Ah well, I'll keep writing about it anyway.
# When Does It End?
People are having a field day with the MyDoom email virus. Not only are they pumping up the animosity between SCO and the Linux community but they are also now blaming the users for their ignorance.
Who do I blame? I blame Microsoft. There's such a thing as being a responsible monopoly -- and that includes making software that people can't cut their own hands off with. Outlook should not have scripting support period. End of discussion. It doesn't matter how many cool new wizz-bang features it enables it's just too dangerous for the average user. Yet, it's still there because it might break something if they took it out.
Ars Technica recommends patience when dealing with users that just can't learn not to open email attachments. I think rather than allow the possibility of people mass infecting one another inadvertently it should just not be possible. Why blame the users when the system is flawed? Not just the email system itself but also many of the email clients that are build on top of it, especially Microsoft Outlook.
The MyDoom virus also opens a convenient back door for other infections. Spammers could use an infected machine to send anonymous spam email or organize a distributed denial of service (DDOS) attack on the web sites of unpopular companies like SCO or Microsoft.
Nevermind the spyware that gets installed without our knowledge and tracks us wherever we go, sending the data back to companies that sell it for profit.
Is this what we really want? Anti-virus, anti-spam, anti-spyware industries built around protecting an insecure operating system from viruses that exploit the latest hole, the latest bit of spying software? Where does it end? When do we wake up and say "gee, you know ... if we did this right the first time maybe we wouldn't be having all of these problems and subsequent band-aid solutions over and over".
I just don't buy the argument that says if Mac OS X or Linux were the dominant operating system, there would be more viruses for those systems. Unix architecture was made with security in mind from the beginning. It was designed to interact with other computers, not all of which could be trusted.
The same can't be said for Windows NT or Microsoft Outlook. They were both designed to sell software, not to be secure. Security takes a back seat to marketing. And even with the new Microsoft initiative to secure their operating system viruses still occur on a massive scale, worms from infected systems running Microsoft software ravage the Internet and slow it down for everyone not just Windows users. Our home computers send millions of spam emails right back to us and we don't even know about it. Spyware still spies on us without our knowledge. I'm tired of companies hiding behind their EULAs.
When does it end?
Update 7:18pm The NY Times has an interesting and long article about virus writers.
# AudioMan's Controller
Things are progressing well with AudioMan but I'm going slowly making sure I don't miss any tests. I've also cleaned up a lot of the APIs for the repository and file packages that had gathered cruft.
I'm starting on the controller part of the program now. If you remember the diagram from a few days ago, the controller is told by the GUI what to put in the models, which are in turn displayed by the GUI. So the controller controls what goes in the models and that's what the GUI displays (the view).
Remember when I was talking about updating the models from the mutators? Well this didn't make much sense. The controller updates the model, so the mutator should tell the controller something happened and then the controller will respond to that.
So far in the controller I've invented an object called TrackFilter based on the FilenameFilter interface in the Java libraries. As an aside, it's useful to read about these libraries because they can give you good ideas for your own programs. Anyway, the TrackFilter constructor takes three parameters: playlist, artist and album. It also has a method named boolean accept(AudioData), which takes a track as input and decides whether or not it matches the criteria given in the constructor.
So for example, if the GUI wants to display all of the Ricky Martin *cough* songs I have in my entire collection, I'll make this object:
TrackFilter tfRicky = new TrackFilter(IRepository.COLLECTION, "Ricky Martin", null);
and pass it to the controller. The first parameter is the playlist, in this case the whole collection (as a constant). The second parameter is the artist and the third is the album (where null is "all").
The first thing the controller will do is clear the artist, album and track models. Then it will ask the repository for all of the Ricky Martin tracks in the collection and add them one by one to the models.
So let's say I want to add another Ricky Martin track while the view is already displaying Ricky Martin tracks. The view needs to be updated by the models which are updated by the controller to show this new track. The mutator, which does additions and removals from the collection, will tell the controller that a new track has been added. The controller then uses the boolean accept(AudioData) method that was passed to it to determine whether or not the models should be notified about that new track being added. If so, it adds the track to the models.
So while a view it being displayed it doesn't have to be completely reloaded if an addition or removal occurs. If the track passes the controller's filter it will be added to the models and appear in the user interface.
# CD Run
Since I'm coding all day now, I'm listening to a lot of music. I needed to replenish my supply so I picked up:
I've also recently picked up:
They are both great albums. DCFC is emo-like, but still with a punk rock influence. Yellowcard is a punk band with a violinist ... a really unique sound.
Oh, and CD Warehouse in Kanata overcharged me by about 7 bucks. I only noticed when I checked the receipt in the parking lot and went back in for a credit. Of course the CDs were scanned into a computer, where the prices are updated by humans. You should always check your receipts -- errors happen all the time.
# Making a Listenable Model Out of Vector
To implement the model-view-controller (MVC) design pattern for AudioMan I need a model. The Vector class is good enough but I also need to be able to attach listeners to a Vector so that the user interface (the TableViewer's content provider) is notified when the model changes. To do that I needed to subclass Vector and add the notification code.
The problem with subclassing Vector is the number of mutator methods it has: well over a dozen. Mutator methods change the state of an object. So if I have listeners attached to my subclass of Vector, every time one of those mutator methods is used I have to notify the listeners. Overriding all of those mutator methods would be a big PITA.
So I chose three methods: add(Object), remove(Object) and clear(). I would only use these three methods to mutate my new class VectorModel, a subclass of Vector. I override every other mutator method and make it throw UnsupportedOperationException.
I got this idea from the Collections class, which contains handy static methods for manipulating Collections. In it there are a bunch of methods that return unmodifiable (read only) versions of Collections. For example:
public static List unmodifiableList(List list)
takes a List as input and returns an unmodifiable version of that List. They do it by wrapping the List inside of another List that has all of the mutators throwing UnsupportedOperationException.
To complete VectorModel I need the code to add and remove listeners and to notify the models. Java has a handy class for managing a list of listeners called EventListenerList.
private EventListenerList listenerList;
I wrote public addListener and removeListener methods like:
public void addListener(ModelListener listener)
public void removeListener(ModelListener listener)
though you may want to do more parameter checking (I've removed it for simplicity). You may have noticed a new class in the above code: ModelListener. It's an interface that extends the EventListener interface, used for event handling. In that interface you write abstract method headers for the events you want to notify the listeners about. These usually correllate with the mutator methods. For example, mine were:
public interface ModelListener extends EventListener
When you mutate the VectorModel you notify the listeners by iterating through the list of added listeners, like so:
EventListener listeners = listenerList.getListeners(ModelListener.class);
for (int i = 0; i < listeners.length; i++)
The classes you want listening in turn have to implement the ModelListener interface and write some code for the three methods. In my case that was JFace's TableViewer content provider. The TableViewer class has methods to add and remove objects so this was fairly simple.
The last piece of the puzzle is the ModelEvent class, which gets sent with the event to the listener. When I notify the user interface that some object has been added I want to send that object, so that the user interface knows what to add. The ModelEvent constructor takes the object in question as a parameter.
public ModelEvent(Object source, Object item)
The listener then extracts the added or removed object from the immutable ModelEvent object by using the getItem() accessor method.
So that's how I'm doing my model. Any questions or suggestions?
Update 10:12 PM Yeah, it's not quite that simple. The Vector class has a lot of methods that do the same thing, left over from the old days when it didn't implement the List interface. To simplify their lives the implementors of Vector just used other methods sometimes. So I can't use remove(Object) if I override removeElementAt(int) and make it throw UnsupportedOperationException because remove(Object) uses that method. I had to go right to the top and see which methods didn't call any other methods. Those ones turn out to be:
public synchronized void addElement(Object o)
So why do I have two remove...() methods? Well I need to be able to remove objects by reference, but I don't want to have to call indexOf(Object) every time. The other method, removeElementAt(int) actually does the removal and is called by removeElement(Object). So now I have to test it too. Oh joy.