|« February 2004||April 2004 »|
I'm Ryan Lowe, a Software Engineering graduate living in Ottawa, Canada. I like agile software development and Ruby on Rails.
I write this blog in Canadian English and don't use a spell checker. Typos happen.
» Full-time Ruby on Rails freelancer
» Full-time with Rails since May 2005
» Former committer for RadRails (now Aptana)
» I also have a few Rails side-projects in development:
1. wheretogoinTO.com Toronto nightlife
2. Hey Heads Up! TODO list and sharing
3. Layered Genealogy family history research
4. foos for foosball scoring
5. fanconcert for music fans (on hold)
Hiring Rails developers? I can telecommute by the hour from Ottawa, Canada
»» Email: rails AT ryanlowe DOT ca
Now hosted on Hey! Heads Up -- check it out!
Derek Lowe's (Ryan's older brother) words at Ryan's funeral
email@example.com no more
Forging Email Headers: Good, Bad or Ugly?
Sarcastic Dictionary (Part 1 of Many)
Twisting Rails is Risky Business
Risky Business? My Take on Early Alphas
Whoa, it's August 2007
A Postscript to "Growth at the grassroots"
»» All Blog Posts
David Heinemeier Hansson
James Duncan Davidson
Signal vs. Noise
Amy Hoy: (24)slash7
Luis de la Rosa
# Extreme Programming (XP) Artifacts
What's an artifact? A software artifact is a document or diagram (like UML) describing code. Depending on the level of detail, artifacts can be a communication tool between developers and customers. Most often though, they are a communication tool from developers to other developers and aren't written in a style that customers can understand. In extreme programming (XP), user stories and other XP practises fill this function and replace the need for artifacts.
XP says that maintaining artifacts is A Bad Thing. You have to spend the time keeping them up to date, which is a lot of work in a lightweight process that encourages change, refactoring and code turnover. This weighs down your lightweight process and takes time away from implementing new stories and fixing bugs.
But managers want to hedge their bets. What if a developer on the team dies or gets in an accident or gets sick? How can you maintain the code if the design for that iteration is in his brain and not on paper? XP solves this problem using several complimentary practises.
The beauty of not having code ownership (one person owns a component and maintains it) is that everyone knows a little something about the whole system. Sure, there are experts on your team that write a lot of database code and you pair up with them when you have to do a database task for a story that uses the database. But when you pair up together, he shares the database knowledge he has with his pair. He still knows more than you, but now you know something about the database.
XP code is written in a way that it is maintainable. Developers do this because of the nature of XP: they never know when they will have to go back to that code and change it. So there is a lot of emphasis on creating readable, quickly understandable and easily modifiable code. Now, don't confuse easily modifiable with flexible. Flexible is often too much design than you need, and violates YAGNI. XP does not deliver flexible code in that sense. Anyway, I digress ...
The unit testing that is done in XP is another self-documenting feature. If you want to figure out how a method is used, grep for the tests that use it.
If you're new on an XP team, these features of XP plus pair programming will help you get in the loop faster. As a new team member you can pair up with an experienced team member as they show you the ropes. The best way to do that, is to start driving the keyboard right away. You learn more by doing than watching. It makes the task implementing slower (and if you were assigned the task, you should adjust the estimate if you are paired with a newbie), but the learning process is quick for the new guy and everyone wins.
Maybe you don't buy all of these XP arguments. I'll admit it, I don't either. Sometimes I just want to see a high level architecture diagram/spec and identify software patterns. XP does not produce these documents for the reasons I wrote above.
However, like I said earlier in my post about XP quality, the customer rules. If the customer wants to hedge his bets with a high level architecture diagram, then make a story for it. The customer pays (literally) for every story he gets implemented. So if he wants to spend the money on an architecture diagram, he can.
When's the best time to make an architecture diagram? Well the XP philosophy doesn't have exact release schedules. In fact, you should be able to use the contents of the source code repository without any (major) problems at any time. Pretty demanding eh? That's the kind of quality XP demands from its developers, and why there is so much focus on testing and iteration. No broken windows allowed (unless by accident, we're all human).
Realistically though, especially if you are working on a COTS product, your users like to see a release schedule. So you might release to your customer (and maybe a subset of your users) after every iteration (length of two weeks) but make a "stable branch" in your source code repository only every 4, 6 or 12 months. Then you release this stable branch to your users. It really depends on how much your users can handle change.
If they are computer savvy they'll probably be drooling for new features and want new releases as soon as you can deliver them. If they are luddites, they'll never want to change and will only change when you make them update (ie. if you drop support for an old version or introduce a new feature they desperately need). These are the same people still running Windows 95 because, well ... it works for them. So you have to take your users into account -- and ask them yourself, don't take your customer's word for it.
Back to artifacts: a good time to make an artifact is when you do that stable branch. That branch will go into "maintenance mode" and not change much after that. Maintenance will be bugfixes, not refactorings or new features. True, branching is not part of "pure XP" but it's creative license I think you can take for a COTS product that needs a maintenance branch. Just after the stable branch the customer can make a story to request a high level diagram and spec for that branch. Since the branch isn't going to change much there's less artifact maintenance to slow the bugfixing process for that branch. Depending on how high level the artifacts are, the bugfixes may not even impact your artifacts.
A completely different option is to put the watching pair programmer to work while the other guy is driving. XP advocates getting the watching guy to think about high level concerns anyway, so why can't he be updating a wiki with a high level representation of the code in it, maybe even a UML model? This slows down every task implementation because the guy riding shotgun is distracted but it's another option if you want artifacts with XP.
Disclaimer: this is my understanding of XP and I don't speak for the official XP guys. If there are mistakes in this blog post, please correct me in my comments. Thanks!
# Ottawa Inline Skating Trails Report
OK folks, the sweepers have not been out on the bike paths yet. But I have seen them on the roads and sidewalks, so they are bound to be soon. The path on the east side of the Rideau Canal is quite bad for gravel/sand, but I did manage to make it from Hog's Back to just past the Canal Ritz (but the east side, naturally). After that, the path is under construction: they are replacing the wall of the canal and the railing, which was falling apart and causing the bike path to sink in places.
My bro reports that the trail on the Quebec side of the Ottawa River (called the Champlain Trail, he tells me) still has snow on it. Not so good for rollerblading but excellent for cross country skiing. I take that trail west from the Champlain Bridge past the beach in Alymer until it ends, which is about 20k for the round trip. The trail also goes further east all the way to Hull apparently but I've only been a few kilometers that way. The Champlain Trail is great because its in the woods -- you're not near traffic and you only cross roads a few times. It stays close to the Ottawa River but it's not a great trail for "the view" though ... all you see is trees. :)
A better trail for a view is the trail on the south side of the Ottawa River from the Champlain Bridge east towards downtown (there's parking at the Champlain Bridge). You can even go all the way to the base of the Rideau Canal locks on that trail, which is cool. Great views of the cliffs, the new war museum and Hull. The trail is in decent shape, but isn't as good as the Champlain Trail on the Quebec side.
Today I skated from near the Hurdman bus stop to Sussex Drive on the bike path on the east side of the Rideau River. It's in amazing shape for this time of year, and looks fairly new -- nice and smooth for inline skating. Until the other trails are clear, this is the one to take. You can park near Billings Bridge or in the Du Barry parking lot at Montreal Road, both close to this path.
# Extreme Programming (XP) Quality
I've been reading about extreme programming (XP) quality lately. First off, for people not familiar or a little familiar with XP, it may seem like an ad hoc process. How can quality be important in an ad hoc process? That question is made from a false assumption: quality is actually often more important in XP than traditional software development. XP is not ad hoc, it's lightweight.
Quality comes from a few different places in XP:
The reality is that most of us are not going to be building NASA rocket engines or airplane flight control systems. These types of systems demand a level of quality and peer review that XP cannot possibly deliver (that's just my personal opinion). Waterfall and processes like the CMM have a place in constructing these systems. Our friends over at Motorolla India have it down pat. Personally, I have no interest in engineering these systems.
Unfortunately for the rest of the software development world these heavyweight processes are not agile. They cannot respond rapidly to changing market conditions or bugs found in the platform or other factors. They also take a long time to produce a working product that users can poke at because all of the design work has to be done up front in people's brains and on paper. If they deliver something that the customer doesn't want, the developers won't find that out until the customer has the cell phone in their hands saying "this sucks!" They can put the feedback on that product into the next model, but the damage is done. That kind of investment in code is risky.
So the first step we as software engineers have to do involves two admissions:
1. All software contains defects
This may seem pretty bizarre, coming from a profession that wants to be known as "engineering" but the truth is this: the defects that we can ignore have little to no impact on the system. We will fix them in a priority that suits the customer, mixed with new features. As well, bugs or defects are usually just deviations from the requirements. If your requirements are wrong in the old process, you have to go back up the waterfall and fix them. In XP you just make a new user story, change the acceptance tests, fix it and move on.
You should know though, that this is just my personal opinion. The XP literature rarely admits that unimportant bugs can be sitting in the code for a long period of time but I think it's an obvious assumption given the XP process: if the customer never tells you to fix a bug, it will never get fixed. My theory is that the XP guys don't like to admit that maybe the process has flaws like implicitly allowing bugs. Of course it does: the flaws are the people, the engineers building the system. We make mistakes all the time, and the process we use has to be able to manage these mistakes in a graceful way. Changing a bunch of artifacts up the waterfall is painful and takes time -- waterfall doesn't account for mistakes made by humans, it punishes them. At every step you have to be perfect and know exactly what the customer wants. Iteration is a lot less painful.
In the agile world, if a bug is important enough the customer can vote to get rid of it in as quickly as one iteration (usually two weeks). Ultimately quality is up to the customer in an agile process, so it's their call -- but don't worry, they won't let the quality get too bad. Remember that the customer has a stake in great quality too. The ability for the customer to make the call about quality is what makes agility so great.
Let's not forget about new features -- they go in on demand every two weeks as well. The customer gets her most important features sooner, reducing the risk that the implementation may not solve the business problem at hand. The feedback from that feeds right into the next iteration. There isn't much connection to the past, just to the present: not oh great, the current code doesn't match design document X or interation diagram Y, we have to go back and fix it just let's change the code and tests to work like the customer wants it NOW. There are no artifacts to go back and change. Remind me to talk about artifacts in XP some time.
Bugs are a funny thing in XP. Because of the high code turnover rate, a bug you may know about that has been in the system for a few months may be nuked by a refactoring for a new feature at any time. You never know what the customer is going to request next, you are not a mind reader, so don't even try. Why would you spend the time to fix a bug that will just be nuked? If the bug never effects the customer in a big way, or a workaround is managable, then everything is cool for them. They may not want to have to pay to fix it if it doesn't impact them that much.
If the code for that area is relatively stable and the customer thinks the defect is important enough, sure you can write a user story to fix it and if the customer thinks its important enough she'll prioritize it so you can go in and fix it. The customer rules in XP.
So a few conclusions: the software most of us write does not have to space ship quality. Just hold yourself down and get yourself to admit that we aren't sending people to the moon. Defects are prevented in XP by additional testing not found in regular software development BUT the iterations and process ensures that when found a defect can be fixed quickly. Responsiveness is more important! And finally: in XP the customers have more control. As a customer, wouldn't you like to have more control over your where your money goes and what the software does? Yeah, I thought you would.
Disclaimer: this is my understanding of XP and I don't speak for the official XP guys. If there are mistakes in this blog post, please correct me in my comments. Thanks!
# Eclipse 3.0 M8 Comments
When Eclipse starts for the first time it asks you where your workspace is or should go. Windows users don't have to set their workspace at the command line and Mac users don't have to dig into configuration files. Very nice.
The SWT people have improved the SWT Table to allow for virtual tables. As I understand it, the model data will be with the table but TableItem objects (rows) will only be created as they are needed (when the row is displayed in the UI), which is great for large tables. Seems like it will be done by M9 and I'm anxious to see if it will improve performance for tables in AudioMan, which start getting slow to load after a few hundred rows.
The new look and feel of Eclipse has been toned down a bit. Feedback on new M7 look and feel was very mixed -- people either loved it, hated it or took a while to get used to it. I'm still not used to it but I haven't used it that much yet. It didn't help that Ant support in M7 was broken and I had to go back to M6. If you think the Windows look and feel is weird, you should see it on the Mac! I'll give them a break though -- it's a work in progress. At least they are listening to their users' feedback and have responded in less than six weeks: that's the most important thing. This is why regular incremental releases and iterative development is a good thing: your customers get what they want, and quickly.
When I try to use the Ant plugin in Eclipse 3.0 M8 it is still broken out of the box. If the Ant plugin comes with the junit JAR file why can't it find the junit task without any help? Seems like basic seamless integration to me but they are taking the stance that developers will manually add junit.jar if they require it. The Ant plugin could watch for optional tasks in the build.xml file and pull the JAR files that it already has in automatically as they are required. That would be a lot more friendly. I understand the stance that they can't pull all of them in by default because it would be a waste but a little code intelligence might solve this problem.
Update 7:12 PM After reading more comments on bugs, it seems that while many people don't like the new look and feel the Eclipse UI team may be resisting either 1) a change back to the old native look and feel or 2) supporting both look and feels because either option would be a lot of work and/or complicate Eclipse. They are currently gauging support for either option with Bugzilla votes.
Probably the worst part about this new look and feel is that there was no warning of it (at least not outside of their bug repository) or solicitation of comments. If they had proposed the new look and feel before committing the resources to implementing it they may not have had to backtrack like may do now. The concerns are especially true for people already invested in the Rich Client Platform which uses the new look and feel. The decision to develop the new look and feel without input from everyone was risky -- I wouldn't blame (not that they are) the users (whether they be Eclipse developers, users of SWT, or users of RCP) for giving them something they didn't want.
Update Wed 5:55 AM: the Eclipse team responded to feedback from changes in M7 with these comments.
# Coke vs Coffee
I've stopped drinking Coke for a few reasons: acidity, carbonation and sugar. Of course I can't go without caffeine so I started drinking coffee instead. Here's a general comparison (numbers extrapolated from Tim Horton's web site) between a 600ml bottle of Coke and a large double cream double sugar coffee from Tim's:
Coffee packs a better punch, has less sugar but more fat and less overall calories. Both probably stain your teeth equally well though ... and let's not forget about coffee breath. Maybe I should move on to caffeine mints. heh
# Strong vs Weak Typing
As someone that has used strongly typed programming languages for most of his short career, I'm finding it hard to grasp the advantages of weakly-typed ones.
Paul Vick makes a good point when he says that generics make strongly typed languages even stronger, and this seems to go straight against the new wave of weakly typed languages like Python and older ones like Smalltalk.
It seems somewhat related to the enabling vs. directing thread that was going around recently. Strongly typed languages direct you down an inflexible path: you may only assign a value of type A to a variable of type A or one of its superclasses or interfaces. Once you do that it may be painful to change the type later in the code because you have to edit all of the types. Weakly typed languages enable agile processes where a value's type may change often and variables don't have type. Good unit testing with weakly typed languages prevents programmers from making mistakes with types.
Refactoring support in IDEs like Eclipse digs into this advantage of weak typing though. If I can change the strong type of a variable everywhere in the code with one action then isn't this the same difficulty as changing a weak type except now I have a more rigid type contract? If the tools enable the same type flexibility as weakly typed languages wouldn't a strongly typed language be better?
Maybe I need a few examples. I haven't seen the light yet.
Update 7:30 PM as James Robertson points out I'm way off base here. That'll teach me for thinking out loud. :) I won't correct my post though -- it's already been linked to.
Yep, I really meant to ask: what is the advantage of dynamic typing over declarative typing?
# Urgent My Foot!
So I get a letter in the mail from "AppleCare Warranty" in Markham, Ontario. On the outside of the letter, just above the destination address is: Urgent! Open Immediately.
Of course a person would be curious by such an envelope and I open it immediately. So what was it? My AppleCare Protection Plan Certificate and the AppleCare Terms and Conditions. WTF? How is that urgent? I already have the warranty, I don't need a piece of paper confirming it. And of course my Dad was in a mad rush to give it to me: "You have urgent mail here from Apple! Your iBook will explode if ...!"
Reminds me of people misusing Outlook's urgent mail flag: !. I have a short attention span, just like everyone else. The envelope worked. But why, dear Apple, must you exploit this vulnerability?
Soon every piece of mail will be labeled Urgent! and I'll never be able to tell which mail is really urgent (none of it) and which mail is actually ... well, not urgent. Can you imagine the anarchy this will cause? I might actually have to open every piece of mail I receive.
Apple's warranty department probably uses the same envelopes for new warranty certificates and warranty expiry notices. Having a warranty expire on you might be urgent. But can't a company like Apple afford two different envelopes? :)
Wow, I think I actually wasted half an hour thinking about this. I need a hobby .... oh, wait.
# AudioMan's Repository to Use a Database
I think you guys are right: I've hit the tipping point with the AudioMan repository. Using objects and the Java Collections Framework was good to start but now the features I need, like threadsafety, would take too much custom development and maintenance time to be practical.
So I will concentrate on getting a database in there for the next little while. The most promising database is HSQL, formerly Hypersonic, a database completely written in Java that supports SQL. Thanks Jim, for finding it.
Kibbee also suggested Xindice because AudioMan currently uses XML. That's true, but the XML is only in the repository implementation and would be completely replaced by a new database. So Xindice would have to offer compelling features over HSQL besides its XML buzzwords compliance (now with 50% more XPath!).
If you find any other databases that would work well with AudioMan, let me know. Here's what the ideal database would look like:
1. Callable from Java but still performant
Of course, easy and small are relative terms ... but they are something to aim at.
Preferably the user shouldn't even be aware that AudioMan uses a database. Apparently iTunes uses a database and exposes XML files to the file system transparently. Obviously AudioMan can't do that but it gives you an idea of the kind of seamless integration that joe sixpack users like.
Remember that AudioMan is an end-user application targetted to regular people. We can't have them installing or configuring a database themselves. The installation should be so painless that they don't mind doing it often (like Mozilla's installer: download... click-click-click-click-done).
# Agile Data Serialization Management
Admittedly I don't know much about databases, so I'm looking for some input in this area. I do know a bit about agility though, and using a database for AudioMan concerns me on that level. I'm looking specifically for pointers to somelike like agile database development concepts.
Databases are a persistence mechanism, like regular files, and are basically a form of object serialization. You take an object and put it into permanent storage in some known format. Then you can go back to the storage and retrieve the object you put in from that known format.
The problem with agile systems is that the objects are changing much more often than in a system made with a non-agile big design up front (BDUF) process like waterfall. When you serialize objects they must be in a format that can be understood by some parser to deserialize them. If you serialize in format A and deserialize in format A' after a refactor in an agile process, you have breakage. So the serialization mechanism also needs to be agile.
Databases can be the same way. If you add or remove columns in a table the code can be adversely affected and break. You can detect these breakages with good unit testing -- that's not much of a problem. But the issue is creating a database schema that promotes agility. You don't want to be prevented from adding new features or refactoring when you need to because you are hamstringed by a restrictive database schema.
There is also the issue of updating the storage data on the user's machine, whether it be a database or a file format. In an agile system you need to be able to convert an old iteration's serialization or database format to the new one fairly painlessly. This is all part of maintenance and allowing for agility. If you cannot easily convert the formats, updating them will be too much of a pain for the users and the users will actively resist change. How can you get their feedback in a timely mannor if they constantly resist change? In an agile process, you want your users to actively encourage change based on their feedback! Having a painless and transparent update to the serialization format is ideal.
There are probably many agile projects that have to manage this issue but one that I'm familiar with is the Eclipse project and its workspace directory. Eclipse has a one-way conversion tool that runs whenever you install a new version of Eclipse and it converts the old workspace data into the new serialization format used by the new version. You can't go back to the old format from new format but that is rarely an issue. You could always just back up the old workspace before you update Eclipse and the restore it from backup if you needed it.
The Eclipse workspace directory contains all of the projects you're working on and all of your preferences, as well as local change history and lots of other working data. Given the size and scope of the preferences and other features that use the workspace directory, it seems like the Eclipse project would be a good case study in agile data serialization management. They seem to have it figured out pretty well.
The workspace stuff seems to be in the Eclipse "core" component.
# Repository Locks
I finally split up AudioMan's repository into four manageable chunks: core, reader, mutator and status. The reader gets data from the repository, the mutator (writer) changes the repository and status just checks certain conditions without writing or returning data. The core component is a unifying component for the other three and its only methods save and restore the repository from hard disk.
Splitting it up like this makes each component smaller and more manageable, true. But the main reason I did it is to introduce the repository locking mechanism I talked about before.
Databases (and also files) usually have rules for reading and writing. During a write to the database, which could be several steps, no other write or read operations should occur because the database could be in an inconsistent state. So the database will lock before the write and unlock after it. This is important in a multi-threaded or multi-client environment where more than one thread may be reading or writing. Multiple threads reading the database, however, can happen concurrently without any problems since the data is not changing.
So in the case of AudioMan, I need a lock for the write operation. Before the write the lock is acquired and after the lock it is released. The lock will prevent other other threads from reading or writing while the database is being written to.
But I will also need a lock for the read operation. The writing operation will have to aquire the write and read locks before it is allowed to write. The read lock will be shared by all threads that are currently reading the database, possibly in a sort of pool. If no threads are reading then the lock is available. The write operations have to wait (maybe in a priority queue) for all of the other threads to stop reading before it may write.
This is pretty much the same thing that file systems do when reading from and writing to files. The C stdio library exposes this explicitly, letting the programmer open a file as write ("w") or only read ("r") with the same open() function call. When a file is open for writing, no other thread or process can read or write it. When a file is open for reading, no other thread or process can write to it but they can still read it.
# iTunes: Playlist Algebra
What I would like to do is be able to tell two things apart in iTunes: files that are in albums and those that are singles. I'm an album listener most of the time, so the singles tend to add up, get ignored/forgotten and generally just cruft up my collection and waste space on my iBook. It would be nice to have a playlist that contains all of the singles in the collection so I can manage them.
iTunes has a two types of playlists: regular (static) and smart (dynamic). You have to add songs to a regular playlist explicitly and the playlist never changes unless you add or remove songs to or from the playlist or remove a song from the collection (which removes it from the playlist automatically).
Smart playlists are more like queries and do not contain the same songs all of the time. For example, I have a smart playlist called "100 most played" which changes continually based on the number of times the songs in the collection have been played. I have another called "recently added" which shows the 50 most recently added songs, again depending on which songs have the latest added date.
Every time I add an album to my collection, I add all of its tracks to a regular static playlist called "albums". It's much easier to browse the artist and album lists when you're only dealing with the albums in the collection. In my case, the artist and album lists are both reduced to 1/4 the size when I'm only looking at whole albums. I play whole albums from beginning to end a lot, so this works well.
I would also like a playlist that displays a list of all of the singles in the collection. I need a smart playlist that says "OK, the user has all of the albums in a playlist so the singles must be everything else." The playlist has to be smart (dynamic) to account for new files singles being added all of the time. I could make a static playlist called "singles" and all of my singles to it manually but that would be a lot of work! What I'm getting at is that I'd like to be able to exclude one playlist from another. That feature could also be extended to allow the user to combine regular or smart playlists together in arbitrary ways as well as exclude. Pretty much doing playlist algebra.
Apple likes to keep things simple, and the current smart playlist "query" interface is pretty easy to understand. I don't think introducing a feature like playlist algebra (with a better name, of course) would be that difficult to understand. That seems to be a major barrier to getting features introduced into Apple products but that's good: Apple products are made for everyday users, not power users like me. The power users can suggest the future features that the everyday users might like and use.
# Broken Window Theory
The so-called Pragmatic Programmers have some interesting views. I've only read a couple of interviews with them but they seem to be interested in agile development but are also rational enough to say "sometimes pure exreme programming (XP) doesn't work in all situations, so use your head." Like XP and most agile processes, they seem to prefer enabling over directing.
The broken window theory is one of the more interesting ideas. The gist of it: if you compromise on quality even a little bit, the project could spiral out of control. So you have to stay on top of little problems and not let them hang around -- because if a lot of little problems are in the code, then developers will just say "well, what's one more problem? It will save me five minutes" and get lazy.
Quality includes not just fixing bugs, but also the code conventions and little things. Code readability can impact how quickly you can understand the code and refactor. Agile processes are maintenance processes, with high code turnover ... you need to write your code to be easily understandable, refactorable and maintainable. You should be able to go to a piece of code and not be able to tell who wrote that code. Everyone on the team should be writing in the same way, so that they may read it in the same way. Most people are not in that mindset. They just want to write a piece of code and move on to something else. Maybe they'll come back to fix a bug but otherwise the code is perfect, right?. In XP you will almost always come back to refactor the code, so it instills a sense of responsibility to be maintainable.
If you rigorously fix "broken windows" -- small problems -- just like the pragmatic guys say, then the quality will stay high because no one wants to be the first one to start writing crappy code. Just like in XP where no one wants to check in a test that doesn't pass because it breaks the build. If you start allowing broken tests, then the value of your whole test suite decreases.
Where this might differ from XP is that XP has a focus on the riskiest problems. Some developers might interpret that as being "I have new features that are more important or risky than these little bugfixes, so I have to implement them now instead of fixing bugs."
But if you take the pragmatic view into account, then the bugs themselves become very high risk to the project on the overall quality scheme of things for the reasons I talked about above. So they become the riskier issues you should be attacking first, at the expense of new features but for the long-term benefit of the project.
I shouldn't say that it is a complete replacement though, some of the stuff on everything2 is just completely off the wall, rude, sexist, opinionated, gross (it could be archived as a fairly accurate picture of mostly young male geeks during and after the high tech bubble).
It replaced it for me though, because all of the extra stuff in everything2 is a distraction from the information that is actually useful -- the good parts of everything2 cut through the bull and gave a nice brief bit of information on most subjects, especially technology. It was really nice to be able to go to everything2 and type in an acronym and get a nice description back. It's moderation system is restrictive because of the people that use the site, and it also prevents the site from gaining momentum. Eventually the moderators can't keep up. Wikipedia is a more serious and formal attempt and therefore doesn't have the same misuse issues as everything2.
So what is Wikipedia? Well, first you should understand what a wiki is: a web site that lets you edit its contents along with other people. So if you see a mistake, you can just go in and fix it right away. The moderation system consists of thousands of people that use the site and correct the mistakes of others as they come across them. They can also add more information and increase the size of the encyclopedia, through easy linking of pages within the wiki and elsewhere. The result is a fairly accurate and very cross-linked source of information.
Because Wikipedia is backed by a wiki, if someone comes along and erases a page or vandalizes it, the page can be put back to it's original proper state very easily. All of the past versions of the page are stored by the wiki and can be compared to the current version of the page.
I have to stop myself after browsing Wikipedia for a while. It's like the web was originally: additive to keep following links around and exploring. It's really easy to get off-track, just like in everything2. Granted everthing2 is often more amusing than interesting.
It's interesting to note that the inventors of HTTP originally saw the web being this way in the first place: more collaborative. Instead the web is mostly read-only web pages which largely serve the interests of their owner. Only now, when social software is starting to become mainstream, do we realise the power that a community of people can have. The Wikipedia is a great example: in three years they already have over 200,000 articles -- and that's just the English ones. It's a pretty amazing demonstration of the power of collaboration, community self-moderation and great software.
# Tim Bray Hired By Sun
One of the more visible technical bloggers out there, Tim Bray, landed a job Sun Microsystems. Congratulations Tim, and you're absolutely right -- when someone can make a decision and make things happen quickly, that's when you know it's a company you want to work for.
You can bet that his blog has something to do with why he got hired, and I like the fact that his new job meshes well with his blogging lifestyle. He says "...this job is going to be pretty public-facing, so you�ll find out lots of what I�m doing more or less as I do it." Looks like Sun is starting to ride the cluetrain too, and they get a great voice in Tim. If only all of us bloggers could blog like that and find jobs like that. :)
The rest of his post is a rather interesting dump on Microsoft and .NET, which you should read if you're a .NET developer (read his disclaimers, they are interesting as well). On one hand I try to keep my distance from the .NET platform because of some of those arguments. But on the other I read about and use C#, the Common Language Runtime (CLR) and the Framework Class Library (FCL) so I'll know what other languages like Java, its JVM and class libraries are competing against and in which areas they are falling behind to .NET.
# Freedom in the News
I'm going to echo this one, because it's important. Scoble points to an article in the Chicago Sun-Times about a girl not disciplined for her opinions on her blog. Robert is right too: it was handled in the best way. Why are these parents still mad? Would they rather they only had a right to an opinion so long as it agreed with the majority? If so, who is more ignorant ... them or the girl with her own opinions?
It's timely for this blog because I was just talking about freedom last week. Any freedom comes with the good and the bad. You can't be free to express your ideas without a good chance of running into other ideas you disagree with. How you deal with those different ideas shows what kind of person you are. Can you counter them with more convincing arguments? Will you just ignore the ideas or will you consider them? Can you explain your side without going off the handle?
The free software guys know this too: the benefit of them releasing a free operating system outweighs the negative impact that say, a terrorist organization could have from using it. Free encryption libraries are under this microscope as well.
Freedom is all or nothing. If we deny freedom to people we disagree with then we are only undermining it for ourselves as well.
# AudioMan's TrackFilter
AudioMan's immutable TrackFilter object is used to filter tracks for the models. The models receive notification any time a mutator changes the repository. Some of these notifications don't apply to what the model is currently holding, like albums for a specific artist for example.
The TrackFilter object's constructor takes three parameters: playlist, artist and album; and has two methods: acceptAdd() and acceptRemove() that both return boolean.
If acceptAdd(track) is true, then the track should be added to the model. If acceptRemove(track) is true, then the track should be removed from the model. I used to have one method called accept() but then during unit testing I noticed a slight difference in filtering when a track was added and when it was removed, so I needed a method for each situation. If you are curious, check out the code for the exact situation. :)
The accept...(track) methods determine their boolean output by comparing the track parameter against the parameters given in the constructor. If the playlist, artist and album names all match, then the method returns true. If any of the parameters are null then that means "all".
So when each model is made it is given a TrackFilter to use. Each new model has a new TrackFilter to determine what notify messages reach it and actually change the model. The rest of the notifications are ignored.
The TrackFilter class was inspired by Java's FileFilter interface.
# The Freedom to Blog
Why are blogs great? Because people straight out of school can give their opinion on something in the same medium that people with 20 years experience can and they can discuss it from different points of view. No one has a turf advantage in the "blogosphere". There is no blogger "old boys club".
Sure, I'm opinionated -- I hope I never lose the ability to think for myself and express myself freely. I make predictions. Some are serious, some are outrageous. I might be way off base sometimes but I don't see a big problem with it. I'm not a reporter and I don't formally fact check -- I just give my opinion.
If you think you're reading all facts when you read blogs I'm afraid you're overestimating them. Blogs are temporary, error-prone and biased. But nevertheless blogs are still interesting and useful.
The great thing about blogs is: if you don't like what I say on my blog you can check out thousands of other blogs and completely ignore mine ... millions even. You don't have to waste your time reading about this small fish. Better yet, you can comment here or post on your own blog about how I'm dead wrong. I assure you I'll read it, I check my comments and referrers all the time. You might even (*gasp*) change my mind. I'll probably blog that too.
So get out there boys and girls and exercise your freedom to have an opinion. You'll be glad you did, it's liberating. Who knows, someone might even read it. ;)
# AudioMan's Models
I usually talk about AudioMan's models like they are all one component, but in this post I'll get into what they are all about. The models are one part of the model-view-controller triple that I talked about before and then updated more recently. They contain the data that is displayed in the view that the user can see. The controller takes direction from the GUI and manipulates the contents of the models, like when a user is browsing a collection.
There are four models, one each for the playlists, artists, albums and tracks. The contents of each subsequent model in that order depends on what is selected in the previous model. In that way, they sort of cascade. The playlists model is just a Vector of name/value pairs. The name is the label you see in the view and the value is the playlist's ID number.
The artist list contains a list of all of the artists in the selected playlist. The album list contains a list of all of the albums in the selected playlist and by the selected artist.
Finally, the track list contains all of the tracks that fit the playlist, artist and album that are selected. Like the playlists, it is also just a Vector but of AudioData objects representing tracks.
The artist and album models are the most interesting. They receive add/remove notifications from the tracks mutator just like the tracks model does, in order to add and remove artist and album names on the fly. To do this accurately, these models needed to keep a count of the number of tracks that apply to a specific artist or album name.
So I made the artist and album models Hashtables. The key in the Hashtable is the artist or album name and the value is the number of tracks that use that name. When these models receive an add notification from the tracks mutator, if the name isn't in the list it's added to the Hashtable with a value of 1. It it is in the list, the value is incremented. When these models receive an remove notification from the tracks mutator, if the name is in the list but the value is only 1 then the name is removed. Otherwise if the entry exists the count is decremented.
How does a model know if it should listen to a notification from a mutator? It can't listen to them all because some additions and removals don't apply to what it's showing. That's where the TrackFilter object comes in and I'll explain that one in another post.
# VS.NET Background Compilation Issues Explained
James Robertson links to an interesting blog post from Paul Vick about Visual Studio's background compilation. That post was preceded by an even more interesting introductory post that explained some history. Summary: The old Visual Basic was completely compiled in the background while .NET languages are not, but almost completely "compiled". Paul is in the midst of explaining why.
You'll remember that the flawed background compilation was one of my major gripes about Visual Studio .NET so I'm interested in seeing this fixed. It would also probably affect other gripes I had, including code formatting, refactoring and code completion.
For comparison, Eclipse keeps a sort of "DOM-like tree" of the code you're working on as far as I understand it. This seems to be much different than the Microsoft file-based approach, where everything is recompiled based on files being "dirty" or not, covered in Part 1 of Paul's explanation. All speculation on my part, though.
Besides being more comprehensive than Visual Studio (Paul admits that .NET does not compile the code in the background with a full compiler), Eclipse seems to be more efficient. The delays on Eclipse seem slower, and having a tree of the entire workspace in memory instead of relying on files would also explain why Eclipse is such a RAM hog. Trading resources for speed or features seems like a reasonable compromise to make for a developer workstation. RAM is cheap these days.
But to get back to what James said about Smalltalk: there's a good chance the Eclipse guys were inspired by Smalltalk's image idea -- the same people that wrote Eclipse also worked on Visual Age. I don't know much about the Smalltalk image, though ... except that it is not file-based.
As for C#'s less than perfect background compiling: I can't stand it. There's nothing worse than editing a file in a *modern* IDE, seeing no errors, then compiling/running it and finding compile errors. It's a complete waste of my time and it only gets worse as the project gets larger and more complex. I know we've come a long way from the dark ages of the command line interface but I'm a new school hacker -- I'm more demanding, I'm writing agile code and I'd rather not wait for: code, complete re-compile, run test suite, code, complete re-compile, run test suite, etc... :)
I will be following Paul Vick's blog now to get the reason, though I'm betting it will be the result of a tough engineering compromise (like keeping file paradigm around instead of hogging RAM). Microsoft's transparency though blogs in this case helps me understand these compromises. If I can grok the compromise I will respect it more as an engineer and also get a handle on when I expect the situation to improve. Definitely sweet. Blogging is changing the software business on a grass-roots level like this ... when will other companies catch the cluetrain? Who knows ...
# MVC Part II: Attack of the Mutators
Here's an updated architecture sketch of AudioMan:
I've been thinking a lot about multithreading AudioMan, and here's what I've come up with so far. Note the diagram is different from the old diagram -- it hasn't even been two weeks yet. That's why I don't put a lot of effort into making them look nice: they change too much. If you are lost reading this post, you might want to try reading the old post first -- it may help.
The major multithreaded situation I have right now is when someone wants to browse the collection while a long Include Directory operation is going on the background. My solution to this problem is to temporarily make the repository (dB in diagram, it can be thought of as a database) read-only while I'm updating the UI and then unlock it when I'm done.
Using the numbers in the diagram above, here's a typical scenario:
A1. Include Directory X
A lot of this architecture already exists in AudioMan, I just haven't had a chance to talk about it yet. There are two potential problems that I don't have enough information about which could be issues:
1. When the Include Directory resumes and the controller gives its list of files to update to the mutators, will there be a "collision" of sorts since both threads are using the mutators? I don't think so. The mutators are singletons but they do not contain any instance specific information. They do not have state. So the threads can context switch using the same mutator singleton without any problems as far as I know ... but I am still unsure. I will have to look into this more.
2. After the controller unlocks the repository in B6, the mutators will resume and could notify the models of changes before the models are properly connected to the view and can notify it. This by itself seems to be OK since the models hold onto these updates, so when the UI is ready to load the models (with setInput()) they will be up to date.
However, there could be a short period of time during the connection process where the models have already been connected to the UI but aren't notifying it of changes yet. If that is the case, the view will miss updates. A solution could be to attach the view's model listener first and then load the model into the UI. Of course, this may give the view false updates just before the the new model is connected because it is listening prematurely. However, the view will be cleared when the new model is connected to it anyway and the model has those updates as well. So I believe everything will appear correct in the end.
Any comments? I realise this may be a lot to digest. If I didn't explain anything well enough please let me know.
Wow, been a pretty slow week. Two things I'm working on for AudioMan right now: threading and file library issues.
The file library issues are easy to fix but will take some refactoring. The code that reads the file meta data has the AudioMan object AudioData throughout it. From a coupling standpoint, this is bad. So even though it will make the libraries less efficient (copying data), I'm making libraries that can be used on any other project not just AudioMan. I may even release them under the LGPL later, assuming I can get everyone's permission.
The threading issue is really sticky. Once you start to get more than one thread going in a program the complexity skyrockets. You have to think of all of the situations a simple operation could be interrupted and how events can conflict or resources can deadlock. There's no question that AudioMan should be multi-threaded, the only question is how and how much.
The major problem causer at the moment is letting people browse the collection will files are being included, on a long Include Directory operation that crawls all of your directories. While the view is switching models, the mutators are sending messages to the models and they get shown by the wrong views. Other tracks are getting lost. I think I will have to suspend the mutators while the view is swapping ... but that's not as easy as it sounds.
At any rate, I'm taking some time to figure out those things before I attack. These are both basic problems that need to be addressed right now.
# Software Needs Communication
I've talked to a few people lately about communication overhead and how that can effect projects. Being one of the users of AudioMan and also the only programmer at the moment means that I can operate largely in a vacuum, coming up for air every once in a while to find out what people thought of the latest milestone. As long as I release milestones often, the risk is low that I will start going off track too far. But projects with one developer limited simply by resources.
Now I want to start to think about getting other people to contribute, so I created the AudioMan developer part of the site and put effort into making instructions for developers so they could get started, at least. Once that happens, they may have a few problems.
The big one is that they don't know my "process", the informal way I do things now that is locked in my brain. Because I worked by myself it didn't make much sense to document a lot of it because it would have just been unnecessary overhead work. Now those documents will facilitate communication between team members -- learning how to set up tools for the project, using those tools and the processes that we've agreed on. They'll speak for me while I'm busy doing other things.
The process is a big deal because an agreed upon process can streamline development. If everyone is on the same page you spend less time fixing other people's "mistakes" or watching over each other too much. I'm going to start writing more pages about certain processes, like what to do when you want to make a patch or how to build a release. Then people can review them and improve them.
This blog has become a good place to learn about AudioMan's technology and to keep up even if you aren't working on the code. Since the code turnover rate can be higher than most projects it would take a lot longer to maintain up-to-date UML diagrams for everything. When I can just write informally about problems I'm having, the background I give then contains a lot of information about the current state of the project. Information that might be stale in two months ... so the artifacts are informal.
So is there anything about AudioMan you want me to write about?
# Let's Talk About the "I" Word: Innovation
Mike Bell says in my comments: "As an economics student, I have to say after reading all this stuff about how programmers are so confused and out of touch with reality that it's not even funny anymore... it's sad.". Sad is expecting a job in a market that can't support it. Reality is free software is here to stay, so let's kick its ass by innovating.
Just like there are economists willing to give away free knowledge about the economy, or a doctor or lawyers that do pro bono work, there are people that are going to write free software. I think professionals owe it to society in general to share the wealth once in a while.
What makes any of those other careers possible despite that? It's value added. Software companies can throw far more resources against a problem in a shorter amount of time because they are paying people. Free software has a horrible time to market and besides a few exceptions the process does not scale well. So corporations still have the advantage on big projects.
I write free software but I don't advocate it in all situations. I'm an idealist about freedom but I'm a capitalist. I'm also a realist. Free software isn't going away just because we want it to or because it might put all of us out of a job one day. I *know* free software takes jobs away from people. Is this fair? Yes, because the jobs they had were so "trivial" they are replaced by someone's work in their spare time. Do I have a lot of sympathy for people marginalized by free software? Not really.
Do factory workers fear automation? You bet -- but it still happens. Do programmers get an exception just because we think of ourselves as more trained or more intelligent? It can't happen to us! Right? No. Progess will happen but the factory workers still get jobs doing other things and guess what -- so will we.
You know, the whole high tech dream world is interesting. It's almost like they expect regular economics or politics (globalization) to NOT affect them, to go around the whole industry. HELLO? Your job is not safe. It doesn't matter if it's outsourcing or free software. No one owes you a job just because you have a computer science degree.
Having said all of that, I don't think free software will effect us as much as people are worried about. A few segments are more vulnerable than others. Traditional consumer COTS software is the most targetted because it is used by more people that have the technical know-how to replace it. Stuff like Windows, Office and all of the common windows apps. They will all be replaced by free alternatives if they just stagnate for the next few years and don't add value. Why? Because free software will be good enough to do the job AND because it was given the time to catch up in features. Who do I blame for that? The companies for not staying a step ahead and consistently adding value. Having no innovation for years and years lets the free software guys catch up and eat your lunch.
Server software is another free software area ... except this one is now being sponsored by companies! The rationale is a little different -- instead of freedom for individuals, companies see free software and open standards as a way to level the playing field. Now you have to add value somewhere else (an easier to use front-end GUI, for instance -- like Mac OS X), and the power is swinging back in the direction of the agile little company that innovates, after being wayyyyy in the other direction for so long. The large companies still have the advantage because of sheer resources but they still complain about losing their massive market position and ability to have vendor lock-in. Freedom makes vendor lock-in go bye bye. Find another way to make money.
As much as you guys groan about the computer service industry, custom software will never go away. The honest truth is: as human beings that are now technologically enlightened, we will never run out of processes to automate or improve. Even if the government goes Linux straight across the board they will still need people to write custom apps for the platform to do the tasks. To hand out marriage licenses, take payments for parking tickets or issue passports. These systems will also have to be maintained. Nevermind that once the public gets a taste of this software, they'll want more of it! It's easier to deal with a web form than to have to travel down to the passport office every five years. Not all service/consulting jobs are travel all the time, either. Just look at the city of Ottawa: there are tons of contractors doing work for the federal government and they never leave Ottawa.
Embedded applications are niche and will continue to be written by closed source companies that understand the advances in hardware. Free software can't keep up (a few use Linux but the apps are custom, you get the idea). That includes cell phones and PDAs. Same with networking equipment made by Cisco and Nortel ... it's too niche and there is too much cutting edge research involved. Yes, you've realised it ... these segments innovate.
I still see plenty of opportunity for paying software jobs out there. Young (and old) programmer, instead of belly aching about not being able to find a job in your favourite field because now people make the software for free, pick a field were they still add value over free software. Maybe this is exactly the kick in the pants the industry needs to INNOVATE AGAIN. Harsh competion is usually a great motivator. We have to show that we can produce better quality results than free software or outsourcing in yet another "new economy". These are exciting times ... no more sitting back on our heels.
The factory workers got over it because now they get their goods at half the price they used to. They found a new job putting together photonics equipment and got paid twice as much. Then they lost their jobs again and are doing something completely different. They learned a new skill and adapted when the economy changed. We should adapt too.
# Roll Your Own ApplicationWindow OnOpenListener
JFace's ApplicationWindow object doesn't seem to have a listener that is triggered when it is opened. This is unfortunate because there are lots of things that you might want to do when the "UI is ready", like load it with data.
Warning, the following explanation is from 30,000 feet ...
ApplicationWindow uses the method open() to show itself. Developers usually make this method call block, meaning that code after open() doesn't execute until after the window is closed. After open(), everything is event-driven. So you can't make a call to your database after open() and expect it to affect the GUI. There is a way to get around it though.
The windowing toolkit that JFace uses, SWT, is a layer on top of each operating system it is written to. SWT receives messages from the operating system and stuffs them into a queue. These are messages about if the mouse pointer moved, if a click happened or if a letter was typed. SWT then processes these events and updates the application's GUI.
Do you remember me blogging about asyncExec()? This method call just puts a Runnable on SWT's queue, usually from a non-UI thread because SWT only allows the SWT thread to modify it. But who says you can't use it from the UI thread too? Here's how I did it:
public static void main(String args)
Now the task runs right after the window appears. The trick is window.create(). Usually you don't have to call it but it will initialize the Shell for you (I found that out poking around in the SWT code). Otherwise the Shell is null until open() (too late!). Without window.create(), you can't getDisplay() from the window's Shell and you can't use asyncExec().
BTW, I used this technique to put progress bars in AudioMan when it loads and saves the repository on open and exit. It looks sweet! I'll be releasing a new version of AudioMan (0.3.1) shortly.