| «« Unit Testing a Database Abstraction | Mac OS X 10.4 Tiger Announced »» |
|
About
I'm Ryan Lowe, a Software Engineering graduate living in Ottawa, Canada. I like agile software development and Ruby on Rails.
I write this blog in Canadian English and don't use a spell checker. Typos happen.
Projects
» Full-time Ruby on Rails freelancer
» Full-time with Rails since May 2005 » Former committer for RadRails (now Aptana) » I also have a few Rails side-projects in development: 1. wheretogoinTO.com Toronto nightlife 2. Hey Heads Up! TODO list and sharing 3. Layered Genealogy family history research 4. foos for foosball scoring 5. fanconcert for music fans (on hold) Hiring Rails developers? I can telecommute by the hour from Ottawa, Canada »» Email: rails AT ryanlowe DOT ca
BulletBlog
Now hosted on Hey! Heads Up -- check it out!
Syndication
Pings
Recent
Derek Lowe's (Ryan's older brother) words at Ryan's funeral
blog@ryanlowe.ca no more Forging Email Headers: Good, Bad or Ugly? Sarcastic Dictionary (Part 1 of Many) Tags Hierarchies Twisting Rails is Risky Business Risky Business? My Take on Early Alphas Whoa, it's August 2007 Closing Comments A Postscript to "Growth at the grassroots" »» All Blog Posts
Linkage
del.icio.us/ryanlowe
technorati/ryanlowe.ca/blog Aurora Roy Jim Andrew Trasker Travis Kibbee Karen Dr. Unk Ayana Van Bloggers Joel Spolsky Robert Scoble Tim Bray Dave Winer Raymond Chen James Robertson Ruby/Rails Bloggers rubyonrails.org weblog David Heinemeier Hansson Dave Thomas James Duncan Davidson Mike Clark Jamis Buck Signal vs. Noise Tobias Luetke Amy Hoy: (24)slash7 Jeremy Voorhis Eclipse Bloggers Planet Eclipse EclipseZone Luis de la Rosa Eclipse Foundation Kim Horne Billy Biggs Ian Skerrett Mike Milinkovich Bjorn Freeman-Benson Denis Roy
Archives
|
Database Perf and Object Creation
Premature optimization is bad, sure. But no optimization is bad too, if what you end up with is unusable. I've been thinking a lot about the performance implications of using a database with AudioMan, especially with object creation and destruction. You see, with the old repository objects were kept in collections. "Indexes" -- which allowed me to get certain tracks based on artist or album -- were just lists of pointers to objects. The track objects already existed. The database abstraction consists of methods that take parameters and returns lists of objects. For the implementation it would use a SQL query, execute it and then take the resulting recordset and make a list of track objects out of it. All of these objects would be new, and creating them would be a serious performance issue. The worse case for this performance is when the "Whole Collection" is visible in the track list. Collections can be 500, 1000 or even 5000 songs or more. Creating this many new objects in a reasonable amount of time just doesn't seem possible. I don't even think I need to try it to prove that to myself. In fact, the performance for loading existing objects into the track list is already borderline on about 1000 songs. So if I can't create these objects on the fly I better keep them around in RAM, already created. However, the great thing about the database is that I can do all sorts of SQL queries on the tracks' values and such, which I can't do easily with collections. So I want to keep the database and SQL queries around too. But then I had an idea ... I'll SQL query the database, except instead of returning all of the track data I'll just return the tracks' unique key numbers. I'll also keep a map of all of the track objects in the collection. Then I can take the key numbers returned from the SQL query and get the objects I need from the map without having to recreate them. The only problem with that solution of course is that the objects in the map have to be kept in sync with the data in the database. There are two ways to do this:
To me, the second option is preferable because there's less chance of a bug causing synchronization problems -- the object will always have the same data as the database. This is starting to look more and more like a persistence layer, isn't it? Are there any ideas I should borrow from known implementations/APIs of persistence layers? I don't want to reinvent the wheel here. What do you think? Posted at June 28, 2004 at 01:02 AM ESTLast updated June 28, 2004 at 01:02 AM EST Comments
Wierd... I was talking my last post about your last point along this same line. Probably in september I will implement caching in the following manner (#1,2,3 are already in place, #4 is the missing element)... 1) Create db interface The cache will be a cross breed that uses sql and inmemory to do the persisting and this implementation is only worried about caching. The cache will probably have a bit of sql glue to balance the trade off between being in memory and being 100% accurate (aka the update issue). » Posted by: aforward at July 2, 2004 03:34 PMI recommend re-inventing the wheel. Think of Joel's 'what you do is what you should do yourself' Persistence frameworks (at a coding level) are brittle, very brittle. I see them as being object-oriented databases that sit on top of relational databases. Sorry, can no longer concentrate. The short of it... capuring the requirements of what to do when data changes is too difficult to configure (based on my knowledge of DataSets.Net which is similar to some 3rd party java persistence mechanisms) » Posted by: aforward at July 2, 2004 03:41 PMFirst, My experience says that the speed problem does not come from creating the objects. Object creation with a semi-space garbage collector is extremely fast. The problem is retrieving that many rows from the database. You need to use a database cursor to retrieve pages of objects and create them as you need them. Take a look at http:www.glorp.org if you don't want to write all this yourself. Second, you are making a grave error by assuming that you know where the speed problems lie; Until you profile the code, you simply have no idea. I don't either, since I can't profile your code, so I've just given you my best guess based on my experience. You really need to profile your code and then fix the actual problem, not what you imagine or suppose to be the problem. » Posted by: anthony at July 6, 2004 09:37 AMYou're right Anth! :) » Posted by: Ryan at July 6, 2004 11:39 AM |