ShoeChicken: September 2005

As Michael had already posted, I spent some time this weekend working on the database, only to discover Michael was working on it too. We'll have to work on that whole planning thing. We've got the empty shell of the database in place now though.

I've also been thinking about the internals of the rating agent and some of the conceptual functionality of shoechicken. One thing I was thinking about is whether the rating agent could be thought of as a system service. This would allow news to be aggregated even when the GUI (Frying Pan) was not active. Since the RSS/Atom feeds cycle through articles, having the service running in the background would allow the retrieval and reclocation of articles that could otherwise be lost. At first this seemed to be an alright idea, but after talking with Rachel she pointed out how that may put some people off the application. Also, if someone doesn't check the news very often they're probably not too interested in what transpired 2 weeks ago.

Part of what got me thinking about this was the possibility of re-reclocating articles that had not yet been read by the user. This doesn't seem to make sense if the article has already been posted in the Frying Pan as articles could jump around in the list too much. However, if the article had not yet been posted to the GUI, it would seem plausible to reclocate that article again to find its most appropriate ordered location in the unread article set. The problem is that this situation is not likely to occur if Shoechicken only runs when the GUI is active.

So, my final thoughts (thanks largely to Rachel) are that the user could be given the option to have shoechicken collect news in the background, or just when the GUI is open. I know this is thinking a little way down the road, but my goal is to not prevent the former from happening through design choices.

Onto the Shoechicken rating agent internals.

1. Shoechicken should receive FeedItemEvents from the Bufunkalo. He can then build a vector representation of this article with each component being a normalized TF-IDF weight. The IDF part will be calculated from the set of all unread articles.

2. The TF-IDF weight vector can then be passed onto an LSI component that can put the new vector into the item vector space and reduce the dimensionality. Although calculating this can be expensive, I'm not sure that it will be that bad for our data set, it will be easiest to just recompute the SVD each time instead of folding the new vector in. This can be changed in future implementations but isn't necessary for us right now.

3. The reduced new vector should end up at the reclocating component which can compute the item's rating based on the user model. The rating will then be stored in the database and then UI Agent can be notified.

The feedback/AI component isn't addressed here as I think this is best left to a later date. Laying out the AI shouldn't alter the design of this part too much. It's more the case that the AI will tweak each component to try and improve their effectiveness'.

This is what I've thought up so far. Michael and I are going to meet on Tuesday and will probably talk over the ideas in more detail. Hopefully I can start on some coding mid next week!

Busy day with Shoechicken.

The Shoechicken Database has been committed to the CVS repository, along with the scripts used to create it. The database the Shoechicken project is using is HSQLDB Database. The database in the repository is a skeleton of a complete database. It contains no data, only definitions of the tables. Sad thing is that both James and I constructed the database at the same time today, so there was a bit of overlapping work and some of it was not used (sorry James). The good thing is both James and I got some experience using HSQLDB, which will pay off in the coming weeks of development.

The Shoechicken website has been updated with new documentation on the database. Additionally, some dead links on the site were fixed. The most important update was the addition of the Database Design page under the documentation section. The page should serve as a reference for any designers who need to check the table designs, data types, and contraints used in the database.

ShoeChicken

Monday, September 26, 2005

Thoughts about the rating agent

Sunday, September 25, 2005

Shoechicken Database Completed

Contributors

Links

Previous Posts

Archives