Friday, February 16, 2007

Shoechicken Goal Changes

Some fundamental assumptions as to how Shoechicken will be used have changed. We've decided to move from a personal computer application to a more service oriented approach. In changing these fundamental assumptions, much code will have to change. New code is being developed in Ruby as both a learning exercise and an attempt to speed up development.

Tuesday, October 31, 2006

Rerating unread but unrated items

Well, been a long time since I last posted. But let's just pick up where we left off. A good deal of work has continued on shoechicken. We have possibly gained another develop who's currently looking into integrating hibernate into shoechicken to abstract the database more than our current "helper" O.R. Mapping classes.

As noted by this posts title, here's a current question that needs answering. If items have already been rated for a user, but he or she has not yet read them yet, should those items be rerated when the next batch rating occurs? Here's a mini brain dump on the issue.

To start this off, a few basic assumptions can be made; the reclocation of an item should really represent the item's place within that set of items. It would seem reasonable to presume GUIs are going to have a difficult time presenting every single item in a potentially large set of items; thus, we can assume that an item for which shoechicken has received no kind of event has not been viewed by a user. Based on these assumption, every time a user requests a set of rated items shoechicken should do its best to reclocate each article, thereby taking into account relationships among the items.

Thinking over these terms leads to the thought that not rerating previously rated items would placee previously rated items having ratings not relevant to the current data set. Whereas, if we do rerate them then all unread articles are reclocated relative to one another.

Monday, January 30, 2006

LaTex and Genetic Algorithms

Spent a lot of time this week getting the paper ready for the conference. I decided to use LaTeX as the conference put an outline file out as .tex. I've only had brief foray in using LaTex, so formatting the paper had a steep learning curve. I do have to say however that it was quite fruitful. I now appreciate the power of LaTeX and will probably use it again for further papers.

Anyway, moving past the paper, I started working on the shochicken. Much of the work was re-familiarizing myself with the research I performed last year. I am really quite excited about this part of the project! Since a lot of the work has been more laying out the foundations, there isn't really much to report on progress with the shoechicken.

I did have what could turn out to be quite a good idea though. We'd assumed a fairly traditional machine learning approach for shoechicken. The system would employ some kind of reinforcement learning to build a user model. However, it may be a good idea to look into genetic algorithmic techniques. We could let the Shochickens breed!!! Sounds a little crazy, but there could be a great deal of merit to this approach. Only just thought about it, but will research the topic with some zest over the next week or so and let you all know how it goes!

Saturday, January 21, 2006

Completed the FeedSubscriptionService, Bufunkalo, and Shredder

The FeedSubscriptionService, Bufunkalo, and Shredder are all completed and tested! The next items on the hit list are the Coordinator (previously known as UIAgent) and the Shoechicken itself. The UIAgent was renamed because its role now is more of a centralized module that ties all other pieces together.

Michael and I have begun laying out the internals of both the Coordinator and the Shoechicken. The Coordinator is going to involve working out some messy synchronization issues between the modules, but we think we've got the main operations mapped out.

I've started writing code for the Shoechicken. I will probably start by creating a test case for the Shoechicken and build the internal event chain gradually. The first step will be to pull the data from the database and build the Article/Keyword matrix. I am going to have to create some fake user models with which to compute that ratings, so that will probably be next. Then, after combining the new data with the user model, we'll start working on using LSI to row reduce the data.

My wife suggested that we look to possibly adapting Shoechicken to perform a similar task for video footage. Shoechicken could recommend new movie trailers, news footage, and other kinds of video footage. After talking it over for a while it really seems like a possible direction for Shoechicken in the future. It would of course require delving into video image processing, which sounds like a good challenge. Let the fun begin!!

Sunday, January 01, 2006

Shoechicken Paper Accepted!

Here's the first update of 2006; Happy New Year everyone!

Well the first piece of news is that the paper Michael and I submitted to CATA-2006: ISCA 21st International Conference on Computers and their Applications was accepted!!! Go Shoechicken! Now the battle to come up with funding to get us to Seattle begins ...

As far as the project itself goes, over Christmas I've implemented but have not written test cases for the Bufunkalo module. That leaves test cases for the Shredder and Bufunkalo modules as the next two goals. After that we can move onto the User Interface Agent (which could possibly be renamed now that its function has changed), and the Shoechicken itself.

Points that came up while implementing the Bufunkalo were the fact that in many cases I either needed to listen for FeedEvents or FeedErrorEvents exclusively. I worked around this by just leaving the unnecessary method unimplemented; however, Michael and I have spoken about the possibility of separating the FeedListener into a FeedListener and another interface FeedErrorListener. We will discuss this in more detail after Christmas.

Tuesday, November 15, 2005

State of the Chicken

Wow, it's been a long time since I posted. A lot has happened with the Shoechicken project in the past few weeks.

James and I have completed another draft of the paper. Hopefully, we'll have it up on the website soon. This draft will be our submission to the 21st International Conference on Computers and their Applications.

I have been tinkering with the design (well, not design per se) of the Shoechicken database lately. Most of what I changed was just nitpicking, but since I'm the one working with it at the moment I figured I would go ahead and make the changes. For instance, the table names have been changed to the singular (i.e. "Feeds" is now "Feed"). From a programming standpoint it just didn't feel right working with it in the plural when dealing with individual records. Second, I changed the auto-numbered *Id fields of the tables to simply "id." It was just redundant to type "Feed.feedId," so the new "Feed.id" is much better. Foreign keys are still referenced using the old *Id names, since they refer to the "id" fields in different tables. Lastly, I added a field for the title of a feed to the Feed table. The database design website has been updated to reflect the changes. The scripts for initiating the changes have been comitted to the repository, along with a copy of the empty shoechicken database.

I have given up on the Wiki for writing class specifications. Half of this comes from the fact that it didn't like me saving changes, so it was wasting my time without getting anything published. The other half comes from the fact that I can write code a lot faster than I can specify what it should do. All in all, Shoechicken just won't be completed in any reasonable time if we switch to this approach now. It was a beautiful dream though...

Well, all of the database changes weren't made for nothing. A lot of my time has been spent working on the parts of the Shredder that deal with the database. I have been developing the ArticleDatabaseHelper class, which will be used to add/retrieve/remove data related to the article tables in the database (Author, Context, Feed, Item, ItemToKeyword, and Keyword). The class is nearly completed, with only the methods related to the ItemToKeyword table outstanding. The class already weighs in as a beast of nearly 1,700 lines of code, which I'm sure it will break before it is completed. James and I still need to discuss the approach the ArticleDatabaseHelper will take when dealing with the inability to obtain Connection objects from the ConnectionManager. Additionally, I have added its related exception, the DatabaseProcessingException, to the repository. At this time, no test cases exist for either of these classes. I will work on test cases as soon as I have access to the completed ConnectionManager class.

In addition to the above mentioned classes, I have implemented the Shredder classes which use the ArticleDatabaseHelper to add their appropriate "shreds" to the database. I have commited the ContextEntryCreator, ItemEntryCreator, KeywordEntryCreation, and KeywordCountEntryCreator classes to the repository. These classes have not been tested either, since they are dependent on the completed ArticleDatabaseHelper class. Additionally, James and I need to discuss how these classes will handle exceptions, and that functionally will need to be added to them.

Wednesday, October 12, 2005

Having Fun with Testing

Well, I just finished writing my first junit test case. It took quite a bit of effort, a bit more than I'd expected to be honest. However, I really do think it's worth the effort in the end. There is no way that I would perform all the test cases by hand otherwise, particularly not on a regular basis. I tried to follow the philosophy of writing the unit test before writing any implementation of the code. I wrote the skeleton of ConnectionManager beforehand so I could use auto-completion while writing the test.


On another completely different note, I'm writing this blog entry using a toolbar applet for Gnome. It allowed me to enter the blog info. for the shoechicken blog and then bring up a mini blog entry window with a single click. So, if this entry comes out all garbled up, you know why.


I also realized today that java 1.5 added a thread pool implementation in java.util.concurrent. This could be very useful when implementing the Bufunkalo. I'm also wondering, way ahead of when I need to be, whether java.nio might be very useful for the rating part of the Shoechicken; I'll have to look further into that, just wanted to have it out there to remind us.


I spent some time looking into writing an ant build script for Shoechicken this week. I help maintain ant build files at work, but haven't ever written one from scratch. Luckily one for Shoechicken would be fairly rudimentary, but may be worth the time. Currently Michael and I are both using Eclipse to build the project, but a standardized build file from which to create the project may be a good idea. I'm thinking that the build file would only take a few hours to make too. Michael and I can talk about it when we meet tomorrow.


I've been checking the search engines and we're still only up on Yahoo, Altavista, and AllTheWeb. I make sure to click on our links on each search engine; not sure that it does anything but it would seem logical for the search engines to keep track of what people actually click on?


That's all for now, but should be plenty more to post after Michael and I meet tomorrow.