Thoughts about the rating agent
As Michael had already posted, I spent some time this weekend working on the database, only to discover Michael was working on it too. We'll have to work on that whole planning thing. We've got the empty shell of the database in place now though.
I've also been thinking about the internals of the rating agent and some of the conceptual functionality of shoechicken. One thing I was thinking about is whether the rating agent could be thought of as a system service. This would allow news to be aggregated even when the GUI (Frying Pan) was not active. Since the RSS/Atom feeds cycle through articles, having the service running in the background would allow the retrieval and reclocation of articles that could otherwise be lost. At first this seemed to be an alright idea, but after talking with Rachel she pointed out how that may put some people off the application. Also, if someone doesn't check the news very often they're probably not too interested in what transpired 2 weeks ago.
Part of what got me thinking about this was the possibility of re-reclocating articles that had not yet been read by the user. This doesn't seem to make sense if the article has already been posted in the Frying Pan as articles could jump around in the list too much. However, if the article had not yet been posted to the GUI, it would seem plausible to reclocate that article again to find its most appropriate ordered location in the unread article set. The problem is that this situation is not likely to occur if Shoechicken only runs when the GUI is active.
So, my final thoughts (thanks largely to Rachel) are that the user could be given the option to have shoechicken collect news in the background, or just when the GUI is open. I know this is thinking a little way down the road, but my goal is to not prevent the former from happening through design choices.
Onto the Shoechicken rating agent internals.
1. Shoechicken should receive FeedItemEvents from the Bufunkalo. He can then build a vector representation of this article with each component being a normalized TF-IDF weight. The IDF part will be calculated from the set of all unread articles.
2. The TF-IDF weight vector can then be passed onto an LSI component that can put the new vector into the item vector space and reduce the dimensionality. Although calculating this can be expensive, I'm not sure that it will be that bad for our data set, it will be easiest to just recompute the SVD each time instead of folding the new vector in. This can be changed in future implementations but isn't necessary for us right now.
3. The reduced new vector should end up at the reclocating component which can compute the item's rating based on the user model. The rating will then be stored in the database and then UI Agent can be notified.
The feedback/AI component isn't addressed here as I think this is best left to a later date. Laying out the AI shouldn't alter the design of this part too much. It's more the case that the AI will tweak each component to try and improve their effectiveness'.
This is what I've thought up so far. Michael and I are going to meet on Tuesday and will probably talk over the ideas in more detail. Hopefully I can start on some coding mid next week!
I've also been thinking about the internals of the rating agent and some of the conceptual functionality of shoechicken. One thing I was thinking about is whether the rating agent could be thought of as a system service. This would allow news to be aggregated even when the GUI (Frying Pan) was not active. Since the RSS/Atom feeds cycle through articles, having the service running in the background would allow the retrieval and reclocation of articles that could otherwise be lost. At first this seemed to be an alright idea, but after talking with Rachel she pointed out how that may put some people off the application. Also, if someone doesn't check the news very often they're probably not too interested in what transpired 2 weeks ago.
Part of what got me thinking about this was the possibility of re-reclocating articles that had not yet been read by the user. This doesn't seem to make sense if the article has already been posted in the Frying Pan as articles could jump around in the list too much. However, if the article had not yet been posted to the GUI, it would seem plausible to reclocate that article again to find its most appropriate ordered location in the unread article set. The problem is that this situation is not likely to occur if Shoechicken only runs when the GUI is active.
So, my final thoughts (thanks largely to Rachel) are that the user could be given the option to have shoechicken collect news in the background, or just when the GUI is open. I know this is thinking a little way down the road, but my goal is to not prevent the former from happening through design choices.
Onto the Shoechicken rating agent internals.
1. Shoechicken should receive FeedItemEvents from the Bufunkalo. He can then build a vector representation of this article with each component being a normalized TF-IDF weight. The IDF part will be calculated from the set of all unread articles.
2. The TF-IDF weight vector can then be passed onto an LSI component that can put the new vector into the item vector space and reduce the dimensionality. Although calculating this can be expensive, I'm not sure that it will be that bad for our data set, it will be easiest to just recompute the SVD each time instead of folding the new vector in. This can be changed in future implementations but isn't necessary for us right now.
3. The reduced new vector should end up at the reclocating component which can compute the item's rating based on the user model. The rating will then be stored in the database and then UI Agent can be notified.
The feedback/AI component isn't addressed here as I think this is best left to a later date. Laying out the AI shouldn't alter the design of this part too much. It's more the case that the AI will tweak each component to try and improve their effectiveness'.
This is what I've thought up so far. Michael and I are going to meet on Tuesday and will probably talk over the ideas in more detail. Hopefully I can start on some coding mid next week!