Monday, March 28, 2011

Paper Reading #17: Personalized Reading Support for Second-Language Web Documents by Collective Intelligence

Reference Information:
Title:  Personalized Reading Support for Second-Language Web Documents by Collective Intelligence
Authors: Yo Ehara, Nobuyiku Shimizu, Takashi Ninomiya, Hiroshi Nakagawa
Presentation Venue: IUI’10, February 7–10, 2010, Hong Kong, China

Summary:

This paper is about an interface that predicts what words on an English web page the user might not understand. If the user clicks on the word than the definition of the word will be displayed to the user. This can be more efficient than previous glossing systems because it glosses only the words that are predicted and not every single word in the text regardless if the user knows the word or not.

Logistic regression is used to model the system and predict the words unkown to the user. The next part of this paper discusses the very technical and mathematical explanation behind this model.

For testing purposes a database of 12,000 words was created that lists the most fundamental words that an English speaker should know. For the first study each of the 16subject answered 12,000 questions, one for each word in the database. Users were asked to match definitions with words.

From studies it is encouraging that it has been shown that the algorithm used adapts quickly to new users. This is helpful to know because it shows that something like this can be helpful to first time users opposed to using a system that has a static list of "difficult" words already defined.


Discussion:

I think something like this would be pretty helpful in certain situations. In most cases I personally would not want to have something like this taking up space on my screen. However, if I were reading a technical document or something for school I would like to have something like this available. I like how something can be created to adapt to each user's vocabulary.

What I didn't like about this document is even after reading it fully I'm not 100% sure I understand how the model works and the results from the studies. This sounds weird but the middle of the document is cluttered with specific terms to the field and mathematics. This was hard to follow because I don't understand fully what they are talking about even after they try to give a short description. I think this is just because the paper is meant for people already familiar with the subject.

3 comments:

  1. I agree, I don't think that I would want this impeding my reading. However, this would probably be quite helpful to those learning a language.

    ReplyDelete
  2. I agree, I see this being useful for learning languages or for technical documents. Other than that though, I think it might just get annoying, especially if the prediction system isn't very good.

    ReplyDelete
  3. Right on, I think this should be used in moderation and that having it in every text reader would be very distracting. PS: the paper was very hard to follow.

    ReplyDelete