It's not ironic - my post was musing inspired by your work. I guess I
wasn't sure if I understood your results. You were looking at the overall
POS usage in the entire texts as a possible way of ranking the texts. I was
wondering about POS of particular search terms - those that could take on
several POS. A related question - does SOLR use stemming to widen the search
to various POS? Then would it be meaningful to rank the given texts by the
POS of the actual search terms? And has anyone looked at samples of user
search terms - are they almost always noun phrases? Just wanting to
understand what you have explored. And I probably should have added to your
thread on NGC4LIB, rather than Code4lib - I tend to conflate them.
Cindy Harper, Systems Librarian
Colgate University Libraries
[log in to unmask]
315-228-7363
On Sat, Feb 19, 2011 at 5:42 PM, Eric Lease Morgan <[log in to unmask]> wrote:
> On Feb 19, 2011, at 11:26 AM, Cindy Harper wrote:
>
> > I just was testing our discovery engine for any technical issues after a
> > reboot. I was just using random single words, and one word I used was
> > "correct". Looking at the first ranked items, I wondered if there's some
> > role for parts-of-speech in ranking hits - are nouns and , in this case,
> > adjectives more indicative of aboutness than verbs? The first items were
> > "Miss Manners ... excruciating correctly behavior", then a bunch of
> govdocs
> > on "an act to correct....". I don't think there's any reason to prefer
> > nouns over verbs, but I thought I'd throw the thought at you anyway.
>
>
>
> Ironically, I was playing with parts-of-speech (POS) analysis the other
> day. [1]
>
> Using a pseudo-random sample of texts, I found there to be surprisingly
> similar POS usage between texts. With such similarity, I thought it would be
> difficult to use general POS as a means for ranking or sorting. On the other
> hand, specific POS may be useful. For example, Thoreau was dominated by
> first-person male pronouns but Austen was dominated by second person female
> pronouns.
>
> I think there is something to be explored here.
>
> [1] POS - http://bit.ly/hsxD2i
>
> --
> Eric "Still Counting Tweets and Chats" Morgan
>
|