Currently when I search for some information on the net, Search Enginges definitely do a good job and show me the numerous (37999 results matched !) results.
However, when I start reading each of the links in the top 10 list, I see that most of them have information overlap. I spend 10 mins reading each page, only to find at the end of the 10 links, that the information I have really gained is 1.75 pages may be.
Now, search Enginges hog the bandwidth of the Websites and download the complete data. However the real power of this complete data is not harnessed.
If there is a search engine which has a reader attached to it, which can show me the snippets or excerpts of the information on the page or atleast cluster results based on search content overlap, that would be cool.
Now Challenges for this –
What exactly is information and how do you find its overlap?
Is this computationally feasible?
Adding another layer between search engine and the documents (reader layer), will that be usable?