Sunday, November 8, 2009

Reading Notes for this week

Web Search Engines: Part 1 and Part 2

I was unable to find these articles on the websites. They didn't come up after hitting the link and I couldn't find them after searching on the journal website by article and author name.

Current developments and future trends for the OAI protocol for metadata harvesting

- OAI = Open Archives Initiative
- the original purpose of the protocol was to facilitate access to "diverse e-print archives"
- serves as a way to distribute content
- requires metadata in certain forms which can allow a search of the "invisible web"
- OAI wasn't originally meant for libraries but has proven to be beneficial to libraries and archives.
- There are several initiatives by libraries (and sometimes collaborations among libraries)
- There are some problems with the OAI registry of data providers, including completeness and search/browse capabilities
- ERRoLs = Extensible Repository Resource Locators


The Deep Web: Surfacing Hidden Value

- regular search engines cannot retrieve websites that are located in a space considered the deep web
- In the deep web, results are only achieved through a specific search
- search engines usually retrieve web pages two ways: authors submit or search documents from one "hypertext link to another"
- the study presented in the article was to try to discover the size of the deep Web and the relevance of its content
- site are assigned to one of twelve arbitrary categories
- deep websites receive half as much traffic as surface websites

No comments:

Post a Comment