My blog has moved!

You will be automatically redirected to the new address. If that does not occur, visit
http://www.kdmcgregor.wordpress.com
and update your bookmarks.

Friday, December 19, 2008

On vacation

I am on vacation in sunny Jamaica. Enoying the beach and the 30-35 degree weather.
I am seeing the snow in Canada....brr!!!!!
I am a poor blogger when it comes to regular updates...so this vacation is going to make it worst.


See you in January Canada....brr!!!!!!!

Friday, December 5, 2008

Searching the web - Part II & III

In my continuing series on searching the web I will look at the ARC (Automatic resource compilation) algorithm and the SALSA Algorithm

ARC Algorithm
It is actually an extension of the HITS algorithm, and uses the notion of hubs and authorities. This algorithm also uses a term based search engine to create a root set. The only difference with this algorithm is that it performs textual analysis of the web pages, and assigns a weight on the hub and the authority scores based on the textual analysis.

SALSA algorithm
The stochastic approach for link structure analysis algorithm is an extension of the HITS algorithm. This algorithm also uses the concepts of hub and authority pages;however this algorithm uses the theory of Markov chains to perform two random walks on the web graph. One walk is conducted on the authority side of a web graph (authority chain) and the other walk is conducted on the hub side of the web graph (hub chain).The algorithm creates a matrix that consists of the links between pages. This link matrix is applied to the hub and authority matrices in an iterative manner. What is produced are eigenvectors of the hub and authority matrices. The web pages with the highest eigenvectors are the highest ranked.

I have not found any practical applications that use these algorithms. As soon as I find thm I will post the links.