My blog has moved!

You will be automatically redirected to the new address. If that does not occur, visit
http://www.kdmcgregor.wordpress.com
and update your bookmarks.

Friday, December 19, 2008

On vacation

I am on vacation in sunny Jamaica. Enoying the beach and the 30-35 degree weather.
I am seeing the snow in Canada....brr!!!!!
I am a poor blogger when it comes to regular updates...so this vacation is going to make it worst.


See you in January Canada....brr!!!!!!!

Friday, December 5, 2008

Searching the web - Part II & III

In my continuing series on searching the web I will look at the ARC (Automatic resource compilation) algorithm and the SALSA Algorithm

ARC Algorithm
It is actually an extension of the HITS algorithm, and uses the notion of hubs and authorities. This algorithm also uses a term based search engine to create a root set. The only difference with this algorithm is that it performs textual analysis of the web pages, and assigns a weight on the hub and the authority scores based on the textual analysis.

SALSA algorithm
The stochastic approach for link structure analysis algorithm is an extension of the HITS algorithm. This algorithm also uses the concepts of hub and authority pages;however this algorithm uses the theory of Markov chains to perform two random walks on the web graph. One walk is conducted on the authority side of a web graph (authority chain) and the other walk is conducted on the hub side of the web graph (hub chain).The algorithm creates a matrix that consists of the links between pages. This link matrix is applied to the hub and authority matrices in an iterative manner. What is produced are eigenvectors of the hub and authority matrices. The web pages with the highest eigenvectors are the highest ranked.

I have not found any practical applications that use these algorithms. As soon as I find thm I will post the links.

Tuesday, November 25, 2008

Searching the web 1.5

As the semantic web forges ahead, there exists a semantic search engine called hakia. I have not tried it out as yet, so I guess a hakia one week challenge is necessary.

Wednesday, November 12, 2008

vmware 2.0

I finally upgraded to vmware 2.0 on my debian file server.I'm loving it. It uses a web interface.
One thing I like about it is that your mouse moves from the host computer to the virtual machine.In the older version of vmware, when your mouse was in the virtual machine you had to type a control sequence to go back to the host machine.
When you install vmware 2.0 and start it, you have to enter a user name and password. The default will be the administrative credentials for the host machine (root for Linux). You may need to install a plugin in the browser to start the vm.

I am going to need more memory.

Saturday, November 8, 2008

Searching the web Part I

This week I took a cuil one week challenge. I was curious to find out how this search engine stacks up against goolge's search engine. My aim was to use no other search engine apart from cuil. However, within three days of the challenge, I had to switch. Search results were poor. A search on say 'agile development' resulted in no wikipedia hits. There were no spell checker, and no add ons for firefox.

One reason for for abrupt end to my cuil one week challenge was that I was introduced to another search engine called clusty.Clusty not only returned better results when compared to cuil,but instead of delivering millions of search results in one long list, clusty grouped similar results together into clusters. Clusters help you see your search results by topic so you can zero in on exactly what you’re looking for or discover unexpected relationships between items. What is great you can search withing clusters.

Clusty used a clustering algorithm for its search engine. Everyone is familiar with Google's PageRank algorithm.Clustering involves the separation of , say, unrelated documents and group related documents together.Using the contents of a web pages and their link information, the content-link hypertext clustering algorithm groups similar web pages into more complete web pages that can be searched or combined into larger clusters. To generate clusters, the algorithm uses similarity functions based on the contents of the web pages and the hyperlink information. There are two similarity functions for this algorithm, a similarity function that examines the hyperlinks of the pages and a similarity function that examine the contents of the web pages.Combining the hyperlink and content similarity functions together in an iterative nature produces web pages that are similar, grouped in clusters.

Other web search algorithms of note are:
HITS -Hyper text induced topic selection
ARC - Automatic resource compilation
SALSA - Stochastic Approach for Link Structure Analysis

These I will discuss in my next blog post.

Tuesday, September 30, 2008

Time

In my first blog post, I had said that I was in the process of creating my own website. As I said my charter, scope and WBS have been created. However, nothing has transpired since then.I started my blog and this is my latest post in weeks.
As a result the thought came to me , how do IT professionals find the time to upgrade their skills.
The IT industry constantly changes, new programming languages, new platforms, and new paradigms emerge at a rapid rate. To ensure you are not a dinosaur in the IT industry you have to keep up.
However most IT professionals have a life outside IT. They have other responsibilities:wives,children,house,etc. Even at work one is confined to the projects and deadlines.
I myself try to fine the time to learn. I listen to IT podcasts when I exercise, doing stuff around the house. When it comes to hand on learning I usually have a laptop with me listening to the TV.
It would be interesting to hear feedback from other IT professionals as to how they find that balance.

Thursday, August 28, 2008

IM + Email + Social Networks

Here is a neat tool that manages all your existing IM, email, and social network accounts. It is called Digsby. IM accounts that can be added are:
  • AIM
  • MSN
  • Yahoo
  • Facebook chat
  • ICQ
  • Jabber

Social network sites that can be added are:

  • Facebook
  • myspace
  • Twiiter

I have been looking for a tool for twitter, and this seem to be what I have been looking for. This tool is in beta and other social network support is comming soon.

Friday, August 22, 2008

Network tools for Linux

Here are few network tools when working with the Linux OS
  • ifconfig : This tool can be used to display the current configuration of a network interface. A privileged user can also use it to change any parameter of a network interface, be it an Ethernet card, a serial PPP link, or the loopback interface. For example, to show the configuration of all network interfaces on the host, we can use:

  • netstat : This tool is able to extract a lot of different kinds of information on all or just one network interface. A short rundown of some of netstat's arguments

    Argument Effect
    (nothing) Display open connections (sockets)
    -a Also show listening and non-listening sockets
    -c Redisplay selected table continuously
    -i Display network interfaces
    -n Display IP addresses, don't resolve names
    -r Display network routes
    -s Display network statistics
    -v Provide verbose information

  • snoop and tcdump : Both these utilities enable an administrator to examine the packets being sent on a network.
    Either tool allows packets to be examined as they appear on the network. Various options allow packets to be filtered according to source IP address and port, destination IP address and port, protocol, message type, and so on. For example, Apache's communications could be monitored on port 80, filtered down to data packets.

  • spray : A variant of ping . spray floods a destination server with ping packets to test the handling capacity of the network and server. The higher the percentage of packets that reaches the destination, the better the network. This is an unfriendly thing to do to a network that is handling real network traffic, so it should be used with caution

Thursday, August 21, 2008

The Life long learner

For those life long learners , like my self, check out Itunes U. There are a wide range of topics from different universities. Its all free to download. The format is mostly audio, however there are some video presentations as well.
You will need to install Itunes to download the content. Hey it beats paying college fees.

Tuesday, August 12, 2008

The Linux Desktop - reality ?

I just recently completed a course called Organizational and Buisness communication at Fanshawe College. One of the course requirements was to do a presentation. My presentation was on Data Mining, more on this in a later blog.
One of the class members did a presentation on Linux, and showed some open source alternatives to the common windows applications. Most members of the class was intrigued, however one class member asked the following questions :
  • Can you have MSN messenger ?
  • Does the open source equivalent has the same functionalities as MSN messenger ?
  • Does i tunes work on Linux ?
These questions showed that the Linux desktop has a long way to go before it is adopted as desktop equivalent to windows. Sure you can use an windows emulator like wine, but which non-geek computer user will have the time to configure a windows app to run on Linux.
I have three Linux boxes at home, and I too use windows sometimes. Mark you this is running a vm on Debian.
From the view of outsiders, Linux are for geeks who relish at writing complex command line statements.It will be a while to convince individuals born in the world of Microsoft windows.
Hey, I can't convince my wife to switch.

Sunday, July 27, 2008

Experts Exchange

As IT professionals we usually search the web to seek solutions to some unfamilar IT problems. One source of solutions to IT problems is Experts Exchange. Although Experts Exchange is a subscription based website, if their link comes up in a search result to a problem, select the link and scroll to the very bottom of the page. You may find the answer to your problem

Wednesday, June 18, 2008

DMZ and you

As I mentioned before I am in the process of creating my personal website. This website will be hosted on my machine in my basement. My network infrastructure at work involves a router that lies between the Internet and my computers ( 5). To host a web server, or even a game server usually requires enabling the DMZ on your router. This is not a secure option. As not only I expose my web server to the outside world, but my other computers.
One option is to use two routers . Use one router for my web server which will have a DMZ, and use the other router for my other computers. See this article for more information. This scenario provides for a secure infrastructure.

Monday, May 12, 2008

Hello World!

This is my first venture into the blogosphere. Most programmers will be familiar with my title. This usually the first output when you learn a new programming language.

You may be wondering about the kdmcgreg in the url http://kdmcgreg.blogspot.com/. Well, kdmcgreg made up my first email address. That was 10 years ago when I started my Computer Science degree.


This blog will act as an achor for my real venture of creating my own website. I am totally using open source tools and technologies for my website. More later


I am doing a project management course, and that course has truely open my eyes. For the past 10 years, I have been a techy. Nothing but learning the latest programming language and cool tools. Getting work pacakges from work and implementing them .However this project management course has showed me a new side to the IT field. Project charter, project scope, Work Breakdown Structure(WBS), and Network diagrams. The are some of the tools in Project Management that help you ensure that your project will be successful. As a result I am using my newly developed skill to build my website. I have created a charter, a scope, and a WBS. This has help me with my planning. Now own to executation!!!!