All of us who have access to the Internet will have come across search engines at some point. They are the indispensable pages which let us search the whole web for articles on topics that we specify. Before search engines were around, the Internet users' only help on finding topics were directories of pages that were manually built. If these did not take you where you wanted then the only other choice was to follow links until you either found the information you were after or you just gave up. Now though, all we have to do is enter a keyword or two, press enter, and Hey Presto, 10 million pages that contain just what we want. Hmmm? Read on to discover a few straightforward tips to help you find the information you want.
The main problem we face is that when we do a search we are returned vast numbers of pages, most of which are either partly or totally irrelevant, thus, we must make our searches as specific as possible. The most obvious methods are to include more keywords and make sure our existing keywords are not too general, but we do have quite a few other options available to us.
Most of the popular search engines, e.g. Infoseek, Lycos, Yahoo!, Excite, WebCrawler, AltaVista etc, have some type of syntax we can utilise to help us. Fortunately the syntax for simple operations is fairly uniform across the various engines. The most commonly used are boolean operations, and also grouping terms together as phrases. Further to these though, some engines offer case sensitive searching, wildcards, name searches and searching within specific fields of the document.
Lets run through a few examples. (Click here for a complete table of search engine operations).
Yamaha AND keyboards AND synthesizers
this forces the results to contain all three words. We can use + instead of AND directly in front of a word e.g.
+Yamaha +keyboards +synthesizers
has exactly the same result
If you did not want information about Yamaha musical instruments then you could try:
yamaha -music
oryamaha NOT music
depending on the search engine, to prevent music from being in our results.
This is normally the default for entered keywords and therefore requires no special syntax.
"return of the native"
Using Infoseek we get 260 useful results, as opposed to 22million if we did not use the quotation marks.
WebCrawlers near operator can be used as follows:
pentium NEAR/10 performance
would mean that the word pentium must be no more than 10 words away from performance.
AltaVista, Infoseek and Yahoo! enable the user to use wildcards, similar to those found in DOS and UNIX for file searches. This is helpful if you are not sure of an exact spelling. E.g. in Yahoo! try
Tchaiko*
to find pages about the composer Tchaikovsky.
AltaVista and WebCrawler also have syntax to state two words that must be together
These are ; and adj respectively. They are equivalent to enclosing the words in quotes as a phrase. E.g."Yamaha pianos" = Yamaha;pianos = Yamaha ADJ pianos
So far, the searches have been of whole documents which to tend to yield large numbers of results. Infoseek, Yahoo! and AltaVista now offer us the choice to just search the document titles or the URL (Uniform Resource Locator). Title queries are useful if we want pages solely on a topic and not those that just mention it. Also, URL queries are useful if we are looking for a web page belonging to a company but do not know it's URL and don't want hundreds of pages that mention this company or are totally unrelated if the company has an unfortunate name. For instance, searching for 'digital' using altavista returns 8 million entries. If we use its URL locator as follows:
URL:digital
we are returned 20000 and more importantly the host address.
Infoseek and AltaVista also provide searches for sites and links. So searching for
link:ic.ac.uk
will return all the sites which have links to imperial college.
Obviously these searches are most useful when used in conjunction with other terms. E.g.
host:digital alpha
will find pages about digitals' alpha chip from digitals' own web site.
AltaVista currently goes further with searches for anchors, text, applet names, image names (rather handy), ActiveX object names, host names and domains.
So, for example, if you wanted a picture of a chip manufactured by digital you could try
host:digital image:chip*
this returns images of any type with chip in their name which are on a page with host digital
For a quick reference to the various search engines features I have provided a table.
What's the difference? The difference is mainly in how the database of web pages is collated. A directory relies on authors actually submitting their pages which are then manually and accurately categorised and placed into the directory structure. A search engine forms its database automatically by using 'Robots' or 'Spiders' which traverse the web, finding pages and adding them to the database. Due to the lack of manual input, these can only be approximately categorised. The advantage the search engine has is its size, the big engines currently have 50+million pages each.
Directory listings are too often dismissed as a slow way of finding information, when in many cases they are more effective. For instance, someone wanting to learn HTML would be well advised to look through the directories first. These will more than likely give lots of useful information. A search engine could then be used if more information is still required.
Yahoo is in fact a directory as opposed to a search engine. The easy way to tell is a directory will always have lists of categories you can select and browse through as well as being able to search the entire database. A search engine can not offer the same useful facility because its pages are not as accurately categorised. Having said this though, all of the previously mentioned search engines, except AltaVista, do now let you search through a structured directory tree. These directories normally contain reviewed pages which constitute only a small portion of the engines entire database. If you are a user of Infoseek, you may have noticed that when you do a search, as well as displaying the results, it also has options to go to a list of related topics in its directory. For example if you search for star wars you are given a choice of lists of pages about Harrison Ford, Games, Star Wars fan clubs etc.
Some of you may have encountered MetaCrawler or SavvySearch, these appear to be search engines but do in fact use other search engines to do their dirty work. They are in fact very useful, retrieving say, the 10 most relevant results in parallel from several of the most popular search engines, 7 in MetaCrawlers case and 19 by SavvySearch. In use, they are surprisingly quick, often feeling quicker than the individual search engines.
Many of the search engines now offer more than just web site searching. Now we can use them to search for newsgroups, e-mail addresses, business addresses etc. We can also find out about the weather, get maps of cities and find out about our stocks and shares. If you are using WebCrawler and include an American city in a search it automatically gives you the option to view an interactive map of that city. Unfortunately most of these services only apply to the US.
Very recently, Netscape and Yahoo! have formed a slightly new service. Anyone using Netscape Navigator can press the destinations button and be taken to a customizable search page. You select your own username and password, which are stored locally as a Cookie, and you can then select subjects you want to include from seven given topics; computing, business, entertainment etc. When you then press destinations on your Netscape browser you can view the latest news on your chosen topics, search through the index, or just browse the information at hand.
Here are a couple of useful hints for searching for that important info. Firstly, if the site you are going to is on compuserve e.g. intel, instead of entering http://www.intel.com, you can type just intel. This is very useful for going to the search engines, e.g just type excite or altavista or yahoo etc. Secondly, to aid your searching, if you are given an address such as
http://orpheus.amdahl.com/doc/products/bsg/intra/adapt.html#RTFToC3it is often handy to snip the end off to go upwards in the site, e.g.
http://orpheus.amdahl.com/doc/products/Now that you all have the potential to be expert World Wide Web explorers why not go and try out your new found talents, not forgetting to use the features table as a reference to what they each can do.