This is for sure one of the most important parts of searchlores,
and you would be well advised to try some of the incredibly powerful search engines listed below.
If you limit yourself to google (that in november 2004 suddenly doubled it's index to 8 billions documents)
or to Yahoo (that in July 2005 claimed to have indexed almost 20 billion documents) you 'll just cover less than
one half of the visible web (and not even 1/100 of the hidden one).
Just copy this page onto your harddisk as
c:\main.htm (or whatever), and then bookmark it there and
use it (after having edited or thrown away anything you fancy)
in order to perform effective searches on the web
using any main search engine and starting from an unpolluted jumping off place, a page that has as few frills as
possible and as many useful forms as we know of. A page that you can modify -and ameliorate- yourself (feedback, in that case,
would be appreciated).
The main reason you should use more than one main search engine is that
search engines' results overlap FAR less than you would think. Recent studies
point out that around 3/4 of the results of a given search are UNIQUE for each search engine.
Remember that search engines list only the first part of any BIG DOCUMENT:
the size varies.
Google had a famous limit of 101K, which was abolished in January 2005, the new limit should be around 150K. These
limits are very annoying when dealing with large documents (or on-line books).
Note also that just because one, hundred, or thousand pages from a given site are crawled and
made searchable trough one of the main search engines, this does not guarantee that
every page from an indexed site has really been crawled and indexed. This shortcoming
hits not only 'new' pages, that can take MONTHS to be indexed: beehives
of spiders harvesting
a site often MISS whole subdirectories, old and new. Useful material may
be all but invisible to those that only use 'main' search tools to seek.
Moreover anyone that uses regularly google (for instance, but other search engines are
not that different) will have noticed how polluting commercial sites results nowadays
a search engine introduce a new, simple "please hide all commercial sites form your SERPs" (Search
Engines Result Pages) option, or switch, or slide, it would probably become king of the hill in a couple of months.
Therefore, seen the commercial-oriented pollution of the web, you
well advised to use regional engines, usenet and other
specialized or targeted search tools and combing
techniques and also to rely on your own bots as well, when searching your various targets.
[Only 400 results viewable]
Index now provided by Yahoo.
(search for links to 'text')
(search for links with the description 'text')
(search for given text in the url)
(search files within 'targetdomain')
(search files on 'hostname')
(search 'text' inside the title tags)
(search Java applets named 'text')
(search images with such 'filename')
Read the Altavista
in depth page! Spammed as if there were no tomorrow &
very badly commercialized. The idiots behind altavista's marketing managed to
ruin the best search engine of the middle nineties. It is still THE ONLY
search engine which is TRULY BOOLEAN, hence offering truly amazing opportunities to real seekers...
once you have taken care yourself of the spam.
main drawback is that they are very easy to spam, so you'll
get most useless results in the
positions: "hic alta, hic salta" (a seekers' proverb)...
experienced searchers mostly
jump directly in the middle of altavista's
Altavista is the 'dead links
champion' among the 'main' search engines.
Use the Simple search (which defaults to OR) ONLY if you
really know what you are doing :-)
A "Graphical" search engine, rather interesting result clusters. Here follows the text search form,
but by all means try its cartographic interface
Another "Graphical" search engine by the guys at kartoo, rather interesting
result clusters. Here follows a raw text search form,
but by all means try its cartographic interface instead!
Another "clustering" search engine... associated phrases and related keywords galore!
Dicy is a powerful and unique search engine that searches the Internet with a graphical "flower"
format and retrieves on the fly users releated and possiblealternative relevant searches.
(Here, for interested seekers, the very structure of their spider, captured on my servers :-)
Mooter "The power of relevance"
Another "clustering" search engine, from Oz. "Starbust" technique. Original keywords are highlighted.
By all means, do click on "next clusters" once you get your first SERP.
The "open directory project". The best and
most authoritative directory on the web, can be quite useful, especially when starting a broad query
The alpha and omega of all relevant searches. Copied and scraped by all web-thieves and search
gems lurk inside dmoz. Be careful and always avoid all useless clown-clones à la http://www.answers.com/
The powerful chinese Google alternative... with CACHE!
"...the world's second largest independent search engine..."
(a compound engine with some own and blog results)
IceRocket uses innovative metasearch technology to search the Internet's top search engines, including WiseNut,
Yahoo, MSN, Teoma, Altavista, Alltheweb, Lycos,
and many more.... Based in Dallas, so beware :-)
(hard to say if this is useful or not)
"Save, search and share your Personal Web. Furl it"
"Furl saves a personal copy of any page on the Web and lets you to find it again instantly, from any computer.
Share the sites you find, and discover useful new sites. Become a member to start building your Personal Web"
Fact is you can use some of the 'comments' this s.e. will dig.
This is -for some queries- a very useful search engine, highlights query words in the result snippets
and clusters on request results from the same server. Check it!
The Wayback machine
This is not only a -powerful- search engine,
but also an incredible stalking tool! Explore the Net as it was!
Visit the ad hoc YAHOO page
WARNING: Yahoo has been moved to its specific page, where you will find a
wealth of information. Here only a few masks and some info:
[Once Yahoo had only 677 results viewable, now the SERPs stop at 1000]
For info on Yahoo's (Inktomi's) rich syntax, see Nemo's essay (September 2005)
Yahoo is now one of the three "big players" (google, MSN and
Yahoo) and claimed at the beginning of September 2005, to
have indexed 19 billion sites (against google's 8 billion). A few weeks later Google claimed 25 billion docs (against Yahoo 20 billion).
Since the Web runs around 500 billion docs (and growing) the 'race' is rather pointless :-)
(More on google's ad hoc section)
[Only 4011 results viewable]
AND,OR,(),NOT,,", Excite is a classical
example of just another
'ignoble corporate merge'. Just click on rthe link above and look at it! See?
Idiotical & useless, obsolete (late-ninety)
'portal' approach. As a consequence
it ceased to be a major player in January 2002 when Infospace killed it injecting tons of
paid search results. This applies to all merges btw: attempts to escape
the fate of all pyramide schemes
that always forebode catastrophes. Recently the Italians and Germans at Tiscali
have try to revamp this engine on the sunset boulevard. It is still full of
pay-per-click crap, so
noone in his right mind uses it.
On 10/NOV/2004, probably as a counter to Microsoft's MSN new beta "super" search, google
*doubled* its indexed pages, claiming now a total of up
to 8 billion pages, which should correspond, approximately, to
1/4 of the web (around 35 billions pages according to our own data). One wonders
where did they hid all these billions pages until november 2004 :-)
LYCOS [As many results viewable as you get!]
"Part Man, Part Machine" ~ Open Directory & DMOZ used. Uses
index, with updates at greater intervals than FAST. Major sin: Has closed the VERY useful Trondheim
This was the old ask search engine,
started in 1997 and has gone through many changes over the years. In the fall of 2001
Ask Jeeves purchased the teoma search
technology and incorporated it into their search engine. Subject specific popularity organizing the web into cluster topics.
It has some meta search engine aspects, some answers come from dogpile and about.