~ Main search engines at searchlores ~
to basic

Updated October 2006, version 1.65
       
SEARCH ESSAYS OF CHOICE

Long term searching: rules and advice ~ Yahoo's (Inktomi's) search syntax ~ Powerbrowsing ~ Polylinguistic search ~ Learning to transform questions into effective queries ~ Search Engines Anti-Optimization ~ Fishing for troubles ~ Music searching ~ Catching the rabbit's ears ~ When your search fails ~ Follow Links in the Underground ~ Google's wild side ~ Using Fuzzy Logic ~ A Re-ranking trilogy ~ Searching scarcity ~ Searching Historical Information


Back to Portal   ~    Library   ~   Bk:flange of myth  ~  
pda searches    low band searches (good for GPRS)


INTRODUCTION 
(read this)
QUICK FORMS  (use them)


Fravia's searching MAPA (masks and pages)          cache : has cache             ¤ : special page
Best s.e.
Ask Jeeves
cache   (date)
MSNsearch cache
Yahoo! cache
¤[YAHOO]
Google cache 
 ¤[GOOGLE]
Second Tier
Alexa cache
Hotbot
¤[HOTBOT]
A 9 (google's cache)
Teoma/Ask  
 ¤[TEOMA]
Fast  
 ¤[FAST]
Alta!
Adva  Simple
Useful s.e.
Wisenut
Baidu cache
Gigablast cache
IceRocket (webarchive)
Exalead (date)
Graph s.e.
Kart00 (graph)
Touch (graph)
Ujiko (graph)
Dicy (cluster)
Mooter (cluster)
 
Other
 
 
 
Lycos
Looksmart
Excite (ill)
Other
Entireweb
Wayback (past)
Factbite (ency)
dmoz (directory)
Furl (webarchive)
[FTPSEARCH]
@ PHP
[Our searching scrolls!]
[600 engines for next to nothing]
@ fravia's
Targets
Local
Regional
Compound
Usenet
Accmail
@ fravia's
Live searches
Page Providers
Combing
Details
Databases
Allinones
@ fravia's
Images
Books
Laws
Files
Filez
Passwords
Bogus & crap
look.com
2020search
tygo.com



Instructions & Caveats

This is for sure one of the most important parts of searchlores, and you would be well advised to try some of the incredibly powerful search engines listed below. If you limit yourself to google (that in november 2004 suddenly doubled it's index to 8 billions documents) or to Yahoo (that in July 2005 claimed to have indexed almost 20 billion documents) you 'll just cover less than one half of the visible web (and not even 1/100 of the hidden one).

Just copy this page onto your harddisk as c:\main.htm (or whatever), and then bookmark it there and use it (after having edited or thrown away anything you fancy) in order to perform effective searches on the web using any main search engine and starting from an unpolluted jumping off place, a page that has as few frills as possible and as many useful forms as we know of. A page that you can modify -and ameliorate- yourself (feedback, in that case, would be appreciated).

The main reason you should use more than one main search engine is that search engines' results overlap FAR less than you would think. Recent studies point out that around 3/4 of the results of a given search are UNIQUE for each search engine.

Remember that search engines list only the first part of any BIG DOCUMENT: the size varies.
Google had a famous limit of 101K, which was abolished in January 2005, the new limit should be around 150K. These limits are very annoying when dealing with large documents (or on-line books).

Note also that just because one, hundred, or thousand pages from a given site are crawled and made searchable trough one of the main search engines, this does not guarantee that every page from an indexed site has really been crawled and indexed. This shortcoming hits not only 'new' pages, that can take MONTHS to be indexed: beehives of spiders harvesting a site often MISS whole subdirectories, old and new. Useful material may be all but invisible to those that only use 'main' search tools to seek. Moreover anyone that uses regularly google (for instance, but other search engines are not that different) will have noticed how polluting commercial sites results nowadays are. Would a search engine introduce a new, simple "please hide all commercial sites form your SERPs" (Search Engines Result Pages) option, or switch, or slide, it would probably become king of the hill in a couple of months.

Therefore, seen the commercial-oriented pollution of the web, you would be well advised to use regional engines, usenet and other specialized or targeted search tools and combing techniques and also to rely on your own bots as well, when searching your various targets.

Note that you can also easily search and find targets that do not exist any more :-)


A useful tool to compare results in google and yahoo:
http://www.langreiter.com/exec/yahoo-vs-google.html?q=searchlores


SEARCH ENGINES FORMS
(Use the MAPA to navigate)



ALTAVISTA ADVANCED SEARCH [Only 400 results viewable]
Index now provided by Yahoo.
AND,OR,(),NOT,NEAR,",*
link:text (search for links to 'text') anchor:text (search for links with the description 'text') url:text (search for given text in the url) domain:targetdomain (search files within 'targetdomain') host:hostname (search files on 'hostname') title:text (search 'text' inside the title tags) applet:text (search Java applets named 'text') image:filename (search images with such 'filename')

Read the Altavista in depth page!
Spammed as if there were no tomorrow & very badly commercialized.
The idiots behind altavista's marketing managed to ruin the best search engine of the middle nineties.
It is still THE ONLY search engine which is TRULY BOOLEAN, hence offering truly amazing opportunities to real seekers... once you have taken care yourself of the spam.

Altavista algos' main drawback is that they are very easy to spam, so you'll get most useless results in the first 20-30 positions: "hic alta, hic salta" (a seekers' proverb)... experienced searchers mostly jump directly in the middle of altavista's results lists.
Altavista is the 'dead links champion' among the 'main' search engines. Use the Simple search (which defaults to OR) ONLY if you really know what you are doing :-)


Boolean query: 

            Sort by:

        Language:          Show one result per Web site

                From:     To:   (e.g. 31/12/99)

Simple search - Graphic Version


ALTAVISTA SIMPLE SEARCH [Only 400 results viewable]
For boolean operators, and more info, use Advanced Altavista instead!

Index now provided by Yahoo.

Ask AltaVista a question.  Or enter a few words in

search refine

Search - Advanced




Altavista's ad hoc strings

One of Altavista's most SPECIFIC features is the anchor: operator, which will allow patient searchers to find relevant pages trough tha anchor tag.
For instance: anchor:snowflakes or anchor:posette or anchor:beria or anchor:kafka will give you a series of noise reduction arrows...
of course you can extend the trick to whatever...
anchor:warez or anchor:gamez or anchor:whatever :-)


Kart00
A "Graphical" search engine, rather interesting result clusters.
Here follows the text search form, but by all means try its cartographic interface


Worldwide web   English web  
more options    To use the best of KartOO, try the cartographic interface.


Ujiko
Another "Graphical" search engine by the guys at kartoo, rather interesting result clusters.
Here follows a raw text search form, but by all means try its cartographic interface instead!




Dicy
Another "clustering" search engine... associated phrases and related keywords galore!
Dicy is a powerful and unique search engine that searches the Internet with a graphical "flower" format and retrieves on the fly users releated and possiblealternative relevant searches.

(Here, for interested seekers, the very structure of their spider, captured on my servers :-)

          Search_Dicy  


Mooter
"The power of relevance"
Another "clustering" search engine, from Oz. "Starbust" technique. Original keywords are highlighted.
By all means, do click on "next clusters" once you get your first SERP.

This is a VERY GOOD search engine, developed by a Hamey guy who's in the "neural network" path.
Its results are brilliant. Servers seem weak (slow), though :-(

 


Dmoz
The "open directory project". The best and most authoritative directory on the web, can be quite useful, especially when starting a broad query

The alpha and omega of all relevant searches. Copied and scraped by all web-thieves and search engines-spammers: real gems lurk inside dmoz.
Be careful and always avoid all useless clown-clones à la http://www.answers.com/

          Advanced




Baidu
The powerful chinese Google alternative... with CACHE!
"...the world's second largest independent search engine..."



BAIDU ADVANCED
DIQU BAIDU (regional)


Looksmart ~ For instance: searchlores
Quite commercial oriented... powered by Inktomi... but uses its own databases!
Search for    

IceRocket

(a compound engine with some own and blog results)
IceRocket uses innovative metasearch technology to search the Internet's top search engines, including WiseNut, Yahoo, MSN, Teoma, Altavista, Alltheweb, Lycos, and many more.... Based in Dallas, so beware :-)

Search the Web:

Furl

(hard to say if this is useful or not)
"Save, search and share your Personal Web. Furl it"
"Furl saves a personal copy of any page on the Web and lets you to find it again instantly, from any computer. Share the sites you find, and discover useful new sites. Become a member to start building your Personal Web"

Fact is you can use some of the 'comments' this s.e. will dig.

Search for  

The Entireweb
This is -for some queries- a very useful search engine, highlights query words in the result snippets and clusters on request results from the same server. Check it!

   Advanced
 Preferences

 
 

The Wayback machine
This is not only a -powerful- search engine, but also an incredible stalking tool! Explore the Net as it was!


YAHOO

20 BILLION DOCUMENTS (End-September 2005)

Visit the ad hoc YAHOO page
WARNING: Yahoo has been moved to its specific page, where you will find a wealth of information. Here only a few masks and some info:



YAHOO [Once Yahoo had only 677 results viewable, now the SERPs stop at 1000]

For info on Yahoo's (Inktomi's) rich syntax, see Nemo's essay (September 2005)

Yahoo is now one of the three "big players" (google, MSN and Yahoo) and claimed at the beginning of September 2005, to have indexed 19 billion sites (against google's 8 billion). A few weeks later Google claimed 25 billion docs (against Yahoo 20 billion). Since the Web runs around 500 billion docs (and growing) the 'race' is rather pointless :-)
(More on google's ad hoc section)

Advanced Yahoo search


Note that there are some direct addresses for yahoo (see google's UF, point 14), for instance: http://216.109.117.135/search.
There is an interesting "MSN alike" YAhoo slider tool you should be aware of: Yahoo Mindset, try for instance fravia

EXCITE [Only 4011 results viewable]
AND,OR,(),NOT,,",
Excite is a classical example of just another 'ignoble corporate merge'. Just click on rthe link above and look at it! See? Idiotical & useless, obsolete (late-ninety) 'portal' approach. As a consequence it ceased to be a major player in January 2002 when Infospace killed it injecting tons of paid search results. This applies to all merges btw: attempts to escape the fate of all pyramide schemes that always forebode catastrophes. Recently the Italians and Germans at Tiscali have try to revamp this engine on the sunset boulevard. It is still full of pay-per-click crap, so noone in his right mind uses it.


 Web Search 
exclude words 
search in 

excite image search (powered by fast)

 Image Search 
Format  ALL  JPEG  GIF  BMP 
Type  ALL  COLOR  B/W  LINE ART 


Google

25 BILLION DOCUMENTS (End-September 2005)


Visit the ad hoc GOOGLE page
WARNING: Google has been moved to its specific page, where you will find a wealth of information. Here only a few masks:

Google shoots for the lowest common denominator zombie being able to find stuff, yet allows power users to take advantage of the hidden advanced features

Simple Google


        
Advanced GOOGLE
(only 3% of users take advantage of it, poor 97% zombies :-)

G. scholar  ~  G. Univ search  ~  G. Classical :-)

and a nice "GoogleRanking" bookmarklet: internet+searching


Googlette:


On 10/NOV/2004, probably as a counter to Microsoft's MSN new beta "super" search, google *doubled* its indexed pages, claiming now a total of up to 8 billion pages, which should correspond, approximately, to 1/4 of the web (around 35 billions pages according to our own data). One wonders where did they hid all these billions pages until november 2004 :-)

LYCOS [As many results viewable as you get!]
AND,OR,(),NOT,NEAR,",

"Part Man, Part Machine" ~ Open Directory & DMOZ used. Uses especially Fast's index, with updates at greater intervals than FAST. Major sin: Has closed the VERY useful Trondheim ftp-search facility.

Lycos advanced: fields    Lycos advanced: language    Lycos advanced: link referrals
Lycos help page
Gigablast
Most recent search engine, quite good, it seems. HEY! It has a cache, like Google!
Search for...
all of these words
this exact phrase
and this exact phrase
any of these words
none of these words
Sort by date
Restrict to this Site
Restrict to this URL
Pages that link to this URL
Site Clustering yes   no
Number of summary excerpts 0   1   2   3   4
Results per Page 10   20   30   40   50


TOUCHGRAPH

A graphical map of incoming and outcoming links, still in beta, uses google.
http://www.touchgraph.com/TGGoogleBrowser.html

FACTBITES

Factbites, quite interesting australian aggregator, more encyclopedia than search engine

Enter topic:  


ASK JEEVES

This was the old ask search engine, started in 1997 and has gone through many changes over the years. In the fall of 2001 Ask Jeeves purchased the teoma search technology and incorporated it into their search engine. Subject specific popularity organizing the web into cluster topics. It has some meta search engine aspects, some answers come from dogpile and about.