Searching Hints & Tips
How Do Search Engines Work?

Search engines use software to spider the Web and create their databases.
Search engines use software that set up connections (structured indexes) to information in the Web
for fast access to the information

Web crawling & web scraping are concepts that can be confused.
The main difference is that while web crawling is about finding and indexing webpages,
web scraping is about extracting the data found on one or more webpages.

These are also  sometimes called robots, web crawlers, or worms.
Web pages are retrieved and indexed  by these search  engines. 

When you enter a query at a search engine, your input is checked against the search engine's keyword indices. 
The best matches are then returned  to you as matches.

             Browse means to scan, skim or read and can also mean eating, grazing, pasture, or crop.
 In ordinary English not pertaining to computerized data processing, browse means to loosely and superficially look through something

(for example, a market place, a store, a collection of something, a book, a newspaper, an article, etc.) with the purpose of only quickly seeing what it is or what it’s about without dealing with the further details of what is being looked at. Pertaining to grass- or plant-eating animals or agriculture, browse can mean eating, grazing, pasture, or crop.can mean eating, grazing, pasture, or crop.



Technically a Browser or a Web Browser is software that is used to extract (get) information from the World Wide Web.





No1: No search engine covers very much of the vast material available on the Internet!
 Imagine millions and millions of documents & files of various kinds.

Ingen sökmotor täcker särskilt mycket av det otroligt omfattande materialet som finns på Internet!
Tänk dig att det finns miljontals och åter miljontals dokument & filer av olika slag.

Vissa sökmotorer är bra till vissa saker, andra till annat. Om du vet att materialet du söker är skrivet och publicerat (digitalt) i Sverige, ja, då kan du ju söka här hemma med hjälp av en svensk sökmotor/sökmotorversion som www.yahoo.se  eller  www.google.se

Söker du material som du tror finns utomlands och/eller är publicerat på ett annat språk än svenska, ja, då ska du naturligtvis söka med ett annat språk än svenska och du ska definitivt inte besvära dig med att använda svenska sökmotorer/sökmotorversioner. 
Se språkval samt ev. landskod, URL:en slutar med  .se, mm mm 
Flera  kriterier finns som hjälper dig fastställa en sidas hemvist.

Kom också ihåg att olika motorer fungerar mer eller mindre bra med hjälp av varierad sökteknik som att  använda  citattecken/citationstecken ("  "), plustecken ( +  + ), minustecken (-  -) eller asterisk/"stjärna" ( * ). "Stjärnan" i dagligt tal, egentligen benämnd 'asterisk' är ju något av en joker i leken och kan med framgång ibland användas då man är osäker på exakt stavning eller variant av ord, eller exakt frasinnehåll. 



 " " citattecken använder du när du skall ha tag på ett exakt uttryck eller ett givet namn där sökobjektet/sökmålet innehåller flera ord. Här är oftast citattecken ett bra och väl fungerande grepp.
 I'd like to find me a
"yacht of more than 30 feet"

* * asterisk/"stjärna" klan* fångar kanske både 'klant', 'klang' & 'Klangehamn' så använd asterisk med förstånd & skärpa

+  + plustecken  +saltgurka +falukorv fångar med automatik de sidor som innehåller exakt orden 'saltgurka' & 'falukorv' och i den formen

- -  minustecken -kalle +Sigrid  eliminerar automatiskt sidorna där det står om järngänget  'Sigrid' & 'Kalle' & 'Halta Lotta' och gallrar så att säga fram med finess & bravur endast de intressanta sidorna som du vill ha där endast Sigrid beskrivs eller omtalas.


...och, som sagt var, eftersom ingen sökmotor eller sökteknik fungerar över hela fältet så är det viktigt att känna till några av förutsättningarna och de här ovan nämnda klarar man sig icke utan.
Resten är fotarbete, fotarbete och tid & tålamod i kombination med  fantasi & envishet.
Naturligtvis underlättas detta om man har ett visst språkligt intresse för detaljer.


- For those not used to search and search functions within the computing environment, searching on the Internet & searching in a database sometimes need an extra amount of consideration & thought before getting started. 
Learn about 'wildcards' & 'wildcard characters' before you find yourself in a very time-consuming situation, confused, out of time, lagging behind, not knowing anything or little at the most,
sometimes with a feeling or sense of realization
that there is really no information to be found.


'Wild card' is a method permitting an operating system to perform utility functions on multiple files
with related names without the user or programmer having to specify each file by its full & unique name.
That's why you sometimes might end up
finding "Donna", "Dona", "Donald" and "Don" when you've searched for "Don".


also to get your 'slash' right (/)...
When you're in system folders and directories you want a (your) slash tilted to the left ( \ )
whereas when you're on the Internet or on your local Intranet you want to use "the right one" (/

Quite frequently colleagues turn to me complaining about a lack of information.
But usually there's never anything correct about that assumption.

Good searching requires common sense, creativity, technique and an on-line computer.

 -That's about it!

Technique can be acquired by reading thoroughly through the instructions found at major search sites.
Also, there's usually very good information to be found in your own Web Browser under Help.


Also, of course, it is important to become aware of the importance of "the plus" (+). 
If you want to find something consisting of two words where the word order doesn't matter
 you type a plus (+) before each word (+bilar +Chevrolet).
If you want pages or websites containing information in Swedish about Chevrolet cars, this is it.

If the word order is significant (important) you
use a quotation mark at each end (before the first word and after the second  word). This way you make sure you end up finding pages where these words are found exactly the way you wanted them.
Type mainly using small letters unless otherwise required by circumstances,
because 'peanut' will find 'Peanut' but never will Peanut find peanut (groundnut)! 



Different wildcard characters and their meanings

?   Any single character in this position

*    Any number of characters in this position

 A single number in this position

[]   Find these characters

[!]  Don't find these characters [!characters]

If you make a habit of using these characters in your search activities  you can rest assured  regarding  utilizing your limited time  and making the best of it.


If you were to search for h?ll you would find  hall, hell, hill, hull, håll, häll, höll but you wouldn't find 'hålla', 'hälla' etc. which would require the following search text: h?ll?

Similarly you would find 'the', 'true', 'tongue' but not 'tea' if you were to search for t*e





-Please be aware of the fact that some sites, pages & databases do not accept  or even work with characters other than letters and numbers!

Don't forget WHAT language you're using,
and remember that sometimes it might prove quite efficient even trying out words & phrases

(in another language!)
which you actually have to look up first on your own.
Don't  depend on one language only when searching!

This, among other things, makes it important to be creative and to be able to use various ways of searching with different techniques.

   Make a habit of
looking up words you don't understand!





Google Data Collection by Professor Douglas C. Schmidt,
 Vanderbilt University

  1. Machine Learning/AI Series & Certification
                           This series intends to deliver byte-sized sessions on topics  
    ranging from Data Science, Python, Algorithms and Machine Learning Models.            


