Latent Semantic Indexing (LSI):
LSI is since quite
recently (mid 2007) a part of Google
’s
algorithm for indexing
and ranking web pages.
The basic concept is
that to evaluate the relevancy of a website for a specific user
search phrase Google doesn’t only focus on this one
keyword
anymore, i.e. whether it appears in the
headline
, meta tags, anchor
texts and with a
certain frequency in the text, but also if semantically
related words are part of the text.
To judge this factor Google
has compiled a LSI database by comparing different documents
containing the same keywords and noting what other words are
occurring. Documents which have not only the keyphrase but also
a lot of other significant words in common are considered as
semantically close and relevant. Pages which only contain the
search phrase itself but none or very little of the other
keywords which appear in Google’s LSI data base as related to
this phrase, are considered as less relevant and therefore
won’t rank as high.
End 2006 there was a
readjustment of website rankings in Google. And without Google
giving a clear statement as to the reasons many attributed this
to a heavier weighting of LSI compared to other ranking
factors.
The consequences for
a webmaster would be that optimising a page for a specific
keyword would not only require to stuff this keyword in the
various page tags, headlines as well as in the
content with a certain frequency, but also to make sure that
semantically related phrases appear in the
content.
Example: You try to
optimise a page for the keyword “home theater”. Google’s LSI
data base might indicate that most documents containing “home
theater” also contain the phrases “HDTV”, “TV”, “cinema”,
“resolution”, “Dolby Surround” etc. If your page contains none
or only very few of these related terms it will not rank very
high for “home theater” no matter how well optimised it is for
this term.
[Latent
comes from
Latin latere
= to lie
hidden;
Semantic
comes from
Greek semantikos
= significant,
from semainein
= to show, signify, indicate by
a sign, sema
=
sign;
Index
comes from
Latin index
= forefinger, sign,
pointer, list (e.g. of a book’s content); later use as a
verb meaning “to compile an
index”]
|