JCT Library Demos
|
|
Search Engine Scamming
Chana Lajcher - Reference Librarian
You are searching for a distributor of scientific equipment, an
educational game for your child, an article about a new technology,
or you want to see if your favorite search engine has finally indexed
your web page. So why does the search engine show you pornography
sites??
This is a disturbing phenomenon known as search engine scamming
(spamming) or hijacking. Most of the hijackers are pornography sites
or businesses hoping to steal their competitors' clientele.
These techniques are not difficult to do but not only are
they unfair and blatently immoral, in some cases they are illegal and
can be (and are) prosecuted. In some cases, the business owners are
unaware that they are on questionable legal grounds since the field
is so new that even the legal profession is only beginning to deal
with the subject.
Meta - tags:
Meta tags are the special html codes that allow one to add extra
subject keywords to a web site, without that text appearing on the
page for the reader. Good html practice encourages this. One can add
variant spellings, extra search terms, and the search engines will
weight the meta-tag terms in their indexing to help a searcher find
your page.
But this can be abused - it is very easy to salt the html with
terms such as the exact name of a competitor. For example, adding
the name of a popular manufacturer of a product in the hopes that
the users will choose yours instead. Take a look at the meta-tags of the
extinct Quayle 2000 Web site:
meta name="keywords" content="Qualye, Quayle, Dan Quayle, J. Danforth
Quayle, Quail, Quale, Quaile, President, Election, 2000, 2000 election,
Bush, Bush 2000, George Bush, President Bush, George W. Bush, George Bush
2000, George W Bush 2000, George W. Bush 2000, Elizabeth Dole, Dole, Dole
2000, Elizabeth Dole 2000, Forbes, Steve Forbes, Forbes 2000, Steve Forbes
2000, Vice President, Vice-President, Kasich, John Kasich, Kasich 2000, John
Kasich 2000, McCain, Mc Cain, McCain 2000, Mc Cain 2000, John McCain, John
Mc Cain, John McCain 2000, John Mc Cain 2000, Al Gore, Gore, Gore 2000, Al
Gore 2000, Bradley, Bill Bradley, Bradley 2000, Bill Bradley 2000, Bauer,
Gary Bauer, Bauer 2000, Gary Bauer 2000, Christian Coalition, Conservative,
Liberal, Christian, Potato, Potatoe, Dan Quail, Dan Quale, Dan Quaile,
Democrat, Republican, Politics, President, Dan Quayle for President, Quayle
for President, Dan Quail for President, Dan Quale for President, Dan Quaile
for President, Dan Quayle for President 2000, Quayle for President 2000, Dan
Quail for President 2000, Dan Quale for President 2000, Dan Quaile for
President 2000, Quayle, Quayle, Quayle, Quayle, Quayle, Quayle, Quayle,
Quayle, Quayle, Quayle, Quayle, Quayle, Quayle, Quayle, Quayle, Quayle,
Quayle, Quayle, Quayle, Quayle, Quayle, Jesse Jackson, Jackson, Hillary
Clinton, Clinton, Bill Clinton, President Clinton, President Bill Clinton,
Hillary, Rodham, Hillary Rodham Clinton, Hillary Rodham-Clinton, White
House, Oval Office, Congress, Senate, Representative, Press, Media, election
coverage, cnn, presidency, presidents, vice presidents, vice-presidents,
leadership, courage, integrity, honor, valor, hero, intelligence, intellect,
intellectual, honesty, truth, justice, pride, America, Family, Values,
Family values, Social values, Murphy Brown, Taxes, Economy, Kosovo, China,
Security, Defense, Foreign policy, Foreign relations, Government, Marriage,
Marriage penalty, Marriage-penalty, Death taxes, Death-taxes, Crime,
Education, Welfare, Technology, Internet, War, Capital Punishment, Capitol
punishment, Laws, Senator, Drugs, Military, Abortion, Pro-Life, Pro-Choice,
Leader, Motivation, Standing Firm, Middle-class values
This was the real site of the candidate (www.quayle.com). Some of the choices
are rather strange and will certainly confuse the users. The searcher
looking for the latest news from Kosovo will not want the search engine
to pull up US election material. Using the same technique, one can add
keywords for popular actors or musicians in the hopes of catching buyers for
teenage style clothing. Not good html practice but I suppose good marketing.
When people talk about the commercialization of the web - this is one of the
examples.
There have been a few court cases recently over exactly this issue - it
is considered trademark infringement and large corporations are going to
be just as quick to prosecute as they are now for more traditional uses
of their trademarks.
Dumping of text:
An older trick that still works is to place extra text in the same color
as the background, or in a very small font at the bottom of the page in an
attempt to make the keywords appear many times (and therefore make the page
rise higher in a search). Many search engines will discard a page if the
same word appears too many times so there are variations of this such as
making the keywords appear as names of graphic files. There is often a fine
line between spamming an engine and simply writing the page to best show
your product on a search engine.
There are pornographic sites that dump entire dictionaries into the html
source code so that just about any combination of search terms will bring
up that site. The dictionary loads slowly but in the meantime the site
pops up lots of windows and frames and re-directs so the user doesn't notice
that he is not seeing the original page.
Mis-spellings and stolen names:
People mis-spell and type badly - this is a fact of life that some sites
are using to their benefit. For example, the USA Whitehouse (www.whitehouse.gov) has shadow sites - www.whitehouse.net is a parody site, while www.whitehouse.com is pornographic (no I didn't provide links, you can type the addresses yourself if you feel the need). Legally, it is considered akin to getting
telephone numbers similar to a competitor's 800 number in the hopes that
some people will misdial and reach your line instead. In the near future
there will be more legal work on this issue and most likely (since there
are more lawyers than html programmers...) be resolved in favor of the
original name owners. Slate magazine calls this
The oldest scam on the Internet.
Misdirecting:
A new trick is to misdirect the browser or the search engine from a legitimate
site to another. There are a few methods; one being to copy the exact html
code then host the copy on a second site, submit it to the search engines
where it will appear identical to the original text, and when the user clicks
through to the site, to redirect them to another site (using java
routines). This is sometimes referred to as stealing meta-tags.
Often, these same sites have a java routine that will not allow the browser
to click off of the site, rather it opens multiple windows and frames. This
is sometimes called a "mousetrap". It is irritating and seems like a useless
marketing tool until one realizes that banner ad space is sold by the number of
click-throughs so if a site can (artificially) raise that number, they can
then claim a higher fee for the larger numbers.
The FTC has a case against a company for doing just this. In very colorful
language (for a legal document) they call it "nefarious". Note this is the
FTC and not the FCC - they see this as a form of consumer fraud since the
scammers are presenting themselves as a different site thereby stealing the
"customers".
www.ftc.com
If you click through from a site to an interesting database but find yourself
reading about time-share condos instead, then that site has fallen victim
to a flaw in the Windows NT security that allows someone to change the content
of an html page. Someone has replaced the original url for the database with
another url of their choosing. You might consider informing the site owners -
there is really no other way for them to know. If you are a site owner, you
should check through your pages regularly to make sure they haven't been
invaded. Most of the major html scams have been caught this way.
The web is, by its nature, in a constant state of change. There is almost
a war between the search engines and the shady businesses hoping to spam
them. As the technology advances, people are exploring new ways of using it.
Some of them, unfortunately, are not very nice.
Some interesting articles:
Return to JCT Library page |