Web site's proprietor or some other third party informs the user of the site's URL. Some
Web sites, for example, send out mass email advertisements containing the site's URL,
the spamming process we have described above.
Second, the search engines that software companies use for harvesting are able to
search text only, not images. This is of critical importance, because CIPA, by its own
terms, covers only visual depictions. 20 U.S.C. 9134(f)(1)(A)(i); 47 U.S.C.
254(h)(5)(B)(i). Image recognition technology is immature, ineffective, and unlikely to
improve substantially in the near future. None of the filtering software companies
deposed in this case employs image recognition technology when harvesting or
categorizing URLs. Due to the reliance on automated text analysis and the absence of
image recognition technology, a Web page with sexually explicit images and no text
cannot be harvested using a search engine. This problem is complicated by the fact that
Web site publishers may use image files rather than text to represent words, i.e., they may
use a file that computers understand to be a picture, like a photograph of a printed word,
rather than regular text, making automated review of their textual content impossible. For
example, if the Playboy Web site displays its name using a logo rather than regular text, a
search engine would not see or recognize the Playboy name in that logo.
In addition to collecting URLs through search engines and Web directories
(particularly those specializing in sexually explicit sites or other categories relevant to
one of the filtering companies' category definitions), and by mining user logs and
collecting URLs submitted by users, the filtering companies expand their list of harvested
59
Untitled Document
|
|
TotalRoute.net Business web hosting division of Vision Web Hosting Inc. All rights reserved. |