What is the BAJAI List, and why is bigger better?
The BAJAI List holds all the web addresses, file types and communication protocols which fit into the 32 content categories managed by BAJAI.
The BAJAI List and its categories are used to manage web access through a proxy server or client side solution. Think of this list as an index
of the entire Internet. Essentially, The BAJAI List is what you license from BAJAI, the software is included with your list subscription. BAJAI
List updates, provided automatically by EyeUpdate, are free throughout the term of your contract.
There is only ONE List subscription available from BAJAI, and it categorizes the entire Internet. Some companies will make you pay additional
fees to expand on their basic list with premium lists. This implies that their basic list is incomplete, creating additional hidden costs once
you discover that you need better coverage.
Some History on List creation
Traditionally lists have been created one of the following 2 methods:
- A team of human classifiers that search the World Wide Web (WWW) daily, classifying each new site they find and adding it to a classification category.
- A web spider (technology that travels the WWW) is equipped with some form of text finding criteria for classifying the sites it visits.
Problems created by above methods:
BAJAI's Answers to these problems: OCULAR™ and IajaBot™ Technology
- Human classifiers can not keep up with the exponential growth of the Internet
- Human classifiers can not help but to be subject to personal beliefs and morals
- Human classifiers are human and have good and bad days
- Human classifiers need food (pay-cheque) raising the cost of list maintenance
- Human classifiers can become desensitized to ?objectionable? web content
- Text classification tends to over-block
- Text classifiers can not differentiate medical terminology from porn content e.g. Breast Cancer etc.
- Keywords, generally used to classify content, can be found as substrings of other words e.g. Essex County; SuperbowlXXX
- Text robots do not take into account ALL the information on a site
- Porn sites without text will still receive hits; porn sites without images are visited for the stories?
- Web cams and streaming video rarely offer textual cues for analysis
- Sites in most foreign languages are not managed with english based text filtering
- People are aware that image sites and foreign language sites are left unchecked
- Hard to manage content types get added to ?premium? lists creating additional charges to cover the overhead needed to maintain them
- OCULAR™ and IajaBot™ work 24hrs a day at exceptional speed
- OCULAR™ and IajaBot™ are not subject to human emotion
- OCULAR™ and IajaBot™ never have a bad day
- OCULAR™ and IajaBot™ only eat bandwidth, and a LOT OF IT!
- OCULAR™ and IajaBot™ can?t get desensitized to content
- OCULAR™ and IajaBot™ don?t over block, classification requires ALL the content on a site
- OCULAR™ and IajaBot™ won?t find enough supporting information or images on a medical site to classify it as pornography
- OCULAR™ and IajaBot™ can't tell you what porn is, but they KNOW IT when they SEE IT!
- OCULAR™ and IajaBot™ know a picture is worth a 1000 words
- OCULA™R and IajaBot™ know that sites with images speak for themselves
- OCULAR™ and IajaBot™ watch the movies and know how to classify them
- OCULAR™ and IajaBot™ know that images give a clear picture of what is on a foriegn language site
- OCULAR™ and IajaBot™ don?t think any content is ?special? enough to warrant additional fees