A trivial web proxy to zap advertisements

An overview

nobanner is a trivial web proxy written in perl which can rewrite HTML on the fly.

This is a very simple program, and I was not too interested in performance, so this program may very well not be suitable to any large scale use. For a very nice proxy server, which rewrites just URLs, see Junkbusters. There are undoubtably other programs which do the same, or similar, rewriting.

On the other hand, as nobanner can rewrite the HTML code on the fly, this can be used to remove things such as banner advertisements from the web pages you retrieve before they get to your browser. This makes for a somewhat nicer presentation.

This program is very sketchy. It may very well break down in common circumstances. However, it is useful enough for me to use, and that's all I really care about. If others find it interesting to deal with, that is good too. Needless to say, there are no guarantee or warranty of any kind. If you find a problem, please fix the problem and send me an update. If that isn't practical, then describe how to produce the problem as well as possible and mail that description to me. At that point I'll address the issue as I have time. Note that it is very easy to write a rewriting rule which takes out more than was really intended. Beware.

Installation

This proxy server must be run on a computer which has Perl available, can act as an Internet server, and understands the function fork. Linux systems and ISPs come to mind. I suppose there is a way to make this (or something similar) run under other operating systems, but I haven't even given that a try.

You can get the full archive which has all of the files listed below.

Get the source code for the package, and install it in any convenient directory. Get a rule set that you want to use. Mine is the one that I use. You may want to customize that set, or get one from somewhere else. The main idea is to customize the filter to what you want to see (and don't want to see), not what I care about. If you have an interesting rule set available, mail that (or a reference to the file if available) to me and I'll let others know of the work. The rules are basically Perl Regular Expressions to do search and replace. This allows a certain amount of flexibility.

Next, you need to actually start the program. You can do this by hand, but you will eventually want to put it into your system start up files (e.g. /etc/rc.local). The command line will look something like:

	nobanner --rules ./nobanner.rules --port 5244
Of course, you need to specify the appropriate directory for the rule file and some appropriate port number.

Last, you need to configure your browser to use the server and port number that the proxy is running on. Netscape has an option for network preferences which deals with web proxies. Other browsers can probably be configured as well.

If you want to see how the proxy server is working, you can look at any URL ending in /nobanner_status.html. This only works if you are currently using the proxy, and returns information for the proxy you are using.

Switches

The nobanner program has a number of optional switches.
  • --port port# This is used to select the port number the server should listen to. By default it is 5244. You can set this to any numeric value, but I suggest something > 1024 and < 65535.
  • --log logfile This is used to log various debugging to. The default is /dev/null.
  • --logdir directory This is used to create the named directory. A number of files are kept in here to record the number of times each of the rules has fired. If this is left unspecified, then no tracking will be done. This is sort of funky.
  • --rules rulefile This is used to specify the rule file. By default it looks in ./nobanner.rules but you are probably better off specifying an absolute filename. The proxy can work with no rule set, but it seems sort of pointless.
  • --proxy server:port You can select this proxy server to be a client to another proxy server. This could be useful if you have a caching proxy server, or one (like junkbusters) which blocks other traffic such as cookies or java. The default is to not rely upon another proxy server.

Summary

That's all there is to it. I hope you find the program interesting or useful. However, if you intend to do anything substantial, I still suggest that you use another proxy server.

This program is free. I hope it removes some of the advertising which shows up on your web pages. Note that if you want to pay for the program, or send a note of thanks, please let me know. I always appreciate feedback.



[scripts] [Send a comment]
Cornvalley



Last modified 31 May 2004
Dave Regan
http://mordred.ao.com/~regan/

Comments to: Dave Regan