A trivial web proxy to zap advertisements
An overview
nobanner is a trivial web proxy written in
perl
which can rewrite HTML on the fly.
This is a very simple program, and I was not too interested
in performance, so this program may very well not be suitable to
any large scale use. For a very nice proxy server, which rewrites
just URLs, see
Junkbusters.
There are undoubtably other programs which do the same, or similar, rewriting.
On the other hand, as nobanner can rewrite the HTML code on the
fly, this can be used to remove things such as banner advertisements
from the web pages you retrieve before they get to your browser.
This makes for a somewhat nicer presentation.
This program is very sketchy. It may very well break down in common
circumstances. However, it is useful enough for me to use,
and that's all I really care about. If others find it interesting
to deal with, that is good too. Needless to say, there are no
guarantee or warranty of any kind. If you find a problem, please
fix the problem and send me an update. If that isn't practical, then
describe how to produce the problem as well as possible and
mail
that description to me. At that point I'll address the issue as
I have time.
Note that it is very easy to write a rewriting rule which takes
out more than was really intended. Beware.
Installation
This proxy server must be run on a computer which has Perl available,
can act as an Internet server, and understands the function fork.
Linux systems and ISPs come to mind. I suppose there is a way to make
this (or something similar) run under other operating systems, but I
haven't even given that a try.
You can get the
full archive
which has all of the files listed below.
Get the
source code for the package, and
install it in any convenient directory.
Get a
rule set
that you want to use. Mine is the one that I use.
You may want to customize that set, or get one from somewhere else.
The main idea is to customize the filter to what you want
to see (and don't want to see), not what I care about.
If you have an interesting rule set available,
mail
that (or a reference to the file if available) to me and I'll let others
know of the work.
The rules are basically Perl Regular Expressions to do
search and replace.
This allows a certain amount of flexibility.
Next, you need to actually start the program. You can do this by
hand, but you will eventually want to put it into your system
start up files (e.g. /etc/rc.local). The command line
will look something like:
nobanner --rules ./nobanner.rules --port 5244
Of course, you need to specify the appropriate directory for the rule
file and some appropriate port number.
Last, you need to configure your browser to use the server and
port number that the proxy is running on.
Netscape has an option for network preferences which deals with
web proxies. Other browsers can probably be configured as well.
If you want to see how the proxy server is working, you can
look at any URL ending in /nobanner_status.html.
This only works if you are currently using the proxy, and returns
information for the proxy you are using.
Switches
The nobanner program has a number of optional switches.
- --port port#
This is used to select the port number the server should
listen to. By default it is 5244. You can set this to
any numeric value, but I suggest something > 1024 and < 65535.
- --log logfile
This is used to log various debugging to. The default is /dev/null.
- --logdir directory
This is used to create the named directory.
A number of files are kept in here to record the number of
times each of the rules has fired.
If this is left unspecified, then no tracking will be done.
This is sort of funky.
- --rules rulefile
This is used to specify the rule file. By default it looks in
./nobanner.rules but you are probably better off specifying
an absolute filename. The proxy can work with no rule set, but
it seems sort of pointless.
- --proxy server:port
You can select this proxy server to be a client to another
proxy server. This could be useful if you have a caching
proxy server, or one (like junkbusters) which blocks other
traffic such as cookies or java. The default is to not rely
upon another proxy server.
Summary
That's all there is to it.
I hope you find the program interesting or useful.
However, if you intend to do anything substantial, I still
suggest that you use another proxy server.
This program is free.
I hope it removes some of the advertising which shows up on
your web pages.
Note that if you want to pay for the program,
or send a note of thanks, please
let me know.
I always appreciate feedback.