Oct 29, 2008

Open Source and Open Standards; Crtl & Esc

Open Source gives the Ctrl; you can see what is going on through the source. You are free to modify and extend it. You can make it better to make your solution better. Given sufficient effort (and time), it is mailable.

Open Standards give the Esc; once due care is given to the use of standard apis, the escape hatch is always open. If a better implementation comes along you are free to go, you are free to switch-out an implementation.

The combination of open source and open standards is a perfect match. While the proprietary extensions will still exist, the good extensions are eventually subsumed or become de facto standards in themselves.

Access to the source provides the ultimate freedom to protect one's investment and influence the future direction of a project. Patch the source and if you get it right, there is a good chance that the patch will filter through to the user community. It may even eventually make its way to the standard.

Note: With FOSS, there is also intrinsic Esc; at least in the early stages. The barrier to entry is so low, the barrier to exit is no more than the cost of the experience. We are all entitled to change our opinions as we learn.

Oct 17, 2008

Testing: simulating a network failure

Figuring out how distributed software behaves in the event of a network failure or partition can be difficult. It requires testing that involves multiple machines, multiple networks and many hands!.
Virtualisation helps, but for really simple testability, a single VM environment is what you need. What follows, with some context, is a simple solution that may help.

Recently I was trying to track down an issue with Apache ActiveMQ network support. The test scenario required a bunch of VMWare images, dual network cards and periodic manual network disabling.
In order to understand the scenario I tried to reduce it to something more manageable. The iptables firewall in Linux meant I did not have to yank out any network cables. With iptables, and a good tutorial, it is relatively easy to simulate a network failure or temporary network outage by instructing iptables to drop network packets that originate from, or are destined for, an individual port.
For my test, I had a simple network of two embedded brokers, a producer on one broker and a consumer on the other. Both the producer and consumer used the vm protocol, leaving the tcp connector free for the networking calls. The connector was using port: 61616. To simulate a network failure, by dropping all tcp packets to and from port 61616, the following iptables rules do the trick:
$ sudo iptables -I INPUT 1 -p tcp --sport 61616 -j DROP;sudo iptables -I INPUT 2 -p tcp --dport 61616 -j DROP
In order to enable communication again, the two rules added above need to be deleted (for simplicity I just delete the first rule twice):
$ sudo iptables -D INPUT 1;sudo iptables -D INPUT 1
This works fine because I have control over the Linux box and I don't typically run any iptables rules. But this will not always be the case and this will not hold on other platforms or on shared Linux work stations. In addition it requires some manual intervention so it cannot be easily automated.

What I needed I thought, was a simple java socket proxy that could sit as an intermediary between the two ends of the network and which I could control through code. Something that will let traffic pass through until it is instructed not to do so. A quick google did not produce any obvious candidate for reuse so I coded a simple solution that worked for me and built a test case around it. The resulting SocketProxy is uses in BrokerQueueNetworkWithDisconnectTest. The usage pattern is based around replacing required tcp URIs with a proxy URL:

socketProxy = new SocketProxy(remoteURI);
DiscoveryNetworkConnector connector = new DiscoveryNetworkConnector(new URI("static:(" + socketProxy.getUrl() + ")"));
The proxy takes the target URI, sets up a listener and forwarder to the target and through getUrl() returns the proxy URL. To simulate a network failure, socketProxy.stop() is called during the test execution. socketProxy.resume() allows a network reconnect such that recovery can be validated. It made my life a little easier and meant I could produce a reliable and portable test case using a single JVM. I know I will use it again :-)

Note: There is also the option to pause/resume the proxy. This keeps the sockets open but does not allow any traffic to pass through. Pausing allows the simulation of a slow network which was handy for exercising the ActiveMQ inactivity monitor.

Oct 15, 2008

Who, what, why?

Among other things, I design, write, test, extend, refactor and troubleshoot software. Over the past few years, my focus has moved to open source. I want to share some of what I discover as I explore the domain and work to harden and expand some existing implementations. May the sharing, learning and (r)evolution continue.