Nov 22, 2011

ActiveMQ - multiple kahaDB instances (mKahaDB) helping reduce journal disk usage

The default store implementation in ActiveMQ, KahaDB, uses a journal and index. The journal uses a sequence of append-only files to store messages, acknowledgements and broker events. The index holds references to messages on a per destination basis. Essentially, the index holds the runtime state of the broker, mostly in memory, where as the journal maintains the persistence store of  raw data and events. It is the journal of record in a sense.
Periodically, unreferenced journal files are removed through a garbage collection process, so disk usage is kept in check.
In the main, this scheme works well, however, when multiple destinations on a broker are used in very different ways, it can lead to excessive disk usage by the journal. What follows is some detail on a solution to that problem.

Mixed destination usage; frequent fast tasks vs infrequent slow tasks
Imagine a toy makers order-processing. There are two types or orders, custom and standard. A custom order takes a few days to fulfill, a standard order takes a matter of hours. You can easily imagine two order queues, standard and custom. Now imagine that we only process custom orders once a month but process standard orders all the time. So we expect a large backup of custom orders that is slowly consumed at the start of each month and a steady load on the standard order queue.

What the broker sees
From a broker perspective, in the single shared journal, there will be a batch of journal files that are filled with custom order messages. Subsequent journal files that will have mostly 'standard order' messages and acknowledgements with the odd acknowledgement for a 'custom order' message. The sporadic distribution of  acknowledgements for 'custom orders' in the journal files can be problematic because even when that journal file no longer contains any unacked 'standard order' messages, it must still be retained.

Some background on the need to retain journal files
Journal data files are append only. Both messages and acknowledgements are appended, nothing is deleted from a data file. Journal data files that are unreferenced are periodically removed (or archived). The idea is that the index (JMS destination state) can be recreated in full from the journal at any point in time. Any message without a corresponding acknowledgement is deemed valid.


Referenced journal files

In the simplest case, a journal file is 'referenced' if it contains messages that have not been acknowledged by a consumer. The more subtle case reflects the persistence of acknowledgements (acks). A journal file is 'referenced' if it contains acks for messages in any 'referenced' journal file. This means that we cannot garbage collect a journal file that just contains acks until we can garbage collect all of the journal files that contain the corresponding messages. If we did, in the event of a failure that requires recovery of the index, we would miss some acks and replay messages as duplicates.

Problem
So back to the broker perspective of our toy makers order processing. The first range of journal data files remain till the 'custom orders' queue is depleted. Custom order message acknowledgements get dotted across journal files that result from the enqueue/dequeue of the 'standard orders' queue and the end result is lots of referenced journal files and excessive disk usage.

Solution
Reducing the default journal file size can help in this case, but at the cost of more runtime file IO as messages are distributed across more files. In an ideal world, the 'custom order' queue could be partitioned into its own journal where linear appends of messages and acks would result in a minimal set of journal files in use. Correspondingly, the 'standard order' queue with their short lived messages could share a journal.

With the Mulitple KahaDB persistence adapter, destination partitioning across journals is possible. It provides a neat solution to the scenario described above.
Replacing the default persistence adapter configuration:

<persistenceAdapter>
     <kahaDB directory="${activemq.base}/data/kahadb" />
</persistenceAdapter>

with:

<persistenceAdapter>
    <mKahaDB directory="${activemq.base}/data/kahadb">
      <filteredPersistenceAdapters>
       <filteredKahaDB queue="CustomOrders">
        <persistenceAdapter>
          <kahaDB />
        </persistenceAdapter>
       </filteredKahaDB>
       <filteredKahaDB>
        <persistenceAdapter>
          <kahaDB />
        </persistenceAdapter>
       </filteredKahaDB>
      </filteredPersistenceAdapters>
    </mKahaDB>
</persistenceAdapter>
  
The mKahaDB (m, short for multiple) adapter is a collection of filtered persistence adapters. The filtering reuses the destination policy matching feature to match destinations to persistence adapters. In the case of the above configuration, the 'custom orders' queue will use the first instance of kahaDb and all other destinations will map to the second instance. The second filter is empty, so the default 'match any' wild card is in effect.
This configuration, splitting the destinations based on their usage pattern over time, allows the respective journal files to get reclaimed in a linear fashion as messages are consumed and processed, resulting in minimum disk usage.


Overhead

When transactions span persistence adapters, there is an additional overhead of local two phase commit to ensure both journals are atomically updated. Two phase commit requires that the outcome is persisted so there is an additional disk write required per transaction. This can be avoided by colocating destinations that share transactions in a single kahaDB instance. When transactions access a single persistence adapter or when there are no transactions, there is no additional overhead.


Alternative Use Cases: Relaxed Durability Guarantee

Each nested kahaDB instance is fully configurable so one scenario where the use of different persistence adapters makes sense is where your durability guarantee is weaker for some destinations than others. JMS requires that a write be on disk before a send reply is generated by the broker. To this end, a disk sync is issued by default after every journal write. This default behavior is configurable by the kahaDB attribute enableJournalDiskSyncs. If some destinations don't need this guarantee, they can be assigned to a kahaDB instance that has this option disabled and have their writes return faster, leaving it to the file system to complete the write. Here is an example configuration:


<persistenceAdapter>
    <mkahaDB directory="${activemq.base}/data/kahadb">
      <filteredPersistenceAdapters>
      <filteredKahaDB queue="ImportantStuff">
        <persistenceAdapter>
          <kahaDB />
        </persistenceAadapter>
      </filteredkahadb>
      <filteredkahadb queue="NotSoImportantStuff">
        <persistenceAdapter>
          <kahaDB enableJournalDiskSyncs="false"/>
        </persistenceAdapter>
      </filteredKahaDB>
    </filteredPersistenceAdapters>
  </mKahaDB>
</persistenceAdapter>


Apr 22, 2011

Government agencies: cut future IT spend - share costs, invest in open source

Open letter to Minister of State for Public Service Reform, Mr. Brian Hayes TD

Hi Brian,
I would like to share a quick response to my reading of the Irish times article: State to demand price cuts from suppliers to reduce €16bn bill.

In order to best serve the needs of the Irish people right now and into the future, you need to seriously consider open source IT solutions. Across all departments and across all of Europe, government IT departments should be collectively investing in free open source solutions that solve their common IT needs.

Investment in open source IT solutions seeds innovation and is a commitment to shared future value. Investment in proprietary IT solutions is an innovation tax and a commitment to repetition.
This is not some sort of Marxist rant; open source is the best way to innovate. Open source software is a key reason amazon, google, twitter, facebook etc. emerged; they stand on the shoulders of giants.

I imagine this would mean a small shift in how government IT is organised. You would need to extract real value from the smart people therein. Rather than out sourcing decisions to global consultancy companies, you allow a shared need to be met from within.
You enable innovation, by allowing smart individuals to take ownership of both the problem and the solution and most importantly, to share the fruit of their labour.

The bottom line is this, all of the government departments have IT needs in common, they are much more alike than they wish to admit. The also share these needs with other governments thoroughout Europe.
There is no reason to constantly reinvent the wheel. We just need to enable people to share and evolve the best designs. Open source provides the freedom and motivation to do just that.

Apr 10, 2011

Consider Unhosted and open source for eHealth and eGov #DERIopenDay

DERI Galway produced an insightful open day on their developments in the semantic web of linked data. While I listened, two thoughts kept recurring that I want to explore. Chances are I am preaching to the choir but shucks, just in case I am not...

For a web architecture of the future look at Unhosted
At the root of the problem of siloed data and fragmentation (the database hugging phenomena) is the issue of ownership. Institutions have data that they don't really own because that data is of a personal nature. The collection of data is theirs, but not the individual components.
With Unhosted, the ownership problem is turned on its head. Users and aggregators of data only have a 'handle' (a URI) to personal data. A handle that is only useable with permission. Collections of data containing 'handles' can safely be shared. Granted, lots of issues need to be ironed out, but I think the architecture is on the right track and the concept is bang on.

Open source your research
Lots of what you do is plumbing. For new plumbing to be broadly adopted it needs to be better and it needs to be cheap. Publish and be damned. If the research is great the plumbing will proliferate at very little cost. If it does proliferate, you continue to research and innovate and profit above the new infrastructure, it is all good. If it does not proliferate..., well open source was not the problem!

Enterprise Ireland: open source can be a viable business model for shared infrastructure research. It is a world of constant iterative improvement. The profits are smaller but the rewards are greater because simply put, value shared is value multiplied.
In essence, open innovation puts the focus on execution rather than protection, if puts everyone on the front foot.