These are some of the synonyms for catalog in one online Thesaurus. Most of the things that we do, whether in business or private life, use one or the other of the items on this list. The question that we are considering here is what software tools are available to assist in this, one of the most fundamental of human activities. The advent of the Information Age, and the prolific growth in the use of Personal Computers, means that much of our information is now stored electronically, even when the electronic form is only used as a guide to locating a physical object. When we look at how software helps us with organisation tasks, we find that there is either nothing, or too much to be helpful, depending on the way we look at it.
On the one hand every piece of software organises its own information. Your digital camera will come with photo-album software that organises your images, your music player will come with software to organise the music you download, and so on. Just about as many means of organising information as there are types of information. On the other hand if we look for general purpose catalog software we find very little to help. Every computer comes with an Explorer or Finder, and one or two companies sell slightly better versions, but these are very limited in their use. They only have one information type, the file, and one organisational structure, the tree-like hierarchy. We can mimic some of the basic functions of a catalog, we can use folders as categories, and place our files into categories, but we come to grief when we try and do one of the most basic of operations which is to place an item into more than one category. We are completely lost if we want to keep more information than just the bare file, such as a description, a reminder that it needs to be dealt with or a link to another related file.
To be generally useful, Catalog software must satisfy a number of basic requirements.
Information Types - The catalog must be able to handle a wide variety of data types. These must include different types of file, for example audio and image as well as straight information in all the usual data types, for example String, Boolean, Integer as well as other common types such as Color, Date, Time and URL.
Catalog Items - Items that are to be placed into a Catalog must have a set of Attributes, that is pieces of information that are relevant to that item. For different catalogs there must be different sets of Attributes depending on the nature of the items being catalogued. For example a catalog of Music will have a reference to the file that contains the audio. A title and details of the performers. A Catalog of Work for a freelance consultant will have details of the time spent, the customer, the rate charged and so on.
Categories - Items must be placed into Categories. Categories themselves may be split into sub-categories as required. It must be possible for an Item to belong to more than one Category, and for a Category to be a sub-category of more than one Category.
Cross-References - It must be possible for Items to contain as Attributes cross-references to other items in the same or different Catalogs. For example a Catalog of work contains details of the customer. Details of the customer will be held in another Catalog, of people, and all work items for that customer will reference the one item.
Groups - It must be possible for items to be grouped in permanent, semi-permanent or ad-hoc groups so that actions may be carried out on a number of items at the same time. This grouping must be independent of Categories, that is we must be able to group items together even if they do not naturally fall into a single category, without having to create an artificial category.
Information Access - It must be possible to access and retrieve the stored information quickly and easily. As well as direct access through a hierarchy of categories it must be possible to search for items, define indices and display or print the information in an appropriate format.
Information Activation - Where information represents an active object, then it must be possible to activate that object directly from the Catalog. For example in a Catalog of music, once the required item or items have been located and grouped into a playlist it must be possible to play them without recourse to a totally different piece of software with its own User Interface and foibles.
Extensibility - As users' information storage requirements grow, it must be possible to extend the Catalog software to cater for new types of information. This involves the definition of new Catalogs with their own Attribute sets, the extension of existing Catalogs with new Attributes, and the addition of extra code as required to handle new ways of displaying and activating information items.
Tailorability - It must be possible to tailor catalogs to suit the individual user. This involves the provision of user specific information, for example a company name or logo to appear on a print-out, and Catalog customisation, for example to remove irrelevant operations from those available. A menu item "Play Music" would make a lot of sense in a music catalog, it makes little sense in a record of work! On the other hand the ability to create a specific "Play Music" menu item is extremely useful in a music collection, and makes a lot more sense than a possible more general menu item "Activate Referenced File"!
Background to openJean
In the 1980s the author was working on "Project Management Environments". These were large software frameworks into which software tools could be "plugged" so that some form of control could be exercised over the multitude of information items with complex inter-relationships that constitute a large IT project. The final functionality and usability of a product depended on so many things, some physical such as the quality of raw materials, some intellectual such as decisions at meetings and keeping track of all the relationships was an almost impossible task. It remains so today as so many headlines about delays and cost overruns tell us. It was the author's contention that a catalog was the absolutely essential basis for any such system. Without the ability to categorize and group related information items, the complexity very soon became unmanageable.
We are now well into the 2000s and facing exactly the same problem. Now however, with the growth of personal computers and the Internet, it is everyone who faces the problem, not just project managers. We are, we are told, in the Information Age, and facing a tidal wave of information delivered conveniently to that plastic box sitting on the desk. Our banking is done on the Internet, shopping, entertainment; if we run a small business then we have to deal with suppliers with their electronic systems and customers who expect the efficiency savings that only an automated operation can bring. Software suppliers are keen to sell software that claims to bring all these benefits but the classic trade off is only too apparent in the new business of computer software: the really useful stuff is very expensive, the cheap stuff is difficult to use and all too often makes things worse by just adding more different information items to the already overflowing heap. To make things worse even the good ones insist on organising their own information in their own way; the best you can expect in the way of integration is the ability to import foreign data into an application. Even sets of software from the same manufacturer often fail to properly integrate the views of data held by the component parts.
Three requirements for software arising by coincidence at the same time were the trigger for putting some of the earlier ideas about general purpose catalog software into practice.
- The need for freelance consultants and small businesses that operate on a time basis to record the details of their customers, the work contracted and done and the hours estimated and spent for planning and for invoicing.
- The need to keep track of bookings for small hotels, guest-houses and self-catering establishments, again for planning and invoicing
- The need to classify and provide quick convenient access to items in a large collection of music being transferred from CD to hard disc.
However looking at the three requirements there are many similarities. There are "entities", things that have to be stored, a piece of work, a holiday booking or a piece of music. Each entity has a number of items of information associated with it, details of a customer or a performing artist, dates, times, amounts, lengths etc. Each entity can be classified, in the first two cases by dates and/or times, in the last by musical genres. The entities have relationships with other entities; one obvious simplification is to use a separate address book so in both the first cases the work or the booking are related to a specific person represented by an entry in an address book. When there is more than one work item or booking for the same person there is no need to repeat the information. In the same way the artists can be represented in a separate catalog which is referenced by the music catalog. Both the new requirements, for an address book and a database of artists again have the same structure of entities and their associated items of information being placed into categories.
OpenJean came from the need to satisfy all these requirements, superficially very different but in fact having many common features. The purpose of openJean is to organize information on a computer and the benefits of its use are just those normal benefits that you get from such organization, fast efficient access to information so that it can be used. OpenJean will do what it can with the information when it is found, normally this will be to display it on a computer screen but for example if it is music openJean can play it, however that is not the main task of openJean which is to get you to the information fast. OpenJean will not for example provide a word processor or a web browser, if the information in your catalog is a document or a URL openJean will leave the task of editing or browsing to the software you already have. OpenJean does not impose a structure on your information, it allows you to superimpose new structures without affecting the information in any way, just to let you access it more effciently. Not surprisingly, it was not long before new applications for openJean began to appear and the shape of openJean, as a framework into which various applications or catalogs can be slotted, developed.
One other factor that helped determine the way that openJean developed was the idea that it should be customisable and extensible. There is a catch in software development that is similar to the one in many other fields. If a software product is general purpose then it has the great advantage of being usable in a variety of different situations, but the disadvantage that it does not cater for the specifics of any one situation. On the other hand software written very specifically for one situation may perform very well in that situation but is all but useless otherwise. Buying specific software for every situation can get very expensive and confusing for the people who have to learn many different user interfaces, buying the general purpose software is cheaper but the difficulties of adapting it to the different situations often negate the benefits from its use.
The traditional approach to make general purpose software more useful in specific situations is to allow a degree of customisation, which may range from a simple choice of colour schemes to a full blown programming language to allow the addition of new modules. This approach however introduces another, or another instance of the same, catch. If the customisation is to be useful then it is going to be more complicated to carry out and beyond the reach of a non-specialist. Many software packages advertise that they can be customised by "scripts" of some sort; it is not until you come to try it out that you realise that the scripting languages are every bit as complex as any programming language.
The decision was made early in the design of openJean that trying to pretend that a complex process was actually simple was not the best way to proceed. Instead openJean has made extensive use of standards so that complex customisation tasks can be done by any programmer that has worked with modern programming languages and data representation formats.
OpenJean is written in Java®, a programming language widely used for Internet-based applications, all openJean data is stored and exported using XML (eXtensible Markup Language) the standard format for data interchange. In order to customize openJean you can write Java modules, just put them into openJean's library folder and openJean will find them, then tell openJean at what point they are to be called. In order to view information in a specific format write an XSL (eXtensible Stylesheet Language) transformation from openJean's export format into HTML, PDF or whatever other format is required and put it into the openJean applications transformation folder and openJean will give you a menu item that will export a snapshot of openJean and apply the transformation.
For example in Freelancer, the normal mechanisms for creating categories, groups and items are not very useful, the creation process is more specific work items go into days which are contained in weeks, which are in turn contained in months and years. Specific Java methods have been written which call back into the published openJean API so that there is a specific menu item to create a month, and when a month is created all the weeks and days within are also created with weeks possibly belonging to more than one month. The information displays needed by Freelancer are invoices and work plans, the section of the object base for a week or month can be selected and exported to a temporary (or permanent if required for audit purposes) XML file which is transformed into HTML and displayed in the system browser.