Copac Beta Interface

We’ve just released the beta test version of a new Copac interface and I thought I’d write a few notes about it and how we’ve created it.

Some of the more significant changes to the search result page (or “brief display” as we call it) are:

  • There are now links to the library holdings information pages directly from the brief display. You no longer have to go via the “full record” page to get to the holdings information.
  • You can see a more complete view of a record by clicking on the magnifying glass icon at the end of the title. This enables you to quickly view a more detailed record without having to leave the brief display.
  • You can quickly edit your query terms using the search forms at the top of the page.
  • To further refine your search you can add keywords to the query by typing them into the “Search within results”  box.
  • You can change the number of records displayed in the result page.

The pages have been designed using Responsive Web Design techniques — which is jargon that means that the HTML5 and CSS have been designed in such a way that the web page rearranges itself depending on the size of your screen. The new interface should work whether you are using a desktop with a cinema display, a tablet computer or a mobile phone. Users of those three display types will see a different arrangement of screen elements and some may be missing altogether on the smaller displays. If you use a tablet computer or smartphone, then please give beta a try on them and let us know what you think.

The CGI script that creates the web pages is a C++ application which outputs some fairly simple, custom, XML. The XML is fed through an XSLT stylesheet to produce the HTML (and also the various record export formats.) Opinion on the web seems divided on whether or not this is a good idea; the most valid complaints seem to be that it is slow. It seems fast enough to us and the beta way of doing things is actually an improvement as there is now just one XSLT used in creating the display, whereas our old way of doing things used multiple XSLT stylesheets run multiple times for each web page. Which probably just goes to show that the most significant eater of time is the searching of the database rather than the creation of the HTML.

Database update

We’ve had a recurrence of the problem I reported a month ago and so last night we installed an update to the database software we use. I’m told the update contains fixes relevant to the problems we have been experiencing, so here’s hoping it brings some increased reliability with it.

Please accept out apologies if you experienced some disruption last night while I was updating the software.

Yesterday’s loss of service

I thought I’d write a note about why we lost the Copac service for a couple of hours yesterday.

The short of it is, that our database software hung when it tried to read a corrupted file in which it keeps track of sessions. The result was that everyone’s search process hung and so frustrated users kept re-trying their searches, which created more hung sessions until the system was full of hung processes and with no CPU or memory left. Once we had deleted the corrupted file, everything was okay.

The long version goes something like this… From what I remember, things started going pear-shaped a little before noon when the machine running the service started becoming unresponsive. A quick look at the output of top showed we had far more search sessions running than normal and that the system was almost out of swap space.

It wasn’t clear why this was happening and because the system was running out of swap it was very difficult to diagnose the problem. It was difficult to run programs from the command line as, more often than not, they immediately died with the message “out of memory.” I did manage to shutdown the web server in an effort to lighten the load and stop more search sessions being created. It was proving almost impossible to kill off the existing search sessions. In Unix a “kill -9” on a process should immediately stop the process and release its memory back to the system. But yesterday a “kill -9” was having no effect on some processes and those that we did manage to kill were being listed as “defunct” and still seemed to be holding onto memory. In the end we just thought it would be best to re-boot the system and hope that it would solve whatever the problem was.

It took ages for the system to shut itself down – presumably because the shutdown procedures weren’t working with no memory to work in. Anyway, it did finally reboot and within minutes of the system coming up it became overloaded with search sessions and ran out of memory again.

We immediately shut down the web server again. However, search sessions were still being created by people using Z39.50 and so we had to edit the system configuration files to stop inetd spawning more Z39.50 search sessions. Editing inetd.conf didn’t prove to be the trivial task it should have been, but we did get it done eventually. We then tried killing off the 500 or so search sessions that were hogging the system — and that proved difficult too. Many of the processes refused to die. So, after sitting staring at the screen for about 15 minutes, unable to run programs because there was no memory and wondering what on earth do we do now, the system recovered itself. The killed off processes did finally die, memory was released and we could do stuff again!

A bit of investigation showed that the search processes weren’t getting very far into their initialisation procedure before hanging or going into an infinite loop. I used the Solaris truss program to see what files the search process was reading and what system calls it was making. Truss showed that the process was going off into cloud cuckoo land just after reading a file the database software uses to track sessions. So I deleted that file and everything started working again! The file got re-created next time a search process ran — presumably the file had become corrupted.

Educating our systems

JIBS workshop 13/11/08

I attended the JIBS workshop in London on ‘How to compete with Google: simple resource discovery systems for librarians’ with two agendas: one of a Copac team member, interested to see what libraries are doing that could be relevant to Copac; and the other of having recently completed some research on federated search engines, and being anxious to keep up-to-date with the developments.

The day consisted of seven presentations, and concluded with the panel taking discussion questions. Four of the presentations focussed on specific implementations: of Primo at UEA; of Encore at the University of Glasgow; of ELIN at the universities of Portsmouth and Bath; and of Aquabrowser at the University of Edinburgh. Some interesting themes ran through all of these presentations. One was that of increased web 2.0 functionality – library users expect the same level of functionality from library resource discovery systems as they find elsewhere on the internet. With this in mind, libraries have been choosing systems that allow personalisation in various forms. Some systems allow users to save results and favourite resources, and to choose whether to make these public or keep them private.

Another popular feature is tag clouds. These give users a visual method of exploring subjects, and expanding or refining their search. Some systems (such as Encore) allow the adding of ‘community’ tags. This allows users to tag resources as they please, and not rely on cataloguer-added tags. While expanding the resource-discovery possibilities, and adding some good web 2.0 user interaction, concerns have been raised about the quality of the tags. While Glasgow are putting a system in place to filter the most common swearwords, and hopefully ward off deliberate vandalism, there is a worry that user-added tags might not achieve the critical mass needed to become a significant asset in resource discovery. As we at Copac are looking into the possibility of adding tags to Copac records, we will be interested in seeing how this resolves.

The addition of book covers and tables-of-contents to records seems to be a desirable feature for many libraries – and it is nice that Copac is ahead of the pack in this regard! Informal comments throughout the day showed that people are very enthusiastic about the recent developments at Copac, and enjoy the new look.

It was also very interesting to see that some libraries are introducing (limited) FRBRisation for the handling and display of results. UEA, for instance, are grouping multiple editions of the same work together on their Primo interface. This means that a search for ‘Middlemarch’ returns 31 results, the first of which contains 19 versions of the same item. These include 18 different editions of Middlemarch in book form, and one video. While the system is not yet perfect (‘Middlemarch: a study of provincial life’ is not yet recognised as the same work), it is very encouraging to see FRBRised results working in practical situations. Introducing RDA and the principles of FRBR and FRAD at Copac is going to be an interesting challenge, as we will be receiving records produced to both RDA and AACR2 standards for a while. Copac, with its de-duplication system, already performs some aspects of FRBR, as the same work at multiple libraries is grouped as one record.

There were also two presentations dealing with information-seeking behaviour, by Maggie Fieldhouse from UCL and Mark Hepworth from Loughborough. Mark highlighted the need – echoed in later presentations – for users to be given the choice about how much control they had over their search. This was part of ‘training the system’ rather than ‘training the user’. Copac tries to be an ‘educated system’: we provide a variety of search options (from simple to very advanced) through a variety of different interfaces (including browser plug-ins and a Facebook widget), and we hope that this contributes to our users’ search successes. As part off this, we are going to be undertaking some usability studies, which we hope will make Copac even more well-trained.

A very enjoyable and informative day which has given me plenty to think about – and nice new library catalogues to play with!

All the presentations from the JIBS event are available for download:
http://www.jibs.ac.uk/events/workshops/simplerds/