Issues searching other library catalogues

Some of you may have noticed that there is now a facility on the Copac search forms to search your local library catalogue as well as Copac. You’ll only see this option if you have logged into Copac and are from a supported library.

The searching of the local library catalogues and Copac is performed using the Z39.50 search protocol. Due to differences in local configurations the query we send to Copac and the various library catalogues have to be configured very differently.

When we built the Copac Z39.50 server, we tried to make it flexible in the type of query it would accept within the limitations imposed upon us by the database software we use. Our database software was made for keyword searching of full text resources. As such it is good at adjacency searches, but you can’t tell it you want to search for a word at the start of a field.

Databases built around relational databases tend to be the complete opposite in functionality. They often aren’t good at keyword searching, but find it very easy to find words at the start of a field.

The result of which is that we make our default search a keyword search, while some other systems default to searching for query terms at the start of a field. Hence if we send the exact same search to Copac and a library catalogue we can get a very different result from the two systems. To try and get a consistent result we have to tweak the query sent to the library so that it performs a search as near as possible to that performed by Copac. Working out how to tweak (or transform or mangle) the queries is a black art and we are still experimenting.

Stop word lists are also an issue. Some library systems like to fail your search if you search for a stop word. Better systems just ignore stop words in queries and perform the search using the remaining terms. The effect is that searching for “Pride and prejudice” fails on some systems because “and” is stop worded. To get around this we have to remove stop words from queries. But we first need to know what the stop words are.

The result is that the search of other library systems is not yet as good as it could be, though it will get better over time as we discover what works best with the various library systems that are out there.

Copac Beta can search your library too

One of the new features we are trailing in the new Copac Beta is the searching of your local institutions library catalogue alongside Copac. To do this we need to know which Institution you are from and whether or not your Institutional library catalogue can be searched with the Z39.50 protocol.

To identify where you are from, we are using information given to us during the login process. When you login, your Institution gives us various pieces of information about you, including something called a scoped affiliation. For someone logging in from, say, the University of Manchester, the scoped affiliation might be something like “”

Once we know where you are from, we search a database of Institutional Z39.50 servers to see if your Institution’s library is searchable. If it is we can present the extra options on the search forms, and indeed, fire off any queries to your library catalogue.

Our database of Z39.50 servers is created from records harvested from the IESR. So, if you’d like your Institution’s catalogue available through Copac, make sure it is included in the IESR by talking to the nice people there.

Many thanks to everyone who tried the Beta interface early on and discovered that this feature mostly wasn’t working. You enabled us to identify some bugs and get the service working.

Getting to know the Copac libraries 3: Exeter, ‘ayns, and hamzas

If you’re not familiar with the holdings of the University of Exeter, you may be slightly confused by the title of this post. Exeter holds, in its Special Collections, Middle East Collections, and Arab World Documentation Centre, a significant collection of resources on the Arabian peninsula and Middle East, including over 15,000 books in Arabic.

Books written in non-Roman scripts have always been a slightly tricky issue for the cataloguer: is it transliterated correctly? Does there need to be a colloquial translation? What about classification, and subject indexing? Does my OPAC support searching in different character sets? Will my OPAC return results in Arabic if the search is performed in English? Will searching in Arabic (which Copac allows) return transliterated results?

This is where we come (if you hadn’t guessed it) to the ‘ayn and the hamza. The ‘ayn is a letter in the Arabic alphabet, while the hamza represents a glottal stop, and they are both often (incorrectly) transliterated as apsotrophes.

This makes the cataloguer’s job even more tricky. Add to this the fact that we deal with the records of over 50 libraries – records which have been created over a large number of years, during which cataloguing practices have changed – and you can see that we have a bit of a situation.

Apostrophes are, as a general rule, non-filing characters in catalogue records. But what do you do when an apostrophe is not an apostrophe? This problem with ‘ayns and hamzas (which can occur at the beginning, middle or end of words) was making it very difficult to find Arabic records on Copac: whether you included the correct character; an apostrophe; or nothing at all, you were unlikely to get the results you wanted.

Paul Auchterlonie, Librarian for Middle East Studies at Exeter, took the opportunity of being interviewed by me about Exeter’s experiences of being a member of Copac to raise this issue. He not only raised it, he entirely convinced me (who had never heard of either an ‘ayn or a hamza before in my life) of its importance. Then the Copac staff fixed it. Simple, no?

Well, not that simple. The fixing did take Shirley and Ashley some time and effort. Then the data had to be reloaded. And all is not entirely well yet: records in Farsi and Hebrew which have similar problems still need to be reloaded. But the moral of the tale: have a problem with Copac? Let us know! We like fixing things 🙂

DISCLAIMER: While Copac staff do like fixing things, there are issues which we can do nothing about (in the short term, at least – we’re looking at long-term solutions for many issues). This makes us sad. If we can’t fix your issue immediately, please be assured that it’s not because we don’t want to!

Search results as an Atom feed?

Here’s a few questions for you. Would it be useful to be able to get your Copac search results as an Atom feed? If so, would it help in aggegrating Copac searches with results from other services? Would it make writing widgets for, say, iGoogle or Netvibes, easier? Would you like Copac urls to be RESTful (I hope so, as they will be before long.)

Yesterday I was thinking about the different search result formats we provide and I was wondering if Atom might be useful. Then a conversation I’ve had this morning with some colleagues have made me think an Atom format could be very useful in the areas outlined above. However, I don’t have experience of implementing widgets or working with Feeds, so I thought I’d ask here. Any thoughts, anyone?