Notes on (Re)Modelling the Library Domain (JISC Workshop).

A couple of weeks ago, I attended JISC’s Modelling the Library Domain Workshop. I was asked to facilitate some sessions at the workshop, which was an interesting but slightly (let’s say) ‘hectic’ experience. Despite this, I found the day very positive. We were dealing with potentially contentious issues, but I noted real consensus around some key points. The ‘death of the OPAC’ was declared and no blood was shed as a result. Instead I largely heard murmured assent. As a community, we might have finally faced a critical juncture, and there were certainly lessons to be learned in terms of considering the future of services such as Copac, which as a web search service, in the Library Domain Model would count as national JISC service ‘Channel.’

In the morning, we were asked to interrogate what has been characterised as the three ‘realms’ of the Library Domain: Corporation, Channels, and Clients. (For more explanation of this model, see the TILE project report on the Library Domain Model). My groups were responsible for picking apart the ‘Channel’ realm definition:

The Channel: a means of delivering knowledge assets to Clients, not necessarily restricted to the holdings or the client base of any particular Corporation, Channels within this model range from local OPACs to national JISC services and ‘webscale’ services such as Amazon and Google Scholar. Operators of channel services will typically require corporate processes (e.g. a library managing its collection, an online book store managing its stock). However, there may be an increasing tendency towards separation, channels relying on the corporate services of others and vice versa (e.g. a library exposing its records to channels such as Google or Liblime, a bookshop outsourcing some of its channel services to the Amazon marketplace).

In subsequent discussion, we came up with the following key points:

  • This definition of ‘channel’ was too library-centric. We need to working on ‘decentring’ our perspective in this regard.
  • We will see an increasing uncoupling of channels from content. We won’t be pointing users to content/data but rather data/content will be pushed to users via a plethora of alternative channels
  • Users will increasingly expect this type of content delivery. Some of these channels we can predict (VLEs, Google, etc) and others we cannot. We need to learn to live with that uncertainty (for now, at least).
  • There will be an increasing number of ‘mashed’ channels – a recombining of data from different channels into new bespoke/2.0 interfaces.
  • The lines between the realms are already blurring, with users becoming corporations and channels….etc., etc.
  • We need more fundamental rethinking of the OPAC as the primary delivery channel for library data. It is simply one channel, serving specific use-cases and business process within the library domain.
  • Control. This was a big one. In this environment libraries increasingly devolve control of the channels via which their ‘clients’ use to access the data. What are the risks and opportunities to be explored around this decreasing level of control? What related business cases already exist, and what new business models need to evolve?
  • How are our current ‘traditional’ channels actually being used? How many times are librarians re-inventing the wheel when it comes to creating the channels of e-resource or subject specialist resource pages? We need to understand this in broad scale.
  • Do we understand the ways in which the channels libraries currently control and create might add value in expected and unexpected ways? There was a general sense that we know very little in this regard.

There’s a lot more to say about the day’s proceedings, but the above points give a pretty good glimpse into the general tenor of the day. I’m now interested to see what use JISC intends to make of these outputs. The ‘what next?’ question now hangs rather heavily.

Educating our systems

JIBS workshop 13/11/08

I attended the JIBS workshop in London on ‘How to compete with Google: simple resource discovery systems for librarians’ with two agendas: one of a Copac team member, interested to see what libraries are doing that could be relevant to Copac; and the other of having recently completed some research on federated search engines, and being anxious to keep up-to-date with the developments.

The day consisted of seven presentations, and concluded with the panel taking discussion questions. Four of the presentations focussed on specific implementations: of Primo at UEA; of Encore at the University of Glasgow; of ELIN at the universities of Portsmouth and Bath; and of Aquabrowser at the University of Edinburgh. Some interesting themes ran through all of these presentations. One was that of increased web 2.0 functionality – library users expect the same level of functionality from library resource discovery systems as they find elsewhere on the internet. With this in mind, libraries have been choosing systems that allow personalisation in various forms. Some systems allow users to save results and favourite resources, and to choose whether to make these public or keep them private.

Another popular feature is tag clouds. These give users a visual method of exploring subjects, and expanding or refining their search. Some systems (such as Encore) allow the adding of ‘community’ tags. This allows users to tag resources as they please, and not rely on cataloguer-added tags. While expanding the resource-discovery possibilities, and adding some good web 2.0 user interaction, concerns have been raised about the quality of the tags. While Glasgow are putting a system in place to filter the most common swearwords, and hopefully ward off deliberate vandalism, there is a worry that user-added tags might not achieve the critical mass needed to become a significant asset in resource discovery. As we at Copac are looking into the possibility of adding tags to Copac records, we will be interested in seeing how this resolves.

The addition of book covers and tables-of-contents to records seems to be a desirable feature for many libraries – and it is nice that Copac is ahead of the pack in this regard! Informal comments throughout the day showed that people are very enthusiastic about the recent developments at Copac, and enjoy the new look.

It was also very interesting to see that some libraries are introducing (limited) FRBRisation for the handling and display of results. UEA, for instance, are grouping multiple editions of the same work together on their Primo interface. This means that a search for ‘Middlemarch’ returns 31 results, the first of which contains 19 versions of the same item. These include 18 different editions of Middlemarch in book form, and one video. While the system is not yet perfect (‘Middlemarch: a study of provincial life’ is not yet recognised as the same work), it is very encouraging to see FRBRised results working in practical situations. Introducing RDA and the principles of FRBR and FRAD at Copac is going to be an interesting challenge, as we will be receiving records produced to both RDA and AACR2 standards for a while. Copac, with its de-duplication system, already performs some aspects of FRBR, as the same work at multiple libraries is grouped as one record.

There were also two presentations dealing with information-seeking behaviour, by Maggie Fieldhouse from UCL and Mark Hepworth from Loughborough. Mark highlighted the need – echoed in later presentations – for users to be given the choice about how much control they had over their search. This was part of ‘training the system’ rather than ‘training the user’. Copac tries to be an ‘educated system’: we provide a variety of search options (from simple to very advanced) through a variety of different interfaces (including browser plug-ins and a Facebook widget), and we hope that this contributes to our users’ search successes. As part off this, we are going to be undertaking some usability studies, which we hope will make Copac even more well-trained.

A very enjoyable and informative day which has given me plenty to think about – and nice new library catalogues to play with!

All the presentations from the JIBS event are available for download:
http://www.jibs.ac.uk/events/workshops/simplerds/

To Google or not to Google [with update]

As Ashley has just posted, we’ve just reinstated the links to Google Books that were appearing in the right-hand column of relevant records. Back in March we were pleased to be among the throng of those incorporating the new Google Books API. If Google’s mission is to ‘organize the world’s information and make it universally accessible and useful,’ who are we to argue? What self-respecting library service wouldn’t want to be a part of a project that promotes the Public Good?

Then something unusual happened — we got complaints. Not a great many, but still a vociferous few who questioned why Copac would give Google ‘personal data’ about them as users. Several of us in the team went back and forth over whether this was actually the case. My own opinion was that a) this was not ‘personal’ data, but usage data, and therefore not a threat to any individual’s privacy, and b) even if we were giving Google a little bit of something about our users and how they behaved, what does it matter if the trade-off is an improved system? Nonetheless, we went ahead and added that small script so that Google only spoke to the Copac server. No dice.

I was not all that surprised that our attempt at a workaround wasn’t effective (it would have been nice to have heard something back officially from Google on this front, but we’ll live). I am still wondering if it matters, though. Does it makes sense that we ‘pay’ Google for this API by giving them this information about Copac users — their IP addresses and the ISBNS of books they look at? (Is this, in fact, what we’re doing? Paying them?) Isn’t all this just part of the collective move toward the greater Public Good that the entire Google Books Search project seems to be about?

Ultimately, right now, yes. This is the trade-off we’re willing to make. So we’ve reinstated the links, but also added an option under Preferences for now to allow users to de-googlise their searches. Turning off the feature for good would be reactionary to say the least (and perhaps, more to the point, in the political landscape in which Copac operates, *seen* as reactionary). Right now, if you’re in the ‘Resource Discovery’ business, then a good relationship with the most ubiquitous and powerful search engine in the world is of no small importance.

Indeed, behind the scenes, our colleagues at RLUK have been working with Google on our behalf to sign an agreement which will mean that Google can spider Copac records. The National Archives has recently done this, and from what I hear anecdotally from people there, it’s already having a dramatic impact on their stats — thousands users are discovering TNA records through Google searches, and so discovering a resource they might not have known about before. We are hoping that users will have a similar experience with Copac, especially those looking for unique and rare items held in UK libraries that might not surface through any other means. We are eager to see what sort of impact a Google gateway to Copac will have, and we know it can only enhance the exposure of the collections. We’re also exploring this option for The Archives Hub.

Of course, this also means that Google gets to index more information about Copac web searches. David Smith’s article last week “Google, 10 years on. Big Friendly Giant or Greedy Goliath?” highlights some of the broader concerns about this. To what extent should we be concerned about the fact that a corporation is hoovering up information about our usage behaviour? I am always suspicious of overblown language surrounding technology, and Smith’s article does invoke quite a number of metaphors connoting a dark and grasping Google that we’d better start keeping an eye on, “Google’s tentacles are everywhere.”

But invokations of the ‘Death Star’ notwithstanding (!) I think we’re all learning to be a bit more cautious about our approach to Google. It may not be the Dark Lord, but it’s no ‘Big Friendly Giant’ either. For now, we’re pleased to be able to plug in Google’s free API (thank you, Google) and that Copac will soon be searchable via the engine. But nothing is entirely free, or done for entirely altruistic purposes — this is business after all. We just have to keep that in mind and talk constructively and openly about what we’re willing to pay.

[Updated to add: Likely much too late in the game, but I’ve just spent an hour or so listening to The Library 2.0 Gang’s podcast with Frances Haugen, product manager for the Google Book Search API.  Tim Spalding (LibraryThing) and Oren Beit-Arie (Ex Libris) were among those to pose some of the tougher questions surrounding the API and specifically the fact the it only works client-side and forces the user into the google environment.  According to Frances, future developments will include a server-side API, and that an ultimate goal would be to move to a place where the API can be used to mash up data in new interface contexts.  We’ll certainly be watching this space:-)]

Google Book Search

We have re-enabled links to Google Book Search, again. I you haven’t already seen these links, they appear in the sidebar of the Full Record display underneath the menu and cover image. The link text will read either as either “Google Full View”, “Google Preview” or “Google Book Search” depending on the amount and type of information held by Google.

Javascript embedded within the Full Record page connects to Google Book Search to determine if Google hold information on the work. This enables us to show links to Google only when there is something useful to see when you follow the link. The downside to this is that Google will log the IP address of your computer, any Cookies they have previously set on your browser and the ISBN of the work you are viewing; even if you don’t follow the link.

Some of our users expressed concerns about being forced to link to Google and so we changed the way in which the connect to Google was performed. We had a small script on our server act as an intermediary between your computer and Google. That way your computer was only talking to our server and all the connects to Google Book Search originated from our server. This worked okay for a short amount of time until our script was blocked by Google — the message sent back to our script from Google was that “your query looks similar to automated requests from a computer virus or spyware application.” Which I can understand. We did try contacting people at Google to see if there was any way we could keep using our script. All we’ve had from Google is a deathly silence.

So we’ve re-instated the links to Google Book Search and we now have a Preferences page which enables you to turn the links off if you don’t like Google being able to track what you do on Copac. You will need cookies enabled on your browser for the Preference settings to work. The link to the Preferences page appears in the sidebar menu on the search forms and Full and Brief record displays.