Announcing the new Copac interface and design…

A tremendous amount has been going on behind the scenes of Copac for quite a period of time now.  Like everyone across the sector we’re working at what feels like full tilt  —  tackling multiple projects, and figuring out as a team how to juggle and prioritise it all.  We’re undertaking quite a few JISC innovations projects, including the work with developing a shared service prototype for a recommender API based on aggregated circulation data, a considerable amount of effort is being invested in the Copac Collections Management project, we’ve been collaborating with our colleagues across the office on Linked Data research and development, working closely with the Discovery initiative, and our developers (namely Ashley Sanders) have just about cracked the new database design and algorithms that will address some of the major duplication issues we are currently challenged with as a national aggregator of bibliographic records.

Image of the Copac websiteIn order to understand and meet the needs of our current user-base (800,000 search sessions per month, and counting) we’ve also been conducting market research in the form of surveys, focus groups and interviews with our users and stakeholders. We’ve amassed a lot of knowledge about how Copac is used, its benefit to academics and librarians, the features most valued in the interface, and what we could be doing better (deduplication! Ebook records and access!) We still have a way to go to meet all these needs, and as a service with a ‘perpetual beta ethos,’ committed to innovation, we know we’ll never be ‘done’ with this work.

But the launch of the new interface and design today is a very significant milestone, and one we want to mark.  These changes are the product of a great deal of committed work to the principles of market research and user-centred design. Thanks to the efforts of Mimas web developers Leigh Morris and Shiraz Anwar, the new application interface positively reflects the real world user-journeys of Copac users, and has been rigorously tested to ensure it’s in line with those needs. The new graphic design has been developed to communicate the value proposition of Copac as a JISC service representing Research Libraries, and also as a tool to Research Libraries.  Mimas’ new graphic designer has done an excellent job of transforming a site that was out of date, (‘lacked depth’ and ‘cold’ I believe are words used) into something more engaging, reflecting the breadth and richness of the libraries that make up Copac.  Certainly, beyond providing an excellent resource discovery experience for end users (and this is why the simplicity and ease of use of the search and personalisation tools are our primary focus) it is important for us to communicate on behalf of JISC that Copac is a community-driven initiative, made possible by its contributors and representative bodies like RLUK. We hope that the new elements of the website represent this community feel, giving Copac a bit more of an engaging voice than perhaps we’ve previously had.

A big vote of thanks to my fantastic Copac and Mimas colleagues, and particularly those who have worked quite a few late nights and weekends lately: Shirley Cousins, Ashley Sanders, Leigh Morris, Lisa Jeskins, and Beth Ruddock. Thanks to Shiraz Anwar for his work earlier in this project in ensuring every detail of the interface design reflected user needs. Thanks also to Janine Rigby and Lisa Charnock from the Mimas Marketing team for the market research work, and working with us to identify the value proposition and identity of Copac, and to Ben Perry for translating that so swiftly into a design we all instantly agreed on.

Getting Excited about Collection Management

The Copac Collections Management Tools Project is a collaboration between Mimas, RLUK, and the White Rose Consortium.

A number of partners have been working through and with us here  at Mimas  on a  JISC funded  Collection Management project, which is part of the broader Resource Discovery Taskforce activity

Since we have all been working on this slightly under the radar, and recognising the need to share more about this project and what’s going on, we’re planning series of blog posts to update the community on the progress and lessons learned through the partnetship.  The following update is from Julia Chruszcz, who is project managing this piece of work:

Just two months into the JISC funded Copac Collection Management Project the progress has been significant. At a meeting of the project partners on the 6th May each of the representatives from the White Rose Consortium (WRC) universities (Leeds, York and Sheffield) articulated the potential significance of this tool on their decision making processes around monograph retention and disposal and collection development. This included notions of collaborative collection development and how such a Collection Management Tool could facilitate regional and national approaches, each influencing local decisions for libraries.

The WRC has undertaken the early testing of the web-based tool in an approach that the project has adopted to inform development and iteratively assess the tool.  The idea is to build up a full specification over the life of the project of what will be required to take such a tool forward to introduce into library workflows. The next stage, between now and the beginning of July will be to further develop the batch and web technical interfaces based upon the WRC feedback and for this development to undergo further critical testing. The project is due to provide an interim report at the end of June with full report to the JISC at the end July.

The enthusiasm from all the project partners, JISC, Mimas, RLUK and WRC, stems from the realisation that we have the potential to produce a tool that will make a real difference to helping libraries make informed decisions particularly at a time of financial constraint, and assist in furthering the possibility of a national monographs collection, protecting access for researchers at the same time as facilitating local decisions that will save money and resource longer term. And all this by intelligent re-use and application of an existing extensive database, a resource invested in by RLUK and the JISC over many years, the Copac database.

If this is something you are interested in we’d really like to hear your view point and perspective.

Surfacing the Academic Long Tail — Announcing new work with activity data

We’re pleased to announce that JISC has funded us to work on the SALT (Surfacing the Academic Long Tail) Project, which we’re undertaking with the University of Manchester, John Rylands University Library.

Over the next six months the SALT project will building a recommender prototype for Copac and the JRUL OPAC interface, which will be tested by the communities of users of those services.  Following on from the invaluable work undertaken at the University of Huddersfield, we’ll be working with ten years+ of aggregated and anonymised circulation data amassed by JRUL.  Our approach will be to develop an API onto that data, which in turn we’ll use to develop the recommender functionality in both services.   Obviously, we’re indebted to the previous knowledge acquired by a similar project at the University of Huddersfield and the SALT project will work closely with colleagues at Huddersfield (Dave Pattern and Graham Stone) to see what happens when we apply this concept in the research library and national library service contexts.

Our overall aim is that by working collaboratively with other institutions and Research Libraries UK, the SALT project will advance our knowledge and understanding of how best to support research in the 21st century. Libraries are a rich source of valuable information, but sometimes the sheer volume of materials they hold can be overwhelming even to the most experienced researcher — and we know that researchers’ expectation on how to discover content is shifting in an increasingly personalised digital world. We know that library users — particularly those researching niche or specialist subjects — are often seeking content based on a recommendation from a contemporary, a peer, colleagues or academic tutors. The SALT Project aims to provide libraries with the ability to provide users with that information. Similar to Amazons, ‘customers who bought this item also bought….’ the recommenders on this system will appear on a local library catalogue and on Copac and will be based on circulation data which has been gathered over the past 10 years at The University of Manchester’s internationally renowned research library.

How effective will this model prove to be for users — particularly humanities researchers users?

Here’s what we want to find out:

  • Will researchers in the field of humanities benefit from receiving book recommendations, and if so, in what ways?
  • Will the users go beyond the reading list and be exposed to rare and niche collections — will new paths of discovery be opened up?
  • Will collections in the library, previously undervalued and underused find a new appreciative audience — will the Long Tail be exposed and exploited for research?
  • Will researchers see new links in their studies, possibly in other disciplines?

We also want to consider if there are other  potential beneficiaries.  By highlighting rarer collections, valuing niche items and bringing to the surface less popular but nevertheless worthy materials, libraries will have the leverage they need to ensure the preservation of these rich materials. Can such data or services assist in decision-making around collections management? We will be consulting with Leeds University Library and the White Rose Consortium, as well as UKRR in this area.

(And finally, as part of our sustainability planning, we want to look at how scalable this approach might be for developing a shared aggregation service of circulation data for UK University Libraries.  We’re working with potential data contributors such as Cambridge University LibraryUniversity of Sussex Library, and the M25 consortium as well as RLUK to trial and provide feedback on the project outputs, with specific attention to the sustainability of an API service as a national shared service for HE/FE that supports academic excellence and drives institutional efficiencies.

Notes on (Re)Modelling the Library Domain (JISC Workshop).

A couple of weeks ago, I attended JISC’s Modelling the Library Domain Workshop. I was asked to facilitate some sessions at the workshop, which was an interesting but slightly (let’s say) ‘hectic’ experience. Despite this, I found the day very positive. We were dealing with potentially contentious issues, but I noted real consensus around some key points. The ‘death of the OPAC’ was declared and no blood was shed as a result. Instead I largely heard murmured assent. As a community, we might have finally faced a critical juncture, and there were certainly lessons to be learned in terms of considering the future of services such as Copac, which as a web search service, in the Library Domain Model would count as national JISC service ‘Channel.’

In the morning, we were asked to interrogate what has been characterised as the three ‘realms’ of the Library Domain: Corporation, Channels, and Clients. (For more explanation of this model, see the TILE project report on the Library Domain Model). My groups were responsible for picking apart the ‘Channel’ realm definition:

The Channel: a means of delivering knowledge assets to Clients, not necessarily restricted to the holdings or the client base of any particular Corporation, Channels within this model range from local OPACs to national JISC services and ‘webscale’ services such as Amazon and Google Scholar. Operators of channel services will typically require corporate processes (e.g. a library managing its collection, an online book store managing its stock). However, there may be an increasing tendency towards separation, channels relying on the corporate services of others and vice versa (e.g. a library exposing its records to channels such as Google or Liblime, a bookshop outsourcing some of its channel services to the Amazon marketplace).

In subsequent discussion, we came up with the following key points:

  • This definition of ‘channel’ was too library-centric. We need to working on ‘decentring’ our perspective in this regard.
  • We will see an increasing uncoupling of channels from content. We won’t be pointing users to content/data but rather data/content will be pushed to users via a plethora of alternative channels
  • Users will increasingly expect this type of content delivery. Some of these channels we can predict (VLEs, Google, etc) and others we cannot. We need to learn to live with that uncertainty (for now, at least).
  • There will be an increasing number of ‘mashed’ channels – a recombining of data from different channels into new bespoke/2.0 interfaces.
  • The lines between the realms are already blurring, with users becoming corporations and channels….etc., etc.
  • We need more fundamental rethinking of the OPAC as the primary delivery channel for library data. It is simply one channel, serving specific use-cases and business process within the library domain.
  • Control. This was a big one. In this environment libraries increasingly devolve control of the channels via which their ‘clients’ use to access the data. What are the risks and opportunities to be explored around this decreasing level of control? What related business cases already exist, and what new business models need to evolve?
  • How are our current ‘traditional’ channels actually being used? How many times are librarians re-inventing the wheel when it comes to creating the channels of e-resource or subject specialist resource pages? We need to understand this in broad scale.
  • Do we understand the ways in which the channels libraries currently control and create might add value in expected and unexpected ways? There was a general sense that we know very little in this regard.

There’s a lot more to say about the day’s proceedings, but the above points give a pretty good glimpse into the general tenor of the day. I’m now interested to see what use JISC intends to make of these outputs. The ‘what next?’ question now hangs rather heavily.

It’s Official — Copac’s Re-engineering

We’ve been hinting a while now about significant changes being imminent for Copac, and I am now pleased to announce that we’ve had official word that we have secured JISC funding to overhaul the Copac service over the next year.

The major aim for this work is to improve the Copac user experience.  In the short term this will mean improving the quality of the search results.  More broadly, this will mean providing more options for personalising and reusing Copac records.

We’re going to be undertaking the work in two phase.  We’re calling Phase 1 the ‘iCue Project’ (stands for ‘Improving the Copac User Experience’).  This work will be focused on  investigating and proposing pragmatic solutions that improve the Copac infrastructure and end-user experience, and we’re going to be partnering with Mark Van Harmelen of Personal Learning Environments Ltd (PLE) in this work (Mark is also involved in the JISC TILE project, so we believe there’s a lot of fruitful overlap there, especially around leveraging the potential of circulation data a la Huddersfield).  The second phase is really about doing the work — re-engineering Copac in line with the specifications defined in the iCue Project.

We see this work tackling three key areas for Copac:

(i) Interface revision: We’ll be redesigning Copac’s user interface, focusing on areas of usability and navigability of search results. We are aware that the sheer size of our database and our current system means that searches can return large, unstructured result sets that do not facilitate users finding what they need.  Addressing this is a major priority.  We’ll be building on the CERLIM usability report we recently commissioned (more on that in another post) and also drawing on the expertise of OPAC 2.0 specialists such as Dave Pattern.  We’ll also be working consistently with users (librarian users and researcher users) to monitor and assess how we’re doing.

(ii) Database Restructuring: A more usable user interface is going to critically rely on a suitable restructuring of Copac’s database. Particularly, we are centrally interested in FRBR (Functional Requirements for Bibliographic Records) as a starting point for a new database structure. We anticipate that whatever we learn as we undertake this piece of work will be of interest to the broader community, and plan to disseminate this knowledge, and update the community via this blog.

(iii)  De-duplication: The restructuring implies further de-duplication of Copac’s contents, and so we’re also developing a de-duplication algorithm.  Ideally we would like to see the FRBR levels of work, expression, manifestation and (deduplicated) item being supported, or a pragmatic version of the same.

The end user benefits:
1. Searches are faster and more effective (Copac database is more responsive and robust; users are presented with a more dramatically de-duplicated results view)
2.  Search-related tasks are easier to perform (i.e. the flexibility of this system will support the narrowing/broadening of searches, faceted searching, personalising/sharing content)
3.  Access to more collections (Copac database is able to hold more content and continue to grow)

So there we have it.  It’s going to be quite a year for the Copac team.  If you have any questions, comments or suggestions you’d like us to take on board, do leave a comment here or email us.  (Not that this will be the only time we ask!) We can also be chatted to via twitter @Copac.

Catalogues as Communities? (Some thoughts on Libraries of the Future)

At last week’s Libraries of the Future debate, Ken Chad challenged the presenters (and the audience) over the failure of libraries to aggregate and share their data.  I am very familiar with this battle-cry from Ken.  In the year+ that I’ve been managing Copac, he’s (good-naturedly) put me on the spot several times on this very issue.  Why isn’t Copac (or the UK HE/FE library community) learning from Amazon, and responding to user’s new expectations for personalisation and adaptive systems?

Of course, this is a critically important question, and one that is at the heart of the JISC TILE project, which Ken co-directs (I actually sit on the Reference Group). Ken’s  related argument is that the public sector business model (or lack thereof) is perhaps fatally flawed, and that we are probably doomed in this regard; private sector is winning already on the personalisation front, so instead of pouring public money into resource discovery ‘services’ we should instead, perhaps, let the market decide.  I am not going to address the issue of business models here – although this is a weighty issue requiring debate – but I want to come back to this issue of personalisation, 2.0, and the OPAC as a potential ‘architecture for participation.’

I fundamentally agree with the TILE project premise (borrowed from Lorcan Dempsey) that the library domain needs to be redefined as a set of processes required for people to interact with ‘stuff’.  We need to ask ourselves if the OPAC itself is a relic, an outmoded understanding of ‘public access’ or (social) interaction with digital content. As we do this, we’re creating heady visions where catalogue items or works can be enhanced with user-generated content, becoming ‘social objects’ that bring knowledge communities together.  ‘Access’ becomes less important than facilitating ‘use’ (or reuse) and the Discovery to Delivery paradigm is turned on its head.

It’s the ‘context’ of the OPAC as a site for participation that I am interested in questioning.  Can we simply ‘borrow’ from the successful models of Amazon or LibraryThing? Is the OPAC the ‘place’ or context that can best facilitate participative communities?

This might depend on how we’re defining participation, and as Owen Stephens has suggested (via Twitter chats) what the value of that participation is to the user.  In terms of Copac’s ‘My References’ live beta, we’ve implemented ‘tagging with a twist,’ where tagging is based on user search terms and saved under ‘Search History’.  The value here is fairly self-evident – this is a way for users to organise their own ‘stuff’. The tagging facility, too, can be used to self-organise, and as Tim Spalding suggested way back in 2007, this is also why tagging works for LibraryThing (and why it doesn’t work for Amazon). Tagging works well when people tag “their” stuff, but it fails when they’re asked to do it to “someone else’s” stuff. You can’t get your customers to organize your products, unless you give them a very good incentive.

But does this count as ‘community’ participation?  Right now we don’t provide the option for tags to be shared, though this is being seriously considered along the lines of a recommender function: users who saved this item, also saved which seems to be a logical next step, and potentially complimentary to Dave’s recommender work. However,  I’m much less convinced about whether HE/FE library users would want to explicitly share items through identity profiles, as at LibraryThing.  Would the LibraryThing community model translate to the models that university and college libraries might want to support the semantically dense and complex communities for learning, teaching and research?

One of the challenges for a participatory OPAC 2.0 (or any a cross-domain information discovery tool) will be the tackling of user context, and specifically the semantic context(s) in which that user is operating.  Semantic harvesting and text mining projects such as the Intute Repository Search have pinpointed the challenge of ‘ontological drift’ between disciplines and levels (terms and concepts having shifted meanings across disciplinary boundaries).  As we move into this new terrain of Library 2.0 this drift will likely become all the more evident.  Is the OPAC context too broad to facilitate the type of semantic precision to enable meaningful contribution and community-building?

Perhaps attention data, that ‘user DNA,’ will provide us with new ways to tackle the challenge.  There is risk involved, but some potential ‘quick wins’ that are of clear benefit.  Dave’s blog posts over the last week suggest that the value here might be in discovering people ‘like me’ who share the same research interests and keep borrowing books like the ones I borrow (although, if I am an academic researcher, that person might also be ‘The Competition’ — so there are degrees of risk to account for here — and this is just the tip of the ice-berg in terms of considering the cultural politics of academia and education).  Certainly the immediate value or ‘impact of serendipity’ is that it gives users new routes into content, new paths of discovery based on patterns of usage.

But what many of us find so compelling about the circulation data work is that it surfaces latent networks not just of books, but of people.  These are potential knowledge communities or what Wenger might call Communities of Practice (CoP).  Whether the OPAC can help nurture and strengthen those CoPs is another matter. Crowds, even wise ones, are not necessarily Communities of Practice.

The reimagining the library means reimagining (or discarding) the concept of the catalogue.  This might also mean rethinking the  OPAC as a context for community interaction.


[Related ‘watch this space’ footnote: We’ve already garnered some great feedback on the ‘My References’ beta we currently have up — over 80 user-surveys completed (and a good proportion of those from non-librarian users).  This feedback has been invaluable.  Of course, before we embark on too many more 2.0 developments, Copac needs to be fit-for-purpose.  In the next year we are re-engineering Copac, moving to new hardware, restructuring the database,  improving the speed and search precision, and developing additional (much-needed) de-duplication algorithms.  We’re also going to be undertaking a complete  overhaul of the interface (and I’m pleased to say that Dave Pattern is going to be assisting us in this aspect). In addition, as Mimas is collaborating on the TILE project through Copac, we’re going to look at how we can exploit what Dave’s done with the Huddersfield circulation data (and hopefully help bring other libraries on board).]

Amazon Profits from Copac Usability Testing

Well Amazon profits a bit.  Last week I busily printed off a spate of £35 Amazon certificates; these are being used as incentives for those willing to spend a few hours with the Copac interface and the CERLIM team (the Centre for Research in Library and Information Management, conveniently located just down the road at MMU).  As we start to plan some (not inconsiderable) overhauls to the service, the time seemed very much right for undertaking some serious usability testing too. As we begin to develop new features for users, including some personalisation tools, it’s more than necessary to take a reality check on how much this feature creep is going to affect users.  So, over the next few weeks we have tasked the research specialists at CERLIM to help us better understand how our users navigate (or don’t) the current interface,  and also provide us with concrete ideas on how our new interface should look as we redevelop over the next year.    They’re going to be using a mixture of search tasks, interviews and structured focus groups, and have managed to engage a good sample of ‘typical’ users (i.e. researchers and postgrads from a range of disciplines).  We know the findings are going to be invaluable (even while we brace ourselves, just a tad;-)).

Perspectives on Goldmining.

Last Friday, Shirley and I headed down to London for the TiLE workshop: ‘”Sitting on a gold mine” — Improving Provision and Services for Learners by Aggregating and Using ‘Learner Behaviour Data.’ The aim of the workship was to take a ‘blue skies’ (but also practical) view of how usage data can be aggregated to improve resource discovery services on a local and national (and potentially global) level. Chris Keene from the University of Sussex library has written a really useful and comprehensive post about the proceedings (I had no idea he was ferverishly live blogging across the table from me — but thanks, Chris!)

I was invited to present a ‘Sector Perspective’ on the issue, and specifically the ‘Pain Points’ identifed around ‘Creating Context’ and ‘Enabling Contribution.’ The TiLE project suggests a lofty vision where, with the sufficient amount of context data about a user (derived from goldmines such as attention data pools and profile data stored within VLEs, library service databases, institional profiles — you know, simple enough;-) services could become much more Amazon-like.  OPACs could suggest to users, ‘First Year History Students who used this textbook, also highly rated this textbook…’ and such. The OPAC is thus transformed from relic of the past, to a dynamic online space enabling robust ‘architectures of participation.’

This view is very appealing, and certainly at Copac we’re doing our part to really interrogate how we can support *effective* adaptive personalisation. Nonetheless, as a former researcher and teacher, I’ve always had my doubts as to whether the Library catalogue per se, is the right ‘place’ for this type of activity.

We might be able to ‘enable contribution’ technically, but will it make a difference? An area that perhaps most urgently needs attention is research on the social component and drivers for contributing user-generated content.  As the TiLE project has identified, the ‘goldmine’ here to galvanise such usage is ‘context’ or usage data. But is it enough, especially in the context of specialised research?

As an example of the potential ‘cultural issues’ that might emerge, the TiLE project suggests the case of the questionably nefarious tag ‘wkd bk m8’ which is submitted as a tag for a record. They ask, “Is this a low-quality contribution, or does it signal something useful to other users, particularly to users who are similar to the contributor?”

I’d tend to agree the latter, but would also say that this is just the tip of the iceberg when it comes to rhetorical context. For example, consider the user-generated content that might arise around contentious works around the ‘State of Israel.’ The fact that Wikipedia has multiple differing and ‘sparring’ entries around this is a good indicator of the complexity that emerges. I would say that this is incredibly rich complexity, but on a practical level potentially very difficult for users to negotiate. Which UGC derived ‘context’ is relevant for differing users? Will our user model be granular or precise enough to adjust accordingly?

One of the challenges of accommodating a system-wide model is the tackling of semantic context. Right now, for instance, Mimas and EDINA have been tasked to come up with a demonstrator for a tag recommender that could be implemented across JISC services. This seems like a relatively simple proposition, but as soon as we start thinking about semantic context, we are immediately confronted with the question of which concept models or ontologies do we draw from?

Semantic harvesting and text mining projects such as the Intute Repository Search have pinpointed the challenge of ‘ontological drift’ between disciplines and levels. As we move into this new terrain of Library 2.0 this drift will likely become all the more evident.

Is the OPAC too generic to facilitate the type of semantic precision to enable meaningful contribution? I have a hunch it is, as did other participants when we broke out into discussion sessions.

But perhaps the goldmine of context data, that ‘user DNA,’ will provide us with new ways to tackle the challenge, and there was also a general sense that we needed to forge forward on this issue — try things out and experiment with attention data.  A service that gathers that aggregates both user-generated and attention/context data would be of tremendous benefit, and Copac (and other like services) can potentially move to a model where adaptive personalisation is supported.  Indeed, Copac as a system-wide service has a great potential as an aggregator in this regard.

There is risk involved around these issues, but there are some potential ‘quick wins’ that are of clear immediate benefit. Another speaker on Friday was Dave Pattern, who within a few minutes of ‘beaming to us live via video from Huddersfield’ had released the University of Huddersfield’s book usage data (check it out).

This is one goldmine we’re only too happy to dig into, and we’re looking forward to collaborating with Dave in the next year to find ways to exploit and further his work in a National context.  We want to implement recommender functions in Copac, but also (more importantly) working at Mimas to develop a system for the store and share of usage data from multiple UK libraries (any early volunteers?!)  The idea is that this data can also be reused to improve services on a local level.   We’re just at the proposal stage in this whole process, but we feel very motivated, and the energy of the TiLE project workshop has only motivated us more.

Of Circulation Data and Goldmines…

If you’d told me a bit more than a year ago that I’d be getting all excited about the radical potential of library circulation data, well…

This afternoon we had an interesting chat with Dave Pattern from the University of Huddersfield (he of Opac 2.0 and ‘users who borrowed this also borrowed…’ fame).  We’re hoping to collaborate with Dave to see how his important work can be taken forward on a national level.  Dave is about to release the Huddersfield circulation data (anonymised and aggregated) to the community and he’s hoping it will trigger some debate and ideas for developments.   This certainly is a real opportunity for people in our field.  On our end, we’d like to figure out how we could develop a similar feature for Copac, but also look at how to bring more libraries into the mix — contributing more data so those ‘recommendations’ are more effective.

Dave and I both sit on the TILE reference group, and there has been some important work going on in that project about the potential ‘goldmine’ of attention data we’re all sitting on at institutions and data centres.  TILE recommendations suggest the development of an attention-data store service.  Frankly, the sheer scale of this type of all encompassing undertaking gives me headpsin, but a service for the storage and open share of circulation data less so.  In fact, JISC has also recently tasked Mimas and EDINA to propose work around ‘Personalised Search and Recommendation Engines,’ so there’s real scope to think carefully about what such a service might look like.

Goldmine indeed — I’m speaking (from my ‘sector perspective’) at the TILE meeting next week.  The focus of the meeting is to look at how we can improve services for learners by aggregating and using learning behaviour data.  For our part, I am keen to see where this work with circulation and attention data can take us, and I’m looking forward to putting some thoughts together on this score for the meeting.

Spooky Personalisation (should we be afraid?)

Last Thursday members from the D2D team met up with people from the DPIE 1 and DPIE 2 projects, as well as Paul Walk (representing the IE demonstrator project). The aim was to talk ‘personalisation’ developments for the JISC IE. It’s impossible to cover the entire scope of discussion here (we were at it most of the day). As you might predict, it was a day of heated but engaging debate around a topic that is technically and socially complex. As we think about the strategic future of services and cross-service development, and there are serious questions marks over which direction we’re headed in terms of personalisation (and, of course, if it’s even possible to talk of ‘a’ direction).

The key practical aim of the meeting was to share the personalisation aspects of D2D project work, and also to discuss the recommendations of the two DPIE reports. The D2D work includes some development of personalisation components for Copac, components we are referring to cautiously as ‘lightweight’ for now. One way in which we plan to ‘personalise’ the service for users is by offering a ‘My Local Library’ cross-search, achieved (we hope — we’re very much in early phases here) via a combination of authentication and IP recognition used to identify users’ geographical location, and then a cross-search of local institutional library holdings data via Z39.50 targets.

In addition, by next the middle of next year, Copac users will be able to save marked lists of records and export them into other 2.0 environments via an atom feed (I’ll let Ashley write the more technical post on that development). Further down the line (i.e. beyond the next six months) we are interested in providing tools for users to annotate, bookmark and tag these records, but we also want to make sure that any such developments are not made in isolation and are ‘Copac’-centric — there’s a lot to explore here, obviously.

In and of themselves, these developments are not especially complex — the latter is an example of personalisation via ‘customisation’ (to use JISC definitions) where users explicitly customise content according to their own preferences. What I am especially interested in, however, is how saved lists (‘My Bibliography?’) could be used to potentially support adaptive personalisation (this is what Max Hammond, co-author of the DPIE 2 report wryly referred to at the meeting as ‘spooky’).

Dave Pattern’s experiment with using circulation data to ‘recommend’ items to University of Huddersfield library users is well known, and I hope the first step towards some potentially very interesting UK developments. On this end, we’re interested in knowing if there is anything similar to be gleaned from saved personal lists — ‘users with this item in their saved lists also have…’ (or something along those lines). This terrain is very much untested, and one of the critical issues, of course, is uptake. Amazon’s recommender function is effective due to the sheer number of users (effective *some* of the time, that is — we all have ‘off-base’ Amazon recommendations stories to tell, I admit). And this is just one small example of how adaptive personalisation of a service like Copac (or other JISC IE services) might work — there are also opportunities around capturing attention data, for instance.

The DPIE 2 report urges extreme caution in this regard. It raises some very pointed questions about how JISC and its services should approach adaptive personalisation. Too often, the authors warn, ‘personalisation’ is established as a specific goal, with the assumption that ‘personalisation’ is intrinsically valuable. In this context, change is technology rather than user driven, which is fine for experimental and demonstrator work, but high-risk for established services with a strong likelihood for failure. They question how helpful definitions of Personalisation put forth by JISC are in carrying forward a development agenda (Customisation; Adaptive Personalisation based on Data held elsewhere (APOD); Adaptive Personalisation based on User Activity (APUA). This definition “provides a mix of concepts from data capture to functionality, rather than setting out the logical link between a source of data about a user, a model of that user, and an approach to providing the user with a personal service” (17). Also missing is a robust benefits mapping process — “there is little analysis of the benefits of personalisation, beyond an assertion that it improves the user experience” (20). The report concludes:

Complex developments of “personalisation services” and similar should not be a current
priority for JISC. It seems unlikely that an external personalisation service will be able to
generate a user model which is detailed enough to be of genuine use in personalising
content; user preferences are probably not broadly applicable beyond the specific resource
for which they are set, and user behaviour is difficult to understand without deep
understanding of the resource being used. Attempting to develop user models which are
sufficiently generic to be of use to several services, but sufficiently detailed to facilitate useful
functionality is likely to be a challenging undertaking, with a high risk of failure. (26)

These are somewhat sobering thoughts, especially in a climate of personalisation and 2.0 fervour, but overall the report is useful in considering how to tread the next steps in development activity. Key for us is this issue of the user model — can we (Copac? SUNCAT? JISC?) develop one that is likely to be of use to several services? My hunch right now is ‘no.’ We know very little concrete about researchers’ behaviour and how they might benefit from such tools (interestingly, both DPIE 2 reports focused on benefits for undergraduate students, when most of the services in question are largely used by researchers). About humanities researchers, we know even less (much of the interesting work around online ‘Collaboratories’ centres on the STEM disciplines). Apparently JISC is about to commission some investigative work with researcher-users, and here at Mimas a team is about to undertake some focus group work with humanities researchers to determine how personalisation tools for services like Copac, Intute and the Archives Hub could (or could not) deliver specific benefits to their work. I’m sure this research will prove very useful.

We’re urged to ‘proceed with caution,’ but we proceed nonetheless. At Copac we’re taking a long hard look at what a personalised service might look like, and accepted that some risk-taking is likely forecast for the future. I’m very interested to know other’s opinions on a possible recommender function for Copac — at what level could such a tool prove useful, and when might it possibly be obstructive? Personally, I have used the ‘People who have bought also bought’ feature in Amazon quite extensively as a useful search tool. I am less likely to take up the direct recommendations that Amazon pitches at me through my ‘personalised’ Amazon home page, however. (This comes, in part, from making purchases for a six year old boy. If only I could toggle between ‘mummy’ and ‘professional’ profiles…. now there’s a radical thought).