Sunday, August 23, 2015

Things i learnt at ALA Annual Conference 2015 - Or data is rising

I had the privilege to attend ALA annual conference 2015 in San Francisco this summer. This was my 2nd visit to this conference (see my post in 2011) and as usual I had lots of fun.

Presenting at  "Library Guides in an Era of Discovery Layers" Session

My ex-colleague and I were kindly invited to present on our work on a bento-style search we implemented for our LibGuides search.

For technical details please refer to our joint paper Implementing a Bento-Style Search in LibGuides v2 in July's issue of Code4lib.

See the Storify of event at

Data is rising 

Before I attended ALA 2015, I was of course aware that  research data management was increasingly an important service academic librarians are or should be supporting.

To be perfectly frank though, it was a hazy kind of "aware".

I knew that increasingly grant giving organization like NIH and other funders were requiring researchers to submit data sharing plans, so that was an area where academic librarians would provide support in particularly if open access takes hold since it would make obsolete many traditional tasks .

Also I knew there was all this talk about supporting Digital Humanities and GIS (geographic information system) services such that my former institution where I worked with began to start appointing Digital humanities and GIS librarians just before I left.

Perhaps closer to my wheel-house given my interest in library discovery, there was talk about Linked data and BIBFRAME which isn't research data management per se.

All these three areas relate to emerging areas that I knew or strongly suspected would be important but was unsure about the timing or even the nature (see later)

Add the "stewardship's duty of libraries" towards the "Evolving Scholarly Record" (what counts as scholarly record is now much expanded beyond just the final published article and libraries need to collect and preserve that), you can see why data is a word librarians are saying a lot more.

Still attending ALA annual 2015, made me wonder if finally a tipping point has been reached and I should start really looking at it deeper.

Is Linked data finally on the horizon?

While attending a session by Marshall Breeding "The future of Library Resource Discovery: Creating new worlds for users (and Librarians) he asked this question.

Breeding's observation was indeed apt, though one's choice of sessions to attend obviously as an impact so for example this blogger wonders if the overdose of linked data is simply due to her interest.

Still, this year there seemed to be quite a lot of talk on linked data and Bibframe. Perhaps a tipping point has been reached?

I think part of it is due to the fact that ILS/LMS/LSP vendors have began to support linked data.
This breaks the whole chicken and egg problem of people saying there is no interest in using linked data hence there are no tools for it and that there are no tools for it because it isn't worth making because no-one is interested.

The biggest announcement was on Intota v2 - ProQuest's cloud-based library services platform

"Intota v2 will also deliver a next generation version of ProQuest's renowned Knowledgebase. Powered by a linked data metadata engine, Intota will allow libraries to participate in the revolutionary move from MARC records to linked data that can be discovered on the web, increasing the visibility of the library." - Press release

I actually was in attendance during the session but left before it was demoed (kicking myself for that). The tweet below is interesting as well.

Of course, we also can expect Summon to start taking advantage of linked data to enhance discovery via Intota,

Besides Proquest, SirsiDynix announced to "produce BIBFRAME product in Q4 2015".
While Innovative had pledged support to Libhub Initiative a few months earlier.

OCLC of course has always been a early pioneer on linked data.

"Nobody comes to librarians for literature review?"

As part of my attempt to balance going to sessions where I was really interested in the area (and hence likely I would be well versed  in most of the things shown) and sessions where I was totally unfamiliar with (and hence likely most things would go over my head), I decided to go to some GIS sessions.

I accompanied my ex-collegue and co presenter to a couple of sessions on GIS (Geographic Information Systems) which he has an interest/passion in and is currently tasked with trying to start something up for the library.

I attended various sessions including a round table session  which focused more on what libraries were doing as opposed to more technical sessions. It was clear from the start that some academic libraries in the US were far more advanced than others, such as Princeton, who I believe had a librarian state that libraries have being managing data for over 50 years and it's not a new thing to them.

Much nodding of heads occurred when someone warned about jumping on the band wagon simply because their University Librarian thought it was a shiny new thing.

Many talked about staffing models, how to fit in liaison librarians vs specialist roles into these new areas which is a perennial issue whenever a new area emerges (e.g it was promoting open access the last time around for many academic libraries).

One librarian stated that helping faculty handling research data is important because "nobody comes to us anymore for literature searches".

Of course this immediately drew a response from I believe a social science (or was it medical) librarian who said, faculty do come to them for both literature review as well as data sets! :)

Why searching for data is the next challenge

ExLibris has been sharing the following diagram in various conferences recently, listing 5 things users expect to be able to do.

Of the five tasks above, I would say the greatest challenge right now would be to "obtain data for a research project" which can be seen as a different class of problems compared to the other 4 tasks which broadly speaking involve finding text based material.

I would think this is because over the years, improvements in search technology (from the "physical only" days to the early days of online and now to Google scholar and web scale discovery), coupled with easily over a century of effort and thinking of how to organize and handle text - this has made searching for text, in particular scholarly texts (peer reviewed articles in particular) if not a completely solved problem, at least a problem that isn't so daunting that most academics would recoil in terror and ask for help.

Yet, the level of difficulty for searching for data sets/ statistics is I would say about the same level of difficulty for searching for articles in the 1980s to 1990s. While the later has improved by leaps and bounds the former hasn't moved much.

Lack of competition from Google? 

Having worked in a business/management oriented University for 5 months, I am starting to appreciate how much more difficult it is to get datasets from say finance areas and I know many librarians including myself feel a sinking feeling in our stomach when asked to find them.

Firstly, the interfaces to get the data out of them are horrendous. Even the better ones are roughly at the level of the worst article searching interfaces.

This is partly I suspect because without Google to put pressure on these databases, there is no incentive to improve. Competition from Google I believe have driven the likes of EBSCO, Proquest etc to converge into pretty much the same usable design or at least a google like design that takes little to adjust to.

Today, the UI you see in Summon, Web of Science, Scopus, Ebsco platforms etc is pretty much the same, and you practically can use it without any familiarity. (See my post on how library databases have evolved most in terms of functionality and interface to fit into the google world).

Google's relentless drive to improve user experience has benefited libraries to try to keep up. You could say the Ebscos of the world would practically forced to improve or die from irrelevance as students flocked to Google .

Of the databases that libraries subscribe to , the worse ones typically belong to either the smallest outfits or ones that primarily served other non-library sectors.

So the likes of bloomberg , Capitaliq, T1 and even many law databases  such as lexisnexis have comparatively harder to use designs.

They can get away with this because of lack of competition from Google and also these are primarily work tools, and professionals are proud of the hard earned bloomberg skills say that gives them a competitive advantage.

When it comes to non-financial data, it becomes even more challenging, since there isn't many well known repositories of data (at least known to a typical librarian not immersed in data librarianship) that one should look at. Google is of limited help here showing up the usual open data worldbank/UN etc sources that are well known.

How researchers search for public data to use

A recent Nature survey asked researchers how they find data to use.

The article noted that no method predominated with checking references in articles as common a method as searching databases.  Arguably this points to the fact that

a) databases on date are not so well known
b) databases on data are hard to use (due to lack of comprehensiveness of data or poor interface).

Of course this survey question asks about "public data" to reuse,

Researchers often approach me about using data from databases (for content analysis) we license such as newspaper databases and article databases. This seems yet another area that academic libraries can work on, leading libraries like NCSU libraries have took on this task to negotiate access of data from the likes of Adam Matthew and Gale

Confusion over what libraries can or should do with data

Like any new area, academic libraries are trying to get involved in (thanks to reports like NMC's Library Horizon Report - Library editions listing this area as a increased focus) , there is a lot of confusion over the skill sets, roles and responsibilities needed.

What a "data librarian" should do is not a simple question, as this can span many areas.

In Hiring and Being Hired. Or, what to know about the everything data librarian, a librarian talked about how his responsibilities blow up and that "everything data librarians
 don’t actually exist".

He points out that many job ads for data librarians actually comprise 5 separate areas
  •  Instruction and Liaison Librarian
  •  Data Reference and Outreach Librarian
  •  Campus Data Services Librarian - (this job is most associated with Scholarly communication)
  •  Data Viz Librarian (Learning Technologist)
  • The Quantitative Data Librarian (Methods Prof)

I can smell the beginning of what the Library Loon dubs as "new-hire messianism". Where a new hire is expected to possess a impossible number of skill sets, working under indifferent or even hostile environments and expected to almost singlehandedly push for change with no or limited resources or authority. 

Obviously no one staff should be "responsible for data", I've been reading about concept of "tiers of data reference.  and thinking of how to improve in this area.


Like most academic librarians, I am watching developments closely, and trying to learn more about the areas. Some sites

Thursday, July 16, 2015

5 things Google Scholar does better than your library discovery service

I have had experience implementing Summon in my previous institution and currently have some experience with EDS and Primo (Primo Central).

The main thing that struck me is that while they have differences (eg. Default Primo interface is extremely customizable though requires lots of work to get it into shape, while Summon is pretty much excellent UI wise out of the box but less customizable,  EDS is basically Summon but with tons of features already included in the UI), they pretty much have the same strengths and weaknesses via Google Scholar.

So far, my experience with faculty here in my new institution is similar to that from my former's, more and more of them are shifting towards Google Scholar and even Google.

Though Web scale discovery is our library's current closest attempt at mimicking Google Technology it is still different it is in the differences that Google Scholar shines.

Why is Google Scholar, a daring of faculty?

To anticipate the whole argument, Google Scholar serves one particular use case very well - the need to locate recent articles and to provide a comprehensive search.

While library discovery services are hampered by not just technological issues but also the need to balance support for various use cases including the need to support known item searching for book titles, journal titles and database titles.

It is no surprise a jack of all trades tool comes out behind.

Here are some things Google Scholar does better.

1. Google Scholar updates much quicker

One feedback I tend to get is from faculty asking me why their paper (often hot off the press) wasn't appearing in the discovery service.

In the early days of library discovery service, often the journal title simply wasn't covered in the index, so that was that.

These days more often than not the journal title would be listed as covered in the index particularly if it was a well known mainstream journal. So why wasn't the particular article in the discovery service?

Unfortunately, typically I would discover the issue lies with the recency of the article. The article was so new it didn't appear in the discovery service index yet.

Yet I would notice time and time again for example whenever an article appeared on say Springer, within a day or two it would appear in Google Scholar while it would take over a month if that to appear in our discovery service index.

Google Scholar simply updates very quickly using it's crawlers compared to library discovery services which may use other slower methods to update.

Also I have found library discovery services may often not index "early access/edition" versions, while Google Scholar, whose harvesters seem to happily grab anything on the allowed publisher domain have less issues.

The discovery service providers might argue, Google Scholar tends to employ almost zero human oversight and quality control and that as such they provide less accurate results.

This may be so, but it's unclear if the trade-off is worth it, in today's fast paced world where anxious faculty just want to see the article with their name appear.

2. Covers scholarly material not on usual "Scholarly" sources   

Besides speed of updates, Google Scholar shines in identifying and linking to Scholar material even if they are not found in the usual publisher domains.

Take the experience back in 2014 of a Library Director who was trying to access a hot new paper on "Evaluating big deal journal bundles".

The library director was smart enough to know it wouldn't appear in the discovery service and so did an ILL for the article and it turns out she could have just used Google Scholar to find a free PDF that the author linked off his homepage.

Here we see the great ability of  Google Scholar's harvester to spot "Scholarly" papers (famously with some false positives), even if it resides on non-traditional sites. For instance it can link to pdfs that authors have linked off their personal homepages (which may or may not be university domains).

This is something none of our library discovery services even attempt to do. In general our discovery services build their index at a higher level of aggregation, typically at journal level or database level, so there is no way it would spot individual papers sitting on some unusual domain.

3. Greater and more reliable coverage of Open Access and free sources

It's a irony that I find discovery services generally have much poorer coverage of Open Access than Google Scholar.

Most discovery services have indexed DOAJ (Directory of Open access Journal), but many libraries experience so bad linking experience (linking may not be at article level and/or lead to broken links), they just turn it off. (Discovery indexes that cover OAIster might have better luck?)

How about institutional repositories? Something created and managed by Libraries? On most discovery services, you typically can add only contents of your own institutional repository and you have a very limited selection of other institution repositories (always on the same discovery service) you can add

Usually you can add only the libraries that have volunteered to open their institutional repositories to other customers on the same discovery service and this is a very short list (probably a dozen or so).

The list is even shorter when you realise some of these institutions are not wholly full text and the discovery service makes it difficult to offer only full text items from these Institutional repositories when you activate them, so you are eventually forced to turn them off.

I am not well versed enough with institutional repositories and OAI-PMH to understand why there is so much difficulty to figure out which items listed in them are full text or not, but all I can say is Google Scholar's harvesters have no such issues identifying free full text and making it available. I would add some of it is not quite legal (eg look at the pdfs in, researchgate etc surfacing in Google Scholar).

Reason #2 and #3 above is the main reason why Google Scholar is by far the most efficient way to find free full text and why apps like Google Scholar Chrome button and Lazy Scholar are so useful.

4. Better Relevancy due to technology and the need to just support article searching

Going through the few head to head comparisons between Google Scholar and discovery services in the literature (refer to the excellent - Discovery Tools, a Bibliography), it's hard to say which one is superior in terms of relevancy, though Google Scholar does come up on top a few times.

My own personal experience is Google Scholar does indeed have some "secret source" that makes it do better ranking. There are many reasons to suspect it is better from the fact it can personalize, uses many more signals (particularly the network of links and link text) and just sheer technical know-how that made it the world's premier Search company.

A somewhat lesser often expressed reason why Google Scholar seems to do so well is that unlike library discovery services, Google Scholar is designed for one primary use case - to allow users to find primarily journal literature.

A library discovery services on the hand according to Exlibris has 5 possible cases

I would argue library discovery services are handicapped because they need to handle at the very least "Access to known book or journal" + "Find materials for a course assignment" + "Locate latest articles in the field".

Trying to balance all these cases simultaneously (which includes ranking totally different material types such as Books/articles/DVDs/Microforms etc) results in a relevancy ranking that can be mediocre compared to one that is optimised just for finding relevant journal articles aka Google Scholar.

During the early days of library web scale discovery, libraries and discovery service vendors learnt a costly lesson that despite the name "Discovery", a large proportion of searches (I see around 50% in most studies) was for known items. This included known items of book titles, journal titles and database titles.

Not catering for such users would typically lead to great unhappiness, so you started seeing many discovery service vendors working on their relevancy to support known item searching and adding features like featured boxes, recommenders to help with this.

All this meant that library web scale discovery services would always be a disadvantage compared to Google Scholar which focused on one main goal , discovery of articles as nobody goes to Google Scholar to look for known book titles, journal titles or database titles.

They do go to Google Scholar for known article title searches but "ranking" of such queries is easy given how unique and long the titles tend to be. In any case, doing well for article known item search is less a matter of ranking and more a matter of ensuring the article needed is in the index and as we have seen above Google Scholar is superior in terms of coverage due to broader sources and faster updates.

5. Nice consistent features

Google Scholar has a small but nice set of features. It has a "related articles" function, you won't find in most web scale discovery services unless you subscribe to BX recommender.

Many users like the "Cited by" function. Your library discovery service doesn't come with that natively, though mutual customers of Scopus or Web of Science can get citation networks from those two databases.

Because Google Scholar creates their own citation network, they can not only rank better but also provide the very popular Google Scholar Citations service. Preliminary results from this survey, seems to indicate Google Scholar citations profile are popular then on, Researchgate etc.

But more important than all this is the fact that it is worth while to invest in mastering Google Scholar. All major academic libraries will support Google Scholar via library links/open url, so you can carry this with you no matter which institution you are at.

On the other hand, if you invest in learning the library discovery service interface at your current institution, there's no guarantee you will have access to the same system at your next institution given that there are four major discovery services on the market (not counting libraries that use discovery service apis to create their own interfaces).


Does this mean library web scale discovery are useless? Not really.

I would argue that web scale discovery tools are designed to be versatile.

While they may come up second best in the following cases

  • In-depth literature review (both Google Scholar and Subject indexes are superior to web scale discovery in different ways)
  • Known item search for books/journal titles/database titles (Catalogues and A-Z journals and database lists are superior)

There are no other tools that can be "pretty good" in all these tasks, hence their popularity with undergraduates who want a all-in-one tool.

Can we solve this issue of being jack of all trades but master of none?

One interesting idea I have heard and read about in various conferences including Ebscohost's webinars was the idea of a popup appearing after entering the keyword and clicking search, asking the user whether he was trying to find a known item or a subject search or any other scenarios and based on the answer the search would execute differently.

Somehow though I suspect it might get annoying fast.

Sunday, May 31, 2015

Rethinking Citation linkers & A-Z lists (I)

I am right now involved in helping my current institution shift towards a new Library Service Platform and discovery service (Alma and Primo) and this has given me an opportunity once again to rethink traditional library tools like citation linkers, A-Z journal and databases lists.

It's pretty obvious such tools need a refresh as they were created

  • before Google/Google Scholar and web scale discovery.
  • in an era where electronic was not yet hugely dominant.

For this post, I will discuss citation linkers and how some vendors or libraries have attempted to update it for the current new environment of discovery followed by a further post on  ideas to update the A-Z database and journal list.

Citation linkers - a outdated tool?

The idea of citation linker (sometimes known as citation finder or article finder) function was meant to be straight forward. You entered a reference and the library would hopefully link you to full text of an article via the library's openurl resolver.

Most link resolvers such as ExLibris's SFX, Innovative's Web Bridge, Ebsco's Linksource etc offer a variant of such a tool.

Below we see some typical citation linkers across different vendors.

Typical citation linker from Proquest's 360 link

Typical SFX citation linker

Typical EBSCO LinkSource Citation finder

Typical Alma Uresolver Citation linker

I first encountered this tool myself pretty late in 2012, when implementing the suite of then Serialssolutions (now Proquest) services including Summon and 360link in my former institution.

Initially, I was totally confused by the fact that simply entering the article title alone would not work! You had to painstakingly enter various pieces of information which even then would often fail to work, depending on the accuracy of the citation fields you entered.

My confusion is understandable because I came upon this tool after the rise of web scale discovery where entering an article title was usually sufficient to get to the full text.

Even after I grasped the concept of how it worked, I realized how unlikely a user would be willing to use it, much less successfully use it since it was much easier to just enter the article title in Google Scholar or a library discovery service.

Sure as I discussed in Different ways of finding a known article - Which is best? way back in 2012, searching by article title via Discovery index has drawbacks (eg it can't find non-indexed items) but it is far easier and more convenient for the user and if there is anything I learnt in my years working in the library, convenience tends to trump everything else.

Can we improve on it? Autocomplete to the rescue

How would I create a citation linker 2.0?

A obvious improvement would to be to work on UX.

One study on the usability of the SFX citation linker  noted that while users who tried finding articles via the Journal A-Z list had issues, it was even worse when using the citation linker.

They suggested improving the usability of the tool by removing unnecessary fields such as author and article title fields which were usually not used for openurl resolution.

Georgia Tech Library seems to have followed this recommendation, as unlike the default sfx link finder
they hid the various author fields (first name, last name, initial) etc

A more interesting proposal to improve the tool was made by Peter Murray way back in 2006 entitled A Known Citation Discovery Tool in a Library2.0 World

"The page also has an HTML form with fields for citation elements. As the user keys information into the form fields, AJAX calls update the results area of the web page with relevant hits. For instance, if a user types the first few letters of the author’s last name, the results area of the web page shows articles by that author in the journal. (We could also help the user with form-field completion based on name authority records and other author tables so that even as the user types the first few letters of the last name he or she could then pick the full name out of a list.) With luck, the user might find the desired article without any additional data entry!"

Essentially he is suggesting that each of the fields in the citation linker would have autocomplete features via ajax which helps the user as well as adding a "Results area" which displays likely articles that the user is searching for. He goes on to suggest similar ideas for various fields such as volume and issue fields.

"Another path into the citation results via the link resolver: if a user types the volume into the form field, the AJAX calls cause links to appear to issues of that volume in addition to updating the results to a reverse chronological listing of articles. If a user then types the issue into the HTML form field or clicks the issue link, the results area displays articles from that issue in page number order. Selecting the link of an article would show the list of sources where the article can be found (as our OpenURL resolvers do now), and off the user goes."

At the time of the proposal, such a feature was not possible because it would require a large article index to draw results from. Today we of course have web scale discovery systems.

Auto parsing of citations 

One of the weaknesses of citation linkers is that it requires the user to parse the citation and enter each piece of information one by one into various fields. Not all users are capable of that or even patient enough to do that.

Why not simply allow users to cut and paste the citation and let the software figure it all out?

Brown University's free cite tool, allows you to toss in a citation and it will try to parse out each citation field. I believe there are a few other similar tools out there. The logical idea of course is to use this parsed output to fill in the citation linker field.

This is exactly what UIUC Journal and Article Linker tries to do.

A interesting variant of this is done by EBSCO.

EBSCO has an app called EBSCO Citation Resolver  via it's new Orbit platform, an Online Catalog of EBSCO Discovery Service™ Apps.

This uses the above mentioned Brown University's Free cite to parse references but instead of passing over the data to a traditional citation linker to try to get to the full text via OpenURL as UIUC does, it passes the data over to EDS itself.

As you can see above, the parsed information is sent to EDS for advanced searching using field searching.

We will get back to this example later.

Finding full text by text and voice recognition

Also why restrict oneself to cutting and pasting citations? What about other input methods? There used to be a ios app, I believe by Thompson Reuter's Web of Science that allowed you to take a photo of a reference and by the magic of OCR and text recognition combined with the citation parser, link you to the full text.

Unfortunately I lost track of that app but I recall it didn't work very well because it was limited to linking you to article entries in Web of Science and the text recognition combined with citation parser wasn't that good.

Still as technology advances I think the idea has legs. I have no doubt if Google desires, they can easily set this up to work with Google Scholar.

Now imagine combining this with voice commands such as Google Now, "Ok Google, find me such and such article by so and so in journal of abc".

Output accuracy should improve too.

Making it easy to input the citation is just one part of the equation, making sure full text can be reached is the other.

Coming back to the EBSCO Citation Resolver a interesting point to note is that after parsing the reference instead of passing it over to a citation linker such as their own Linksource citation finder (see below), it dumps the information into the discovery service Ebsco discovery service.

Parsed citation did not get passed to LinkSource's article finder

Why would one send the information to the discovery service and not the citation linker tool?

Part of the reason is that linking via OpenURL is often hit and miss in terms of linking to full text.

Some studies put full text linking success at around 80% of the time due to well known openurl issues which IOTA and KBART and are trying to solve.

Summon and EDS provide more stable forms of linking (often called direct linking that can work up to 95% of the time), which can be used whenever possible on-top of OpenURL. (Note : 360Link v2.0 provides the same type of direct linking as Summon)

Add the fact that automatic citation parser's is going to be somehwat inaccurate at text recognition, it might be easier to employ strategies that involve just extracting the author and article title to work with the discovery service , then trying to identify every citation field (eg vol, issue, page) to work with the full openurl resolver as the latter method is very error prone, requiring a large number of fields to be recognised correctly to work well.

That said as more citation styles require dois to be added, the work of parsing citation becomes easier as often the doi alone is sufficent to get to the full text. I also suspect the increased use of citations created by reference managers (eg Mendeley, Zotero) and the increased support of  Citation Style Language (CSL) for various styles may eventually make things more consistent and easier for the citation parser.

I can go further and imagine a hybrid system for output that would even work with Google Scholar for free pdfs + Web Scale Discovery direct linking + Openurl linking to give the best chance of reaching the full text.

You can see this hybrid multiple approach system somewhat in play in the Lazy Scholar extension (supports Chrome and Firefox) that checks Google Scholar for free full text and also offers openurl resolution.

This could work either the same way link resolver menus work now and display various options or there would be some intelligent system in the background deciding whether to use the discovery service or Google Scholar to find the full text (how likely was the first result in Summon say based on a title only phrase search the hit?) or to rely on traditional openurl resolution.


All in all though, I don't see much of a future for a stand-alone citation linker sitting on your website.

Few people have the patience to use it.

Ideally a web scale discovery service - basically the big 4 - Summon, EDS, Primo and Worldcat , should be built to handle cases when users copy and paste the whole citation. (I understand Primo has enhancements that handle it).

As it is, I notice the rise of such user behaviour in search logs of discovery services under my care. It's a small but significant amount, something noted in other studies that analyse discovery search logs.

Can Summon handle cutting and pasting full references?

Discovery services should definitely be trained to identify such cases and automatically call the citation linker function.

Perhaps the system would then try to

a) Recognise the likely type of material sought (book, book chapter, article etc)
b) Depending on material type, focus on identifying with high likelihood the title, doi, author etc.
c) Use either the discovery index, doi resolution or traditional openurl methods depending on a) and b)

I expect, usually the system would try a phrase search for an article title, perhaps further narrowed by author in the article index (the top match usually is highly likely to be the right one), sometimes it would resolve the doi and yet other times it would try the traditional citation finder method.

With tons of statistics on success rates, it might be possible to get a reasonably accurate system.

Depending on how certain you are on the model you are using, it could show all the options (similar to how link resolvers menus work now and in particular Umlaut is worth looking at), or it could just show the highest probability match.

Next up, do we really need A-Z database and A-Z Journal lists?

Friday, April 17, 2015

Making electronic resources accessible from my home or office - some improvements

I've recently been involved in analysing  LibQual+ Survey at my new institution and one of the things recommended nowadays when doing LIBQUAL analysis is to do a plot of performance of various items versus how important those items are to users.

Above we see sample data from Library Assessment and LibQUAL+®: Data Analysis

We proxy importance of a factor by the mean desire score on the vertical axis and the how well a factor is performing by the adequacy mean score on the horizontal axis, so the higher the dot the more important it is.

In the above sample data IC 1 or "Making electronic resources accessible from my home or office" is the 2nd to 3rd most important factor, and I suspect this is typical for most libraries.

Also do note that the analysis above is for to *all* users. Undergraduates traditionally have high desire for space, if we include only faculty, it will probably be even higher ranked.

LibQual questions can be hard to pindown on what they mean, though in this case, I would suspect it is the accessibility from home that is the issue. Currently, most forward looking academic libraries try to make access as seamless as possible by ip authentication in-campus so users don't need to use proxy methods within campus. (Expecting users to start from the library homepage to access resources is a futile goal)

Off campus access is more tricky since not all users will be informed enough or bother to VPN even if that is an option.

Meeting Researchers Where They Start: Streamlining Access to Scholarly Resources 

Is seamless access to library resources particularly off-campus really that difficult? Roger C. Schonfeld in the recent Meeting Researchers Where They Start: Streamlining Access to Scholarly Resources believes so. He wrote ,  "Instead of the rich and seamless digital library for scholarship that they need researchers today encounter archipelagos of content bridged by infrastructure that is insufficient and often outdated." 

He makes the following points
  • The library is not the starting point   
  • The campus is not the work location
  • The proxy is not the answer
  • The index is not current  (discovery services often have lag time compared to Google/Scholar)
  • The PC is not the device (despite the mobile push in the last 5 years, publisher interfaces are still not 100% polished) 
  • User accounts are not well implemented
Most of these points are not really new to many of us in academic libraries, though it is still worth a read as a roundup of issues researchers face.

Still the listing above misses one very important issue, that is the classic problem of the "Appropriate copy problem" that the openurl resolver was invented to fix. The key problem is that openurl still isn't widely implemented and while Google Scholar, supports it , Google itself doesn't and it is extremely easy to end up on an article abstract page without any opportunity to use openurl to get to the appropriate copy. More on that later.

BTW Bibliographic Wilderness responds to Roger Schonfeld from the library side of things, pointing out among other things the appropriate copy issue and difficulties of getting vendors to improve their UX (aka we can't cancel stuff based on UX!).

Shibboleth and vendor login pages

So what should be ideal view when a user lands on a article abstract page and needs to authenticate because he is off campus and/or without proxy? 

One way is Shibboleth but that is not something I have experience with but it seems it is poorly supported and as poor usability.  Without Shibboleth is there a way for vendors to make sign-ins easier when users are off campus and land directly on the article page without the proxy?

The way JSTOR has done it (for last 1-2 years?) has always impressed me. 

JSTOR will intelligently look at your ip and suggest institutions to login from. As far as I know you don't have to have Athens/Shibboleth or do anything special for this to work.

Recently Stephen Francoeur brought to my attention the following announcement from Proquest

Essentially the Proquest login screen is redesigned to make it simple to allow users to enter their institution and the system will attempt to authenticate you using the usual method.l

"Today we are debuting a simplified login experience for institutions that use a remote login method such as Proxy, Shibboleth, OpenAthens, or barcode to authenticate users into ProQuest ("

"To reduce this confusion, we've redesigned the login page ( as shown below to make it easier for remote users to authenticate into ProQuest by adding the "Connect through your library or institution" form above the ProQuest account form. Further, remote users can select their institution on the login page, instead of having to click through to another page as they had to do previously. After users select their institution, they will be re-directed to the remote authentication method their institution set up with us."

Though it doesn't seem to suggest institutions, it's still fairly easy to use, just type in your institution and you will be asked to login (via ezproxy in my case).

Ebsco is another one that seems to make it possible to select your institution and login for full text but like the Proquest one above, I could never get it to work either at my old institution or new. This could be some configuration setting needed.

It's really amazing how few publishers follow the lead of JSTOR and Proquest. If the Elseviers/Sages etc of the world followed a similar format, I am sure there will be much less friction for accessing to paywall articles. Let's hope Proquest's move will lead to others converging to a similar login page, the way now many article databases look pretty much similar.

Appropriate copy problem revisited

Say most publishers start to wise up to UX matters and implement a login page like JSTOR so our users can select a institution and quickly get access. Will that solve every problem? Arguably no,

At my old institution, we had great success with promotion of the proxy bookmarklet, Libx etc to overcome proxy issues (part of it is because ALL access is through proxy whether in campus or off so the proxy bookmarklet would be essential all the time as long as you did not start from the library homepage) 

But even if a user was smart even to add the proxy string , that still led to a common problem.

Often even after proxying full access would still not be granted. The reason of course is because we may not have access to full text on that particular page but may have access somewhere else on another platform.

A classic example would be APA journals, where access would be available only via Psycharticles (which can be on Ovid or ebsco platform). Google results tend to favor publisher rather than aggregator sites, so one would often end up on a page where one would have access only via another site.

The more a academic library relies on aggregators like Ebsco or Proquest as opposed to publishers to supply full text the more the appropriate copy issue arises.

As mentioned before this issue can be solved if the user starts off searching at a source that supports openurl such as Google Scholar and access via the library links programme or even a reference manager like Mendeley. But with multiple ways of "discovery" you can't always guarantee this.

In fact, I am noticing the rise in number of people who tell me they don't even use Google Scholar but Google to find known item articles. Interestingly enough the recent ACRL 2015 proceedings Measuring Our Relevancy: Comparing Results in a Web-Scale Discovery Tool, Google & Google Scholar  finds that Google is even better than Google Scholar for known item searching. Google scored 91% relevancy in known item queries while Google Scholar and Summon both scored 74%!

If so, we will have ever increasing number of users who will land on article abstract pages without the opportunity of using link resolvers to find the appropriate copy.

Another example, I find many interesting articles including paywall articles via Twitter.  From the point of someone sharing, what is the right way to link it so others who will have different options for access will be able to get to it?

There's doesn't seem to be a obvious way (link via doi? link to a Google scholar search page?) and even if there was this would be troublesome to the sharer, so most of the time we end up with a link to a publisher version of the article which others may not be able to access.

 Lazy Scholar and the new Google Scholar Chrome extension

So what should we do, if we end up on a article page and we want to check access via our institution?

I've wrote about Libx before but my current favourite Chrome extension is Lazy Scholar, which I reviewed here. 

It exploits the excellent ability of Google Scholar to find free full text and also scrapes the link displayed by Google Scholar for the library link programme.

With more and more providers cooperating with Google Scholar (see the latest announcement by Proquest for Google Scholar to index full text from Proquest), Google Scholar is by far the largest storage of scholarly articles and every Scholar's first stop to check for the existence of an article.

Lazy Scholar automatically attempts to figure out if you are on a articles page and will search Google Scholar for the article and scrape what is available. In this case there is no free full text so it says no full text. But you can click on Ezproxy to proxy it or click on "Instit" which triggers the link resolver link found in Google Scholar (if any).

There are many other functions that the author has added to try to make the extension useful , I encourage you to try it yourself.

Interestingly in the last few days, Google themselves had a similar idea to help known item searches by exploiting the power of Google Scholar. They created the following Google Scholar Button extension.

It is very similar to Lazy Scholar but in the famous Google style a lot simpler.

On any page with an article, you can click on the button and it will attempt to figure out which article you are looking for an search for the title in Google Scholar and display the first result. This brings in all the usual goodies you can find in Google Scholar.

If the title detection isn't working or if you want to check for other articles say in the reference, you can highlight the title and click on the button.

It's interesting to see the future of both extensions, see here for a comparison between the features of Lazy Scholar vs Google Scholar button.


 "Making electronic resources accessible from my home or office" isn't as easy as it seems. A approach that combines

  • improved usability of publishers login pages
  • Plugins to support link resolvers and the ability to find free full text via Google Scholar
is probably the way to go for now, though even that doesn't address issues like seamless support for mobile etc. 

Wednesday, February 18, 2015

[Personal] Moving on from my first library job

After over 7 years of working at NUS, I am finally moving on.

I am very grateful for the opportunities given to me here and I have changed and grown far beyond I expected. This period has been by far the most eventful period in my life, I've changed so much I wager the newbie librarian who first stepped into NUS in 2007 could hardly recognise the person I am in 2015.

A much younger me, a few weeks after joining NUS in 2007

This is also the longest I have held any job so far, and I admit I have grown extremely comfortable to the culture, people and systems here. But I've always been driven by curiosity and the need to try new things and this has led to many new initiatives and even transfers to different departments to study the systems and procedures. But leaving NUS Libraries? That would be a far bigger change. 

Another early photo of me in NUS

So when the offer came from the Singapore Management University as Manager, Library Analytics, I was at a cross roads. I have always been impressed by the energy, passion and knowledge of the librarians at the Singapore Management University and I could see myself helping to make an impact working with such great colleagues. And yet there is of course the doubt that comes with any major change. Would I fit in? Could I adjust to a very different working style?

In many ways the safer choice would be to remain at NUS where I have been doing well and will probably continue to do well in the foreseeable future. But there are times in my life where you have to take risks to grow and I decided to push out of my comfort zone and accept the offer. Let's see what else I can learn elsewhere.

My last day

So on 23th Jan 2015, I came into NUS Libraries for my last official working day (was on leave for the next 4 weeks).

It was already a emotional week, after the announcement was publicly made. It was heartwarming to hear kind words from colleagues, staff and students (some of whom came as a surprise to me) who heard and reached out to say they appreciated what I had done and it made me feel that I had at least made a difference in my time here.

On my last working day, in an odd kind of symmetry in the morning I gave a final training/briefing session to the new librarians who were just starting their careers in NUS. [Another nice bit of symmetry, found out later that one of the new librarians was a former student who thanked me for some of the work we did 4 years back]. The rest of the day was spent saying goodbye to many colleagues who dropped by to chat and a final exit interview.

How did I feel on that last day? I felt a little like JD from Scrubs (a character I identify with a lot).

"Endings are never easy; I always build them up so much in my head they cant possibly live up to my expectations, and I just end up disappointed. I'm not even sure why it matters to me so much how things end here.

I guess its because we all want to believe that what we do is very important, that people hang onto our every word, that they care what we think. The truth is: you should consider yourself lucky if you even occasionally get to make someone, anyone, feel a little better. After that its all about the people that you let into your life.

And as my mind drifted to faces I've seen here before, I was taken to memories of family, of coworkers, of lost loves, even of those who left us. And as I rounded that corner, they all came at me in a wave of shared experience.

And even though it felt warm and safe, I knew it had to end. Its never good to live in the past too long.As for the future, thanks to Dan, it didn't seem so scary anymore. It could be whatever I wanted it to be."
So to the future I go! Wish me luck.

Thursday, January 15, 2015

A Bento style search in LibGuides v2

LibGuides V2 Search Display

Like many libraries right now, my institution is working towards upgrading to SpringShare LibGuides V2.

Update 15/1/2014 just went live!

Like many libraries, we took the opportunity to revamp many aspects of Libguides v2. One of the areas, we spent the most effort on was the front page.

What follows is a joint post by my colleague  Feng Yikang and myself

Designing the LibGuides home page

One of the main questions that vexed us was, what should the LibGuides homepage do? What purpose should it serve that distinguished it from the library home page?

Most academic libraries use the LibGuides home as a landing page to list all their libraries' subject guides and topic guides. That is certainly one option.

We went the other way and designed it as an alternative home for research oriented users. 

Like many academic libraries, our library homepage was designed before the current slew of expertise based services likeScholar communication/ Open Access support, GIS services etc was common place and there was little "real estate" to link to such services on the main library homepage.

Also arguably, it was difficult to justify adding a link to a niche service like Scholarly Communication from the library home page because the vast majority of users (undergraduates and graduate students) would never need to use it on the main library home.

Our LibGuides homepage could be the logical place to add such links.

Also we noticed from the chat queries received on the LibGuides homepage, the questions we received were a mixture of the following

a) Users want to figure out how to place a hold and how to locate the book they wanted. as well as problems with passwords - (The "Find Book" and "Password" boxes handle this)

b) Graduate students and above trying to figure out what library support services were available. Often it was a case of a new experienced graduate student or faculty trying to see if the specialised support service they enjoyed in their prior institution was available here. (The "Research support" box handles this)

c) Users looking for specialised librarians in their area to support them. (The pull down menus on the right support this).

As such we designed the LibGuides homepage to encourage browsing of such services.

The drop down menus on the right, help users quickly locate the appropriate subject librarians, while the other categories are carefully curated based on a combination of usage, chat queries asked and just plain old institution. 

The design is of course currently tentative, and will evolve with time. 

What about search? Bento style ?

Of course browsing is all very well, but we know a lot of people will just search. Handling LibGuides search is probably an issue that has always causes me some amount of headache.

In LibGuides v1, you could customize your search. So you could add a keyword search to any of the following

1) Libguides search
2) Libanswers (FAQ) search
3) Classic catalogue search
4) Web Scale discovery search - eg Summon

or pretty much any search you wanted as long as you could craft the right url.

Below is how it looked in our Libguides v1 search. I had hooked it up to multiple options including search our web scale discovery search (dubbed "FindMore@NUSL") as well as a Google site search of our web pages ("All Library Pages"). There were the normal options of searching within all guides as well.

The problem was which default search was the right one?

Like most libraries, I initially set the default to search "All Guides".  It was a search in Libguides, so you expected Libguides results right? What else could be simpler?

The problem is I did an analysis of the keywords that people did in LibGuides v1, a few years ago and though I don't have the analysis now, I remember roughly 90% of searches resulted in no hits.

Why? Because users were searching very specific searches like singapore chinese temple architecture.

Most of such searches had zero results. 

As a sidenote, the same keyword search in LibGuides v2, does yield results, it seems likely in LibGuides v2, there isn't strict AND boolean going on. I am not sure if this is a good idea, since the user may not be aware this is happening and be disappointed at totally irrelevant guides appearing.

Such searches would be yield reasonable hits if you did them in a Web Scale discovery system like Summon, but not in a LibGuides search.

The other issue I found with using the default LibGuides search was that people were found searching for specific book titles using title, isbn, issn etc eg. "1984", journal titles "Nature", or databases "Scopus" in the Libguides search. In other words they were treating the Libguides search like a catalogue search.

For the more obscure databases or journal titles and most certainly books that were not listed in LibGuides this often led to no results. 

Even if the guide happened to list the item the person was looking for say a database was listed in the guide, you still ran into a problem. 

Below shows a search for "Realis" (a local real estate database). 

It isn't very obvious that the subpage - Databases in the Real Estate Guide has a link to Realis.

Many users might even click on the topmost "Real estate" link and hunt for the Realis link on the home of the real estate guide, not realising it is in the databases page.

All these searches were very common in my logs, and explains why a default LibGuides only search was not always the best.

So what was the solution? 

The idea we had which seems obvious in hindsight was suggested by my colleague Yikang. I was telling him how many research intensive Universities like Duke, Colombia, Princetion are currently spotting bento style search results

Such bento style search results would display multiple boxes of different types of content. By entering keywords, users will be presented with a holistic spread of results including resources, services, library guides, FAQs or more. 

So one could display results not only from the LibGuides but also from Libanswers, Catalogue, article index of discovery services and more, fitting every need.

Yikang realised that LibGuides v2 allows one to customize the search result template and this made it possible to pull in bento style results.

Below show some results, when someone searches for "Systematic Review".

In our current simple implementation of Bento, we have the simple "Our Guides" box and  "FAQs" box,

In addition, we have three more boxes - "Our Suggestions" , "Library Catalogue" and "Articles" which comes directly from our Summon API. 

They are drawing respectively from Summon best bets, Search filtered to "Library catalog" and filtered to article-like content types.

How do they work together?

The leftmost and most prominent box feeds you with LibGuides as expected. This can display LibGuides that are not just disciplines or subject specific but can also cover services like EndNote, Bibliometrics, GIS etc.

Below shows someone searching for bibliometrics and a relevant guide surfaces plus other relevant material.

But what if the keyword does not allow any good LibGuides matches? Hence the existence of other boxes.

The "Library Catalogue" box helps resolves searches where people are searching for specific book titles, database titles and most known items etc. Below we see an example of searching by database (Business Source Premier) and searching by isbn.

Below shows a search by ISBN

The "Articles" box, would at least show some results if the user searches for something highly specific not covered by any guide, pulling in at least a few relevant articles or books.

"Our Recommendations" could be customized based on what users are searching. It could be used to cover cases not covered by the other boxes. 

"FAQ box" comes from Libanswers, and it helps resolve common policy and procedural questions.

Broad Implementation

For those interested in the details of setting, here is the report from my collegue Yikang. 

For this setup to work, I needed a proxy page, which will act as "middle-man" between our LibGuides page and the Summon server. 

Once a search query is entered into the LibGuides search page (A), a javascript sitting in the LibGuides search page will pass the query to this proxy server page (B), which will in turn parse the query to Summon server (C). 

The Summon server will then return the results in JSON format to the proxy server page (D), which will pass the results to the LibGuides search page (E) to be interpreted and displayed (F). 

The really difficult part was setting up the proxy server page as coding the proxy server page from scratch would be time-consuming, because it was foreign territory to me. Fortunately Virginia Tech University Library shared some sample php files at which I could refer to. I used the files, together with some HTTP_Request php files downloaded from PEAR

They worked! The potentially time-consuming part of the job was done.

Next, I had to do some Javascript programming to design the Bento search layout results. Aside from
the layout, the Javascript interprets the various search API results for each compartment (LibGuides,
Summon, LibFAQ) and incorporates it into the search results page, in a presentable form. Javascript
was used to add the "see more" buttons. Each "see more" button opens a popup window showing
the full results of each respective result type compartment."


Our bento style search result page is in the scheme of things not exactly a new idea. Though it's the first example I know of that links the search from a LibGuides search as opposed to from the main library homepage.

By pilot testing the search display on LibGuides search only, we can carry out a small scale experiment, rather than rolling it out to all users on the main Library portal search bar.

Arguably, the bento style search for LibGuides is a much safer bet, because the use cases are more constrained.

Consider 3 Alternatives for the display of search results from LibGuides

1) Pointing to Libguides only results
2) Pointing to Web Scale discovery blended search ie Summon default search
3) Pointing to Bento style with boxes for Libguides and Discovery results.

If the user is searching in LibGuides and is really looking for a suitable LibGuide say the Economics or EndNote guide, chances are there are only a few relevant hits anyway in the LibGuides, so for such users, a box with 10 top hits for guides should be more than enough. Both Alternative 1 and 3 will be successful as they show ranked list of Libguides, while alternative 2 will fail.

Alternative 2, which is pointing solely to Summon may succeeded since LibGuides are indexed in Summon, but the blended style list and uncertain relevancy ranking means the user have to plough through many results to find the guide he needs.

For other use cases, whose searches are best satisfied by non-guides results, the bento style boxes (alternative 3) provide a far better option than a straight out LibGuides search (alternative 1) which would have no results.

Also as blended lists in Summon can have problem with known item searches , we handle this case by creating two separate boxes of content, "Library Catalogue" and "Articles" (though both are drawn via Summon api using different filters)

So what do you think?  Is this a general improvement from the default search?

Wednesday, December 31, 2014

A personal review of 2014

"Happy families are all alike; every unhappy family is unhappy in its own way."

Somehow though, I doubt successful libraries are all alike except in the most general of ways.

Still, these are some of the changes or trends in librarianship in the year of 2014 that resonated with me or occupied me. A lot of it probably is highly specific to my current institution and environment so your mileage might vary. 

1. Open Access finally takes off in academic libraries

I know many open access advocates and librarians are thinking, this isn't really new. But from my highly localized context, this was the year, my current institution formally created a "Scholarly Communication team" and created outreach teams. I had my first experience presenting on open access to faculty.

In the local (Singapore) context, this was the year with the passing of A*Star mandate for A*star funded research as well as the instruction from National Research Foundation that research institutions should have open access policies to tap on funding, meant finally Singapore is starting to get serious on this. Of course, this is just the beginning.

I also began to see some interest in altmetrics eg Plumx, though it may be still early days.

2. Shifts in mobile 

Yet another year and yet another new iPhone. What was different this year was the upgrade in screen sizes with Iphone 6 and Iphone 6 plus. Measuring 5.5 inch, the iPhone 6 plus is Apple's first venture into "phablet territory". Even the iphone 6 got a size upgrade from 4 inch to 4.7 inch.

I myself upgraded this year from Iphone 4s with a tiny 3.5 inch screen to perhaps currently the best Android smartphone/Phablet, a Samsung Galaxy Note 4 which has a large 5.7 inch screen. Note like most new owners of devices I am obviously biased. 

My Samsung Note 4

I have pretty long fingers, so I adapted to the large screen fast. The current trend is for flagships on Android to become bigger including the Nexus 6 (huge 6 inch screen), so it's clear to me that this will eventually lead to large screens becoming the normal on mobile.

My own experience mirrors most people, who find that after a few days, they just can't go back to smaller screens. Also like many have reported, my usage of  tablets has also fallen, in my case the Nexus 7 (2013) doesn't seem so useful anymore, though admittedly, the new Android Lollipop 5.0 on Nexus 7 did give it a new lease of life.

Looking at my own institutional statistics (sessions on portal, Libguides, Summon etc) for the last 6 months, it's somewhat surprising to me that besides the Iphones, the next most popular phone is the large Samsung Galaxy Note 3

This could be something unique to my situation, or it could be simply because people only bother to use our websites with large screens (there is build-in web responsive design for Summon 2 and Libguides 2 though but the library portal is a mobile page redirect).

Still, I think the upgrading to bigger screen sizes and phablets probably means more online reading and services like Browzine and the new Ebscohost's Flipster, a digital magazine  are going to benefit.

Browzine in particular just launched support for Android smartphones at the end of the year, just in time for our new subscription to Browzine and my new Phablet!

Browzine on Samsung Note 4

I used to be quite "hot" on mobile developments in libraries (refer to blog posts in 2010/2011) but somehow I felt after an initial furry of interest, it pretty much died out at least in academic libraries.

Most of us have mobile library pages, or a native app of sorts, typically from offering like Boopsie or Librarything Mobile. More recently, responsive web design has become popular, with library vendors from Databases, Discovery Services (e.g Summon) and LibGuides moving towards that. Library websites are also slowly moving towards that. 

All this is nice, but still pretty boring really. Still this year some interesting developments. 

Other mobile developments to watch

  • Adding of NFCs to iphone 6 , apple pay etc may make such technologies main stream

3. Upgrades lots of upgrades

This was the year my institution chose to upgrade to the following (some were already available in Aug 2013 but we wisely chose to wait until 2014)

Some upgrades were more major than others (eg 360Link v2 upgrade was relatively minor) but still all could be considered major upgrades with changes to functionality and UI that users would notice straight away.

This year, we finally managed to achieve the popular Bento style search that I have blogged about so much, though we are currently only linking it to the LibGuides v2 search.

From the redesigned Libguides V2 screen (see below), users can search and will see a modified LibGuides v2 search page that shows results from Libguides, Libanswers and Summon!

This helps solve one major issue I noticed in Libguides searches - users tend to search for article titles, books, or very specific search results that are best answered by a search results page from your discovery service. They also sometimes search for policies, opening hours etc. 

All this was achieved by a talented new colleague of mine and he will be doing a guest blog in the new year to explain how it was done.


I had a great 2014 though my blogging rate suffered, hopefully this will change in 2015. Still thank you all loyal blog readers for subscribing to my blog all these years and wish you all a amazing New year to come.

Share this!

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Related Posts Plugin for WordPress, Blogger...