The initial motivation
- Sample citations made by your researchers to other items
- Record what was cited - typically you record age of item, item type cited, impact factor of journal title etc.
- Check if the cited item is in your collection
The impact of free
But what does "in your collection" mean? This of course would include things you physically hold and subscribed journal articles etc.
But it occurred to me that these days our users could also often obtain what they wanted by searching for free copies and as open access movement starting to take hold, this is becoming more and more effective.
In fact, I did it myself all the time when looking for papers, so I needed to take this into account.
In short, whatever couldn't be obtained through our collection and was not free would be arguably the potential demand for DDS/ILL.
(In theory there are other ways legal and illegal to gain access such as writing to the author, access via coauthors/secondary affiliations or for the trendy ones #canihazpdf requests on Twitter),
How do you define free?
What would an average researcher do to check for any free full text?
There could in theory be very big differences between preprints and the final published versions and if you only had the post print version you should cite it differently from the final published version.
According to Morris & Thorn (2009) , in a survey, researchers claim that when they don't have access to the published version, 14.50% would rarely; and 52.70% never access the self archived versions.
This implies researchers usually don't try to access self archived versions that aren't final published version.
For example in the Ithaka S+R US Faculty Survey 2012 survey, over 80% say they will search for freely available version online, more than those using ILL/DDS. Are these 80% of faculty only looking for freely available final published version? Seems unlikely to me.
Ithaka S+R US Faculty Survey 2012
Let's flip it around for sake of argument, how do things look like if we assume users access free items (whether preprint/postprint/final version) as a priority and only consult the library collection only when forced to?
Why amount of cited material that is free is a harbinger of change for academic libraries
Combining citation analysis with open access studies
|Free full text found||Sample||Searched in||Coverage of articles checked||Comment|
|Bjork & et. al (2010)||20.4%||Drawn from Scopus||2008 articles searched in Oct 2009|
|Gargouri & et. al (2012)||23.8%||Drawn from Web of Science||"software robot then trawled the web"||1998-2006 articles searched in 2009
2005-2010 articles searched in 2011
|Archambault & et. al (2013)||44% (for 2011 articles)||Drawn from Scopus||Google and Google scholar||2004-2011 articles searched in April 2013||"Ground truth' of 500 hand checked sample of articles published in 2008, 48% was freely available as at Dec 2012|
|Martín-Martín & et. al (2014)||40%||64 queries in Google Scholar, collect 1,000 results||Google Scholar||1950-2013 articles search in May 2014 & June 2014|
|Khabsa & Giles (2014)||24%||randomly sampled 100 documents from MAS belonging to each field to check for free and multiple that by estimated size of each field determined by capture-release method||Google Scholar||All? searched in ??|
|Pitol & De Groote (2014)||58%||Draw randomly from Web of science - for Institution C, draw 50 that are not in the IR already and check in Google scholar||Google Scholar||2006-2011||Abstract reports 70% free full text, this is for institution A , B and C, where for A and B, the random sample drawn from Wos had to include copies already in IR as well.|
|Jamali & Nabavi (2015)||61%||Do 3 queries each in Google Scholar for each Scopus third level subcategory. Check the top 10 results for free full text||Google Scholar||2004–2014 articles, searched in April 2014|
Martín-Martín & et. al (2014) and Archambault & et. al (2013), in particular strike me as very rigorous studies and both show around 40%++ full text is available.