Statistical data are produced by statistical systems. This is true of all regular statistical variables required by modern organizations – including libraries. Data that are needed for normal operations must be produced by routinized systems.
Library statistics is small and specialized subject. It is mainly important for people who work in and with libraries. Today, the most urgent task for library statisticians is to improve and renew our statistical systems. Developing – or “inventing” – new variables and indicators must be part of that renewal. But field testing is essential.
In the STM disciplines – science, technology and medicine – it is axiomatic that inventions will not make an impact unless they are tested and found useful under ordinary working conditions. The same applies to statistical innovation. Proposing new indicators is easy. Testing them with real data is laborious. Getting them accepted by organized statistical systems is tough, back-breaking work. Organizations resist change.
Our current statistical systems are still heavily dependent on industrial ways of thinking and working. They were developed in the middle and late 20th century, long before the web made a deep impact on society. The systems reflect what I would call an industrial model of statistics. They aim at standardized mass collection of a limited number of statistical variables.
Typically, these statistics are based on annual reporting of variables that are easy to gather from existing databases. Publication is also annual. The producers of statistics do not make the full data sets available for researchers to play with. They select the tables and the cross-tabulations we are allowed to study. The regular print publications are characterized by lots of numbers and very superficial analyses. They do not, I would say, train their readers to appreciate statistics.
This is not the fault of the producers, however. It results from limited demand. Most librarians avoid statististics if they can. They do not want to be educated. Thus, there is no pressure on statistical agencies to improve their products. Repetition and complacency prevail.
The web is changing everything, of course: libraries, librarians, users and statistics. In a knowledge economy, statistical data must be treated as an economic rather than as an administrative resource. Statistical production should be judged by the same standards as other forms of knowledge production. Libraries are knowledge factories rather than media collections. Statistical systems should produce the data librarians need to sustain and develop their services – as well as the data politicians need to make decisions and evaluate policies.
This implies, I think, that most data should be collected at short intervals – hours, days, weeks, months – rather than years. They should, in most cases, be quality checked and published as soon as they have been collected. These are hardly sensitive data, so the full data sets should be made available in digital form for study by researchers and other interested parties.
Library agencies should also do their level best to train librarians in practical, operational numeracy. In library schools, most statistical courses are research- rather than practice-oriented. Students learn about normal distributions and T-tests and sampling errors, which most of them will never use. But they are seldom prepared for the statistical problems they will meet in real life: how to interpret official statistics, how to compare data from different libraries, how to construct and evaluate operational indicators, how to argue with numbers.
Without a real demand for interesting data, library statistics will remain at the margin of our professional discussions. I am not saying that statistics is a VIS – a Very Important Subject. Statistics is more like cartography. It provides a framework for rational decisions. Without statistical data, we act blindly. Maps are useful for everybody. With statistics, we may still disagree about our route – but not about the nature of the landscape.
Library statistics can not be changed unless libraries change. Libraries can not be changed unless communities, users and educational institutions change. But this is hardly a problem. Our environment is changing before our eyes.
As I see it, statistical producers have three choices.
- They can run with the early adopters – the people and organizations at the forefront of change.
- They can walk with the careful and moderate reformers in the middle.
- They can make a stand with the stubborn defenders of tradition at the back.
I write for the first group, but will be happy to discuss these questions with everybody.
As a small experiment I will publish the paper Indicators without customers, for our satellite conference in Turku, as a series of blog posts.