Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Friday, June 14, 2024

The Data Detective: Ten Easy Rules to Make Sense of Statistics by Tim Harford: A review


The US edition was published in 2021 -- so I am a little late to the game. But this is a great read. It does not delve deeply into math, but could be a companion to discussion on web evaluation ("fake news") as well as basic math literacy.

Here are the ten easy rules ... but I strongly recommend reading the book:

  1. Search your feelings
  2. Ponder your personal experience
  3. Avoid premature enumeration
  4. Step back and enjoy the view
  5. Get the backstory
  6. Ask who is missing
  7. Demand transparency when the computer says no
  8. Don't take statistical bedrock for granted
  9. Remember that misinformation can be beautiful, too
  10. Keep an open mind

And ... the golden rule: Be curious

And while he has a web site [https://timharford.com/] in my initial wandering around, it does not say much about this work on statistics [although you can readily purchase the book on his site] ... he has moved on to other topics.


Wednesday, January 19, 2011

Links - January

Walt Crawford asks about data on libraries (as institutions) using social networking.

The inimitable Jessamyn West calls one of her posts Blogging Alone – Social Isolation and New Technology from Pew. It is thoughtful and related to the question above (at least a little). She has also posted about a term new to me (but which makes sense): search neutrality.

Aaron Tay wonders about the effect on libraries of Delicious closing down (or not). [Note to self: Get work-related issue back on the discussion table.]

There is a thoughtful piece in American Libraries Online about outsourcing, from a consultant who helps libraries get through the process of becoming efficient without outsourcing.

I don't usually get to teach in a formal setting, but there are occasions in my new job where I will. I pay attention to what Iris Jastram says about what she figures out about teaching and learning. As an academic setting, her teaching takes place in a very different setting. She is teaching part of a structured, formal, semester-long course. When I teach it is a 90-minute web course, or maybe a half or full day, skills-based focused course. I found a great deal to glean from her post on specialization.

Iris also wrote a paen to the "reference interview" which took the conceptual issue further and applies its principles to broader issues in her work community.

I have not read much about the "generational divide" recently, however, Librarian Kate gave her reaction to an article on KPBS which came out of the recent ALA Midwinter meeting in San Diego. (Original post here.) As a boomer living with a NextGen librarian, I am not sure I agree about any of the generalizations, but the view is important.

And on a totally unrelated topic Fonts. Salon recently had an article on fonts. Wired also had an article on fonts. Both are drawn from the original Princeton study (which....attention Dorothea Salo seems to be OA article!.)

Monday, September 14, 2009

Next Transition -- Deja Vu

Today begins my next transition at work.

I started working at the State Library of Louisiana (SLOL) on December 1. I was hired to be a Library Consultant, and to be the State Data Coordinator (SDC). As the SDC, I got thrown right in to gathering the data for FY 2008 (in Louisiana, that is the same as the calendar year). It was fun! I had the opportunity to deal with every public library in Louisiana. The state data report has now been published.

A little more than a week ago, I was asked to take on some more responsibilities. The SLOL, like many state agencies in Louisiana [and across the country], had some budget challenges in the "new" fiscal year (for state government that is July 1 - June 30), and some positions were not funded and others lost.

One of those was the head of the Reference Department. I was asked to take on those responsibilities, in addition to my current ones of SDC and doing special projects like the Library Support Staff Certification Program (LSSCP) for which SLOL is one of the pilot sites.

So, I am back to middle management. Since July 2008, I have not been a supervisor. A part of me really likes that. In a way I am back to the same kind of position that I was in beginning in October 1982, when I became Librarian III at the then Tucson Public Library. I continued as a "middle manager" through my first stint at Bridgeport Public Library (April 1983 - April 1985). After that I was "the boss" until July 2008. (That is a quarter century for those counting.)

I have not worked a public service desk since I left Wilton in December 1994. I have a lot to learn -- again. I have some new staff to work with. And, we have a chance to fill a vacancy soon, so I get to start with someone new, too.

Life is full of changes. You never know what is next. [I also believe in "Never say 'Never!'"]

Wednesday, July 08, 2009

Links and comments

ALA is coming up at the end of the week. Time to clean off the desk -- actual and virtual!

What counts as "broadband?" Jeff Scott has a good analysis with comments about this. His comment that defining it as one-half of a T-1 line is short-sighted, is insightful. Louisiana was one of the first states to have Internet available to the public in every "main" library in the state. Many are finding that the bandwidth currently provided is not adequate. Many libraries here, and elsewhere I assume, were hoping that the stimulus funding would help them improve network speed. This seems not to be the case.

Annette Day and Hilary Davis have an interesting article about the process of journal selection (and de-selection) for academic libraries in a group blog which I have only recently found: In the Library with a Lead Pipe.

Meredith Farkas is back and continues to be insightful and probably ahead of her time. She talks in a recent post about relying on free/web-based services to deliver critical functions. Her post makes me also think about how is our discussion about things like Library 2.0/Web 2.0 being stored for the future. We can read about the late 19th discussions on the pros and cons of public libraries collecting and circulating fiction because those discussions took place in print. Will library scholars of the end of this century be able to do the same for our discussions? (I'm also going to point to a related/similar discussion imbedded in Walt Crawford's August (?) Cites and Insights. I am sure, since I was reading it in the doctor's office yesterday, that his "Writing about Reading" has influenced my thoughts on Meredith's post.)

I am watching the Google OS situation somewhat closely, in part because we are considering purchasing a "netbook" for our regular travel.

Finally, is my new work love: statistics. In a short (3 minute) TED talk, mathematician Arthur Benjamin talks about the need to redefine the teaching of mathematics. Currently, calculus is the "holy grail" or highest level. He notes that most of us do not use calculus in our daily lives. (Engineers are exempt from this characterization.) However, if everyone had a better understanding of statistics, a lot of us would do better in life.

The link is above....I will try to embed it here:


Wednesday, June 03, 2009

Silk Purses and Sows Ears? Assessing the Quality of Public Library Statistics and Making the Most of Them: PLA Spring Symposium 2009 - Morning I

The program began with an introduction by Joe Matthews. He went over the handouts (in paper) and reviewed the agenda review. He reminded ups that the only dumb question is the unasked one.

Is is possible to develop a "library goodness scale?"

What is good, what is a great library? This is an interesting challenge to define.

In a library organization management’s responsibilities are:

defining goals;

  • obtaining the needed resources;
  • identifying programs and services to reach the goals;
  • and using the resources wisely.

There are benefits and challenges: lots of performance measures -- most libraries have too many which are never used. (You have the authority to stop collecting data if it is not being used.)

A very important concept is "You get what you measure." He cited an example of police performance measurement. As a result of the measure used (minor quality of life issues) the community had man cops reporting pot holes – including the same pot holes day after day. The measure, reports filed, was incredibly high. The solving of crimes was not. As managers we need to refine the performance measurement system to reflect what you want.

Benefits and challenges: role of evaluation not to prove but to improve; provides feedback on actual performance; develops a culture of assessment. When data is disconfirming, report is often ignored rather than addressing the issue raised.

Efficiency & Effectiveness

Efficiency is the internal perspective: are we doing things right? Effectiveness is the external perspective: are we doing the right things? It is an important distinction.

The Library-centered view: how much, how many, how economical, how prompt?

Types of measures: Leading v. lagging: circ is lagging, what you did last month; historic data.

Leading is something that lets you forecast demand: pre-registration figures. In Joe’s opinion there is no relationship between inputs and outputs in libraries!

Leading indicator at reference: Very few libraries use reference data they have to change the staffing pattern at the reference desk. There is no leading data for reference queries...it may be the number of Google searches that month. He quoted OCLC Perception data on use of library reference as first source 3% of the time. You can forecast from past data trending. Should change staffing pattern, should get rid of reference questions....

A leading indicator could be a "high holds list" for items on order; another could be the school district calendar for staffing the reference desk.

Question on interpreting data when users asked what they want. Triangulation, partly asking what they want, customer satisfaction data, focus groups.

Measures need to be: SMART: Specific (accurate), Measurable, Action oriented, Relevant (clear), Timely

It is also important to review the data, and how it is collected and reported. In one library, the gate count suddenly doubled. When a manager went to check the manager discovered that there was a new staff member reporting it – the gate counted both those entering and exiting, and the former staff member correctly reported ½ of the number as the attendance. The new staff member did not.

Why do we use the data? There are several reasons: to help understand demand; to demonstrate accountability; to help focus; to improve services; to move from opinions to use of data, more responsive to customer needs; communicate value.

When we collect data we make some assumptions. For instance comparability (why does 3-week book count the same as 2-day DVD) [Joe also made an argument to not include renewals as part of circulation]; accuracy [how to count reference? ticks or argues to use gate count as an indicator; also argued for sampling--demonstrated busy-ness, need to demonstrate value] blow up reference desks....get rid of them.

Performance -- often bunch of numbers and no historical context, last 2-3 years of data.

Problem is failure to keep pace with ever rising expectations.

Larry Nash White presented next on the Library Quality Assessment Environment

He noted that he was raised by grandfather who was an efficiency expert.

Performance person in the library actually knows more about what is going on in the library. Statistics and metrics are like tight fitting clothes, they are suggestive, but not completely revealing.

History helps tells us where we have been. Most of what we measure we stole from somewhere else.

We have measured parts, how do we measure the whole. In 1934 Rider developed a way to maximize efficiency using costs. "If we don't assess things and do it correctly, then others from outside of the library will come and do it for us." (Rider 1934) About 100 library systems around the country are run by an outsourced firm (LSSI and others).

Google in 9 hours answers as many reference questions as all libraries in the US in 2006.

1939 was first customer service survey. 50s and 60s saw the quantitative crunch. Smile ratio as a measure? Especially when there are more smiles on the other side of the counter.

What is happening today? What are the influencing factors?

How many have enough resources (money, time, staff)? No one. [Great story about Santiago, Chile library. Single building of 275,000 square feet, 75 staff, 75,000 items to serve a city of over 5 million from one building.]

Increasing stakeholder involvement is important. When you want to keep your stakeholders out, that is a bad sign. They bring in own perceptions, biases, etc. which you must work with.

Technology is neutral, it is intent which the value. How we use it to deliver service it is made good or bad. How effective is our technology service. Total cost of ownership studies. Anti-tick marks. Use technology to count wherever possible. Use automation system to count computer use, reference questions, directional questions. An ILS is really good at counting. Can do location by location and hour by hour.

We are always borrowing from someone else. Libraries are using what business world gave up years ago. And they are tools that were often designed for something else.

Time is affecting what we do.

More quantitative data is wanted by stakeholder, more qualitative data is wanted by profession. This is a tension/division.

A wider scope is needed to assess and improve the process. Dynamic alignment: held up knotted string, not a macramé -- used as an analogy for our performance assessment environment (not much give). Do you have the right things in place, counting the right things and giving the right answer. (Pulled in the right way, and it became a single string.) When we align our assessment we need to continually align because of the changes in the environment.

Future predictions

  • More assessment.
  • More quantitative data to support quality outcomes
  • More intangible assessment. (Many things we do are intangible, and are important.) What would it look like if we started reporting the air.
  • More assessment of organizational knowledge
  • More assessment of staff knowledge (human capital) are we effectively assessing the use of that resource.
  • Increased alignment of assessment process.
  • [Intellectual capital. Human capital -- what people know. Structural value -- what is left when people go home. Value of the relationships: stakeholders, vendors, partners.] Report the value created. Wherever we spend money we need to report the value of what we do.

Ray Lyons then talked about Input-Process-Output Outcomes Models

IMLS has now embraced the United Way’s language. But there are also program evaluation sets, and a 1993 Government Performance Results Act.

He showed several graphics including "Program Evaluation Feedback Loop." It is considered to be a rational process. It is also very stagnant which ignores political issues.

If you remember why you are doing this, you can often come up with your own answers to your questions.

Evaluation questions include "merit." Orr's model does not include stakeholders very well, they are listed as "Demand" How can you produce demand?

Performance Assessment is often blind to unintended consequences. Does not ask: what are the real needs of the community?

Input statistics, should only be used only in connection with outputs, only potential for services. Output statistics measure current level of performance.

Goals are often related to the statistics. Aren't you going to reach a point where you can no longer improve?

Interpreting output statistics: interpret in relation to goals and is left up to the library. There are no standards for evaluating quality or the value of the items. We also don't look at the relationships between the data elements. (Or don't trust the judgments we make.)

PLA Spring Symposium Notes

So, I was looking in blogger, and found that I had not posted these.

They need some editing, the first will come up today. I am working on the others.

Saturday, April 04, 2009

Silk Purses and Sows Ears? Saturday Morning

If you accept the metrics you have always used, have the same audience, etc. you are setting yourself up to fail. Always look for a different way to tie the knot.

LJ Index: Ray Lyons

Contingent valuation definition: value that someone is willing to trade for something else. What else will equal the item in question.

Looking at measures in general...research project -- let's just look at one.

  • Library ratings are contests
  • Comparison to library peers
  • Performance is not gauged according to objective standards
Rules are that are chosen by the person running the contest. You must have rules to have any kind of evaluation. HAPLR was the first, the pioneer. We have to compare libraries to peers because we do not have standards.

They are based on standard library statistics. They do not measure: quality, excellence, goodness, greatness, value.

They do not account for mission, service responses, community demographics, or other factors.

Selection of statistics and weightings are arbitrary. Assume higher statistics are always better, and adopta one-size-fits-all approach (all libraries are rated using a similar formula).

Simplistic measures are created primarily for library advocacy. They are subject to misinterpretation by library community, the press, and the public.

Current rating systems: BIX (German Library Association), HAPLR, LJ Index

It is a totally arbitrary method. The more different methods, the more different views of the world.

Uses library expenditure levels as peer comparison groups. If you chose population a similar distribution would exist.

Measures service output only. Libraries must "qualify" to be rated: pop over 1,000; expenditures of more than $1K; meet IMLS definition, and report all those to IMLS

Reference questions are statistically significantly different in correlation to other items. Look at outlying values most of which occur in the smallest libraries.

Indicators chosen: circulation per capita; visits per capita; program attendance; public Internet computer uses. If libraries do report data, can not be retrospectively added. This is a contest not a pure scientific event.

There are anomalies in the data, it reflects the "untidiness" of the IMLS data. Chose to do per capita statistics. It can be an unfair advantage/disadvantage depending on whether the official population accurately represents service population.

Libraries are rated by how far above or below the average a library is. Calculate the mean, standard deviation. Score given to data to show how spread out data is.

Create a standard score: compares visits to the mean and divide by standard deviation to get a score. Your score should not be influenced by the others in your group, and therefore this is not a real scientific evaluation process, and does not measure quality.

What is the point, data is old ... advocacy is the reason to do it. We are in a profession where technology is driving change. Perhaps we really need to change.

What can you squeeze out of the data we have? Is this what we should do?

(Hand out....adjustment to get rid of negative, then get rid of decimal point.) Number looks very precise, but it in not very precise.

Advocacy -- showcases more libraries, encourages conducting and publicizing local evaluation

Encourages submission of data, and emphasizes the limitations of the indicators.

The model is inherently biased. Measures service delivery. If other stats chosen, other libraries could move to the top. Comparison between groups is inherently impossible.

Encourages assessment and collecting of data not previously collected. How many can you list. This is a contest and not a rigorous evaluation. Five stars went to an arbitrary number. Partly determined by space in the journal.

Customer Satisfaction -- Joe Matthews

Customer satisfaction is performance minus expectations. It is a lagging and backward looking factor. Not an output or outcome, it is an artifact.

Survey: Can create own survey, borrow from others, use a commercial service (Counting Opinions), need to decide when to use.

Need to go beyond how doing, and ask about particular services, ask respondents how are they doing, open ended questions elicit a high response rate. Most surveys focus on perceptions rarely ask about expectations. (PAPE - Priority And Performance Evaluation)

Service quality: SERVPERF - Service perfomance; SERVQUAL - Service Quality (see handout); LibQUAL+ for academic libraries.

LibQUAL+ is web based, costs $5K per cycle, and public libraries who have tried it have generally been disappointed.

Telephone surveys are being abandoned as more and more people are dropping land lines (partly to avoid surveys).

Cannot do inferential analysis if response rate is less than 75%

Single customer satisfaction survey to loyal customers: How likely is it that you would recommend X to a friend or colleague using a 10 point scale. Net Promoter Scores (NPS) (handout)

The library fails 40% of the time (although for a range of reasons). One of the worst things is to tell people that there is a long wait for best sellers.

Look at wayfinding in the library. Hours, cleanliness as well as comfort and security are very important. One library put flowers in both the men's and women's restrooms. Review when you say no.

Take a walk in your customer's shoes, but remember that you are also wearing rose colored glasses.

Hotel staff are trained in making eye contact and greeting.

Friday, April 03, 2009

Silk Purses and Sows Ears? Assessing the Quality of Public Library Statistics and Making the Most of Them: PLA Spring Symposium 2009 – Afternoon

Conundrum: is a library excellent because it is busy or is the library busy because it is excellent?

Measuring excellence is difficult.

Output statistics are not useless, but they cannot tell us about quality, excellence, or if there is a match between community need and services.

Interesting examples of use of data from the afternoon. One branch was looking at cutting. Circ per hour was same first hour or last hour, but all circ last hour was from one person, where there were many the first. Therefore closing last hour inconvenienced only one person. Or rearranged collection so easier to find, but while circ went up, reference questions went down (and patrons were more satisfied).

Cost benefit analysis has several advantages quantify monetary value of library services. How do you chose the economic value. Technical ideas: consumer surplus valuation; contingent valuation (how much would you pay for ... ) or Willingness to pay.

Select specific services delivered to specific audiences.

One advantage is that it is well known in the business community. It can be used over time for specific services or products. Cost can be high since it involves surveying the community. It is possible that you may choose and area to study, when you could be missing other areas which the community values more than you think.

Larry White

Estimated cost of performance assessment in Florida in 2000 was $16 million, but the state only gave $32 million in state aid, and therefore 1/4 to 1/2 of state aid was wasted.

Outcomes based performance measurement takes a long time to generate results...years.

Return on investment. Everyone hopes a high return on investment.

ROI is used by business. Happy stories and smiley kids did not work. Created an ROI. First year 6 to 1. Took a buck from commissioner, promised return. I created this (showed stats). Showed ROI, then told a story.
Combination of cost avoidance and return revenue wise. Used data from geneology/history room list, took local folks and multiplied by figures by tourism for local. Then estimated distant customers for overnight stays. Showed library accounted for $500,000 in tourism revenue. Then cost avoidance (average cost per book times number of circ -- because then people did not have to purchase).

Data mining for libraries gets to be an important role in the community. Need to use a combination of numbers and words in a creative fashion.

Can use it to justify what you want to do and to save what you want to do. It is scalable.

Lack of consensus in value and use. It is usually used defensively (preserving library funding) and reactively. We wait for disasters to tell the story.

Joe Matthews

Summer Reading Program

In Portland OR, 75,000 in program, raise $60K, fly family of 4 to Disney Land. Only 30% complete. Real outcome is improvement in reading level. Cost is high. Can be done with third party agency for confidentiality of kids.

Encourage to start thinking about outcome side of things do they spend more time reading, do they do better in school. One place does a survey of caregivers about perception of outcomes.

Statistics

Context, trends, history.

Age of collection as a stat.

Unintended consequences of performance measures
  • assessment process consequences (survey fatigue)
  • tactical process consequences
  • strategic process consequences
Assessment process consequences: changes in organizational culture; changes in operational processes; changes in organizational procedures/policies; technology’s impact. We can assess them more often, and faster, too! How far do we assess, what about the user of your web page in China who wants the Mandarin version of the page.

Tactical consequences: operational (how you work); systems (can create an 'us v. them'; can look at forest and forget the trees); financial (ten to one return speech -- loaned out to economic development dept.); service ("new Coke"); organizational impacts (can bring good things).

Strategic consequences occur over a long period of time. Operation supported an unethical behavior to support the need to constantly increase circulation. Problem where assessment drives the mission rather than the mission driving the assessment.

Final of the day: Management Frameworks (Joe Matthews)

Three Rs: Resources, Reach, and Results.

Resources: how much money do we need to achieve
Reach: who do we want and where
Results: What do we want and why

Choose only two or three measures. It is important to think about customer segments. (other than demographics) they come with different needs and with different expectations.

Performance Prism (see handout) used in England and New Zealand and Australia

Balanced Scorecard

Financial perspective: saving time, reducing costs, gnerating new revenues, share of the pie
(see hand out....)

Building a Library Balanced Scorecard: Mission and Vision Statements; Strategy -- how to reach the mission; select performance measures; set targets or goals; projects or initiatives

Strategic plans often do not use the word strategy. Most use one of two approaches: conserative, reachable, or the scientific wild ass guess approach, but you may want to have a BHAG.

The scorecard is usually created by a small group with the results shared with the stakeholders.

Silk Purses and Sows Ears? Assessing the Quality of Public Library Statistics and Making the Most of Them: PLA Spring Symposium 2009 - Morning II

Larry White

We need to tell our stories better.

Every one of the 2,000 FedEx outlets report several thousand data elements every day. It is collected electronically, compiled company-wide, and delivered to upper management with comparatives. Response lag of less than 12 hours. Took asessment and made it a value added service.

Walmart has new data server farm multi-petrabit storage. All data kept and stored for 2 years. When something is sold, a replacement item is leaving from the warehouse to replace it on the shelf.

More metrics need to be automated, and more frequently performed. If we don't figure out how to do it for ourselves, someone is going to come in and do it for us.

Ray Lyons

Challenges of Comparative Statistics

Choosing a comparable library - there is no answer.

We need to be as diligent as we can to get a satisfactory answer even if it is not as satisfactory as we want it to be.

It is also about accountability.

See book on municipal benchmarks in footnotes. (Organization is in favor of standards existing to see if you are doing ok.) If money is not attached to standards, there may not be a reason to adhere to standard.

Types of benchmarking: data comparison using peer organizations; and analyzing practices for comparable purposes; seeking most effective, best practices.

Benchmarking steps: define measures to us (who will be involved); identify appropriate partners; agree on ground rules (ethical issues); identify objectives; document profile and context; collect data; analyze data (including normalizing, e.g. per capita); report results; and then use them for change.

Is it important to think about community need and culture. Important to chose libraries which have the same service responses.

Need to both count what you want/need, but also need to report using the standard definitions so that the data can be compared. Another person noted that staff feel that what is counted is valued.

Peer measures: average individual income; college education; # women not working outside the home; school age children; geographic area

Study of what output measures that Ohio library directors used: 3 regularly: material expenditures, operating expenditures, circulation. [These are the easiest to find, and easy to define.]

Now moving to use of internet and job application sessions, use of computers.

Recommendation: at a minimum identify peer libraries:
  • service area population
  • key demographic characteristics
  • library budget
  • service responses

Some libraries are in "fertile settings" which can explain statistical performance.

Joe Matthews
Activity Based Costing

Handout based process:

Figure costs....salaries. Handout includes cost of library, which as a municipal library does not include utilities.