The Future of the Future of Digital Stewardship
In December, my colleagues from NDSA and I had the pleasure of attending CNI to present the 2014 National Agenda for Digital Stewardship and to lead a discussion of priorities for 2015. We were gratified to have the company of a packed room of engaged attendees, who participated in a thoughtful and lively discussion.
For those who were unable to attend CNI, the presentation is embedded below.
(Additionally, the Agenda will be discussed this Spring at NERCOMP, in a session I am leading especially for Higher Education IT leaders; at IASSIST in a poster session, represented by Coordination Committee member Jonathan Crabtree; and at IDCC, in a poster session represented by Coordination Committee member Helen Tibbo.)
Discussions of the Agenda at CNI were a first step in the input gathering for the next version of the Agenda. In January, NDSA will start an intensive and systematic process of revising the Agenda for priorities in 2015 and beyond. We expect to circulating these revisions for peer and community review in April and present a final or near-final version (depending on review comments) at the annual Digital Preservation conference in July.
Part of the discussions at our CNI session echoed selected themes in Cliff Lynch’s opening plenary “Perspective” talk, particularly his statements that:
We [as a stewardship community] don’t know how well we’re doing with our individual preservation efforts, in general. — We don’t have an inventory of the class of content that is out there, what is covered, and where the highest risks are.
There is a certain tendency to “go after the easy stuff”, rather than what’s at risk – our strategy needs to become much more systematic.
In our discussion session these questions were amended and echoed in different forms:
What are we doing in the stewardship community, and especially what are we doing well?
What makes for collaborative success, and how do we replicate that?
I was gratified that Cliff’s questions resonated well with the summary we’d articulated in the current edition of the National Agenda. The research section, in particular, lays out key questions about information value, risk assessment, and success evaluation, and outlines the types of approaches that are most likely to lead to the development of a systematic, general evidence base for the stewardship community. Moreover, the Agenda calls attention to many examples of things we are doing well.
That said, a question that was posed at our session, and that I heard echoed repeatedly at side conversations during CNI, was “Where are we (as a group, community, project, etc.) getting stuck in the weeds?”
This question is phrased in a way that attracts negative answers — a potentially positive and constructive rephrasing is: What levels of analysis are most useful for the different classes of problems we face in the stewardship community?
As an information scientist and a formally (and mathematically) trained social scientist, I tend to spend a fair amount of time thinking about and building models of individual, group, and institutional behaviors, tracing the empirical implications of these models, and designing experiments (or seeking natural experiments) that have the potential to provide evidence to distinguish among competing plausible models. In this general process, and in approaching interventions, institutions, and policies generally, I’ve found the following levels of abstraction perennially useful:
The first level of analysis concerns local engineering problems, in which one’s decisions neither affect the larger ecosystem nor provoke strategic reactions by other actors. For example, the digital preservation problem of selecting algorithms for fixity, replication, and verification to yield cost-effective protection against hardware and media failures is best treated at this level in most cases. For this class of problem, the tools of decision theory  (of which “cost-benefit” analysis is a subset), economic comparative statics, statistical forecasting, monte-carlo simulation, and causal inference  are helpful.
The second level concerns tactical problems, in which other actors react and adapt to your decisions ( e.g., to compete, or to avoid compliance), but the ecosystem (market, game structure, rules, norms) remains essentially fixed. For example, the problem that a single institution faces in setting (internal/external) prices (fees/fines) or usage and service policies; is a strategic one. For tractional problems, applying the tools described above is likely to yield misleading results, and some form of modeling is essential — models from game theory, microeconomics, behavioral economics, mechanism design, and sociology are often most appropriate. Causal inference remains useful, but must be combined with modeling.
The Agenda itself is not aimed at these two levels of analysis; however, much of the NDSA working groups‘ projects and interests are at the first, local-engineering level: NDSA publications such as content case studies, the digital preservation in a box toolkit, and the levels of preservation taxonomy may provide guidance for first-level decisions. Many of the the other working group outputs such as storage, web, and staffing surveys, although they do not describe tactical models, do provide baseline information and peer comparisons to inform decisions at the tactical level.
The third level is systems design (in this case legal-socio-technical systems) — in these types of problems, the larger environment (market, game structure, rules, norms) can be changed in meaningful ways. Systems analysis involves understanding an entire system of simultaneous decisions and their interactions and/or designing an optimal (or at least improved) system. Examples of systems analysis are common in theory: any significant government regulation/legislation should be based on systems analysis. For institutional scale systems analysis, a number of conceptual tools are useful, particularly market design and market failure ; constitutional design ; and the co-design of institutions and norms to manage “commons” .
Working at this level of analysis is difficult to do well: One must avoid the twin sins of getting lost in the weeds (too low a level of analysis for the problem) and having one’s head in the clouds (thinking at such level of generality that analysis cannot be practically applied, or worse, is vacuous). Both the Agenda and Cliff’s landscape talk are aimed at this level of analysis and manage to avoid both sins to a reasonable degree.
Academics often do not go beyond this level of designing systems that would be optimal (or at least good) and stable if actually implemented. However, it’s exceedingly rare that a single actor (or unified group of actors) has the opportunity to design entire systems at institutional scale– notable examples are the authoring of constitutions, and (perhaps) the use of intellectual property law to create new markets.
Instead, policy makers, funders and other actors with substantial influence at the institutional level are faced with a fourth level of analysis — represented by the question of “Where do I put attention, pressure, or resources to create sustainable positive change”? And system-design alone doesn’t answer this: Design is essential for identifying where one wants to go, but policy analysis and political analysis are required to understand what actions to take in order to get (closer to) there.
This last question, of the “where do we push now” variety, is what I’ve come to expect (naturally) from my boss, and from other leaders in the field. When pressed, I’ve thus far managed to come up with (after due deliberation) some recommendations (or, at least, hypotheses) for action, but these generally seem like the hardest level of solution to get right, or even to assess as good or bad. I think the difficulty comes from having to have both a coherent high-level vision (from the systems design level) while simultaneously getting back “down into the details” (though not, “in the weeds”) to understand the current arrangements and limitations of power, resources, capacity, mechanism, attention, knowledge, and stakeholders.
Notwithstanding, although we started by aiming more at systems design than policy intervention, some recommendations of the policy intervention sort are to be found in the current Agenda. I expect that this year’s revisions, and the planned phases of external review and input, will add more breadth to these recommendations, but that it will require years more of reflection, iteration, and refinement to identify specific policy recommendation across the entire breadth of issues covered by the Agenda at the systems level.
 For an accessible broad overview of decision theory, game theory, and related approaches see M. Peterson , An Introduction to Decision Theory . For a classic introduction to policy applications see Stokey & Zeckhauser 1978, A Primer for Policy Analysis.
 There are many good textbooks on statistical inference, ranging from the very basic, accessible and sensible Problem Solving by Chatfield (1995) to the sophisticated and modern Bayesian Data Analysis 3rd edition, by Gelman et. al (2013). There are relatively few good textbooks on causal inference — Judea Pearl’s (2009) Causality: Models, Reasoning and Inference 2nd edition is as definitive as a textbook can be, but challenging; Counterfactuals and Causal Inference: Methods and Principles for Social Research, Morgan & Winship’s textbook, is more accessible.
 Market failure is a broad topic, and most articles and even books address only some of the potential conditions for functioning markets. A, good, accessible overview is Stiglitz’s Regulation and Failure , but it doesn’t cover all areas. For information stewardship policy the economics of non-consumptive goods is particularly relevant — see Foray, The Economics of Knowledge (2006); and increasing returns and path dependence are particularly important in social/information network economies — see Arthur 1994, Increasing Returns and Path Dependence in the Economy.
 See Lijphart, Arend. “Constitutional design for divided societies.” Journal of democracy 15.2 (2004): 96-109. and Shugart, Matthew Soberg, and John M. Carey. Presidents and assemblies: Constitutional design and electoral dynamics. Cambridge University Press, 1992.
 The late Lin Ostrom work was fundamental in this area. See for example, Ostrom, Elinor. Understanding institutional diversity. Princeton University Press, 2009.