Output and Income Indicators

Image by Michael Reichelt. Downloaded from pixabay.com

In my previous Fan post (“Indicators of Government Expenditures”) I noted that, when using output indicators such as GDP, we should keep in mind that: a) there are important limitations to this indicator, and b) when used, there are different indicators that may be more or less appropriate for different purposes. I develop a bit on those two points here.

On the first point, an assessment was done in 2008 by a commission led by three economists, two of which Nobel prize awardees, at the request of the Government of France, and later summarized in a book. I draw from it here, although additional details are available online.1

The commission was led by Joseph Stiglitz, Amartya Sen (both Nobel Laureates), and Jean-Paul Fitoussi. Other economists were also part of the commission. The commission was divided in three working groups:

  • One to focus on standard issues of national accounting, such as measuring government output and treatment of household production;
  • A second group focused on the relationship between output measures and efforts to measure well-being or quality of life;
  • A third group looked at attempts to capture sustainability in measures of output.

On the “classical GDP Issues,” GDP mainly measures market production, and one reason why money measures have come to play an important role in our evaluation of economic performance is that money valuations facilitate aggregation. However:

  • Prices do not exist for some types of output (e.g. government services provided free of charge or household services such as child care);
  • Market prices may not reflect consumer’s appreciation of goods and services if there is imperfect information (e.g. financial products, telecommunications bundles);
  • Market prices may not fully reflect societal evaluation due to externalities (e.g. environmental costs);
  • Collecting accurate data may be challenging when there are sales or differences in prices among alternative selling mechanisms (e.g. online vs store prices);
  • Accounting for quality of products and changes in quality is challenging and may not always be reflected in prices;
  • Underestimating quality improvement means overestimating inflation, which, in turn, means underestimating real income.

These are not minor inconveniences, but real issues, and the extent to which GDP measures are distorted by them is not clear. They discuss in some length the issues with measuring services, for example. Services account for up to two thirds of output and measuring the quality of services is challenging. Measuring government provision of services, for example, is often done through inputs, which leave aside the possibility of capturing changes in productivity. Attempts to measure government services using outputs face known challenges, such as accounting for quality. What services are considered final and what intermediate (or “defensive”) services is difficult to define. E.g.: government costs with prisons? Private costs with commuting?

The authors suggest five ways of dealing with some of the deficiencies of GDP as an indicator of living standards:

  1. Emphasize well established indicators other than GDP
    • Gross, rather than Net, has the issue of not accounting for the amount of output that is needed to maintain capital goods (depreciation). When technology is changing rapidly, this could be substantial and the difference between Gross and Net can be considerable. Then – consider “Net” (although depreciation is hard to estimate);
    • Product, rather than Income, has the issue of not being as good for accounting household consumption and, therefore, associated well-being. The difference is the purchasing power sent to and received from abroad (net income from abroad). Also, changes in the relative prices of exports and imports will affect national income even if domestic product stays the same. Consider “Income;”
  2. Consider wealth jointly with consumption to capture consumption possibilities over time;
  3. Bring out the household perspective
    • Adjusted disposable income accounts for government taxes and monetary transfers but not for transfers in kind;
  4. Add information on the distribution of income, consumption and wealth:
    • Median is better than average, but depends on survey data and these have known challenges:
      • Unit of measurement? Consumption unit?
      • Measuring property income?
      • International comparability
      • Whose bundle of consumption?
      • Changes in the provision of services within households or between families to provision by markets creates distortions
    • Also, we should be looking at distribution of full income, not just market income, including values such as household income and leisure
  5. Widen the scope of what is being measured (may require imputation):
    • Recommendation is to keep a satellite account because: a) imputed values are not as reliable as observed values; b) non-observed values could end up being a very large share of total output. E.g.:
      • Household work, under the authors estimates, could be 30% of currently measured GDP;
      • Leisure could be 80%;
    • They still recommend it be done for a) completeness; b) the invariance principle – under which the value of a good or services should not depend on the institutional arrangement under which it is provided (e.g. free by state or charged by private sector).

The other two areas taken on by the commission working groups are more intuitive to me, even if not easy to address so I only briefly summarize the conclusions of the corresponding working groups:

  1. On the relationship between output measures and efforts to measure well-being or quality of life, the argument is that these latter concepts cannot be reduced to resources. Efforts to measure well-being and quality of life have either attempted to measure subjective perceptions, tried to assess capabilities that would enable and support human functioning (health, education, security…) or tried to identify how individuals themselves weigh the non-monetary aspects of their well-being. All these attempts face challenges, including how to incorporate inequalities, how to access the linkages between the various dimensions of well-being or quality of life, and how to aggregate them;
  2. On attempts to capture sustainability in measures of output, there is a large and varied literature that the commission divided in four groups: attempts to establish large dashboards with sets of indicators addressing different aspects of sustainability; attempt to develop composite indices; attempts to develop adjusted GDP indicators; and indicators focusing on overconsumption or overinvestment.

What do I draw from the above? A few initial thoughts:

  • When using an indicator of output growth for a selected country or group of countries, I have typically used the World Bank, World Development Indicators (WDI), Gross Domestic Product (GDP) series in Local Currency Units (LCUs). I have used LCUs when looking at growth instead of alternative monetary units, to avoid the influence of short term fluctuations of exchange rates. Attempts to correct for this influence, such as the World Bank’s Atlas measure (more on this below) or the use of Purchase Power Parity (PPP) measures seem unnecessary, given their imperfections and that we are only interested in growth and not in comparing the absolute value of output among countries. This series can be used to break down domestic output in its expenditure components (G+C+I+Ex-Im+changes in inventories), as well as by sector of the economy (agriculture, industry and services)2. It is available for a period of over 60 years for most countries. Based on the input above:
    • The use of output rather than income indicators when looking at growth seems reasonable to me and perhaps more relevant: it better reflects the production capacity of a country (rather than its standard of living) and, for most countries, output and income do not tend to diverge much over time (although this may not always be the case and would be interesting to look at the data).
    • The fact that GDP indicators do not capture household production means that growth is likely overestimated during periods where agricultural production for own consumption is reduced and production for the market is increased. GDP growth is also likely overestimated during periods of increased entry of women in the labor market, if this also means decreased services within the household. I would need to further research the WB WDI methodology to see the extent to which the WB tries to address this issue in their measurements;
    • The extent to which the informal economy is captured also requires further look into the WB WDI indicator methodology. If it does not capture the informal economy well, growth would also be overestimated during periods of formalization.
  • I have used The World Bank, World Development Indicators (WDI), Gross National Income (GNI) series in Purchasing Power Parity (PPP) when comparing countries. I have preferred to use at the concept of income (what belongs to the residents of a country) rather than product (what is produced within the boundaries of a country) when comparing countries because it is a better indicator of resources available to the local population. For cross country comparisons, PPP measures (even if imperfect) allow some correction for price and exchange rate distortions regarding how much residents of two compared countries can actually purchase with their income. This series is available for fewer years and countries. Based on the input above:
    • Periods of rapid technological transformation – such as the one we are in now – are likely generating considerable distortion in our relative measurements of income by country, given the challenges in addressing quality of products and services. To the extent that we are able to use net indicators (as opposed to gross), accounting for depreciation in such periods is also a more serious challenge and a source of distortion.
    • Does our association of value with market prices mean that our association of income per capita with productivity is somewhat distorted? I explain: think of luxury goods, where price is not necessarily associated with quality but where status of a brand plays an important role in product prices. Countries with heavy presence of luxury industries will have their per capita incomes associated with this higher price that is fabricated by the status of their products rather than by the quality of their products. How we understand the productivity of their population would need to be interpreted in this context (Italy, I am thinking of you).
    • Do the decaying European houses (that we think of as so charming) mean that European household income tends to be overestimated by the use of gross measurements?
    • On the other hand, does the fact that we do not capture the value of leisure underestimate European household income relative to countries like the US?
  • The World Bank uses GNI per capita in US dollars converted from local currency through the Atlas method to classify countries in income groups (low income, lower middle income, higher middle income and high income). The Atlas method is based on three year moving averages of exchange rates. They use the Atlas method rather than PPP arguing that “issues concerning methodology, geographic coverage, timeliness, quality and extrapolation techniques have precluded the use of PPP conversion factors for this purpose” (World Bank, undated). This seems to also be the indicator the WB uses for establishing the annual threshold for countries to qualify for International Development Association (IDA) loans. The US Millennium Challenge Corporation (MCC) uses the WB country income groups to select countries that qualify for its assistance (low income and lower middle income). Based on the input above:
    • If we underestimate income in low-income economies, given that they often also have larger portions of their economies not captured by GNI measurements (greater presence of subsistence agriculture, household production and services, informality), what does this mean for our categorization of countries in income groups? How distorted are these classifications? Should we be interpreting them as rather “market income” groups? If so, to what extent are our foreign assistance programs directed at increasing “market income,” rather than income as a whole? To what extent are our foreign assistance impact evaluations distorted by not recognizing this distinction?

Notes

  1. There used to be a site with technical papers at the URL: www.stiglitz-sen-fitoussi.fr . This seems to no longer be available but I found a link to the content here: https://web.archive.org/web/20150622185128/http://www.stiglitz-sen-fitoussi.fr/en/index.htm
  2. The WB World Development Indicators reports total value added at basic or producer prices and GDP at purchaser prices. That is why their measurements differ. Purchaser prices include taxes and exclude subsidies. For more information, see here: https://datahelpdesk.worldbank.org/knowledgebase/articles/114948-what-is-the-difference-between-total-value-added-a

References

Stiglitz, Joseph A; Sen, Amartya; and Jean-Paul Fitoussi. 2010. Measuring our Lives: Why GDP Doesn’t Add Up. The Report by the Commission on the Measurement of Economic Peformance and Social Progress. The New York Press.

World Bank. Undated. Why use GNI per capita to classify economies into income groupings?. Available: https://datahelpdesk.worldbank.org/knowledgebase/articles/378831-why-use-gni-per-capita-to-classify-economies-into. Accessed: June 08, 2024.

Continue ReadingOutput and Income Indicators

Indicators of Government Expenditures

Image by Abraham Bosse. Downloaded from picryl.com

The International Monetary Fund (IMF) has a couple of public dashboards showing government expenditures as a percentage of Gross Domestic Product (GDP), by country. See here and here. There is nothing wrong in doing this if we keep in mind that we are using GDP as a denominator just as a tool to give us a reference of the relative size of government expenditures in different countries. But, based on this kind of data, it is common to hear things like “government expenditures were 61% of the entire French economy or 45% of the US economy in 2020,” as if these numbers were breaking down the total of the economy (100%) in its government and non-government portions. This would be incorrect and, unfortunately, it ends up supporting all sorts of confused discussions about the role of government in the economy.

The comparison between government expenditures and GDP is one of apples and oranges and only makes sense if we understand, again, that GDP is being used as a denominator only as a convenient tool to facilitate country comparisons. Government expenditures, as reflected in databases like that of the IMF, are measures of total expenditures, either by central and local governments or just by central governments (depending on the country), over a one year period. GDP does not measure total expenditures, but rather “value added” by the economy over a one year period. The difference is that measures of value added discount from measures of expenditures, the purchases of intermediate goods and services used to provide the goods and services by the sector in question. Value added is used when measuring output by sector, to allow summing these sectors without double counting. The result is a general measure of output, such as GDP.

To illustrate, see the table below (Figure 1). The second column shows the government as a share of GDP in 2020 for selected countries, as measured in total expenditures and reported by the IMF. The third column shows government consumption as a share of GDP, as measured in value added and reported by the World Bank World Development Indicators. The actual share of the GDP that corresponds to the government would need to add government investment (fixed capital formation) to government consumption. These data were not readily available for most countries in the WB WDI dataset and it seems like disentangling government and private fixed capital formation is not very simple. So I added total fixed capital formation (public and private) to government consumption, for the sake of comparison with IMF numbers (fourth column). The actual weight of the government in GDP should be somewhere between columns three and four.

Figure 1. Government Relative to GDP, Selected Countries, 2020

CountryGovernment Expenditures as % of GDP (IMF)1Government Consumption (value added) as % of GDP (WB)2Government Consumption +Total (public and private) Fixed Capital Formation (value added) as % of GDP (WB)2
France61.3524.8448.12
Germany50.4622.0243.57
Brazil49.9220.1436.70
United Kingdom49.8722.6040.07
United States44.8215.0936.94

Sources: 1. IMF DATAMAPPER. Fiscal Monitor, October 2023, https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC. 2. World Bank World Development Indicators. Accessed April 2024, https://databank.worldbank.org/source/world-development-indicators.

Note: government expenditures in 2020 were generally higher than usual, as countries tried to minimize the economic effects of the COVID 19 pandemic.

I am sure there are better data out there somewhere but, after spending some time trying to unbury the IMF metadata (should be more easily findable) my patience was running low. For the US, see data from the Bureau of Economic Analysis which defines the value added by Government as being “the sum of compensation paid to general government employees plus consumption of government owned fixed capital (CFC), which is commonly known as depreciation (BEA, 2008, p.29).” My point still holds.

Another way of looking at the actual weight of government expenditures in the economy would be to compare, not with GDP, but with total output in an economy over a one year period, that is, not discounting intermediate products and services. Country national accounts typically do show this indicator and it tends to be roughly twice as large of the total value added in any one year. The ratio of total output to value added is available in Table 2.6 of the United Nations (UN) National Accounts Statistics. Figure 2 below applies that ratio to the IMF indicator of government expenditures as a share of GDP to obtain a rough estimate of the share of government expenditures over total output in the last column of the table. Note that the resulting estimates are within the range of columns 3 and 4 of Figure 1.

Figure 2. Government Relative to Total Output, Selected Countries, 2020

CountryGovernment Expenditures as % of GDP (IMF)1(a)Ratio of Total Output to Value Added (UN)2 (b)Rough Estimate of Government Expenditures as % of Total Output (a/b)
France61.351.9531.42
Germany50.462.0324.83
Brazil49.922.0724.14
United Kingdom49.871.8926.40
United States44.821.7725.39

Sources: 1. IMF DATAMAPPER. Fiscal Monitor, October 2023, https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC; 2. UN National Accounts Statistics. Main Aggregates and Detailed Tables. Table 2.6, Accessed April 2024, https://unstats.un.org/unsd/nationalaccount/madt.asp?SB=1&#SBG

Again, I am sure there are better data out there, but the fact that I had to spend considerable time deciphering the data above and still don’t have non-misleading comparable cross-country data for the actual size of government expenditures relative to total output is of relevance itself for my purposes on this blog.

Other than the issue of comparing apples and oranges, there are additional considerations we need to make when assessing statements like the ones I made above (“government expenditures were 61% of the entire French economy or 45% of the US economy in 2020”). One is about what we are supposed to infer from looking at government expenditures. If a measure is provided as a reference for the extent to which governments participate in the economy, using expenditures ignores the entire side of government regulation, which, in market economies, is likely at least as important as government expenditures to understand the influence of the government in the functioning of an economy. Looking beyond total expenditures and into their breakdown by levels of government, by consumption and investment, and other disaggregated data would likely also contribute to a much richer and productive discussion, not to mention the large literature on taxation, as well as financial indicators of debt and debt sustainability. These are all subjects that the IMF delves into professionally and releases publicly a lot of information about, even if not always easy to decipher. I can’t help wondering, however, whether sites like those of the IMF dashboards linked above are actually doing more harm than good by stressing one small and misleading indicator of government participation in the economy.

Another consideration in interpreting data such as that shown in the IMF dashboards is about GDP and what it represents. Although we often think of it as an indicator of the size of the economy: a) there are important limitations to this indicator, and b) when used, there are different indicators that may be more or less appropriate for different purposes. I will look at these issues in a future post.

References

BEA (Bureau of Economic Analysis). 2008. A Primer on BEA’s Government Accounts, by Bruce E. Baker and Pamela A. Kelly. Available: https://apps.bea.gov/scb/pdf/2008/03%20March/0308_primer.pdf?_gl=1*1anuf1l*_ga*NjM4MDQ4ODA2LjE3MTI3Nzc2ODE.*_ga_J4698JNNFT*MTcxMzExMzg4NC44LjAuMTcxMzExMzg4NC42MC4wLjA. Accessed: April 14, 20244.

BEA (Bureau of Economic Analysis). 2010. Frequently Asked Questions: BEA seems to have several different measures of government spending. What are they for and what do they measure? Available: https://www.bea.gov/help/faq/552 Accessed: April 12, 2024

International Monetary Fund (IMF). 2023. IMF DATAMAPPER. Fiscal Monitor, October. Available: https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC; Accessed: April 14, 2024.

United Nations (UN). 2024. UN National Accounts Statistics. Main Aggregates and Detailed Tables. Table 2.6, Available: https://unstats.un.org/unsd/nationalaccount/madt.asp?SB=1&#SBG; Accessed: April 14, 2024.

World Bank. 2024. World Development Indicators. Available:  https://databank.worldbank.org/source/world-development-indicators; Accessed: April 14, 2024 

Continue ReadingIndicators of Government Expenditures

On Data and Evidence in the Social Sciences

  • Post author:
  • Post category:Fan
Photo by Timothy Grindall. Downloaded from pexels.com

On a recent trip to the local public library I happened to find a copy of a book of collected works by Bertrand Russel that I used to own but that, as many other books, had been a victim of my international moves. I always admired Bertrand Russel’s clear, simple and straightforward way of discussing not so simple topics without distorting them (at least not in ways that were obvious to me). The library was selling the book as part of a used book sale and I bought it, together with a copy of Bertrand Russel’s “The Scientific Outlook.”

In reading this latter book, I found myself sucked into a web of interrelated methodological discussions; some old ones (e.g. how scientific the social sciences are or can be) and some newer ones (e.g. whether the huge amount and speed of data availability, and easy access to it, brought by information technology, has challenged traditional scientific methodology and put correlation – with no theory – front and center on the research agenda). I remember delving into economic methodological discussions some thirty years ago as an economics student but have distanced myself from economic theory since.  After getting lost in the rabbit hole, I decided I did not have enough time to dig deep enough into these discussions, but thought I would register what I found, perhaps for continuing/revisiting at a later date.

So here goes.

Both Mlodinow (2009) and Russel (1962) place the origins of the scientific revolution in the late sixteenth and early seventeenth century, pretty much on the shoulders of Galileo Galilei (1564-1642), his contemporaries and those coming soon after him (e.g. Isaac Newton – 1643-1727). They also both characterize the scientific revolution as being centered on induction and experimentation, as opposed to deduction, as a source of knowledge. Both deduction (theory) and induction (evidence) have a role in science and Russel describes the scientific method as including three stages:

  • Observing significant facts

  • Arriving at a hypothesis, which, if true, would account for these facts

  • Deducing from this hypothesis consequences which can be tested by observation (some, quoting Karl Popper, would say “refuted” by evidence)

This characterization of the scientific method (and its variations) seems to have been criticized over time as not adequately portraying how science evolves. The idea that science progresses by refuting hypotheses empirically, for example, seems to have been criticized repeatedly over time. A recent opinion article in Scientific American (Singham 2020) claims that it must be abandoned for good for at least two reasons: first, because empirical experiments are framed by many theories themselves making its results more reflective of comparisons between theories than between theory and evidence; second, because this is not really how science has advanced historically. Rather, the author claims, “It is the single-minded focus on finding what works that gives science its strength, not any philosophy.“ Similar arguments have been made by various philosophers of science, including Thomas Kuhn (Wikipedia 2021).

The use of empirical evidence may vary from one branch of science or research program to another. I particularly looked for discussions among economists, because that is an area I have more of a background in and because of its relevance to international development. In a well known paper, Larry Summers (1991) argues that elaborated statistical tests aimed at estimating model parameters have had little consequence to advance economic thinking, that most papers remembered as having advanced economic theory have little empirical content at all, and that successful empirical research in economics have relied mostly on attempts to gauge strength of association and on persuasiveness. He criticizes models that have been overspecified to enable testing under the argument that results tend be of little worth and, comparing economics to natural sciences he states that “The image of an economic theorist nervously awaiting the result of a decisive econometric test does not ring true.”

In general, the Popperian criteria of falsifiability through testing seems to be simultaneously nominally accepted and yet in practice not met in economics, with theory moving forward anyway, based on the use of empirical evidence to support argumentation. Hausman (2018) summarizes the challenges of application of Popperian criteria to economics (presumably applicable to the social sciences more generally) and how several authors have abandoned completely the criteria to argue that economics (and, again, presumably the social sciences more generally) advances by using a more comprehensive blend of theory and empirical evidence. Durlauf (2012) states that “while some empirical economics involves the full delineation of an economic environment, so that empirical analysis is conducted through the prism of a fully specified general equilibrium model, other forms of empirical work use economic theory in order to guide, as opposed to determine, statistical model specification. Further, a distinct body of empirical economics is explicitly atheoretical, employing so-called natural experiments to evaluate economic propositions and to measure objects such as incentives. 

A more recent discussion on the use of evidence to advance our knowledge of society gained traction with the rapid growth of “big data. ” Some claimed that data in large volume would make the scientific method obsolete and the correlation would suffice to advance our knowledge, even if these claims may come from outside the academic community. For example, an article in WIRED magazine, written by its Editor in Chief, claimed that “Petabytes allow us to say: ‘correlation is enough.’ We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot(Anderson 2000). These kinds of claims have been countered by others (e.g. Mazzochi 2015) and it is hard for me to imagine how computerized analysis of data would not be imbued with human theorizing, no matter how much one tries to step aside and let “data speak.” In addition, for my purposes, much of the data in international development is not “big data.” Even if it were, it is not clear to me how we would separate the many variables that in international development tend to move together (in the same or opposite direction) with just correlation as a criteria (and no theory). 

There is a large literature to review on this topic and I have not even looked at the random control trial based research that gave Esther Duflo, Abhijit Banerjee and Michael Kremer the 2019 Nobel prize in economics, and what that line of research means for the discussion above. But I am thinking (for now) that there may not be a clear rule in the use of evidence and theory for discussing international development knowledge, and I am satisfied (for now) with looking for the reasonable use of theory, evidence, skepticism and caution in thinking of development policy and practice. I am sure I will come back to this discussion at a later date.

References:

Anderson, Chris. 2000. The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. WIRED. June. Available: https://www.wired.com/2008/06/pb-theory/. Accessed: October 30, 2021.

Boland, Lawrence. 2006. Seven decades of economic methodology: a Popperian perspective. In: Karl Popper: a Centenary Assessment: Science, I. Jarvie, K. Milford and D. Miller (Eds), 2006, 219–27. Available: http://www.sfu.ca/~boland/wien02.pdf. Accessed: October 30, 2021

Durlauf, Steven. 2012. Complexity, Economics, and Public Policy. Politics, Philosophy & Economics 11(1) 45–75. Sage. Available: http://home.uchicago.edu/sdurlauf/includes/pdf/Durlauf%20-%20Complexity%20Economics%20and%20Public%20Policy.pdf. Accessed: October 30, 2021.

Hausman, Daniel. 2018. Philosophy of Economics. Stanford Encyclopedia of Philosophy. Available: https://plato.stanford.edu/entries/economics/#RhetEcon. Accessed: October 30, 2021

Mazzocchi, Fulvio. 2015. Could Big Data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO reports. EMBO Press. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4766450/pdf/EMBR-16-1250.pdf. Accessed: October 30, 2021. 

Mlodinow, Leonard. 2009. The Drunkard’s Walk. How Randomness Rules Our Lives. New York: Vintage Books. A Division of Random House.

Russel, Bertrand. 1962 (first copyrighted in 1931). The Scientific Outlook. The Norton Library. W.W. Norton & Company.

Singham, Manu. 2020. The Idea That a Scientific Theory Can Be ‘Falsified’ Is a Myth. It’s time we abandoned the notion. In Scientific American, September 2020. Available: https://www.scientificamerican.com/article/the-idea-that-a-scientific-theory-can-be-falsified-is-a-myth/. Accessed: October 30, 2021

Summers, Larry. 1991. The Scientific Illusion in Empirical Macroeconomics. The Scandinavian Journal of Economics. Vol. 93, No. 2, Proceedings of a Conference on New Approaches to Empirical Macroeconomics (Jun., 1991), pp. 129-148 (20 pages). Wiley. Available: http://faculty.econ.ucdavis.edu/faculty/kdsalyer/LECTURES/Ecn200e/summers_illusion.pdf. Accessed: October 30, 2021.

Wikipedia contributors. 2021. Scientific Method. Available: https://en.wikipedia.org/wiki/Scientific_method. Accessed: October 30, 2021

Continue ReadingOn Data and Evidence in the Social Sciences

Challenges in Exploratory Data Analysis

  • Post author:
  • Post category:Fan
Image by bluebudgie. Downloaded from pixabay.com

You are given a dataset and asked: “what do the data tell us? Do not assume we know anything about the subject, just tell us what the data say?” This is often the task referred to as “exploratory data analysis,” and it is harder than might seem. I see two main challenges.

The first is the request to “not assume we know anything about the subject.” This request is easy to forget without realizing. For example, say you have a dataset with twenty variables. It is perfectly fine during exploratory analysis to want to look at, not just individual variables in your dataset, but also how variables fluctuate relative to each other, that is, correlation. Now, how easy is it to look at correlations within the dataset with no prior inclination to think some of the twenty variables will be more likely correlated than others? We can fight the urge to pay more attention to those by always including all twenty variables in any and all considerations about correlation, but this requires discipline. One could even argue that we should, indeed, spend more time exploring correlations that we have a basis to believe have a causal connection, and that focusing equally on other correlations are a waste of time and possibly misleading. In any case, how to explore data given the mental models we all approach them with is a potential issue to be dealt with. I will likely return to this in a future post.

The second challenge I see in exploratory data analysis is identifying, and keeping in mind at all times, the sources of uncertainty in our data. The sources of uncertainty are several: from what we don’t know about how the variables were chosen and the data were collected, cleaned, stored and checked, to whether we are, consciously or not, asking questions, not about the dataset itself, but about the underlying generating process, that is, about a population of which we can consider the dataset to be a sample.

This last point I find is often overlooked. In some cases, we know that we are looking at a sample and asking questions about a population. For example survey data is often clearly extracted from a broader population in which we are interested. This is the classic use of inferential statistics that we all learn about in college – although, even in this case, we often see analyses focusing on point estimates rather than the more appropriate confidence intervals. But there are cases where we lose track of the sources of uncertainty in our data (or sources of uncertainty in our analysis) and must maintain discipline to correctly assess what our analysis is actually telling us.

For example, say we have data for five characteristics (five variables) for every inhabitant of a community. We are only interested in that community, so we understand we have “population” data (not a sample). In looking at correlation between our five variables, we decide to look at linear correlation among them through a linear regression. Our statistical software spits out a summary of results from our linear regression that includes coefficients and p-values for those coefficients. But p-values assume a distribution for the observed coefficients. If there is a distribution, there is a source of uncertainty (a random variable). Where did that uncertainty come from? Aren’t we looking at population data and, therefore, what we see is all there is to know?

My answer is that the uncertainty stems from assuming there is a linear relationship with variables when what we observe does not perfectly fit that linear relationship. There is, therefore, an “error” term associated with each observation relative to the calculated linearly predicted relationship. The whole linear regression exercise is asking questions about an assumed underlying generating process in the data, not about the observed data itself. We started making assumptions about the data and asking questions about an underlying process, very possibly without noticing.

So here are my tentative initial guidelines for doing exploratory data analysis:

  1. Start by understanding the data: publishing source; when and where the data was collected and who collected it; what universe is it supposed to represent and was it intended as a sample of a larger population; definitions – are the variables well defined; what errors may have been inserted in the data during transmission, cleaning, storing or other manipulation.
  2. Go onto univariate analyses and then cover correlations, being mindful of any potential assumptions we are making and, if we feel we absolutely need to make these assumptions, be explicit about them, and keep them in mind when drawing conclusions.
  3. Keep in mind at all times whether our questions are focusing on the data at hand or on an underlying generating process, i.e., whether we are “going beyond the data.” Again, be explicit if doing so.
  4. Be aware that exploratory analysis is supposed to focus on extracting inspiration from our data. It is not sufficient to draw conclusions. These require a separate step:  testing hypotheses with a second set of data that can be assumed extracted through the same generating process (from the same population). We do not test hypotheses during exploratory data analysis, nor discuss causality and modelling, other than possibly as suggestions for the next step of hypothesis testing.
Continue ReadingChallenges in Exploratory Data Analysis

Defining Data

  • Post author:
  • Post category:Fan
Image by Alex Uriarte

A few weeks ago I watched a few of Crash Course’s Data Literacy elearning videos on YouTube (Arizona State University and Crash Course 2020). It’s first episode defines “data” as “specific information we collect to make decisions.” This is a different definition from others I have heard. It does have some interesting aspects to it. Under this definition:

  • Data would be a subset of information. That is, all data would be information but not all information data.
  • It uses collection and decision making to define what information is data and what is not.

Other definitions are very different.

A common distinction between data and information is that found in the so-called DIKW pyramid or similar representations. DIKW stands for Data – Information – Knowledge – Wisdom, and usually suggests a hierarchy where data is the broader concept that is then filtered as information, that is in turn filtered as knowledge and finally as wisdom. This seems to be commonly used in the knowledge management community and is often attributed to an article by Russell Lincoln Ackoff in the Journal of Applied Systems Analysis in 1989 (e.g., see Bernstein 2009)

Under this representation, data are often interpreted as facts, noise or signals. There are many criticisms to this representation, from whether “filtering” is actually a good way of thinking about the connections between these concepts, to proposed changes to the pyramid, to what is actually the broader concept, data or information (for just a few examples of a relatively large literature, see Weinberger 2010; Tuommi 1999; and Dammann 2018).

Yet a third way of thinking about data is the definition contained in US law. US Federal statutes define data as “recorded information, regardless of form or the media on which the data is recorded” (44 U.S. Code § 3502). The definition is less innocuous than what it may seem at first. Recording information is in good part what distinguishes our handling of information from cultures who rely (or relied) on voice of mouth transmission and the potential loss of content associated with such practices: think of the telephone game that kids play, whispering a sentence in another one’s ear, who then whispers to another one, and so on until the last child states out loud what his/her understanding of the sentence is, often to find the sentence arrived at the end of the communication chain completely altered or distorted. Under this definition, however, information is the broader concept.

The table below summarizes the three different definitions of “data.” 

 

ASU and Crash Course 2020

U.S. Federal Statutes

DIKW pyramid

Definition or understanding

Specific information we collect to make decisions

Recorded information, regardless of form or the media on which the data is recorded

Facts, noise, signals

Highlight

Data has a purpose: decision-making

Data must be recorded

Data as facts, no specific purpose or characteristic

Data relative to information

Information > Data

Information > Data

Data > Information

I do not find the last row – showing the relation between data and information – particularly useful in understanding data: it is a result of how we define not just data but also information, and may be more useful for discussions focused on knowledge. I include it in the table only for the sake of comparison and may explore it in other posts. I do find the “highlight” of each definition useful in thinking about data, how to manage and use them: 

  • Data should reflect facts. Whether it does or not, depends on how it was collected and managed. This is important to keep in mind in discussions about data collection, data curation and trusted repositories.
  • Data should be recorded. This reinforces the importance of data curation and particularly of metadata in enabling us to understand what “facts” do the data actual capture.
  • Data may be used for decision making. Hence, it is important to keep this in mind the many considerations around data bias, completeness, presentation and interpretation.

In this blog, I will use the highlight of each of the three definitions to discuss data.

References

44 U.S. Code § 3502. Legal Information Institute. Cornell Law School. Available: https://www.law.cornell.edu/uscode/text/44/3502#:~:text=(A)%20means%20the%20obtaining%2C,or%20format%2C%20calling%20for%20either%E2%80%94. Accessed: November 14, 2020

Arizona State University and Crash Course, 2020. Study Hall: Data Literacy. Available: https://www.youtube.com/watch?v=0H8awA3GBPg&list=PLNrrxHpJhC8m_ifiOWl1hquDmdgvcviOt&index=14. Accessed: November 27, 2020

Bernstein, J. H., 2009. The Data-Information-Knowledge-Wisdom Hierarchy and its Antithesis. In: Proceedings from North American Symposium on Knowledge Organization. Vol. 2.  Available: https://journals.lib.washington.edu/index.php/nasko/article/viewFile/12806/11288. Accessed: November 27, 2020.

Dammann, Olaf. 2018. Data, Information, Evidence, and Knowledge: A Proposal for Health Informatics and Data Science. In: Online Journal of Public Health Informatics10(3):e224. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6435353/pdf/ojphi-10-e224.pdf. Accessed: November 27, 2020.

Tuommi, Ikka. 1999. Data is more than knowledge: Implications of the reversed knowledge hierarchy for knowledge management and organizational memory. In: Journal of Management Information Systems 16(3):103-117.Available: https://www.researchgate.net/publication/328803142_Data_is_more_than_knowledge_Implications_of_the_reversed_knowledge_hierarchy_for_knowledge_management_and_organizational_memory. Accessed: November 27, 2020

Weinberger, 2010. The Problem with the Data-Information-Knowledge-Wisdom Hierarchy. In: Harvard Business Review. Available: https://hbr.org/2010/02/data-is-to-info-as-info-is-not. Accessed: November 27, 2020.

Continue ReadingDefining Data

End of content

No more pages to load