Notes on The Book of Why

  • Post author:
  • Post category:Fan
Image by Nik. Downloaded from unsplash.com

I read (most of) The Book of Why by Judea Pearl. I read the first four chapters carefully, but less so the remaining of the book, after I thought I understood its main arguments and conclusions. I may come back, particularly if I think I missed key aspects of it, but I will summarize here my current understanding and impressions.

For most of my reading I was somewhat put off by the constant claim of how revolutionary and new the content of the book was, when much of it seemed to simply reflect standard scientific methodology. At first I thought that the claim to a revolutionary new science (“Causal Inference”) perhaps reflected Pearl’s own trajectory, stemming from a computer science background and dialoguing routinely with statisticians, and that perhaps he did not have a strong science background. But that is not the case: I understand he has actually made contributions to physics.

Where I finally saw a rupture with traditional academic practice was when I got to chapter 4 and his discussion of the back-door criterion for selection of variables to control for when assessing models applied to observational data. Social scientists (to exemplify with the academic practice I am most familiar with), when trying to measure the potential impact of a variable X on Y, will typically “control for” any other variable that may also be impacting Y under the argument that this will isolate the effect of X and, therefore, avoid capturing a “spurious” correlation between X and Y, given that we are dealing with observational data and not a controlled experiment (at least in the typical case). Here is where Pearl’s contribution of insisting that modelers design the causal diagram (the theoretical framework) they have in mind, and do not control for variables that actually capture an indirect effect of X, became embarrassingly clear to me. I say embarrassingly both because I am probably guilty of controlling for variables I shouldn’t have in the past and because I realize that what he is saying should be somewhat obvious…and yet that has not always been the case.

If I can summarize his main arguments, they seem to be that:

First, P(Y|X)P(Y|do(X)), in other words, if P(Y|do(X)) reflects the isolated causal effect of X on Y, that we would observe in a controlled experiment (like in a randomized control trial), the P(Y|X) that we capture in observational data is something completely different for at least two reasons: a) it captures the effects of other confounding variables; b) it actually tells us, in itself, nothing about causality. It is the old “correlation does not mean causality argument.” Up to this point, his argument seems to me nothing new. He claims, however, that the language is important to allow us to talk about causality: that the “do(X)” needs to be introduced in our lexicon. Fine.

Second, causal diagrams are key for our proper modeling of causal relationships to be tested by observed data. Consider the model below (Figure 1, not in the book, I made this up). Assume the average success of, say, high school basketball players in dunking the ball, is modeled as a function of the average jumping height of each player, and that this jumping height, in turn, is a function of the average player height and, say, the frequency and intensity of jumping practice on the team. A typical econometric test for the effect of jumping practice on the average dunking success of the players might control for player height to isolate the effect of jumping practice. Pearl’s argument would be that there is no need to control for player height and the way to see this is that, not controlling simulates the situation we would have in an RCT, where any other factors are assumed away because, on average, they should be the same among the treatment and control populations, assuming these are large enough. That said, in this case my understanding is that, if we do control for player height, no harm would be done and the result should be the same, assuming the model is correct.

Now assume the true model is the one in Figure 2, perhaps because height affects players’ expectations from benefiting from practice. Now we do need to control for player height to establish the effect of practice alone on average dunking success. Otherwise, we may be capturing in part the effect of player height, when only looking at the two variables of jumping practice and dunking success.

Let’s look at yet one other model (Figure 3), where jumping practice and average jumping height signal the likelihood of any high school basketball player also competing in the high jump event on the track and field team. Pearl would warn us against controlling for observed participation in the high jump event, because this would detract from the measured effect of jumping practice (“explain-away effect”).

More generally, Pearl proposes the following rules when deciding what to control or not for, to be able to mimic the effect of RCTs in observational data (Pearl and Mackenzie (2018), pgs 157-158):

a) In a chain junction, A→B→C, controlling for B prevents information about A from getting to C or vice versa;

b) Likewise, in a fork or confounding junction A←B→C, controlling for B prevents information about A from getting to C or vice versa

c) Finally, in a collider, A→B←C, exactly the opposite rule hold. The variables A and C start out independent, so that information about A tells you nothing about C. But if you control for B, then information starts flowing through the “pipe,” due to the explain-away effect

[…]

d) Controlling for descendants (or proxies) of a variable is like “partially” controlling for the variable itself.

The idea is that we would not want to control for mediators (B in item “a” above), colliders (B in item “c” above), or proxies (B in item “d” above) but we do want to control for confounders (B in item “b” above), all represented in Figure 4 below.

Pearl actually discusses controlling more in terms of paths rather than variables and calls “back-door adjustment” ensuring that any path connecting the variables X and Y that is not the causal path we are willing to test for, is blocked (by appropriate controls), and and that there are no blockers in the path that we do want to test for.

As Pearl went on discussing RCTs, instrumental variables, and observational studies controlling for confounding variables, I grew increasingly intrigued by what I might be missing that was so revolutionary:

  • Perhaps it is the emphasis on the diagrams, which I do find useful, although I still have trouble thinking of them as revolutionary, but rather as a way to bring clarity to our models. I would have benefited from doing this, for example, during my graduate studies where, yes, my tendency was to control for everything under the sun without clearly realizing the implications.
  • Perhaps it is the do-calculus, and I may need to find more examples where it is used to see how this generates responses that otherwise we would not have.

What I was not able to find, however – and it may be just me missing it – was some discussion of how we can use observed data to not just reject assumed causal relationships, but to help us better define our causal models. Most of the book “thinks” in a very traditional scientific way: from model to data. There is some discussion towards the end of how data mining can help direct our focus to certain correlations (and, thus, potential causal connections to be investigated). We also know there are things we can do to help at least inform how robust our models are to the assumptions we make, such as sensitivity analysis, which he brushes on very, very briefly in chapter 5. It also seems to me that Bayesian Networks, described and discussed in the book, should be useful in feeding into the reverse discussion, from evidence to models. However, Pearl seems to go out of his way to constantly make the point that the data themselves say nothing about causality, without discussing where, then, does our causal reasoning come from. It seems we are simply wired for it, or, as he discusses, it comes from our imagination. Perhaps it was just my misplaced expectation that this book would explore this further.

In the future I hope to further explore some of the aspects of this book and Pearl’s thinking that I have not adequately covered here: particularly Bayesian Networks, the causal ladder and the role of imagination.

Sources:

Pearl, Judea and Dana Mackenzie. 2018. The Book of Why. The New Science of Cause and Effect. New York: Basic Books

Continue ReadingNotes on The Book of Why

Customizing LLMs – Part 2: an Experiment

Image by Samuel Ijimakin. Downloaded from Pixabay

[For the data and Python code used for this blog post, please visit my GitHub repository https://github.com/EngelbergHuller/Customizing-LLMs-Experiment/tree/main]

As mentioned in Part 1 of this two part series (the first part is on my FAN page and titled Customizing LLMs – Part 1: the Concept), I wanted to better understand our capacity to customize LLMs and decided to explore a bit the concept and then experiment a bit myself with building a RAG system on my laptop. I am not a developer and so I “vibe coded” with ChatGPT 4o to set up the RAG system, and then compared the results with those I obtained by simply uploading a set of documents to ChatGPT. Because my RAG system connected with the OpenAI LLM, the thought was that it would help me understand those components that are particular to RAG and not related to the LLM it is based on. I describe this experiment here.

A Bit More Background and Outline

I have the ChatGPT personal plus plan where I pay $20 a month. This plan allows me to upload a limited number of documents to ChatGPT and ask it to analyze them. If I understand correctly, the limits are up to about 50 MB with no file being larger than about 20 MB. If more needs to be analyzed concurrently, ChatGPT will offer to analyze them in batches and then compare the batches for a broader summary. I understand that when ChatGPT analyzes documents uploaded to its projects, it does so following a RAG type pipeline, but using its specific tools.

Below I:

  1. Describe the specifications of the RAG system on my laptop relative to those that ChatGPT tells me it uses when analyzing documents uploaded to it
  2. Describe the documents I used as input to both the laptop RAG system and ChatGPT
  3. Describe the questions (prompts) I used and the relative responses I received from each of the two systems
  4. Try to understand how the different responses resulted from different system specifications

Laptop RAG and ChatGPT Specifications

I used PyCharm as an IDE (Integrated Development Environment), created an account on Open AI to access their LLM, and then asked ChatGPT 4o to walk me through the process of creating a RAG system on my laptop. The system on my laptop uses:

  • A customized script for chunking the pdfs:
    • It defined a maximum amount of words per chunk (800), not tokens
    • It defined an overlap of 100 for adjacent chunks (for continuity)
  • The OpenAI embedding model “text-embedding-3-small”
  • The FAISS library of algorithms for indexing and searching vectors by similarity
  • The OpenAI LLM ChatGPT 4o

In comparison, uploading documents to ChatpGPT4o it tells me it uses:

  • A chunking pipeline with:
    • Chunk sizes of ~500–700 tokens (~400–600 words)
    • Overlap of ~50–100 tokens
    • Splitting by semantic logic, often prioritizing punctuation
  • An embedding model “similar in capability to:
    • text-embedding-3-large or
    • text-embedding-ada-002 (depending on optimization and routing)
  • “an internal vector index that behaves similarly to FAISS, but is not FAISS itself” 
  • Also OpenAI LLM ChatGPT 4o

Input

I used as input 20 pdf documents containing old newspaper editorials that I downloaded from a ProQuest database (accessed through my local public library) and that resulted from a search for key words “foreign aid,” “foreign assistance,” or USAID. I won’t discuss the more general document search because, for the purposes of this experiment with RAG, the 20 document set is the universe of interest. It is relevant, however, that these are scanned images of old newspaper editorials because, as I discuss further below, one of the issues I encountered seemed to stem from the quality of Optical Character Recognition (OCR) software that I was able to use.

Q&A

Having uploaded the 20 pdfs to both ChatGPT’s interface and to the RAG system on my laptop, I asked the same three questions to both:

  • In one short paragraph, tell me what these documents are about
  • Are all articles critical of foreign aid or do any of them praise or defend it?
  • Please provide a table with these 20 articles categorized by stance, date, and key quotes

One advantage of the ChatGPT tool that became clear upfront, was that it seemed to follow its responses with meaningful questions or suggestions, in a way that my RAG system did not. The last bullet above I added at the suggestion of ChatGPT.

I also modified the laptop RAG a couple of times as I noticed some of the limitations in the results I was obtaining:

First, as I collected the responses from my custom RAG system, a sentence in one of its responses made clear that it was not accessing the entire content of the pdf documents but seemed to be relying mostly on the title and perhaps some other metadata. It became clear that, because the pdfs were mostly scans of newspapers, the RAG needed to use Optical Character Recognition (OCR) and was not doing so, but rather relying on what it could read as text. I had to install two new pieces of software, add them to my windows environment, and ensure PyCharm was accessing them: Tesseract and Poppler, two open source software that interact to enable OCR.

After rebuilding the RAG system with OCR the results were better but I noticed the responses did not seem to be making use of all the 20 documents. It turns out that the FAISS tool was retrieving information from chunks using an 8 nearest-neighbor criteria and ignoring other information. So I expanded that to 20. ChatGPT warned me that by doing so, I could decrease the relevance of the response and, yet, comprehensive coverage was what I was looking for. 

Below I copy the responses obtained by the two approaches (I reformatted the responses considerably for presentation but did not modify the content).

Prompt 1. In one short paragraph, tell me what these documents are about

Prompt 2. Are all articles critical of foreign aid or do any of them praise or defend it?

Prompt 3. Please provide a table with these 20 articles categorized by stance, date, and key quotes

ChatGPT 4o

Custom RAG – OCR K=20

Comments on the responses to questions

On the responses to question 1. Unlike ChatGPT, the laptop RAG did not seem able to identify the Washington Post as one of the newspapers from which editorials were sourced. Three of the 20 articles were from the Washington Post, the other 17 were from the New York Times. Since the RAG systems reports that when the text content of a chunk is low it “falls back on OCR,” I thought that, perhaps, it no longer included content from that chunk that was in text format. But I asked the program and that seems to not be the case. It seems to refer to “references from the context” as references extracted in text format and stated that it used both content extracted with OCR and in text when responding to questions.

The laptop RAG also did not clearly identify foreign aid and foreign assistance as the central theme of the 20 documents as clearly as ChatGPT did. After further examining the documents, two of the 20 documents are full pages from the Washington Post that contain several editorials each, only one in each document referring to foreign aid. It seems like the laptop RAG system took into account the other editorials in those two documents much more than ChatGPT did, when providing an overall summary.

On the responses to question 2. The ChatGPT answers provide a sentence summarizing the articles with a common stance (e.g. critical, favorable…) and then provides examples. The laptop RAG systems stays focused on the individual chunks, only providing short summary sentences at the beginning and end. Also, the ChatGPT answers refer to specific documents when providing examples (numbers in brackets) while the laptop RAG system refers to chunks.

On the responses to question 3. The ChatGPT answers responded to the request to categorize the articles by “stance” by defining four buckets in which it divided all 20 articles (supportive, critical, mixed/reformist, and neutral/analytical). The laptop RAG system responded with individualized “stances” for each article.

The laptop RAG system did not interpret each document as being one “article” mentioned in the prompt. I beleive this was likely an issue with my prompt. As mentioned, 2 of the 20 documents had more than one editorial in them. But we may also need to think how to best customize the retrievel of information. The FAISS library works to retrieve information by clustering. When it was set to retrieve 8 nearest-neighbor chunks, it had produced a list of 12 articles in response to question 3. When I expanded this to 20 nearest-neighbor chunks, it retreived 19.

Where the responses became most troubling to me was when I noticed that neither ChatGPT nor the laptop RAG correctly identified the titles to all the articles. The laptop RAG actually performed better in this case than ChatGPT getting 13 titles correct, while ChatGPT only got 9. As previously mentioned, in 2 of the 20 documents, there were more than one editorial in the same document, which would have contributed to the difficulty in selecting one title for the document, and this seems to be reflected in some of the titles offered by the laptop RAG system. But the titles offered by ChatGPT seem to be particularly bewildering and I could not figure out where they came from.

ChatGPT also seemed to have sometimes identified the dates incorrectly, while the laptop RAG sometimes provided NA when it was not able to identify the date. For example, ChatGPT listed five documents as being from 1972 when only two of the documents are from that year.

So what do I draw from the experiment?

First, I would not currently feel comfortable relying directly on ChatGPT nor on this initial laptop RAG system to provide me with good information about scanned pages of newspaper articles. That said, given the responses to question 3, I am left with the impression that what may have seemed to be better responses by ChatGPT in questions 1 and 2 may actually reflect a greater inclination of ChatGPT to “fill in gaps” with made up information when compared to the laptop RAG system.

In addition, given that there seem to be several ways to improve on this laptop RAG (based on my discussion about it with ChatGPT), the laptop RAG may actually be a more promising avenue to obtain more reliable information from such types of documents. 

Second, based on information provided by ChatGPT itself, there were differences in the two systems in the chunking approach taken (words vs tokens), the embedding models used (even though both used OpenAI embedding models, and in the indexing method used for search and retrieval. Because the laptop RAG system is transparent and customizable in each of these elements in ways that directly using ChatGPT does not seem to be, it should allow for room for improvement.

Third, one of the main reasons to look into a customized RAG system (for my purposes) is the existence of limitations in directly using an LLM like ChatGPT to analyze large amounts of scanned documents. However, in further exploring a customized RAG system for this purpose it seems like some effort should go into:

  1. Further scrutinizing the types of documents that will be well interpreted by the RAG system and those that may not and finding ways to, perhaps, exclude documents that would not be well read. For example, how can we better deal with documents that scan entire newspaper pages where only one of the articles on the page is of interest?
  2. Further looking into how well OCR is working and how well the RAG system is capturing information from OCR in tandem with information in text format
A final note on prompt engineering: I seem to not have paid sufficient attention to it in Part 1 of this two post series and in this exerise as well. Given the limitations of both ChatGPT and the laptop RAG system, it is possible that I would have obtained better results just by more clearly specificing to the systems the output I was looking for.
 
Oh, and on the OpenAI cost of the laptop RAG system exercise: $0.15
 

Sources

OpenAI. ChatGPT 4o, accessed October 2025

ProQuest access to The New York Times and Washington Post historical editions. Accessed through Fairfax County Public Libraries, October 2025 

Continue ReadingCustomizing LLMs – Part 2: an Experiment

Customizing LLMs – Part 1: the Concept

  • Post author:
  • Post category:Fan
Image by Samuel Ijimakin. Downloaded from Pixabay

I wanted to better understand our capacity to customize Large Language Models (LLMs). By “our” I mean us, users. I will register my current understanding in two parts:

  • Part 1 (this post) describes my understanding of how LLMs work and dives a bit into Retrieval Augmented Generation (RAG).
  • Part 2 (see post on my Engleberg Huller page) describes my experiment with building a RAG system on my laptop computer (and, yes, I am not a developer, just a user, so check it out).

How LLMs work

LLMs are neural network AI models that are developed to process, understand and generate human language based on a large number of parameters. They have two main components:

  1. The first – and the fundamental breakthrough in the development of LLMs –  is a transformer. A transformer is a form of neural network architecture that creates vectors of parameters representing not just a token (a unit of analysis in neural networks, like a word or part of a word, or a pixel if we were thinking of images) but also the context in which that token lies (for example, position of a word in a phrase, the syntax of a phrase and a word’s semantics). This consideration of context seems to be referred to as “self-attention.” The final vectors generated by transformers are referred to as “final embeddings,” (although it seems like sometimes the term “embedding” is just used for the vectors representing the tokens and not their context)
  2. The second component is a task specific model that takes the final embeddings and generates outputs, such as the predicted next word in a sentence.

Each of the two components above is developed through training of optimization models: the first component requires training a model to find the best vector of parameters (final embedding) that represents a token and its context, and the second component requires training a model to find the best output given a collection or sequence of embeddings.

An important detail, however, is that these components and trainings are not developed in sequence but, rather, at the same time, using feedback loops (backpropagation). Below I reproduce a diagram the ChatGPT made for me (with some alterations, seems ChatGPT is not yet very good at making these diagrams).

Some aspects that may influence the power of a transformer include the number of dimensions (parameters – usually in the billions) considered in embeddings and the number of tokens that a transformer can consider simultaneously (the size of the context window).

Customizing LLMs

I understand there are two main ways of customizing LLMs: Fine Tuning and Retrieval Augmented Generation (RAG). Before exploring those, however, a few words on “prompt engineering.”

Prompt engineering is also seen as a way to customize the results obtained from using an LLM. You can find lots of recommendations online and do full courses on it. I do not want to diminish the attention this seems to get, but it essentially means learning how to interact with LLMs to obtain the best possible answers to your questions. From my experience, the most useful asset in doing so, is your own knowledge about the subject you are exploringt. This allows you to pursue your questions in detail, inducing the LLM to refine its answers. I often do more traditional Google research on a subject before interacting with ChatGPT, so I better know what to ask and how to phrase my questions.

Fine-Tuning

Fine-tuning is an approach to customizing a LLM that consists of altering the collection of parameter values (weights) that an LLM uses in responding to a question (prompt), to improve the LLM performance for specific purposes. Traditionally, it is considered expensive because LLMs can have billions of parameters and “retraining” them would typically be out of reach for all but the companies that own the LLMs (or their transformers). But there are approaches that avoid having to “retrain” an LLM.

A common approach is that referred to as Parameter Efficient Fine Tuning (PEFT). This approach makes use of adapters, small neural network modules that are trained for a specific purpose, and that are typically then added to an LLM without needing to touch the remaining parameter values. An approach known as Low Rank Adaptation (LoRA), in particular, seems to have been able to substantially reduce the cost of fine tuning in projects on which it was used by using low rank matrices (limited number of parameters). 

One limitation of this approach seems to be that, once “fine-tuned,” the resulting set of parameter values of the LLM is applied to any interaction with that LLM. This may generate better results for the specific case the LLM was fine-tuned, but may generate worse results for other use cases. The circumstances in which it is worth for an organization to invest in fine-tuning an LLM would need to then be looked at carefully.

Retrieval Augmented Generation (RAG)

RAG consists of submitting to an LLM a specific set of information that you ask the LLM to consider when responding to your prompts. To do so, you need to build a pipeline that ingests the additional information you want to submit, then breaks that down to small parts (chunks), which are then transformed into embeddings (vectors) in a way that aligns with the way your LLM also uses embeddings. The LLM will then use the embeddings of your prompt to search for similar embeddings of the (chunked) additional information and feed all that to the LLM to seek a response. The diagram below illustrates how RAG works and I discuss the steps in red further below.

1. Chunking – this step breaks down the added information in segments called chunks so that the number of tokens in segments fits the embedding models and LLM context windows, both of which have limits. The chunks need to still have semantic meaning, however, and there is discussion out there around the optimal size of chunks, which may depend on each situation. A general ballpark figure common in discussions seems to be the range of 128-256 tokens, which seems quite specific to me (there must be a back story to this range, I just don’t know what it is). Chunks end up being typically sentences or small paragraphs. Examples of tools that can assist in doing this are LangChain and LlamaIndex.

2. Embedding – the chunks are then turned into vectors intended to capture both content and context of the chunks. These embeddings are typically stored in what is called a vector database which can then be searched. So the result is that chunks become “findable” based on their meaning and context. There are various embedding tools out there from OpenAI, Google and others. However, some of these models may be better aligned than others with the specific LLM that a RAG project intends to use. So, for example, an OpenAI embedding model would be appropriate for use with OpenAI transformers, or Sentence-BERT with Hugging Face transformers. The vector database, on the other hand, may not need such alignments and there are many popular options being used in RAG projects (e.g. Pinecone, Weaviate, Milvus). The user questions also need to be embedded to feed into the LLM but they are typically not stored in the database because they are typically not for repeated use. LLM’s do this embedding every time they are prompted, The existence of embedding models seems to actually be the reason (or one of) why prompts can be better or worse “engineered” to extract the information desired from the LLMs.

3&4. Retrieval and Generation – these steps then consist of feeding a LLM with the embedding of both the new material we wish the LLM to consider and the questions we are prompting the LLM to answer – and then obtaining the answer.

I reached the understanding above, in part, by dialoguing with ChatGPT 5, which in some situations seems to generate not very good responses, perhaps worse than those of ChatGPT 4o (see sources below). But, as I mentioned when discussing prompt engineering, I do have the habit of cross referencing with other information I find on the internet, including feeding it back to ChatGPT, so I am hopeful the understanding above is relatively accurate. 

I then proceeded to try to build a RAG system on my laptop and compare it with simply uploading a set of documents to ChatGPT (I did use the 4o version, in this case). Because my RAG system connected with the OpenAI LLM, I figured it would highlight the RAG components of the system built on my laptop and help me better understand those component. For the results of this exercise, please see the Engelberg Huller post Customizing LLMs – Part 2: an Experiment.

Sources

3Blue1Brown. 2025 (last updated Sept 26). Neural Networks. Course (9 videos), YouTube. Available: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi. AccessedL October 06, 2026

Belcic I and Cole Stryker. Undated. What is LLM Customization? IBM Think. Available: https://www.ibm.com/think/topics/llm-customization. Accessed: October 06, 2025

Google. Gemini backed AI Overview on Google search

OpenAI. ChatGPT 5, accessed October 2025

Continue ReadingCustomizing LLMs – Part 1: the Concept

A Brief Incursion into Epistemology (to be continued)

  • Post author:
  • Post category:Fan
Image by sunjong77. Downloaded from pixabay.com

I walk my dog in the mornings while typically listening to a news podcast. Sometimes I get tired of the news and listen to music or search for some other type of podcast to accompany my walk. Recently I listened to a few episodes of one called “European Intellectual History since Nietzsche,” which consists essentially of recordings of a class given at Yale by an Associate Professor of History called Marci Shore. I enjoyed the first few classes but soon some things didn’t sound quite right to me – admittedly, based on my very limited knowledge – and these issues were enough to make me stop listening. No demerit to the professor, this may only reflect my own limitations and I won’t get into what those issues were because what matters is that it got me wanting to learn more about the epistemology of different philosophers.

Before jumping into epistemology, here is a brief summary of what I got out of Prof Shore’s first two episodes with a broad overview of the Enlightenment and Romanticism.

I next spent some time trying to pin down the epistemological view of different philosophers making use of a few introductory sources: the Oxford Companion to Philosophy (Honderich 1995), which I found in a used book store, Wikipedia and ChatGPT. Yes, I went there. However, I had very little confidence in what I was getting from those sources.

So I finally shifted my efforts and went to my go-to philosopher – Bertrand Russell – for, rather than a history of epistemological thought, at least his own views. 

I should first clarify that epistemology can be defined in different ways, but it essentially refers to the branch of philosophy that addresses how we know. The Oxford Companion to Philosophy defines it differently in different entries written by different contributors. The entry for history of epistemology, written by Prof. D.W. Hamlyn of Birkbeck College, London, defines it as “the branch of philosophy concerned with the nature of knowledge, its possibility, scope, and general basis” (Honderich 1995, p. 242). The entry for problems of epistemology, written by Prof. Jonathan Dancy of Keele University, defines it as the “study of our right to the beliefs we have” (Honderich 1995, p. 245). I take these definitions, in combination, to be sufficient to convey what the focus of this post (and my interest) is.

Bertrand Russell has a little book called “The Problems of Philosophy,” published in 1912, and that largely focuses on epistemology. The book is not exclusively about epistemology, however, and wonders into ontological questions about the nature of reality (e.g. addressing in several parts the issue of “idealism”). I try to focus on the epistemology parts, but I do understand how the two issues are intertwined.

Here is my understanding of Russell’s views, based on the book.

What we perceive through our senses is only indirectly a physical object. We perceive what he calls “sense-data,” which are signals of the actual physical object. “Sensation” is the awareness of things through sense-data. The collection of physical objects is “matter.”

Russell distinguishes between knowing truths (e.g. savoir in French, saber in Spanish and Portuguese) from knowing things (e.g. connaitre in French, conocer in Spanish, conhecer in Portuguese). He then turns to focus on the knowledge of things.

We can “know” things directly or indirectly. The former he will call “knowledge by acquaintance,” the latter “knowledge by description.”

Knowledge by acquaintance can happen in several ways, such as through sense-data, through memory or introspection (self-consciousness, knowledge of self).

Figure 2 below summarizes his thought so far.

Figure 2. Knowledge of Things

The fact that we are able to generate inferences from what we know about things means that we are drawing on some general principles to do so. Examples are:

  • The principles of induction: the more two things are observed together, the more we expect them to be
  • The principles of logic. E.g.: if it is know that: a) if this is true that is true; and b) this is true; then c) that is true

Principles of inference are examples of what he calls “a priori” knowledge. Other examples are mathematics and knowledge as to ethical value (or the intrinsic desirability of things). Russell argues that these principles (or a priori knowledge) cannot be proved by experience. This is an old debate between empiricists and rationalists, and key in defining an epistemological view. The debate often uses the terms “innate” knowledge rather than a priori. Russell prefers to use a priori to innate because, although a priori knowledge cannot be proved by experience, he considers it to be elicited and caused by experience. In the debate between empiricists (e.g. Locke, Berkeley, Hume) and rationalists (e.g. Descartes, Leibniz), Russell considers that the rationalists were correct in that a priori knowledge cannot be derived from experience. But he thinks rationalists were incorrect in thinking that they could deduce what they know from a priori knowledge since, as mentioned above, the a priori knowledge are elicited and caused by experience.

How Russell argues that the general principles cannot result from experience alone is central to his understanding of how we know. His main argument seems to be that any generalization from experience (induction) presupposes some general principle. Therefore, no general principle can be proven by experience. He gives the example of a chicken who expects food every time it sees the person who feeds it. Every day the expectation is confirmed….until the day the person breaks the chicken’s neck. There is no logical reason that simple repetition should guarantee its continuity, no matter how much we expect it to be so, unless we associate to that expectation some general principle (e.g. a logical principle). A note: Russell does not mention causality in his argumentation, but it is my understanding that all causal argumentation presupposes logical principles, so Russell’s argument is consistent with someone bringing causality into the discussions to justify expectations based on experience. The point made in this paragraph seems simple enough, but it is key to establishing an epistemological view and, as mentioned in the previous paragraph it addresses a long standing epistemological debate. On this point, Russel makes a lot of sense to me.

A consequence for scientific thought:

“The general principles of science […] are as completely dependent on the inductive principle as are the beliefs of daily life […]. Thus all knowledge which, on the basis of experience tells us something about what is not experienced, is based upon a belief which experience can neither confirm nor refute […].” (Russell 1912, p. 40)

Deduction then also plays a part in the building of our knowledge, because we can often know general principles without inferring it from its instances (e.g. 2+2=4). Deduction starts from the general to the general or to the particular; induction starts from the particular to the particular or to the general.

Russell then asks himself how a priori knowledge is possible. Here the discussion seems to veer again quite a bit into ontological questions, since it becomes not just how to “know” a priori principles but also about the nature of a priori principles. 

He first explains Kant’s view, which states that a priori knowledge is generated from the interaction of ourselves and physical objects (the “things in themselves”), what he calls “phenomenon.” We cannot know a thing in itself, only to the extent that it conforms with our own nature. If I understood Russell’s explanation, according to Kant a priori knowledge would be a product of our interaction with physical objects. This view would not quite fit empirical views, because we are not just observers, but our own nature is part of what generates our knowledge, just as much as the things in themselves and the perception we have of them. Kant’s view on this makes a lot of sense to me.

Russell, however, proceeds to make a point that I find less immediately obvious and that I am still not sold on. He argues that we should not think of our part in this phenomenon as reflecting the nature of our minds but rather, that a priori knowledge must have a nature that is neither material nor an idea. He gives as an example the law of contradiction (nothing can both be and not be), which he argues is not just a statement about our beliefs (our minds) but of the things themselves. I do not see how this necessarily follows. Why would the law of contradiction not be something that we take for given because of the structure of our minds? How do we know that, in fact, this law must apply to things, if we cannot even perceive things directly? I do not follow Russell’s argument here. At the same time, this is an ontological question and, therefore, not of particular interest to me. Whether the law of contradiction is imposed on things by our minds or is something that exists beyond matter and ideas it does not seem to have immediate consequences for how we know, in any practical way. On this matter, for now, I will stick to Kant’s view, which is more intuitive to me.

Russell then goes on to discuss the nature of a priori knowledge as being neither material, nor a product of our minds. To do so, he reaches out to Plato. To avoid thinking of a priori knowledge as an “idea” he suggests using the term “universal,” and states that the essence of universals is that they do not arise from a given sensation. He goes on to discuss the nature of universals, which I will jump here, both because he lost me and because it seems like too much of a deep dive into ontological questions for this post.

Russel’s next step is to suggest that our knowledge of universals can also be acquired by acquaintance or by description, just like our knowledge of particulars. Through acquaintance, we come to many different types of universal knowledge, such as sensible qualities (e.g. “whiteness), relations (e.g. before and after, above or below, greater or smaller than) and he states that all a priori knowledge deals with relations of universals (Russell 1912, p. 63). 

Figure 3 below modifies Figure 2 to include universals in the picture, the nature of which will remain a mystery to me for now. I must say, however, that there is more to be discussed regarding the knowledge of truths. Russell’s book contains a few more chapters on this, including on intuitive knowledge, truth and falsehood, probable opinion and the value and limits of philosophy. I stopped before these, however, since the discussion of the nature of universals already stumped me and is, in any case, as far as I am willing and have time to go at the moment.

Figure 3. Knowledge of Truths*

*There is more to discuss regarding the knowledge of truths based on Russell’s book. To be continued.

 

Sources

Russell, Bertrand. 1912. The Problems of Philosophy. Printed version of work in the public domain.

Honderich, Ted (Editor). 1995. The Oxford Companion to Philosophy. Oxford University Press

Shore, Marci. 2024. European Intellectual History since Nietzsche. Podcast with recording of classes offered at Yale in 2023. Available on Spotify. Accessed: January 2025

Continue ReadingA Brief Incursion into Epistemology (to be continued)

Output and Income Indicators

Image by Michael Reichelt. Downloaded from pixabay.com

In my previous Fan post (“Indicators of Government Expenditures”) I noted that, when using output indicators such as GDP, we should keep in mind that: a) there are important limitations to this indicator, and b) when used, there are different indicators that may be more or less appropriate for different purposes. I develop a bit on those two points here.

On the first point, an assessment was done in 2008 by a commission led by three economists, two of which Nobel prize awardees, at the request of the Government of France, and later summarized in a book. I draw from it here, although additional details are available online.1

The commission was led by Joseph Stiglitz, Amartya Sen (both Nobel Laureates), and Jean-Paul Fitoussi. Other economists were also part of the commission. The commission was divided in three working groups:

  • One to focus on standard issues of national accounting, such as measuring government output and treatment of household production;
  • A second group focused on the relationship between output measures and efforts to measure well-being or quality of life;
  • A third group looked at attempts to capture sustainability in measures of output.

On the “classical GDP Issues,” GDP mainly measures market production, and one reason why money measures have come to play an important role in our evaluation of economic performance is that money valuations facilitate aggregation. However:

  • Prices do not exist for some types of output (e.g. government services provided free of charge or household services such as child care);
  • Market prices may not reflect consumer’s appreciation of goods and services if there is imperfect information (e.g. financial products, telecommunications bundles);
  • Market prices may not fully reflect societal evaluation due to externalities (e.g. environmental costs);
  • Collecting accurate data may be challenging when there are sales or differences in prices among alternative selling mechanisms (e.g. online vs store prices);
  • Accounting for quality of products and changes in quality is challenging and may not always be reflected in prices;
  • Underestimating quality improvement means overestimating inflation, which, in turn, means underestimating real income.

These are not minor inconveniences, but real issues, and the extent to which GDP measures are distorted by them is not clear. They discuss in some length the issues with measuring services, for example. Services account for up to two thirds of output and measuring the quality of services is challenging. Measuring government provision of services, for example, is often done through inputs, which leave aside the possibility of capturing changes in productivity. Attempts to measure government services using outputs face known challenges, such as accounting for quality. What services are considered final and what intermediate (or “defensive”) services is difficult to define. E.g.: government costs with prisons? Private costs with commuting?

The authors suggest five ways of dealing with some of the deficiencies of GDP as an indicator of living standards:

  1. Emphasize well established indicators other than GDP
    • Gross, rather than Net, has the issue of not accounting for the amount of output that is needed to maintain capital goods (depreciation). When technology is changing rapidly, this could be substantial and the difference between Gross and Net can be considerable. Then – consider “Net” (although depreciation is hard to estimate);
    • Product, rather than Income, has the issue of not being as good for accounting household consumption and, therefore, associated well-being. The difference is the purchasing power sent to and received from abroad (net income from abroad). Also, changes in the relative prices of exports and imports will affect national income even if domestic product stays the same. Consider “Income;”
  2. Consider wealth jointly with consumption to capture consumption possibilities over time;
  3. Bring out the household perspective
    • Adjusted disposable income accounts for government taxes and monetary transfers but not for transfers in kind;
  4. Add information on the distribution of income, consumption and wealth:
    • Median is better than average, but depends on survey data and these have known challenges:
      • Unit of measurement? Consumption unit?
      • Measuring property income?
      • International comparability
      • Whose bundle of consumption?
      • Changes in the provision of services within households or between families to provision by markets creates distortions
    • Also, we should be looking at distribution of full income, not just market income, including values such as household income and leisure
  5. Widen the scope of what is being measured (may require imputation):
    • Recommendation is to keep a satellite account because: a) imputed values are not as reliable as observed values; b) non-observed values could end up being a very large share of total output. E.g.:
      • Household work, under the authors estimates, could be 30% of currently measured GDP;
      • Leisure could be 80%;
    • They still recommend it be done for a) completeness; b) the invariance principle – under which the value of a good or services should not depend on the institutional arrangement under which it is provided (e.g. free by state or charged by private sector).

The other two areas taken on by the commission working groups are more intuitive to me, even if not easy to address so I only briefly summarize the conclusions of the corresponding working groups:

  1. On the relationship between output measures and efforts to measure well-being or quality of life, the argument is that these latter concepts cannot be reduced to resources. Efforts to measure well-being and quality of life have either attempted to measure subjective perceptions, tried to assess capabilities that would enable and support human functioning (health, education, security…) or tried to identify how individuals themselves weigh the non-monetary aspects of their well-being. All these attempts face challenges, including how to incorporate inequalities, how to access the linkages between the various dimensions of well-being or quality of life, and how to aggregate them;
  2. On attempts to capture sustainability in measures of output, there is a large and varied literature that the commission divided in four groups: attempts to establish large dashboards with sets of indicators addressing different aspects of sustainability; attempt to develop composite indices; attempts to develop adjusted GDP indicators; and indicators focusing on overconsumption or overinvestment.

What do I draw from the above? A few initial thoughts:

  • When using an indicator of output growth for a selected country or group of countries, I have typically used the World Bank, World Development Indicators (WDI), Gross Domestic Product (GDP) series in Local Currency Units (LCUs). I have used LCUs when looking at growth instead of alternative monetary units, to avoid the influence of short term fluctuations of exchange rates. Attempts to correct for this influence, such as the World Bank’s Atlas measure (more on this below) or the use of Purchase Power Parity (PPP) measures seem unnecessary, given their imperfections and that we are only interested in growth and not in comparing the absolute value of output among countries. This series can be used to break down domestic output in its expenditure components (G+C+I+Ex-Im+changes in inventories), as well as by sector of the economy (agriculture, industry and services)2. It is available for a period of over 60 years for most countries. Based on the input above:
    • The use of output rather than income indicators when looking at growth seems reasonable to me and perhaps more relevant: it better reflects the production capacity of a country (rather than its standard of living) and, for most countries, output and income do not tend to diverge much over time (although this may not always be the case and would be interesting to look at the data).
    • The fact that GDP indicators do not capture household production means that growth is likely overestimated during periods where agricultural production for own consumption is reduced and production for the market is increased. GDP growth is also likely overestimated during periods of increased entry of women in the labor market, if this also means decreased services within the household. I would need to further research the WB WDI methodology to see the extent to which the WB tries to address this issue in their measurements;
    • The extent to which the informal economy is captured also requires further look into the WB WDI indicator methodology. If it does not capture the informal economy well, growth would also be overestimated during periods of formalization.
  • I have used The World Bank, World Development Indicators (WDI), Gross National Income (GNI) series in Purchasing Power Parity (PPP) when comparing countries. I have preferred to use at the concept of income (what belongs to the residents of a country) rather than product (what is produced within the boundaries of a country) when comparing countries because it is a better indicator of resources available to the local population. For cross country comparisons, PPP measures (even if imperfect) allow some correction for price and exchange rate distortions regarding how much residents of two compared countries can actually purchase with their income. This series is available for fewer years and countries. Based on the input above:
    • Periods of rapid technological transformation – such as the one we are in now – are likely generating considerable distortion in our relative measurements of income by country, given the challenges in addressing quality of products and services. To the extent that we are able to use net indicators (as opposed to gross), accounting for depreciation in such periods is also a more serious challenge and a source of distortion.
    • Does our association of value with market prices mean that our association of income per capita with productivity is somewhat distorted? I explain: think of luxury goods, where price is not necessarily associated with quality but where status of a brand plays an important role in product prices. Countries with heavy presence of luxury industries will have their per capita incomes associated with this higher price that is fabricated by the status of their products rather than by the quality of their products. How we understand the productivity of their population would need to be interpreted in this context (Italy, I am thinking of you).
    • Do the decaying European houses (that we think of as so charming) mean that European household income tends to be overestimated by the use of gross measurements?
    • On the other hand, does the fact that we do not capture the value of leisure underestimate European household income relative to countries like the US?
  • The World Bank uses GNI per capita in US dollars converted from local currency through the Atlas method to classify countries in income groups (low income, lower middle income, higher middle income and high income). The Atlas method is based on three year moving averages of exchange rates. They use the Atlas method rather than PPP arguing that “issues concerning methodology, geographic coverage, timeliness, quality and extrapolation techniques have precluded the use of PPP conversion factors for this purpose” (World Bank, undated). This seems to also be the indicator the WB uses for establishing the annual threshold for countries to qualify for International Development Association (IDA) loans. The US Millennium Challenge Corporation (MCC) uses the WB country income groups to select countries that qualify for its assistance (low income and lower middle income). Based on the input above:
    • If we underestimate income in low-income economies, given that they often also have larger portions of their economies not captured by GNI measurements (greater presence of subsistence agriculture, household production and services, informality), what does this mean for our categorization of countries in income groups? How distorted are these classifications? Should we be interpreting them as rather “market income” groups? If so, to what extent are our foreign assistance programs directed at increasing “market income,” rather than income as a whole? To what extent are our foreign assistance impact evaluations distorted by not recognizing this distinction?

Notes

  1. There used to be a site with technical papers at the URL: www.stiglitz-sen-fitoussi.fr . This seems to no longer be available but I found a link to the content here: https://web.archive.org/web/20150622185128/http://www.stiglitz-sen-fitoussi.fr/en/index.htm
  2. The WB World Development Indicators reports total value added at basic or producer prices and GDP at purchaser prices. That is why their measurements differ. Purchaser prices include taxes and exclude subsidies. For more information, see here: https://datahelpdesk.worldbank.org/knowledgebase/articles/114948-what-is-the-difference-between-total-value-added-a

References

Stiglitz, Joseph A; Sen, Amartya; and Jean-Paul Fitoussi. 2010. Measuring our Lives: Why GDP Doesn’t Add Up. The Report by the Commission on the Measurement of Economic Peformance and Social Progress. The New York Press.

World Bank. Undated. Why use GNI per capita to classify economies into income groupings?. Available: https://datahelpdesk.worldbank.org/knowledgebase/articles/378831-why-use-gni-per-capita-to-classify-economies-into. Accessed: June 08, 2024.

Continue ReadingOutput and Income Indicators

Indicators of Government Expenditures

Image by Abraham Bosse. Downloaded from picryl.com

The International Monetary Fund (IMF) has a couple of public dashboards showing government expenditures as a percentage of Gross Domestic Product (GDP), by country. See here and here. There is nothing wrong in doing this if we keep in mind that we are using GDP as a denominator just as a tool to give us a reference of the relative size of government expenditures in different countries. But, based on this kind of data, it is common to hear things like “government expenditures were 61% of the entire French economy or 45% of the US economy in 2020,” as if these numbers were breaking down the total of the economy (100%) in its government and non-government portions. This would be incorrect and, unfortunately, it ends up supporting all sorts of confused discussions about the role of government in the economy.

The comparison between government expenditures and GDP is one of apples and oranges and only makes sense if we understand, again, that GDP is being used as a denominator only as a convenient tool to facilitate country comparisons. Government expenditures, as reflected in databases like that of the IMF, are measures of total expenditures, either by central and local governments or just by central governments (depending on the country), over a one year period. GDP does not measure total expenditures, but rather “value added” by the economy over a one year period. The difference is that measures of value added discount from measures of expenditures, the purchases of intermediate goods and services used to provide the goods and services by the sector in question. Value added is used when measuring output by sector, to allow summing these sectors without double counting. The result is a general measure of output, such as GDP.

To illustrate, see the table below (Figure 1). The second column shows the government as a share of GDP in 2020 for selected countries, as measured in total expenditures and reported by the IMF. The third column shows government consumption as a share of GDP, as measured in value added and reported by the World Bank World Development Indicators. The actual share of the GDP that corresponds to the government would need to add government investment (fixed capital formation) to government consumption. These data were not readily available for most countries in the WB WDI dataset and it seems like disentangling government and private fixed capital formation is not very simple. So I added total fixed capital formation (public and private) to government consumption, for the sake of comparison with IMF numbers (fourth column). The actual weight of the government in GDP should be somewhere between columns three and four.

Figure 1. Government Relative to GDP, Selected Countries, 2020

CountryGovernment Expenditures as % of GDP (IMF)1Government Consumption (value added) as % of GDP (WB)2Government Consumption +Total (public and private) Fixed Capital Formation (value added) as % of GDP (WB)2
France61.3524.8448.12
Germany50.4622.0243.57
Brazil49.9220.1436.70
United Kingdom49.8722.6040.07
United States44.8215.0936.94

Sources: 1. IMF DATAMAPPER. Fiscal Monitor, October 2023, https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC. 2. World Bank World Development Indicators. Accessed April 2024, https://databank.worldbank.org/source/world-development-indicators.

Note: government expenditures in 2020 were generally higher than usual, as countries tried to minimize the economic effects of the COVID 19 pandemic.

I am sure there are better data out there somewhere but, after spending some time trying to unbury the IMF metadata (should be more easily findable) my patience was running low. For the US, see data from the Bureau of Economic Analysis which defines the value added by Government as being “the sum of compensation paid to general government employees plus consumption of government owned fixed capital (CFC), which is commonly known as depreciation (BEA, 2008, p.29).” My point still holds.

Another way of looking at the actual weight of government expenditures in the economy would be to compare, not with GDP, but with total output in an economy over a one year period, that is, not discounting intermediate products and services. Country national accounts typically do show this indicator and it tends to be roughly twice as large of the total value added in any one year. The ratio of total output to value added is available in Table 2.6 of the United Nations (UN) National Accounts Statistics. Figure 2 below applies that ratio to the IMF indicator of government expenditures as a share of GDP to obtain a rough estimate of the share of government expenditures over total output in the last column of the table. Note that the resulting estimates are within the range of columns 3 and 4 of Figure 1.

Figure 2. Government Relative to Total Output, Selected Countries, 2020

CountryGovernment Expenditures as % of GDP (IMF)1(a)Ratio of Total Output to Value Added (UN)2 (b)Rough Estimate of Government Expenditures as % of Total Output (a/b)
France61.351.9531.42
Germany50.462.0324.83
Brazil49.922.0724.14
United Kingdom49.871.8926.40
United States44.821.7725.39

Sources: 1. IMF DATAMAPPER. Fiscal Monitor, October 2023, https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC; 2. UN National Accounts Statistics. Main Aggregates and Detailed Tables. Table 2.6, Accessed April 2024, https://unstats.un.org/unsd/nationalaccount/madt.asp?SB=1&#SBG

Again, I am sure there are better data out there, but the fact that I had to spend considerable time deciphering the data above and still don’t have non-misleading comparable cross-country data for the actual size of government expenditures relative to total output is of relevance itself for my purposes on this blog.

Other than the issue of comparing apples and oranges, there are additional considerations we need to make when assessing statements like the ones I made above (“government expenditures were 61% of the entire French economy or 45% of the US economy in 2020”). One is about what we are supposed to infer from looking at government expenditures. If a measure is provided as a reference for the extent to which governments participate in the economy, using expenditures ignores the entire side of government regulation, which, in market economies, is likely at least as important as government expenditures to understand the influence of the government in the functioning of an economy. Looking beyond total expenditures and into their breakdown by levels of government, by consumption and investment, and other disaggregated data would likely also contribute to a much richer and productive discussion, not to mention the large literature on taxation, as well as financial indicators of debt and debt sustainability. These are all subjects that the IMF delves into professionally and releases publicly a lot of information about, even if not always easy to decipher. I can’t help wondering, however, whether sites like those of the IMF dashboards linked above are actually doing more harm than good by stressing one small and misleading indicator of government participation in the economy.

Another consideration in interpreting data such as that shown in the IMF dashboards is about GDP and what it represents. Although we often think of it as an indicator of the size of the economy: a) there are important limitations to this indicator, and b) when used, there are different indicators that may be more or less appropriate for different purposes. I will look at these issues in a future post.

References

BEA (Bureau of Economic Analysis). 2008. A Primer on BEA’s Government Accounts, by Bruce E. Baker and Pamela A. Kelly. Available: https://apps.bea.gov/scb/pdf/2008/03%20March/0308_primer.pdf?_gl=1*1anuf1l*_ga*NjM4MDQ4ODA2LjE3MTI3Nzc2ODE.*_ga_J4698JNNFT*MTcxMzExMzg4NC44LjAuMTcxMzExMzg4NC42MC4wLjA. Accessed: April 14, 20244.

BEA (Bureau of Economic Analysis). 2010. Frequently Asked Questions: BEA seems to have several different measures of government spending. What are they for and what do they measure? Available: https://www.bea.gov/help/faq/552 Accessed: April 12, 2024

International Monetary Fund (IMF). 2023. IMF DATAMAPPER. Fiscal Monitor, October. Available: https://www.imf.org/external/datamapper/G_X_G01_GDP_PT@FM/ADVEC/FM_EMG/FM_LIDC; Accessed: April 14, 2024.

United Nations (UN). 2024. UN National Accounts Statistics. Main Aggregates and Detailed Tables. Table 2.6, Available: https://unstats.un.org/unsd/nationalaccount/madt.asp?SB=1&#SBG; Accessed: April 14, 2024.

World Bank. 2024. World Development Indicators. Available:  https://databank.worldbank.org/source/world-development-indicators; Accessed: April 14, 2024 

Continue ReadingIndicators of Government Expenditures

Mental Models and Academic Models

Image by Jobin Scaria. Downloaded from pixabay.com

Every year, the World Bank publishes a World Development Report, an analysis of a selected aspect of Economic Development and its status in the world at the time. In 2015, the selected theme was “Mind, Society, and Behavior.” In this report, the WB argues that there have been advances in our understanding of how people make decisions, and that this better understanding can be used to increase the effectiveness of development interventions.

They highlight three principles of human decision making:

  1. Many of our decisions are done quickly, making use of an automatic and effortless system of thinking that contrasts with the slower and more deliberative and thoughtful process that we often identify with rational decision making processes. This argument builds on the work of psychologists such as Daniel Kahneman and Amos Tversky and I have discussed this in other blog posts on this site as well.

  2. Our individual decision making is not really just individual, but influenced by the society around us: social preferences, norms, identities. We cannot assume that factors (preferences) taken in consideration in individual decision making are not shaped by the communities in which we are embedded.

  3. The social influences that we receive are embedded in “mental models:” worldviews, stereotypes, simplifying concepts and categories that we use for decision making

The consequence of the three principles is that our decision making is influenced by “culture,” deeply rooted beliefs and practices that we often take for granted and may not even recognize. These beliefs and practices may favor or be detrimental to the achievement of desired development goals by any community. When they are detrimental, breaking the cultural patterns may require addressing social practices and institutions before individual incentives and decision-making can change.

 

The authors argue that “recognizing that individuals think automatically, think socially, and think with mental models expands the set of assumptions policy makers can use to analyze a given policy problem and suggests three main ways for improving the intervention cycle and development effectiveness:” (p. 192)

  • “First, concentrating more on the definition and diagnosis of problems, and expending more cognitive and financial investments at that stage, can lead to better-designed interventions. […]

  • Second, an experimental approach that incorporates testing during the implementation phase and tolerates failure can help identify cost-effective interventions […]

  • Third, since development practitioners themselves face cognitive constraints, abide by social norms, and use mental models in their work, development organizations may need to change their incentive structures, budget processes, and institutional culture to promote better diagnosis and experimentation so that evidence can feed back into midcourse adaptations and future intervention designs.” (p.192-193). 


The recognition by the World Bank of the role that culture plays in development, through the functioning of mental models, came on the tails of increased attention paid to behavioral sciences. The report often cites, for example, the work of Nobel Prize laureates Esther Duflo and Abhijit Banerjee, that (among other things) call attention to the fact that there is evidence from randomized control trials that how foreign aid is designed and delivered often matters for their effectiveness. One of several members of the Advisory Panel to the World Bank report was Cass Sunstein, a legal scholar that, among other things, wrote a book called “Nudge” arguing that policy design and delivery can affect the choices that people make. As I write this post, he also happens to be the husband of the USAID administrator Samantha Power. When she took office in 2021, she seemed to bring the belief that the Agency could use more insights from behavioral science in its own design and delivery of foreign assistance activities and even brought her husband to speak to USAID staff.

 

Of particular interest to me is the role played by mental models as devices that seem inherent to our nature, to how our brains work, and necessary for our daily functioning and (often unconscious) decision making, but that simultaneously can be detrimental to our goals and hard to break from.

How far can a parallel be drawn with academic models?

 

Academic models would seem, at first, to be quite the opposite of mental models. They belong to the “thinking slow” realm, we are conscious of their assumptions, the connections between those and their implications, and are able to modify them, consciously, as needed to better explain what we observe in our reality. They would seem to only have in common with mental models, the fact that they are, umm… “models,” simplifications of reality used to allow us to deal with its complexities in a productive way. However, academic models too have a way of inserting themselves into our unconsciousness and biasing our thinking over time, to the point that we are no longer able to recognize this effect.

 

Hoping for some more insight on how academic models allow us to better understand reality, I found a 2008 paper by Mary S. Morgan and Tarja Knuuttila titled “Models and Modelling in Economics,” which I understand was later (in 2012) published in the “Handbook of Philosophy of Economics” edited by Uskali Mäki and published by Elsevier. I should state upfront that I do not know the extent to which the draft I found was edited before being published three years later. A quick internet search shows that Oxford published its own Handbook of Philosophy of Economics in 2009, as did Routledge in 2021. Philosophy of economics is the kind of subject that interests, annoys and troubles me, all at the same time. I do have a genuine interest in how we claim to know things, but my interest in economics always came from a practical standpoint of wanting to improve the conditions in which the populations I came from lived in. So having to spend too much time on these issues to be able to digest economic theory always struck me as simultaneously necessary but too time consuming, perhaps beyond my capacity to fully grasp, and potentially a waste of my time. In the end, my failure to overcome my methodological or philosophical discomfort with economic theory in general became a source of personal internal conflict and, hence, the troubling nature that these discussions have for me.

 

But Morgan and Knuuttila’s paper did seem promising to shed some light on these matters, so I dived into it and I will summarize my understanding and takeaways here.

They distinguish between two major views of models in economics: as “idealized entities” or “purpose built constructions.” 

 

As idealizations, they can be viewed as generalizing, abstracting, simplifying and/or isolating, for reasons such as facilitating deductive reasoning or for mathematical tractability. This can be done by identifying aspects of reality that are considered absent or negligible or that can be ignored because they remain unchanged over the time, place or scope of analysis. Often the idea is that, once used for its purpose of analysis, models can be “de-idealized,” or made more concrete, by adding back specificity. The practice of “idealizing” and “de-idealizing” is not simple or inconsequential, however. Deductions from an idealized model may not necessarily hold when more specificity is added back. In fact, the risk of distortion in each direction of the process is considerable.

 

A couple of traditional discussions around this view of models are, first, the traditional view that data without models reflecting theory can suggest spurious relationships and, therefore, data should follow models that should follow theory. A more recent discussion is on whether it is best to build models from the more general to specific or the other way around. Theory and data jointly feed model building under one or the other approach.

The second view of models, as purpose built constructions, sees them as “fictional entities,” “autonomous objects,” that are not constructed in relation to an observed reality, nor are they necessarily related to theory, or perhaps they are related to just one particular aspect of the observed reality or theory, but are simple tools for thought, perhaps creating a parallel, stylized reality that allows for understanding of the connecting between cause and effect, or “for a variety of purposes.”

 

My experience with economic theory in the past suggests a greater prevalence of the second view relative to the first, at least in the academic circles I was a part of. That is why I also came to see as a reasonable justification for economic theory that it can show that some results are possible to be observed, and it can show a set of conditions under which those results can be observed (sufficient conditions). This is useful often to demonstrate that what we observe does not necessarily mean that “a” or “b”is true, but could also mean that “c” is true. It helps dispel many myths that we create in our daily lives by not understanding the many circumstances that can lead to what we observe. However, being able to show all possible sufficient conditions for an observation to be true (i.e. the collection of scenarios that, seen as a whole, constitute the necessary conditions for an observation to be true), is much harder. A set of identified sufficient conditions may turn out to only be one of many sets of conditions under which the same result is observed. That is where the usefulness of these theoretical models ends. In other words, these constructed models tell us a lot of what reality is not necessarily, but very little about what reality is.

 

Morgan and Knuuttila then seem to suggest that, under either view of models, a more useful way to look at them could be one focused on their function: “instead of trying to define models in terms of what they are, a focus could be directed on what they are used to do (p28).”They go on to argue that this would be the focus less on the models themselves but on the process of modeling.

 

The parallel I was seeking with mental models is not in this paper, after all. The authors only very tangentially allude to it early on when they state that “economics shares an hermeneutic character with other social sciences […] individuals’ knowledge of economics feeds back into their economic behavior, and that of economic scientists feeds in turn into economic policy advice, giving economics a reflexive character quite unlike the natural sciences.” In my experience, no matter how rigorous academic economist may believe themselves to be in their views and uses of models, the moment they are asked to give their opinion about reality they will refer to them (irrespective of their constructive or simplified nature) and draw suggestions about reality that the models themselves to not allow them to. Academic models become academics’ mental models, and the myriad of assumptions, conditions, and circumstances under which they may inform reality get lost in the process.

 

I also found lacking in Morgan and Knuuttila’s paper a discussion of the possibility or not of actually “testing” models with data. Here too they tangentially allude to it when noting that econometric models are not just testing mathematical models built on theory (selection of variables and causal relationships) but are often simultaneously testing the data, assuming probabilistic distributions, functional forms, the nature of observed errors and stochastic behavior (p15).

 

The bottomline for me is that, as I found the World Bank’s effort to assess the role of mental models in development practice to be refreshing, I suspect I would find equally refreshing a critical look at the impact that academic modeling has had on our economic understanding as applied to our daily practice, the good and the bad of it. The discussion of “robustness” of academic models, touched in passing in Morgan and Knuuttila’s paper, makes some headway in this direction, as it recognizes the need to question whether models hold to changes in assumptions, place, time, circumstances. A step further would be to ask ourselves the extent to which we fall from academic rigor when we translate academic models to our daily view of the world and start confusing our models with reality. To be continued. 

 

References

 

Morgan, Mary S. and Tarja Knuuttila. 2008. Models and Modelling in Economics. Forthcoming in U. Mäki (ed) Handbook of the Philosophy of Economics [one volume in Handbook of the Philosophy of Science, general editors: Dov Gabbay, Paul Thagard and John Woods]. Available: chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://sites.pitt.edu/~jdnorton/teaching/Phil_Sci_Core/HPS_2501_2020/more_pdf/Knuuttila_Morgan_Models_2009.pdf/. Accessed: April 07, 2024.

 

World Bank. 2015. World Development Report 2015: Mind, Society, and Behavior. Available: https://www.worldbank.org/en/publication/wdr2015. Accessed: April 07, 2024

 
Continue ReadingMental Models and Academic Models

Management as a Balancing Act: A Personal Account

  • Post author:
  • Post category:Fan
Image by Miguel Á. Padriñán. Downloaded from pexels.com

As I explain in the “Home” and “About” pages of this site, how we know has been a lifelong interest of mine and is a theme throughout this site. It is, therefore, with some discomfort that I write this personal account. Any personal account is anecdotal evidence and of a particular kind: it rings true to ourselves because we lived it, but it is also subject to our own unrecognized biases. So the value of a personal account in a learning blog is hard for me to gauge. In any case, as I write this post, I have been for eight years leading a team providing services to USAID. I feel it is almost an obligation to myself to want to try to draw lessons from this experience. So here goes.

In my role of leading and managing a team, I see my performance as having greatly benefited from luck. This luck is of two kinds: a) I benefited from a few personal traits that I developed over time through no merit of my own; b) I benefited from the environment I was placed into when starting to lead this team. The personal traits that I think helped me have been some degree of humility and empathy born out of not so memorable events that resulted in conflicting feelings of superiority (ugly, yes, I know) and failure, confidence and insecurity, and I think may have translated into a relatable and approachable style of leadership (I will not delve any further here). The environment that I was placed in and that also helped me, was a receptive group of young, kind and competent staff, that was delivering day after day on its own and had built a culture of collaboration and collegiality. For whatever reason, I was embraced. Independent of my own feelings of luck, personal psychological history or additional details of my team, the important part is that, early on, my role on this team was steered by supportive personal relationships, rather than any particular management capabilities I brought to the team.

Over time, however, my team grew both in size as it did in scope and complexity of the work we were asked to take on. It gradually became clear to me – and I believe to others on the team – that supportive and collegial relationships would not be enough to sustain a successful team performance. We needed to improve in ways that none of us were very familiar or comfortable with.

The first direction we sought was towards better defined and established processes, management tools and standards of operation and behavior across the team. The Project Management Institute’s Project Management Body of Knowledge (PMBOK), 6th edition (2017, the only one I have) has a table distinguishing between leadership and management (p. 64) that I reproduce below:

Source: Project Management Institute (PMI). 2017. PMBOK Guide. P. 64

To me the central point is the fourth row: leadership focuses on people, management focuses on systems and structure (I would say processes and tools). We needed to continue nurturing a leadership structure and culture that we thought had been successful until then, while improving on processes and workflows, and providing the team with tools (software, templates, established standards and procedures) that would enable us to gain in effectiveness and efficiency. This all may sound like jargon, but it is really what we thought we needed to do.

I believe we have advanced considerably in this direction, although there is still much to be done and it is an ongoing effort, so I will not provide details in this post (perhaps in another, future one). But I would like to highlight that, moving towards better established processes, standards and tools, does not replace the role of leadership. I have found that professional managers often dive into the PMI management jargon and guidance, while forgetting that the PMBOK itself distinguishes between management and leadership, and attention to both are necessary for the good functioning of a team. Our own efforts and time must be geared towards both management and leadership, This is the first balancing act.

A second and more recent direction we’ve been pursuing is in assessing what kinds of top-down authority we need to allow ourselves to exercise and enforce, unapologetically. As mentioned, our team relied largely on a supportive and collegial culture to function. That is a good thing. But, as such, establishing authorities of a more hierarchical nature is not always easy: it comes with a risk of creating unhealthy power relations. A common illustration that I have found in several places on the internet, mostly blogs, is the one below.

Source: [can’t remember the blog first pulled this from, but present in several. If anyone knows the original source, I will provide credit or pull it down, if need be].

In several blogs, I have seen this picture accompanied by text that goes something like: “when the top guy looks down, they only see s***, when the bottom guy looks up, they only see a**h*****.” The top guy is portrayed as a manger or a CEO and the layers typically reflect layers of management. This is a common view of a top-down management structure. We wanted to avoid the negative relations that are often associated with such structures.  However, we did find that some degree of top-down enforcement is needed to enforce minimum standards across a team, and minimum levels of accountability and fairness.

As with the effort to establish better management processes and tools, this effort to better establish authorities and accountability is also ongoing, and here too I will not get into detail in this post. But this is a second balancing act: establishing clear expectations, responsibility, accountability and a structure to enforce such accountability, without losing the supportive and collegial culture built collectively over time.

So, for whatever it’s worth, there is my personal account. I’m sure I will come back to this in the future, with the critical eyes of our ever transforming selves, as it should be, neither kind, nor mean, but hopefully as honest as self-assessments can possibly be. I also hope to, in the future, further develop how we’ve been rolling out our efforts to better balance leadership and management, bottom-up and top-down structures and processes, the extent to which our efforts succeeded and any insights from the experience.

References

Project Management Institute (PMI). 2017. A Guide to the Project Management Body of Knowledge. PMBOK Guide. Sixth Edition

Continue ReadingManagement as a Balancing Act: A Personal Account

All Things Shining Part II

He deals the cards as a meditation

And those he plays never suspect

He doesn’t play for the money he wins

He don’t play for respect

 

He deals the cards to find the answer

The sacred geometry of chance

The hidden law of a probable outcome

The numbers lead a dance

 

– Shape of My Heart, Sting

So, even after skipping to the final chapter and registering my initial thoughts on this site (see below my post “Initial Thoughts on ‘All Things Shining’”), I went back to the other chapters and finished reading the book (yep, I actually did). Doing so gave me a lot more context on where the authors are coming from and simultaneously served as an organized introduction to some philosophers and literature pieces I knew little or nothing about.

Grossly oversimplifying, their main point, as I understand it, is that there are opportunities to experience the sacred in a Godless world. There is no need to believe in a God (or Gods) to do so, only the predisposition to perceive and experience “moods” in the world around us, be part of those moods that come and go like a wave (“whoosh”) and nurture our capacity to do so like an artisan nurture’s its craft. They contrast their view with that of other philosophers and writers, briefly describing a history of western thought that evolved from a polytheist experience of “moods,” such as those in Homer’s work, to a more monotheistic type worldview in classical Greece, and then through two paradigm shifts (Jesus Christ and Christianity and the Enlightenment and Renee Descartes) that gradually brought us to a nihilist reality, centered on the self-sufficient individual (I find my own pretense of summarizing much of a book in one paragraph astonishing, but…there it is!). For my own benefit, I attempted to organize my understanding of their portrayal of this history of philosophy in Figure 1 below (including a few of my unresolved questions).

Figure 1 – Notes that likely only I am able to follow


There seems to be: a)  an essential assumption in the authors’ reasoning, and b) an essential observation.

The assumption is that meaning is to be found outside of ourselves. We cannot successfully impose it. They make this argument in the second chapter of the book, when discussing David Foster Wallace’s Nihilism and his proposition that we can impose meaning on our reality. They state this possibility is “the most demanding and the most impoverished all at once” (p 47):

  • Most demanding because:
    • It raises the stakes for happiness and demands a kind of bliss that supersedes any kind of earthly condition (ps. 47-48). They later equate this with the Buddhist concept of Nirvana (see ps 163-164).
    • It demands that bliss be constant and achieved at all times (p. 48)
  • Impoverished because there is no place for gratitude (p.48)

They reject this option as a source of meaning and re-emphasize on several occasions throughout the book. For example, on p 142: “The history of the last 150 years suggests that we are not the proper source for meaning in the world”

I, quite frankly, don’t see how they deduce any of the points above. The fact that they seem convinced that they have made their case and discuss the rest of the book as if meaning must be found outside the self, to me, unfortunately, weakens the entire book that, otherwise, I find very interesting, illuminating even. To top it off, as mentioned in my previous post below, they take the cheap shot of using the example of David Foster Wallace’s suicide to emphasize that this is the only possible ending. Thinking back, if all this had sunk in while I was reading the book, I probably would have stopped right there. But I am glad that I continued reading and can enjoy the rest of the book by thinking of the above as an assumption made by the authors, not something they were really set to demonstrate. I can live with that and address the assumption elsewhere.

The observation I can easily agree with. The observation is that we cannot control everything that happens outside of ourselves. We do not fully control our world and the events that surround us. This seems to me to be patently and observably true. It is even arguable that we do not control everything about ourselves (such as quick, intuitive thinking, and involuntarily biological processes).

Given the “assumption” and the “observation” (by the way, these designations are mine, not the authors’), this leads them through a path of wanting to be in-sync with, immersed in, aware of this uncontrolled and uncontrollable environment, and the external “moods” that are generated with the idea that, in doing so, we generate an opportunity to experience meaning, something we can call sacred. I see the appeal of this way of thinking and I see parallels with oriental philosophy that, for some reason the authors seem to want to negate.

Now here is where my thoughts go next: what if we look at the whooshing, the moods, the physis we are exposed to as being generated by random events? We can even choose to look at our own choices and the choices of those around us as generated by random events or being individual outcomes of random events (leaving aside the fact that, in that case, we probably wouldn’t discuss them as “choices”). If we think about the examples the author’s give in these terms, how would our outlook on meaning differ?

To use one of the author’s examples, say we are at a sporting event and collectively experience an athlete “in the zone.” We all realize what is happening and collectively feel we have been a part of something special. For this special moment to occur, we need to have gone to the sporting event, and we need to have been open to experience the athlete’s performance as something special. In addition, those around us need to have done so as well, and at least one athlete must have been “in the zone” on that day. We can look at the aspects of this experience that we do not control as random events: the choices and behaviors of others attending the game, and the likelihood that at least one athlete will be “in the zone.” In that case, our special experience, our experience of the “sacred,” would be one possible outcome of a joint probability distribution of the random events we are exposed to. Why is there a stronger argument to feel gratitude in this case, then in any other outcome of the joint distribution?

In the end, it seems to me what really matters is what the authors’ assumed away in the first two chapters of the book: whether meaning can be attributed from within, whether it needs to be defined in relation to something outside ourselves, and whether meaning should be attributed at all, at least if we look at the world as one that can be described by joint distributions of random events. It is as if the author’s efforts to propose an alternative to today’s nihilism relies on first simply assuming away nihilism as a viable outlook on life. It seems to me the central question is to what extent we think of what we experience in the world as being random or deterministic.

The discussion above takes me to a memory of my father. My father was an engineer. He was very intelligent and liked math. He was also very Catholic. I have this recollection of a period when he was logging into a table the results of the national lottery where he lived. The idea was that, perhaps, he could find a pattern in the lottery numbers. A pattern that would maybe signal some higher order or something he could be “in-sync” with, tuned into. I am sure he would have liked to win the lottery, and he would have interpreted it as a gift from God. Many Catholics view the world as a place where everything happens for a reason. But I honestly think that the discovery of a pattern itself would have been way more rewarding to him than any payments he would have received from winning the lottery. It would have proven the existence of a higher order and at the same time provided some personal intellectual gratification. It is possible to think of my father’s attempt, and it seems to me of the entire book discussed above, as ultimately stemming from a kind of wishful thinking. One that is not unappealing to me, yet difficult for me to accept. One that is difficult for me to accept, yet not unappealing to me.

I don’t think my father knew much statistics, or perhaps didn’t care much for it. And, spoiler alert, to my knowledge he never found a pattern in the lottery numbers, or at least not one that proved successful in winning the national lottery. Interestingly enough, he did many years later win a car in a lottery from the club he belonged to: it was close to Christmas time and he did quite desperately need a new car. He interpreted it as a gift, perhaps even as a reward for his faith, and I am sure he was thankful. Relevant to the discussion above: he did not pick the numbers.

Reference:

Dreyfus, Hubert and Sean Dorrance Kelly. 2011. All Things Shining. Reading the Western Classics to Find Meaning in a Secular Age. New York: Free Press.

Continue ReadingAll Things Shining Part II

Initial Thoughts on “All Things Shining”

I started reading “All Things Shining: Reading the Western Classics to Find Meaning in a Secular Age,” by philosophy professors Herbert Dreyfus (Berkeley) and Sean Dorrance Kelly (Harvard). At the end of the second chapter, after using authors David Foster Wallace and Elizabeth Gilbert to illustrate contrasting views on how meaning is generated or attributed in our daily lives, they end with the following paragraph:

“The question that remains is whether Gilbert and Wallace between them have completely covered the terrain. In Wallace’s Nietschean view, we are the sole active agents in the universe, responsible for generating out of nothing whatever notion of the sacred and divine there can ever be. Gilbert, by contrast, takes a kind of mature Lutheran view. On her account we are purely passive recipients of God’s divine will, nothing but receptacles for the grace he may choose to offer. Is there anything in between? We think there is, and we will try to develop it in the final chapter of the book.” (p. 57)

Needless to say, I skipped to the final chapter.

In that chapter they suggest there is present, today, in our culture, opportunities to generate a kind of experience where, we neither need to impose meaning on our world, nor passively await for meaning to descend upon our lives.

They start by suggesting that there are often collective experiences of marvel, bliss, exhultation. Experiences like those where expectators attending a live sporting event witness an athlete “in the zone” and are collectively taken by the experience. Or when, also collectively, we rejoice in a skilled orator’s speech. They equate these experiences with moments of realization of Homer’s notion of physis: how the world is in as much as it presents itself to us. They also suggest the exhultation of these collective experiences come and go as a wave. They use the term “whoosh” to describe it.

They then recognize the dangers of “whooshing,” like when the collective experience is dominated by some type of mass mentality and pack behavior, and where rational self-control is obliterated.

But they claim there is another type of experience available to us, which also offers an opportunity for experiencing meaning in our physical connection to the world, and which, in addition, can be used to discern between “good” and “bad” whooshing. They equate that experience with the Aristotelian term: poiesis. Poiesis captures a craftsman’s like practice of developing an intimate understanding and relationship with some aspect of our world. A relationship characterized by a “feedback loop between craftsman and craft” (p.211) and through which meaning also arises.

But “poiesis” too has its limitations, in that, in our world, it is under attack by technology. Technology that reduced our need to deeply understand our world to reach our goals. An example they give is that of GPS, through which we can move from one place to another, simply by following orders, and with never building significant knowledge of our surroundings.

The authors then argue that it is up to us to discover, in what we already care about, the opportunities for poiesis. Whether it is in drinking a cup of coffee in the morning, enjoying a walk, or the company of a friend. In other words, discovering it in our relationship to the world, and then nurturing it, transforming routine into ritual. They argue that is the realm of the sacred that exists currently in our world: a rich polytheistic world where the sacred manifests itself through physis and poiesis, and where technology has its place, without completely erasing the opportunities for the sacred.

I am very much enjoying the book, and intend to read the chapters in between (really…at some point), but here are a few thoughts based on the three chapters I read already:

  • In trying to describe how we can discover what we care about and build rituals around routines to nourish our experience of being in this world through poiesis, they use the example of having coffee each morning. They suggest asking what we like about this routine, whether it is the warmth of the coffee, the striking black color, the aroma. That is, they appeal to the senses. This appeal to the senses is very similar to how I have learned to practice mindfulness, to be present in this world, based on my understanding of the practice and the teachings of Thich Naht Hahn. Other parallels to eastern ways of thinking could be made when discussing Wallace’s “This is Water” commencement speech in chapter two. Yet, the authors do not seem to be interested in discussing these potential parallels, at least not in the parts I have read.
  • In discussing Wallace’s nihilism, they propose that, to the extent that he can see any space for the sacred, his attempt to reach it is by imposing meaning on experience, creating this meaning out of nothing, and constantly doing so. There are no constraints on the meaning Wallace can impose on his experience. They suggest this self-imposed task is not humanly possible. In fact, they state that “in such a world, as Melville understood, grim perseverance is possible for a while; but in the end suicide is the only choice” (p.50). I a) do not see how or why “suicide is the only choice,” and b) find this statement actually in bad taste, to say the least, since Wallace did commit suicide. His example is rather an unfairly picked choice, unnecessarily making use of someone’s suicide to make their argument.
  • The authors describe Wallace and Gilbert as having a similar view of the purpose of writing: translating life, conveying what it means to be human, becoming less alone. If the interpretation of our world and our condition is actually an intrinsic and inseparable part of our human condition, one could argue that storytelling is not just interpreting what it means to be human, but actually contributing to creating that condition. Storytelling would then be a way of creating rather than just interpreting our world. I have for a while found this way of thinking appealing, and it is different from what the authors claim to be the views of Wallace and Gilbert. It is perhaps more in tune with Mario Vargas Llosa’s story in his book “El Hablador.” Perhaps something I will explore further in the future.

Reference:

Dreyfus, Hubert and Sean Dorrance Kelly. 2011. All Things Shining. Reading the Western Classics to Find Meaning in a Secular Age. New York: Free Press.

Continue ReadingInitial Thoughts on “All Things Shining”

End of content

No more pages to load