The literary canon consists of texts from the past with high literary quality. The designation of literary quality to contemporary fiction is not only based on the text itself, but also on the prestige of the text’s genre and the reputation of the author. How about the past? Most of the authors who have survived into the current literary canon are male and white. This suggests that the literary canon is selective rather than inclusive. It not only excludes texts, but also their readers. What is more: books that swept many readers away, are often discarded when those readers are gone. We will never know what our parents and grandparents etc. read for their own pleasure and personal benefit. In our project we take two steps to change this. First, we will ask people to submit photographs of bookshelves or bookcases of their departed loved-ones. Using state-of-the-art STR (Scene Text Recognition) we will establish the author and title of the books using library catalogues and link the book descriptions to Delpher and DBNL. We will identify titles at risk of getting lost forever, procure a copy, and have it digitized and added to Delpher and DBNL. Second, we will create a large digital corpus that includes all novels from the last 200 years. We will apply computational tools to find out how texts included in the literary canon differ (or not) from those that are excluded. We expect to identify forgotten classics and will arrange to have these republished so they can reach a new reading audience. All these steps will help to answer the question how objective the literary canon is: Is it a reflection of timeless value judgments, or does it reflect and perpetuate the cultural biases of the time in which the books were first published?


