"Bay-area man proposes inventing a field of study that already exists." —Mark Riedl
This is an inter-disciplinary book. It is likely to alternately amuse, astound, enlighten, offend, enrich, even enrage, and disappoint, because it deals with topic on which some specialists feel confident that they have cornered the market: philosophers, biologists, library scientists, lawyers, information theorists, etc. It is written from the standpoint of someone enamored with and concerned about machine learning (ML) and artificial intelligence (AI) systems and their applications, with the wonder of a theoretical physicist 'discovering' the Humanities late in his career.
The topic (or process) in question goes by many names: classification, taxonomy, judgement, choice, decision theory, category theory, boundary theory, and various other "theories." These are themselves categories, yet this book will be about how they are all similar when one attempts to automate creating them — and certainly, people are seeking to do so.
As fields of study become increasingly specialized, the result is often that people with similar ideas may not be aware of one another's work, resulting in isolation, 'reinventing the wheel,' and so on. This highlights the need for interdisciplinary surveys such as this book. At the same time, the power of ML/AI systems to 'move in' on a variety of fields of study in a content-agnostic way and subsequently "beat the experts," — such is the topic of the chapter, "Space Invaders" — has produced a generation of machine learning developers endowed with Promethean abilities […error: that's not the right metaphor…] but lacking both a historical perspective and appreciation for the relevance of acquiring domain-specific expertise. (See the chapter "Gimme Some Data" for more of this.)
This combination of isolation-via-specialization and lack of historical perspective is illustrated in this amusing anecdote from Twitter in June 2019. Technologist Tristan Harris was replying to a remark by Aviv Ovadya:
…to which philosopher Shannon Vallor chimed in:
…and IT law specialist Chuck Cosson suggested an even longer perspective:
And, not to put to fine a point on it in terms of Twitter posts, the following came up while writing this section:
Thus, there is a potential for real harm when one operates from the "Bay Area Man" mentality, and the principal systems affecting the freedoms of other humans involve automated classifications of one form or another.
For our studies on the activity of "classification" (which we use a grab-bag word for all the words listed above), we will attempt to connect with a historical and broad perspective, from all way back to Aristotle, and survey how this activity has been tackled through history by humans, up to and especially considering modern 'data-driven' approaches for machines, and what these entail.
Why classification? This is fleshed out in the following chapters "One of These Things is Not Like the Others," "The Run Down" and "Classification as Power." Briefly, it is because human individuals classify all the time, and human society is run on classification. When I was 'growing up' professionally as a physicist doing classical field theory, I thought the only sorts of important problems primarily involved "regression" — i.e., the approximation of continuous functions — and I wasn't even sure if classification deserved to count as 'doing science'! This book is in part my wonderful journey into the realization that classification is at the heart of what it means to be human.