One barrier to greater dialogue and understanding within Jurisprudence is the inability to appreciate the variety of forms and purposes among the different theories of or about (the nature of) law. Legal theorists are frequently to blame for the failure of clearer discussion about methodological issues, as they are often not as clear as they might be regarding the nature of claims they are making (e.g., descriptive versus prescriptive, conceptual versus empirical) or regarding the larger project of which their theories are a part. This entry attempts to offer a rough overview of the types and purposes of legal theory. The focus is primarily on theories about the nature of law. Other types of theories – e.g., regarding rights or the best approach to legal or judicial reasoning – may warrant a different analysis.
In general, one might divide theories about social practices and institutions roughly into three broad categories: (1) descriptive theories – theories that purport to state what is the case, offering an overview of current practices or understandings; (2) analytical or conceptual theories – theories that make claims about the intrinsic or necessary nature of some practice or institution; (3) theories which contain elements both of description and prescription; and (4) purely prescriptive, normative or critical theories – theories that argue for how practices or understandings should be reformed. (As will be discussed, the second category, analytical/conceptual theories, can also be seen as a subset of the first category, descriptive theories.) The categories will be discussed, in turn, in the coming sections (1.1, 1.2, 1.3, 1.4). Later sections will offer brief discussions of the related questions of whether methodological questions are specific to legal theory (2.0), the ontology of law (3.0), and the purposes of legal theory (4.0).
1.1 purely descriptive theories
Legal theorists often refer to their theories as “descriptive” – but “descriptive” comes in many variations, some of which, like analytical and conceptual theories, are sufficiently distinctive that they will be discussed separately in the next section, 1.2. There are also interesting types of theories that seem to be neither descriptive nor prescriptive, but to be, in some ways, in-between. Those types of theories are discussed in section 1.3.
In general terms, a theory is “descriptive” if it purports to describe what is the case, rather than to make judgments about the (moral or other) value of the current situation, or to offer arguments for how things should be done differently. Many types of theories that are about law, but are not “jurisprudential theories” (narrowly understood), are clearly descriptive: e.g., sociological, anthropological, and psychological theories about the way people behave in legal roles or in response to legal regulation; and historical accounts of why particular legal systems developed the way they did.
When one is offering a theory meant to range over a large number of instances of some institution or practice (across jurisdictions, or over time), there is always the problem of how to combine the data. For example, how does one have a “theory of law,” when legal systems (however understood) clearly differ from country to country and in any given country over time? To try to offer a purely descriptive theory of a vast social practice like law seems at risk of becoming little “more than a conjunction of lexicography with local history”. (Finnis 1980, 4).
One needs some means of organizing the data that is the subject of one’s descriptive theory, and there are debates within the literature regarding how such selection can or should be done. For example, John Finnis accused Hans Kelsen of having erred in his construction of a theory of law, in that Kelsen purportedly tried to find a “lowest common denominator” – that which was common to all legal systems – rather than doing as Finnis argued should be done in such cases: finding what was characteristic of law in its fullest or most mature instantiation, even if some, or even many, legal systems did not have all of these characteristics. (Finnis 1980, 9-11)
Also, though a theory may not be “prescriptive” in the sense of suggesting reforms of current practices, to make some point about nature or purpose, the theory will likely emphasize some aspects of the practice and/or downplay others. For this purpose, many writers (e.g., Waluchow 1994, 15-29; Raz 1994, 219-221) differentiate forms of non-moral evaluation needed to construct descriptive, conceptual or analytical theories from any sort of moral prescription.
H. L. A. Hart famously argued for organizing a theory of law by viewing the practice at least in part from the perspective of a participant who “accepts” the legal system as giving him or her reasons for action. (Hart 1994, 79-91) Though this “hermeneutic” approach to theory construction is not universally accepted, even Ronald Dworkin and John Finnis, who criticize Hartian legal positivism in particular and the project of purely descriptive legal theory in general, accept the idea of building a theory around an insider’s perspective (though their ideas about how to build on an insider’s perspective differ in important ways from Hart’s views). (Dworkin 1986; Finnis 1980, 3-18; see generally Bix 1999).
1.2 ANALYTICAL OR CONCEPTUAL THEORIES
Many jurisprudential theories purport to offer true claims about law generally. Such theories are usually making analytical or conceptual claims about law, as opposed to making a claim that applies only to a particular legal system — at a particular moment in time. (Also, there are some theorists for whom it is not clear whether their theories are best understood as conceptual or as “merely” descriptive — e.g., this seems to be a matter of ongoing debate regarding the best characterization of John Austin’s work (Cotterrell 2003, 81-83).)
“Analytical” or “conceptual” theories usually purport to be “descriptive”, in the sense that they purport to describe the way things are rather than to criticize or to prescribe. However, such theories are usually not “merely descriptive”, in the sense that the theorists is doing more than merely reporting data or observations.
Conceptual analysis usually involves a philosophically ambitious claim that the theory has captured what is “essential” to some concept of practice, characteristics “necessary” for a practice or institution to warrant the label in question. While such claims about “nature” and “essence” were traditionally associated with Platonic metaphysics, there are less metaphysically ambitious modern versions of such claims. (e.g., Bix 2003b). For example, Raz (1996) defends an understanding of legal theory as conceptual analysis, and in doing so argues that it such theories try to explain “our concept of law”, not some universal or timeless (Platonist) concept of law. For Raz, legal theory is an attempt to get a clearer insight into an idea that is central to a community’s self-understanding.
Hans Kelsen’s “pure theory of law” (reine Rechtslehre) (e.g., Kelsen 1967; 1992) can also be seen as a special form of analysis, one grounded on a neo-Kantian methodology. At its essence, Kelsen’s theory is an effort to determine what follows from the fact that people sometimes treat the actions and words of other people (legal officials) as valid norms.
There have been a number of challenges to the value or tenability of purely descriptive or descriptive-conceptual theories in jurisprudence. Some of the challenges have come from various versions of natural law theory (e.g., Finnis (2000)), an approach that asserts that moral objectives and moral ideals are inherent to the nature of law, and therefore central to its understanding.
John Finnis (2003) has also offered a separate challenge: that if the descriptive/conceptual theory of legal positivism is understood as determining the nature of law, where this “law” is understood as separate from the normative question of how judges should decide cases (see, e.g., Raz 1998, 4-6) or how citizens should act in the face of government decrees, then this is an uninteresting and unworthy inquiry.
A different sort of challenge has come from those who doubt conceptual analysis either generally, or at least in areas where the concepts have normative overtones. The general challenge comes from naturalism (not to be confused with natural law theory, naturalism argues for a more empirical or scientific approach to topics, like epistemology, formerly approached in an a priori or conceptual way), and has been advocated in legal philosophy primarily by Brian Leiter (2003). Leiter, building on the well-known philosophical work of W. V. O. Quine (1951), claims that there are no “conceptual” truths to discover, and that theorists about law (or judicial reasoning or other legal phenomena) should confine themselves to empirical investigations of actual practices.
The “normative” challenge to descriptive conceptual analysis asserts that conceptual analysis of a concept like “law” (or “democracy” or “justice”) is inevitably contestable and evaluative. (e.g., Dworkin 2004; Perry 1998). Stephen Perry argues that one must inevitably choose among alternative tenable theories about law, and that this selection must be made on political or moral grounds. Ronald Dworkin’s interpretive theory of law portrays theories about (the nature of) law as theories that attempt to show the value of legality as part of a larger web of political and moral values.
1.3 BETWEEN DESCRIPTION AND PRESCRIPTION
Despite the commonly accepted distinction between description and prescription, there are a number of approaches and types of theory that seem to lie uneasily somewhere on the spectrum between “purely descriptive” and “purely prescriptive”: (1) “Semantic theories.” Ronald Dworkin famously characterized H. L. A. Hart’s theory of legal positivism as being best understood as a semantic theory – a definition about the meaning of the word “law.” (Dworkin 1986; 2004) However, this characterization was rejected by Hart himself, as well as most commentators. (Hart 1994, 244-248; Endicott 2001; for a defence of Dworkin’s claim, see Stavropoulos 2001). At a minimum it is worth noting that no theorist proferring a theory of law has characterized his or her own theory as being (“merely”) a definition of the word “law.”
(2) Variations on descriptive theory. Even descriptive theories themselves often seem to deviate from pure description. As was summarized earlier, the construction of a theory about some social institution or practice requires some amount of selection or simplification: (a) to prevent the theory becoming simply a messy restatement of complex reality; (b) to allow the theory to extract some basic insight about the institution or practice; and/or (c) to allow the theory to focus on the “fullest” or “highest” instantiation of the institution or practice, rather than what appears to be common to all instantiations. These forms of selection within descriptive theory come under various titles: “principles of theory construction” (e.g. Waluchow 1994, 19-21), emphasizing important features, and Weber’s “ideal types” (cf. Finnis 1980, 9-11).
(3) Rational Reconstruction. Within doctrinal legal scholarship, it is conventional (in many countries) for commentators to try to restate court decisions, or whole areas of law, in a way largely consistent with the outcomes of the cases, but restating the justifications offered to make them more persuasive. (In common law countries, such restatements of areas of law that had been primarily developed by judges – like Contract Law, Tort Law (e.g., Owen 1995), and Property Law – sometimes go under the title, “Philosophical Foundations of the Common Law.”)
(4) Ronald Dworkin’s Interpretive Approach. Dworkin’s influential legal theory (e.g., Dworkin 1986) is grounded on the “constructive interpretation” of official actions. (“Constructive interpretation” is the principle Dworkin would have applied not only for resolving legal disputes, but also for constructing theories about the nature of law.) For law, Dworkin would have judges decide cases by first finding the theory of past official actions (legislation, constitutional provisions, and judicial decisions) related to the dispute that would simultaneously adequately fit those past actions while making that area of law the best it can be (morally or politically). Dworkin’s interpretive approach, if followed in practice, would likely approximate “rational reconstruction”.
1.4 Prescriptive theories
Along with descriptive theories, and the variations of descriptive theories, described above, theories can of course be prescriptive: not focusing on describing current practices, but focusing instead on urging a new or reformed practice. The role of such theories in modern legal theory will be discussed more fully within the overview of purposes of legal theory (4.0).
2.0 ARE THE QUESTIONS SPECIFIC TO LEGAL THEORY?
Many legal theorists seem insufficiently attentive to the work already done elsewhere in social theory regarding the problems of theory-construction. For the most part, theories about law will raise the same questions as theories in other social sciences. For example, one way to distinguish theories is to take into account different schools of social theorizing: e.g., whether the focus is on individuals or on structures, and whether the basic account of social action is behavioristic or hermeneutic (also sometimes called “interpretive” or “Verstehen”). (Bix 2003a, 7-8; Lucy 1999, 17-32). This distinction has its greatest force in theories of social action within legal systems – e.g., theories of judicial behaviour – but it also has implications for more abstract theories of law. For example, H. L. A. Hart’s criticism of John Austin’s command theory of law, and Hart’s subsequent development of his own version of legal positivism, is centrally understood in terms of a hermeneutic rejection of a more empirical or more “scientific” approach. (Hart 1994, 18-123; Bix 1999) Already mentioned is the critique by some commentators of conceptual theories of law on the basis that such theories are generally unsupportable in law.
This is not to claim that there is nothing distinctive about law or legal theory. For example, theories of law may be distinctive in that law seems to function both as a kind of social institution and as central to our practical reasoning (e.g., Finnis 2000, 1602-03); and this “double life” may be a key to understanding the difficulty in constructing a theory about the nature of law.
3.0 ontology (Basic building blocks)
One type (or topic) of legal theory involves a metaphysical (ontological) explanation of law and legal concepts. The Scandinavian legal realists, in particular, focused on this question, though aspects of such questions can be found in a wide range of other theorists. The Scandinavian realists, building on views that paralleled (though did not equate with) logical positivism, were sceptical of entities that could not be understood in terms of observable, empirical data. (e.g., Hägerström 1953, Ross 1957, Lundstedt 1956, Olivecrona 1971)
If one rejects the metaphysical/ontological challenge of the Scandinavian legal realists, one is still faced with questions regarding the metaphysical status of concepts and claims used in legal practice. This inquiry is sometimes presented from a different perspective: in terms of the nature of legal truth, or even general questions of legal reasoning. For legal truth: when one says that a certain law-related claim is correct (“X has a right to possession of A” or “there is a valid contract between R and S”), in light of what is that claim true or false? None of the obvious alternatives seems attractive. On one extreme, already discussed, is the metaphysically sceptical position that demands that legal concept be reduced to observable, empirical terms. A different sort of sceptical view would argue for reducing legal concepts to descriptions and predictions of official actions. This perspective, sometimes described as a “predictive theory of law”, has some initial attractions, but also well-known weaknesses. (e.g., Hart 1994, 88-91)
At the other extreme would be a kind of Platonism in which legal concepts are thought to correspond with metaphysical entities (one modern theorists whose work sometimes comes close to this sort of Platonism is Michael Moore (e.g., 2000)). Related efforts try to ground the objectivity underlying legal concepts through a kind of “natural kinds” analysis, derived from theories of meaning and reference. (Stavropoulos 1996)
Various attempts have been offered to create a middle position – one purportedly more in line with common understandings of terms, without requiring commitment to ambitious or unusual metaphysical claims. Among the more prominent such theories are institutional fact theories (e.g., MacCormick & Weinberger 1986) and Dworkin’s interpretive theory of law (Dworkin 1986). Alternatively, some theorists have simply urged that one can avoid possible ontological issues by equating the meaning of legal terms and concepts with the rules for their use – a response to metaphysical questions similar to that urged by Ludwig Wittgenstein. (Hart 1954; Bix 1995)
4.0 Purposes of legal theory
As earlier mentioned, the topic of the purposes of legal theory is intimately tied up with the topic of the nature(s) of legal theory. A wide variety of purposes are served by theories about law. Some theories, especially those purporting to be analytical or descriptive, can be justified on the narrow basis of seeking truth and knowledge. Here such knowledge can be seen as the straightforward collection of facts that would result from a simple description, or the deeper sort of insight or understanding that might result from a quasi-descriptive model. (However, such knowledge, once gained, might also then play a role in an evaluative or prescriptive theory — a point important to the writings of a number of theorists, including both Jeremy Bentham (1996 ) and H. L. A. Hart (1958).)
In discussing types of descriptive theories, including analytical and conceptual theories (1.1, 1.2, 1.3), it was noted that many commentators argue that it is necessary, or at least valuable, for the construction of a theory of law to involve some amount of evaluation and selection. This view entails certain ideas about theories of law: that a certain lack of fit with the data is an acceptable cost for the insight a good theory might offer. (A comparable point is often made generally about modelling of behaviour, both in the physical sciences and the social sciences.)
A related point: jurisprudential theories are often offered as “explaining” a social practice or institution. “Explanation” is a central, but frequently poorly articulated, notion in discussing the point of descriptive theories. The reason one might put up with some simplifications, or even distortions, of the empirical reality in a (descriptive) theory is the benefit a good theory can offer by way of “insight” – showing something central to the nature of a social practice, or at least something interesting all instances of some category of practices or institutions seem to share.
Relevant to the earlier discussion of “rational reconstruction” (1.3), it should be noted that it often plays a significant role in the teaching of law and the training of legal advocates. However, the “rational reconstruction” used in teaching legal advocacy might differ in small but significant ways from the one offered for purely scholarly purposes. An advocate must have an eye not only to the best reconstruction of a muddled doctrinal area, but also the reconstruction that would seem best to the judges this advocate would face. Thus, if the best reconstruction of the prior cases would justify some right, but the current members of the country’s highest court are unlikely to recognize the right, the advocate might be better served by a reconstruction that excludes that right (at least until the membership on the highest court changes).
Critical theories of law aim more towards reform of current laws and practices rather than (mere) increased knowledge of or insight into those laws and practices. Many of the influential American legal theories of the 20th and 21st century – from American legal realism to law and economics, and including critical legal studies, critical race theory, and feminist legal theory – are best understood as being essentially criticisms of the current approach to legal regulation and/or judicial decision-making, combined with suggestions for how the system could be improved. (The reform- or justice-centred nature of critical legal studies, critical race theory and feminist legal theory are relatively self-evident. American legal realism and, especially, law and economics, may be harder cases, as both include claims that seem to be descriptive or analytical – that seem to be claims about the basic nature of rules, decision-making or law generally. However, both schools of thought are grounded on a view that law is instrumental, and the question quickly becomes, for theorists in both schools, either which ends law should pursue or how best to achieve the ends already chosen.)
When the first edition of Neil MacCormick’s H.L.A. HART appeared in 1981, it was notable for being the first book-length treatment of Hart’s legal philosophy, which was comprised largely of a sympathetic analysis of Hart’s most famous and enduring work, THE CONCEPT OF LAW (1961). As such, MacCormick’s book filled an important gap in the then existing literature for an accessible, introductory text on the main themes in Hart’s jurisprudence.
Nearly thirty years later, and sixteen years after Hart’s death, the scholarly production of critical literature on his thought has grown to enormous proportions, including several monographs, countless journal and law review articles and, most recently, a comprehensive biography, Nicola Lacey’s widely acclaimed A LIFE OF H.L.A. HART: THE NIGHTMARE AND THE NOBLE DREAM (2004). While the intellectual landscape has thus changed considerably, the need for a reliable introductory text on Hart’s contributions to jurisprudence is no less necessary. In this regard, the second edition of MacCormick’s book, updated to account for the major developments since its original publication, such as the posthumously published Postscript to THE CONCEPT OF LAW, continues to fill an important niche in Hartian scholarship.
MacCormick, who attended Hart’s lectures at Oxford as a graduate student in the early 1960s, does not undertake a complete survey of Hart’s scholarship, but instead aims more modestly at a “friendly” introductory account that provides a “sympathetic reconstruction of Hart’s main ideas” (p.13). At the same time, the analysis is not without critical bite, since MacCormick repeatedly claims throughout the book that “Hartian doctrine . . . points in the right direction but does not take us far enough” (p.159). He thus purports at various points (about which more below) to extend Hartian insights in a more rigorous and thoroughgoing manner than Hart himself did. The end result, as MacCormick admits, is a substantially amended conception of law. Indeed, Hart’s reaction to the first edition was to insist that “he considered himself a more hardened positivist than MacCormick had depicted,” which “made him out to be more of a natural lawyer than he wanted to be” (p.15). The present edition would no doubt elicit the same reaction.
As in the first edition, the heart of the book remains MacCormick’s reconstruction of Hart’s general theory of the structure of modern municipal legal systems as “a union of primary and secondary rules,” which is set forth primarily in Chapters 9 through 11. In an “inversion of Hart’s order of [*42] proceeding” in THE CONCEPT OF LAW, this discussion is preceded by a series of chapters examining “the building blocks of Hart’s theory of legal order” (pp.61, 117), namely his conception of social rules, his views on positive and critical moral theory, and the notions of obligation, duty and wrongdoing, power-conferring rules, and rights. It is followed by chapters devoted to an exposition of Hart’s theory of crime and punishment, the relation of law and morality, and an epilogue addressing the methodological concerns discussed in the Postscript.
Before turning to these issues, MacCormick begins, usefully I think, with a brief biographical sketch that sets the stage by reviewing Hart’s early career as a practicing lawyer, his credentials as a proponent of egalitarian social democracy, which crucially informed his reformist impulses, and the philosophical milieu in postwar Oxford, which was “the intellectual context to which his analytical jurisprudence belongs” (p.23). Perhaps surprisingly, in discussing the distinction between “law” and “politics” that figures so prominently in American constitutional discourse, MacCormick candidly observes that Hart’s views on the relation of law and political morality did not escape the parochial concerns of his time and place. “Though he claimed it applied to legal systems quite generally,” MacCormick writes, “Hart’s theory of law bears some of the marks of the . . . unspoken assumptions of the English lawyer” and is therefore “clearly recognizable as the work of an English lawyer of the twentieth century” (pp.8, 10).
While this might have been nothing more than a modest acknowledgment that no scholar can really lay claim to an Archimedean vantage point from which to evaluate social phenomena, it raises some thorny issues for Hartian jurisprudence. Stephen Perry (1996) has usefully distinguished between methodological and substantive versions of positivism, which are logically distinct. On the methodological side, as MacCormick suggests, Hart’s theory makes a claim to universality, in the sense that all genuine legal phenomena are assumed to possess a certain set of shared features or characteristics, regardless of time and place. In this view, the task of the legal theorist is to provide a morally neutral “descriptive account of what societies living under law all have in common” (p.210).
To be sure, such a description should take into account the participant’s perspective, the so-called “internal point of view,” which may or may not include normative considerations, but any such connections between law and morality are a strictly contingent matter. Hart is not much concerned with the reasons why officials accept properly pedigreed social rules. In particular, while moral considerations might be a criterion of validity in some legal systems (e.g., the Bill of Rights to the U.S. Constitution), this is by no means a conceptual necessity. Hartian reportage thus aspires to be a scientific, explanatory-descriptive enterprise that may be conducted independently of any context-dependent features of a particular legal system, such as the moral and political values which it subserves.
But serious reservations have been raised about the plausibility of this entire [*43] approach to jurisprudence. In the first place, as Perry points out, existing legal systems are artifacts of human cultural construction, rather than a “natural kind,” such as the elements of the periodic table, each instantiation of which may be said to share some common essence or function. For this reason, a general theory of law as such may not be a scientifically fruitful explanatory category, perhaps no more enlightening, as Brian Bix (1999) puts it, than “a theory of all objects that begin with the letter ‘N’.” Whatever jurisprudes are up to, it does not seem to be science in any ordinary sense of the term. (But see Brian Leiter (2003) for a vigorous dissenting view).
Moreover, whatever might plausibly be said about all legal systems, past and present and across cultures, is likely to be so abstract as to be without much practical significance. As noted above, Hart readily concedes that legal phenomena must be understood and described from a hermeneutic point of view, with the important proviso that the theorist need not personally embrace this perspective. Be that as it may, this is always the perspective of those persons who are actual participants in a particular legal culture. In understanding legal phenomena, we always start from what Perry calls a “local methodological stance,” since “it is by no means evident how we would go about formulating pre-theoretical propositions about ‘all’ legal systems.” With such an understanding firmly in hand, we might well be in a position to formulate a more general concept of law in a comparative fashion, but since legal institutions and practices are intelligible only in view of some inherently contestable function, value, or purpose, such as the promotion of justice and the common good, such an inquiry seems to be an unavoidably normative enterprise.
It is not clear to me precisely where MacCormick stands on this issue. On the one hand, he seems to insist, with Hart, that the validity and content of law are strictly matters of social fact. This, in turn, leaves open the possibility that “faithful reports can be given about the character and content of some body of law even by somebody who has no commitment to the particular values in which these laws are grounded” (p.204). From this it also follows that “the bare existence of a legal system as a system of rules carries no guarantee concerning the substantive justice or moral satisfactoriness of the content of these rules” (p.208).
On the other hand, Hart defended the separability thesis, at least in part, in frankly normative terms, namely as a warning against “the risk of moral complacency about the uses of the concept of law.” Hart’s positivism thus remains “grounded in practical, indeed moral, concerns, not purely epistemic ones,” and MacCormick concedes that it is not clearly “tenable to base one’s methodology on a claim about the moral basis for insisting on detached juristic inquiry” (p.209).
MacCormick then takes this line of argument a step further. Following John Finnis, he says that “it can be argued most persuasively that law must in principle be oriented toward the common good of the community whose law it is and seek to realize justice among its members” (pp.208-209) (emphasis added). Understood in this way, the law as it is actually [*44] implemented in any given jurisdiction can and often does fall short of the distinctive values it ought to realize, but if an aspiration toward justice and the common good is an inherent feature of the concept of law, then the status of purely descriptive conceptual analysis seems uncertain. Indeed, MacCormick (2007) claims elsewhere that where a given law or set of laws “cannot be accounted for under any possible conception of justice that could reasonably be adopted or advocated by a reasonable person willing to subject his or her beliefs to discursive scrutiny, then what is thus done by way of rules and practices of governance would not properly count as law.” In such “extreme cases,” he says, the ostensible laws “carry no element of genuine obligation with them, though they may be backed by coercion enough” (p.209).
This is a congenial conclusion, perhaps, but it is certainly not Hart’s view of the matter. As is widely known, Hart’s project in THE CONCEPT OF LAW was to rescue positivism from the reductive Austinian formulation according to which law was essentially an order backed by an effective threat of coercion. The problem with this picture, Hart pointed out, is that one’s being obliged to obey a command under the threat of sanctions in the event of noncompliance in no sense gives rise to a genuine obligation. In order to distinguish habitual obedience, motivated perhaps by fear of punishment, from actual rule-following behavior, Hart insists that those to whom the rules apply must adopt a certain “reflective critical attitude,” namely they must willingly accept the rules as a shared or common “standard of behavior.”
In a pre-legal society, the “primary rules” of behavior are obligatory essentially because they are accepted as legitimate by the members of a group as forming part of its conventional morality. By contrast, in a modern legal system, Hart makes no appeal to the content of the primary rules. Instead, such rules are obligatory by virtue of their origin, because they are properly enacted according to a valid secondary rule, which is itself accepted as a social rule of the group.
Moreover, it is sufficient if the public officials who are responsible for the formulation and implementation of the primary rules – legislators, judges, and lawyers – willingly accept the secondary rules in a normative sense. For the system to function, the mass of the citizenry need only generally obey, although Hart allows that in a healthy society, most citizens will also adopt the internal point of view. In the last analysis, however, being under a legal obligation is, after all, to be subject to sanctions for disobedience, even for the members of a systematically oppressed minority. As Hart soberly observes, “coercive power, thus established on its basis of authority . . . may be used to subdue and maintain, in a position of permanent inferiority, a subject group . . . For those thus oppressed there may be nothing in the system to command their loyalty but only things to fear. They are its victims, not its beneficiaries.” Although MacCormick bristles at the suggestion (pp.194-195), it is difficult to escape the conclusion that Hart’s theory of law collapses into the very Austinian model he intended to reject.
These are difficult questions and much more would have to be said to make [*45] these brief comments convincing. But my suggestion is that, if Hart never really escaped the perspective of the typical socially liberal, upper middle class English lawyer of his day, as MacCormick says, then perhaps the substantive values he ascribes to the concept of law were not and could not have been as descriptively neutral as he supposed.
Stephen Perry’s commentary raised a number of quite interesting issues about the right way to understand Raz’s account of the authority of law and its compatibility with the kind of “descriptive” jurisprudence to which Raz and Hart (and me, in a different way) are committed. When Professor Perry’s paper is publicly available, I will try to address some of those issues, but for now I wanted to at least record one point on which I clearly misconstrued Perry in Chapter 6 of my book (“Beyond the Hart/Dworkin Debate: The Methodology Problem in Jurisprudence”). In an earlier paper, Perry had written: “If [the service conception of authority] is right, then the anarchist thesis that the state could never have the moral authority it claims is wrong.” I critiqued this as follows (p. 172 of my book):
Raz’s account of authority is perfectly compatible with ‘the anarchist thesis that the state [more precisely, the laws of the state] could never have the moral authority it claims,’ because Raz’s thesis is only that all laws (sincerely) claim moral authority, not that they actually have it. The anarchist thesis, in Razian terms, is simply the claim that law always fails to satisfy the Normal Justification Thesis. Nothing in Raz’s theory of authority or of law precludes it.
What Perry meant, however, is that the anarchist thesis that authority is impossible is wrong if Raz’s service conception of authority is plausible, and that’s right: the service conception of authority explains how someone can have a justified claim of authority over another (rational, autonomous etc.) person, which the (Wolffian) anarchist denies is possible. It is true that the service conception of authority is compatible with the anarchist claim that no state ever has authority, but that was not, in fact, the anarchist thesis at issue for Perry.
. L. A. Hart is heir and torch-bearer of a great tradition in the philosophy’ of law which is realist and unromantic in outlook. It regards the existence and content of the law as a matter of social fact whose connection with moral or any other values is contingent and precarious. His analysis of the concept of law is part of the enterprise of demythologizing the law, of instilling rational critical attitudes to it. Right from his inaugural lecture in Oxford he was anxious to dispel the philosophical mist which he found in both legal culture and legal theory . In recent years he has shown time and again how much the rejection of the moralizing myths which accumulated around the law is central to his whole outlook. His essays on ‘Bentham and the Demystification of the Law’ and on ‘The Nightmare and the Noble Dream’ showed him to be consciously sharing the Benthamite sense of the excessive veneration in which the law is held in common-law countries, and its deleterious moral consequences. His fear that in recent years legal theory has lurched back in that direction, and his view that a major part of its role is to lay the conceptual foundation for a cool and potentially critical assessment of the law are evident.
This attitude strikes at the age-old question of the relation between morality and law. In particular it concerns the question whether it is ever the case that a rule is a rule of law because it is morally binding, and whether a rule can ever fail to be legally binding on the ground that it is morally unacceptable. As so often in philosophy, a large part of the answer to this question consists in rejecting it as simplistic and misleading, and substituting more complex questions concerning the relation between moral worth and legal validity. Let us, however, keep the simplistic question in mind; it helps to launch us on our inquiry.
Three theses with clear implications concerning the relation between law and morality have been defended in recent years. They can be briefly, if somewhat roughly, stated as follows:
The sources thesis: All law is source-based.
The incorporation thesis: All law is either source-based or entailed by’ source-based law.
The coherence thesis: The law consists of source-based law together with the morally soundest justification of source-based law.
A law is source-based if its existence and content can be identified by’ reference to social facts alone, without resort to any’ evaluative argument. All three theses give source-based law a special role in the identification of law’. But whereas the parsimonious sources thesis holds that there is nothing more to law than source-based law, the other two allow that the law can be enriched by non-source-based laws in different ways. Indeed, the coherence thesis insists that every legal system necessarily includes such laws.
The main purpose of this essay is to defend the sources thesis against some common misunderstandings4 and to provide one reason for preferring it to the other two. The argument turns on the nature of authority, which is the subject of the first section. In the second section some of the implications of this analysis are shown to be relevant to our understanding of the law. Their relation with the three theses is then examined. The connection between law and authority is used to criticize Dworkin’s support of the coherence thesis, as well as the incorporation thesis advocated by Hart and others. The rejection of these views leads to the endorsement of the sources thesis. The essay concludes with some observations concerning the relations between legal theory, law, and morality. Throughout, the argument is exploratory rather than conclusive.
I. AUTHORITY AND JUSTIFICATION
Authority in general can be divided into legitimate and de facto authority. The latter either claims to be legitimate or is believed to be so, and is effective in imposing its will on many over whom it claims authority , perhaps because its claim to legitimacy is recognized by many of its subjects. But it does not necessarily possess legitimacy. Legitimate authority is either practical or theoretical (or both). The directives of a person or institution with practical authority are reasons for action for their subjects, whereas the advice of a theoretical authority is a reason for belief for those regarding whom that person or institution has authority. Though the views here expressed apply to theoretical authorities as well, unless otherwise indicated I shall use ‘authority’ to refer to legitimate practical authority . Since our interest is in the law we will be primarily concerned with political authorities. But I shall make no attempt to characterize the special features of those, as opposed to practical authorities in general or legal features of those, as opposed to practical authorities in general or legal authorities in particular.
The distinction between reasons for action and reasons for belief may’ be sufficient to distinguish between practical and theoretical authorities, but it is inadequate to distinguish between authorities and other people. Anyone’s sincere assertion can be a reason for belief, and anyone’s request can be a reason for action. What distinguishes authoritative directives is
their special peremptory status. One is tempted to say that they are marked by their authoritativeness. This peremptory character has other led people to say that in accepting the authority of another one is surrendering one’s judgment to him, that the acceptance of authority is the denial of one’s moral autonomy, and so on. Some have seen in these alleged features of authority a good deal of what often justifies submitting to authority. Many more derived from such reflections prove that acceptance of authority is wrong, or even inconsistent with one’s status as a moral agent. Elsewhere I have developed a conception of authority which accounts for its peremptory force while explaining the conditions under which it may be right to accept authority. Let me briefly repeat the main tenets of this conception of authority. Its details and the arguments in its support cannot be explored here.
Consider the case of two people who refer a dispute to an arbitrator. He has authority to settle the dispute, for they agreed to abide by his decision. Two features stand out. First, the arbitrator’s decision is for the disputants a reason for action. They ought to do as he says because he says so. But this reason is related to the other reasons which apply to the case. It is not just another reason to be added to the others, a reason to stand alongside the others when one reckons which way is better supported by reason. The arbitrator’s decision is meant to be based on the other reasons, to sum them up and to reflect their outcome. He has reason to act so that his decision will reflect the reasons which apply to the litigants. I shall call reasons of the kind which apply to the arbitrator dependent reasons. I shall also refer to his decision as a dependent reason for the litigants. Notice that in this second sense a dependent reason is not one which does in fact reflect the balance of reasons on which it is based. It is one which is meant to, i.e . which should, do so.
This leads directly to the second distinguishing feature of the example. The arbitrator’s decision is also meant to replace the reasons on which it depends. In agreeing to obey his decision, the disputants agreed to follow his judgment of the balance of reasons rather than their own. Henceforth his decision will settle for them what to do. Lawyers say that the original reasons merge into the decision of the arbitrator or the judgment of a court, which, if binding, becomes res judicata. This means that the original cause of action can no longer be relied upon for any purpose. I shall call a reason which displaces others a preemptive reason.
It is not that the arbitrator’s word is an absolute reason which has to be obeyed come what may. It can be challenged and justifiably disobeyed in certain circumstances. If, for example, the arbitrator was bri, was drunk wconsidering the case, or if new evidence of great importance unexpectedly
turns up, each party may ignore the decision. The point is that reasons that could have been relied upon to justify action before his decision cannot be relied upon once the decision is given. Note that there is no reason for anyone to restrain their thoughts or their reflections on the reasons which apply to the case, nor are they necessarily debarred from criticizing the arbitrator for having ignored certain reasons or for having been mistaken about their significance. It is merely action for some of these reasons which is excluded.
The two features, dependence and preemptiveness, are intimately connected. Because the arbitrator is meant to decide on the basis of certain reasons, the disputants are excluded from later relying on them. They handed over to him the task of evaluating those reasons. If they do not then reject those reasons as possible bases for their own action, they defeat the very point and purpose of the arbitration. The only proper way to acknowledge the arbitrator’s authority is to take it to be a reason for action which replaces the reasons on the basis of which he was meant to decide.
The crucial question is whether the arbitrator’s is a typical authority, or whether the two features picked out above are peculiar to it, and perhaps a few others, but are not characteristic of authorities in general. It might be thought, for example, that the arbitrator is typical of adjudicative authorities, and that what might be called legislative authorities differ from them in precisely these respects. Adjudicative authorities, one might say, are precisely those in which the role of the authority is to judge what are the reasons which apply to its subjects and decide accordingly, i.e. their decisions are merely meant to declare what ought to be done in any case. A legislative authority, on the other hand, is one whose job is to create new reasons for its subjects, i.e. reasons which are new not merely in the sense of replacing other reasons on which they depend but in not purporting to replace any reasons at all. If we understand ‘legislative’ and ‘adjudicative’ broadly’, so the objection continues, all practical authorities belong to at least one of these kinds. It will be conceded, of course, that legislative authorities act for reasons. But theirs are reasons which apply to them and which do not depend on, i.e. are not meant to reflect, reasons which apply to their subjects.
The apparent attractiveness of the above distinction is, however, misguided. Consider an Act of Parliament imposing on parents a duty to maintain their young children. Parents have such a duty independently of this Act, and only because they have it is the Act justified. Further argument is required to show that the same features are present in all practical authorities.
Instead, let me summarize my’ conception of authority in three theses:
The dependence thesis:
All authoritative directives should be based, among other factors, on reasons which apply to the subjects of those directives and which bear on the circumstances covered by the directives. Such reasons I shall call dependent reasons.
The normal justification thesis:
The normal and primary way to establish that a person should be acknowledged to have authority over another person involves showing that the alleged subject is likely better to comply with reasons which apply to him (other than the alleged authoritative directives) if he accepts the directives of the alleged authority as authoritatively binding, and tries to follow them, than if he tries to follow the reasons which apply to him directly.7
The preemption thesis:
The fact that an authority requires performance of an action is a reason for its performance which is not to be added to all other relevant reasons when assessing what to do, but should replace some of them.
The first and the last theses generalize the features we noted in the arbitration example. The normal justification thesis replaces the agreement between the litigants which was the basis of the arbitrator’s authority . Agreement or consent to accept authority is binding, for the most part, only if conditions rather like those of the normal justification thesis obtain.
The first two theses articulate what I shall call the service conception of authority. They regard authorities as mediating between people and the right reasons which apply to them, so that the authority judges and pronounces what they ought to do according to right reason. The people on their part take their cue from the authority whose pronouncements replace for them the force of the dependent reasons. This last implication of the service conception is made explicit in the preemption thesis. The mediating role of authority cannot be carried out if its subjects do not guide their actions by its instructions instead of by the reasons on which they are supposed to depend. No blind obedience to authority’ is here implied. Acceptance of authority’ has to be justified, and this normally’ means meeting the conditions set in the justification thesis. This brings into play’ the dependent reasons, for only’ if the authority’s compliance with them is likely to be better than that of its subjects is its claim to legitimacy justified. At the level of general justification the preempted reasons have an important role to play. But once that level has been passed and we are concerned with particular action, dependent reasons are replaced by’ authoritative directives. To count both as independent reasons is to be guilty of double counting.
This is the insight which the surrender of judgment metaphor seeks to capture. It does not express the immense power of authorities. Rather it reflects their limited role. They are not there to introduce new and independent considerations (though when they make a mistake and issue the wrong decrees they do precisely that). They are meant to reflect dependent reasons in situations where they are better placed to do so. They mediate between ultimate reasons and the people to whom they apply.
II. AUTHORITY AND THE LAW
I will assume that necessarily law, every legal system which is in force anywhere, has de facto authority. That entails that the law either claims that it possesses legitimate authority or is held to possess it, or both. I shall argue that, though a legal system may not have legitimate authority, or though its legitimate authority may not be as extensive as it claims, every legal system claims that it possesses legitimate authority. If the claim to authority is part of the nature of law, then whatever else the law is it must be capable of possessing authority. A legal system may lack legitimate authority. If it lacks the moral attributes required to endow it with legitimate authority then it has none. But it must possess all the other features of authority, or else it would be odd to say that it claims authority. To claim authority it must be capable of having it, it must be a system of a kind which is capable in principle of possessing the requisite moral properties of authority. These considerations, I shall argue, create a weighty argument in favour of the sources thesis. Let us review them step by step.
The claims the law makes for itself are evident from the language it adopts and from the opinions expressed by its spokesmen, i.e. by the institutions of the law. The law’s claim to authority is manifested by the fact that legal institutions are officially designated as ‘authorities’, by the fact that they regard themselves as having the right to impose obligations on their subjects, by their claims that their subjects owe them allegiance, and that their subjects ought to obey the law as it requires to be obeyed (i.e. in all cases except those in which some legal doctrine justifies breach of duty).
Even a bad law, is the inevitable official doctrine, should be obeyed for as long as it is in force, while lawful action is taken to try and bring about its amendment or repeal. One caveat needs be entered here. In various legal systems certain modes of conare technically unlawful without bso in substance. It is left to the prosecutorial authorities to refrain from prosecuting for such conduct, or to the courts to give absolute discharge. Where legally recognized policies direct such authorities to avoid prosecution or conviction, the conduct should not be regarded as unlawful except in a technical sense, which is immaterial to our considerations.
Does the fact that the law claims authority help us understand its nature in any way, beyond the sheer fact that the law’ makes this claim? If of necessity all legal systems have legitimate authority, then we can conclude that they have the features which constitute the service conception of authority. But it is all too plain that in many cases the law’s claim to legitimate authority cannot be supported. There are legal systems whose authority cannot be justified by the normal justification thesis or in any other way. Can it not be argued that, since the law may lack authority, a conception of authority cannot contribute to our understanding of what it is, except by showing what it claims to be? This conclusion is at the very least premature. It could be that, in order to be able to claim authority, the law must at the very least come close to the target, i.e. that it must have some of the characteristics of authority. It can fail to have authority. But it can fail in certain ways only. If this is so, there are features of authority that it must have. If so, we can learn from the doctrine of authority something about the nature of law.
Note that nothing in this suggestion assumes that all the necessary features of the law are necessary features of every practical authority. The law may well have others. Indeed, I am already assuming that the law does have others, since it is not necessary that every person who has legitimate authority’ claims to have it, as the law necessarily does. All that we are trying to establish is whether some necessary’ characteristics of law are necessary characteristics of authority, which the law must have if it is to be capable of claiming authority.
I suggested above that only those who can have authority can sincerely claim to have it, and that therefore the I a w’ must be capable of having authority’. This claim is so vague that, even if correct, it cannot be more than a gesture towards an argument. What might that be? Consider the fact that the law is a normative system. If it were not, it would be incapable of having practical authority. If the law were a set of propositions about the behaviour of volcanoes, for example, then it would not only lack authority over action, it would be incapable of having such authority. The statement that a normative system is authoritatively binding on us may be false, but at least it makes sense, whereas the claim that a set of propositions about volcanoes authoritatively determines what we ought to do does not even make sense.
But cannot one claim that a person X has authority which it would make no sense to attribute to X? The claim makes sense because we understand what is claimed, even while we know that it is not merely false but is necessarily, or conceptually, false. For example, what cannot communicate with people cannot have authority over them. Trees cannot have authority over people. But someone whose awareness of what trees are is incomplete,
a young child, for example, can claim that they do have authority. He is simply wrong. Similarly, even if he is aware of the nature of trees, he may make an insincere claim to that effect. Perhaps he is trying to deceive a newly arrived Martian sociologist. Notice, however, that one cannot sincerely claim that someone who is conceptually incapable of having authority has authority if one understands the nature of one’s claim and of the person of whom it is made. If I say that trees have authority over people , you will know that either my grasp of the concepts of authority or of trees is deficient or that I am trying to deceive (or, of course, that I am not really stating that trees have authority but merely pretending to do so, or that I am play-acting, etc.).
That is enough to show that since the law claims to have authority it is capable of having it. Since the claim is made by legal officials wherever a legal system is in force, the possibility that it is normally insincere or based on a conceptual mistake is ruled out. It may, of course, be sometimes insincere or based on conceptual mistakes. But at the very least in the normal case the fact that the law claims authority for itself shows that it is capable of having authority.
Why cannot legal officials and institutions be conceptually confused? One answer is that while they can be occasionally they cannot be systematically confused. For given the centrality of legal institutions in our structures of authority, their claims and conceptions are formed by and contribute to our concept of authority. It is what it is in part as a result of the claims and conceptions of legal institutions. This answer applies where the legal institutions themselves employ’ the concept of authority. But there may’ be law in societies which do not have our concept of authority. We say of their legal institutions that they’ claim authority because they claim to impose duties, confer rights, etc. Not having the concept they cannot be confused about it, though we can be confused in attributing the claim of authority to them.
The argument of the last four paragraphs has established, first, that one can fail to have authority because one is incapable of possessing authority (though even those capable of having authority may fail to have it), second, that since the law claims authority it is capable of having authority. There are two kinds of reason for not having authority. One is that the moral or normative conditions for one’s directives being authoritative are absent.
Typically, this will be either because the normal justification, explained above, is unavailable or because, though available, it is insufficient to outweigh the conflicting reasons which obtain in this particular case. The second kind of reason for not having authority is that one lacks some of the other, non-moral or non-normative, prerequisites of authority, for example, that one cannot communicate with others.
It is natural to hold that the non-moral, non-normative conditions for having authority are also the conditions of the ability to have authority. A person’s authority may be denied on the ground that he is morally incompetent or wicked. But such facts do not show that he is incapable of having authority in the way that trees are incapable of having authority. Nazi rules may not be authoritatively binding, but they are the sort of thing that can be authoritatively binding, whereas statements about volcanoes cannot. Most arguments about the authority of governments and other institutions revolve around their moral claim to the obedience of their subjects. The existence of the non-moral qualifications is taken for granted. The argument does not start except regarding persons and institutions who meet those other conditions. That is why they are thought of as the conditions which establish capacity to possess authority.
If this view is correct then, since the law necessarily claims authority , and therefore typically has the capacity to be authoritative, it follows that it typically has all the non-moral, or non-normative, attributes of authority. The remainder of my argument, however, does not depend on this strong conclusion. We will concentrate on two features which must be possessed by anything capable of being authoritatively binding. These two features will then be used to support the sources thesis.
It is convenient to concentrate attention on instructions or directives. The terms are used in a wide sense which can cover propositions, norms, rules, standards, principles, doctrines, and the like. In that sense the law is a system of directives, and it is authoritative if and only if its directives are authoritatively binding. Likewise, whoever issues the directives has authority if and only his directives are authoritatively binding becaushe makes them, that is (1) they are authoritative, and (2) part of the reason is that he made them.
The two features are as follows. First, a directive can be authoritatively binding only if it is, or is at least presented as, someone’s view of how its subjects ought to behave. Second, it must be possible to identify the directive as being issued by the alleged authority without relying on reasons or considerations on which directive purports to adjudicate.
The first feature reflect.’ the mediating role of authority. It is there to act on reasons which apply to us anyway, because we will more closely conform to those reasons if we do our best to follow the directives of the authority than if we try to act on those reasons directly. Hence, though the alleged authoritative instruction may be wrongly conceived and misguided, it must represent the judgment of the alleged authority on the reasons which apply to its subjects, or at least it must be presented as the authority’s judgment. Otherwise it cannot be an authoritative instruction. If fails not because it is a bad instruction, but because it is not an instruction of the
right kind. It may be an instruction given for some other occasion, or in jest, or an order or threat of a gangster who cares for and considers only his own good. Strictly speaking, to be capable of being authoritative a directive or a rule has actually to express its author’s view on what its subjects should do. But given that this element is one where pretence and deceit are so easy, there is little surprise that appearances are all one can go by here, and the concept of de facto authority ‘ as well as all others which presuppose capacity to have authority, are based on them. If the rule is presented as expressing a judgment on what its subjects should do, it is capable of being authoritative.
The second feature too is closely tied to the mediating role of authority. Suppose that an arbitrator, asked to decide what is fair in a situation, has given a correct decision. That is, suppose there is only one fair outcome, and it was picked out by the arbitrator. Suppose that the parties to the dispute are told only that about his decision, i.e. that he gave the only correct decision. They will feel that they know little more of what the decision is than they did before. They were given a uniquely identifying description of the decision and yet it is an entirely unhelpful description. If they could agree on what was fair they would not have needed the arbitrator in the first place. A decision is serviceable only if it can be identified by means other than the considerations the weight and outcome of which it was meant to settle.
This applies to all decisions, as much to those that a person takes for himself as to those taken for him by others. If I decide what would be the best life insurance to buy, it is no good trying to remind me of my decision by saying that I decided to buy the policy which it is best to buy. It means that I have to decide again in order to know what I decided before, so the earlier decision might just as well never have happened. The same applies to the subjects of any authority. They can benefit by its decisions only if they can establish their existence and content in ways which do not depend on raising the very’ same issues which the authority is there to settle.
Can it not be objected that my argument presupposes that people know the normal justification thesis, and the others which go with it? To be sure such an assumption would not be justified. Nor is it made, All I am assuming is that the service conception of authority is sound, i.e. that it correctly represents our concept of authority. It is not assumed that people believe that it does.
It is worth noting that a set of conditions rather like the pair I have argued for can be derived from a much weaker assumption than that of the service conception of authority’ explained above. I will call this the alternative argument. Its premise is nothing more than the claim that it is part of
our notion of legitimate authority that authorities should act for reasons, and that their legitimacy depends on a degree of success in doing so. Even those who reject the service conception of authority will accept conditions similar to the two I have argued for if they accept that legitimacy depends on (a degree of) success in acting for reasons. It is obvious that this weak assumption is enough to hold that only what is presented as someone’s view can be an authoritative directive.
Instead of the second condition, that directives be capable of independent identification (i.e. independent of the reasons they should be based upon), two weaker conditions can be established. I will assume that authorities make a difference, i.e. the fact that an authority issued a directive changes the subjects’ reasons. It follows that the existence of reasons for an authority to issue a directive does not by itself, without the directive having actually been issued, lead to this change in the reasons which face the subjects. Therefore, the existence of reasons which establish that a certain directive, if issued, would be the right one to have issued cannot show that such a directive exists and is binding. Its existence and content, in other words, cannot depend exclusively on the reasons for it. The existence and content of every directive depend on the existence of some condition which is itself independent of the reasons for that directive. Moreover, that further condition cannot simply be that that or some other authority issued another directive. Often the existence of one law is a reason for passing another.
But we have just established that the existence of a law cannot depend simply on the existence of reasons for it, on reasons showing that it would be good if people behaved in the way it prescribes, or that it would be good if the law’ required them to do so. Therefore, the existence of one directive, though it may show’ that another is desirable or right, cannot by itself establish its existence.
III. THE COHERENCE THESIS
The previous section argued that, even though the law may lack legitimate authority, one can learn quite a lot about it from the fact that it claims legitimate authority. It must be capable of being authoritative. In particular it must be, or be presented as, someone’s view on what the subjects ought to do, and it must be identifiable by means which are independent of the considerations the authority should decide upon.
It is interesting to note that legal sources meet both conditions. To anticipate and simplify, the three common sources of law, legislation, judicial decisions, and custom, are capable of being sources of authoritative directives. They meet the non-moral conditions implied in the service conception of authority. Legislation can be arbitrary, and it can fail to comply with the
dependence thesis in many ways. But it expresses, or is at least presented as expressing, the legislator’s judgment of what the subjects are to do in the situations to which the legislation applies. Therefore, it can be the product of the legislator’s judgment on the reasons which apply to his subjects. The same is true of judicial decisions. Judges may be bribed. They may act arbitrarily. But a judicial decision expresses a judgment on the legal consequences of the behaviour of the litigants. It is presented as a judgment on the way the parties, and others in the same circumstances, ought to behave.
Similarly with custom. It is not normally generated by people intending to make law. But it can hardly avoid reflecting the judgment of the bulk of the population on how people in the relevant circumstances should act. Source-based law can conform to the dependence thesis. It therefore conforms to the first of our conditions which are entailed by the fact that the law claims authority.
Legal sources also conform with the second of our two conditions, since they are capable of being identified in ways which do not rely on the considerations they are meant to decide upon. An income-tax stis meant to decide what is the fair contribution of public funds to borne out of income. To establish the content of the statute, all one need do is to establish that the enactment took place, and what it says. To do this one needs little more than knowledge of English (including technical legal English), and of the events which took place in Parliament on a few occasions. One need not come to any view on the fair contribution to public funds.
As was noted above, all three rivals, the coherence, the incorporation, and the sources theses, are united in attributing a special significance to source-based law. The preceding simplified account illustrates the way central features of the law’ can mesh in with and acquire a special significance from the service conception of authority and the two necessary features of law which it entails. It does not follow that these are the reasons normally given for the centrality of source-based law. The coherence thesis represents an account which is at the very least indifferent to the considerations outlined above, I have identified it as the view that the law consists of source-based law together with the morally best justification of the source-based law. This may look an unholy mixture of disparate elements. But it need not be. In the hands of its best advocate, R. M. Dworkin, it embodies a powerful and intriguing conception of the law.
Dworkin’s conception of the law, expressed in various articles over many years, is not easy to ascertain. Some points of detail which are nevertheless essential to its interpretation remain elusive. Many readers of his celebrated ‘Hard Cases’ (1975) took it to express a view of law which can be summarized in the following way:
“To establish the content of the law of a certain country one first finds out what are the legal sources valid in that country and then one considers one master question: Assuming that all the laws ever made by these sources which are still in force, were made by one person, on one occasion, in conformity with a complete and consistent political morality (i.e. that part of a moral theory which deals with the actions of political institutions), what is that morality?”
The answer to the master question and all that it entails, in combination with other true premises is, according to this reading of Dworkin, the law. The master question may fail to produce an answer for two opposite reasons, and Dworkin complicates his account to deal with both. First, there may be conflicts within a legal system which stop it from conforming with any consistent political morality. To meet this point Dworkin allows the answer to be a political morality with which all but a small number of conflicting laws conform. Second, there may be more than one political morality meeting the condition of the master question (especially once the allowance made by the first complication is taken into account). In that case Dworkin instructs that the law is that political morality which is, morally, the better theory. That is, the one which approximates more closely to ideal, correct or true morality.
In his ‘Reply to Seven Critics’ (1977) Dworkin returns to the question of the nature of law. He gives what he calls too crude an answer, which can be encapsulated in a different master question:
“To establish the content of the law of a certain country one first finds out what are the legal sources valid in that country and then one considers one master question: What is the least change one has to allow in the correct, sound political morality in order to generate a possibly less than perfect moral theory’ which explains much of the legal history’ of that country’ on the assumption that it is the product of one political morality?”
That (possibly less than perfect) political morality is the law. Both master questions depend on an interaction of two dimensions. One is conformity with ideal morality, the other ability to explain the legal history of the country. The new master question differs from its predecessor in two important respects. First, its fit condition concerns all the legal history of the country. Acts of Parliament enacted in the thirteenth century and repealed fifty years later are also in the picture. They also count when measuring the degree to which a political morality fits the facts. The earlier test refered only to law still in force. Only fitting in with them counted. Second, the new master question gives less weight to the condition of fit. It is no longer the case that the law consists of the political morality which fits the facts best, with ideal morality coming in just as a tie-breaker. Fit (a certain unspecified level of it) now provides only a sort of flexible threshold test. Among the (presumably
numerous) political moralities which pass it, the one which is closest to correct morality is the law.
I hesitate to attribute either view to Dworkin. The articles are not clear enough on some of the pertinent points, and his thought may have developed in a somewhat new direction since these articles were written. Luckily, the precise formulation of the master question does not matter to our purpose.
Enough of Dworkin’s thought is clear to show that its moving ideas are two. First, that judges’ decisions, all their decisions, are based on considerations of political morality. This is readily admitted regarding cases in which source based laws are indeterminate or where they conflict. Dworkin insists that the same is true of ordinary cases involving, say, simple statutory interpretation or indeed the decision to apply statute at all. This does not mean that every time judges apply statutes they consider and re-endorse their faith in representative democracy, or in some other doctrine of political morality from which it follows that they ought to apply these statutes. It merely means that they present themselves as believing that there is such a doctrine. Their decisions are moral decisions in expressing a moral position. A conscientious judge actually believes in the existence of a valid doctrine, a political morality, which supports his action.
If I interpret Dworkin’s first leading idea correctly and it is as stated above, then I fully share it. I am not so confident about his second leading idea. It is that judges owe a duty, which he sometimes calls a duty of professional responsibility, which requires them to respect and extend the political morality of their country. Roughly speaking, Dworkin thinks that morality (i.e. correct or ideal morality) requires judges to apply the source based legal rules of their country, and, where these conflict or are indeterminate, to decide cases by those standards of political morality which inform the source-based law, those which make sense if it is an expression of a coherent moral outlook.
Notice how far-reaching this second idea is. Many believe that the law of their country, though not perfect, ought to be respected. It provides reasonable constitutional means for its own development. Where reform is called for, it should be accomplished by legal means. While the law is in force it should be respected. For most, this belief depends to a large degree on the content of the law. They will deny that the laws of Nazi Germany deserved to be respected. Dworkin’s obligation of professional responsibility is different. It applies to every legal system simply because it is a legal system, regardless of its content. Furthermore, it is an obligation to obey not merely the letter of the law but its spirit as well. Judges are called upon to decide cases where source-based law is indeterminate, or includes unresolved conflicts, in accordance with the prevailing spirit behind the bulk
of the law. That would require a South African judge to use his power to extend apartheid.
Problems such as these led to the weakening of the element of fit in the second formulation of the master question. But then they also weaken the duty of professional responsibility. There is an attractive simplicity in holding that morality requires any person who joins an institution to respect both its letter and its spirit. If this simple doctrine does not apply to judges in this form, if their respect for theinstitution, the law, is weakened from its pure form in the first master question to that of the second, then one loses the theoretical motivation for such a duty , at least if it means more than saying that one ought to respect the legal institutions of a particular country because their structure and actions merit such respect, or to the extent that they do.
These are some of the doubts that Dworkin’s second leading idea raises. My formulations of the two leading ideas (and of the doubts concerning the second) are mere sketches. They are meant to outline an approach to law which gives source-based law a special role in the account of law on grounds other than those explained in the previous section. It is easy to see that Dworkin’s conception of law contradicts the two necessary features of law argued for above. First, according to him there can be laws which do not express anyone’s judgment on what their subjects ought to do, nor are they presented as expressing such a judgment. The law includes the best justification of source-based law, to use again the brief description given in the coherence thesis of which Dworkin’s master questions are different interpretations. The best justification, or some aspects of it, may never have been thought of, let alone endorsed by anyone. Dworkin draws our attention to this fact by saying that it requires a Hercules to work out what the law is. Nor does Dworkin’s best justification of the law consist of the implied consequences of the political morality which actually motivated the activities of legal institutions. He is aware of the fact that many different and incompatible moral conceptions influenced different governments and their officials over the centuries. His best justification may well be one which was never endorsed, not even in its fundamental precepts, by anyone in government. Much of the law of any country may, according to Dworkin, be unknown. Yet it is already legally binding, waiting there to be discovered. Hence it neither is nor is presented as being anyone’s judgment on what the law’s subjects ought to do.
Second, the identification of much of the law depends, according to Dworkin’s analysis, on considerations which are the very same considerations which the law is there to settle. This aspect of his theory is enhanced by his second master question, but it makes a modest appearance in the first as
well. Establishing what the law is involves judgment on what it ought to be. Imagine a tax problem on which source-based law is indeterminate. Some people say that in such a case there is no law’ on the issue. The court ought to ask what the law ought to be and to decide accordingly. If it is a higher court whose decision is a binding precedent, it will have thereby made a new law. Dworkin, on the other hand, says that there is already law on the matter. It consists in the best justification of the source based law. So in order to decide what the tax liability is in law, the court has to go into the issue of what a fair tax law would be and what is the least change in it which will make source-based law conform to it. This violates the second feature of the I a w’ argued for above.
It is important to realize that the disagreement I am pursuing is not about how judges should decide cases. In commenting on Dworkin’s second leading idea I expressed doubts regarding his view on that. But they are entirely irrelevant here. So let me assume that Dworkin’s duty of professional responsibility is valid and his advice to judges on how to decide cases is sound. We still have a disagreement regarding what judges do when they follow his advice. We assume that they follow right morality, but do they also follow the law or do they make law? My disagreement with Dworkin here is that, in saying that they follow pre-existing law, he makes the identification of a tax law, for example, depend on settling what a morally just tax law would be, i.e. on the very considerations which a tax law is supposed to have authoritatively settled.
For similar reasons Dworkin’s theory violates the conditions of the alternative argument, the argument based on nothing more than the very weak assumption that authorities ought to act for reasons and that the validity of authoritative directives depends on some degree of success in doing so. This assumption leads to the same first condition, i.e. that the law must be presented as the law-maker’s view on right reasons. As we have just seen, Dworkin’s argument violates this condition. He also violates the other condition established by the alternative argument, that the validity of a law cannot derive entirely from its desirability in light of the existence of other laws. Dworkin’s theory claims that at least some of the rules which are desirable or right in view of the existence of source-based law are already legally binding.
Dworkin’s theory, one must conclude, is inconsistent with the authoritative nature of law. That is, it does not allow for the fact that the law necessarily claims authority and that it therefore must be capable of possessing legitimate authority. To do so it must occupy, as all authority’ does, a mediating role between the precepts of morality and their application by people in their behaviour. It is this mediating role of authority which is denied to the law by Dworkin’s conception of it.
IV. THE INCORPORATION THESIS
The problem we detected with the coherence thesis was that, though it assigns source-based law a special role in its account of law, it fails to see the special connection between source-based law and the law’s claim to authority, and is ultimately inconsistent with the latter. It severs the essential link between law and the views on right action presented to their subjects by those who claim the right to rule them. In these respects, the incorporation thesis seems to have the advantage. It regards as law source-based law and those standards recognized as binding by source-based law. The approval of those who claim a right to rule is a prerequisite for a rule being a rule of law. Thus the law’s claim to authority appears to be consistent with the incorporation thesis.
I should hasten to add that many of the supporters of the incorporation thesis do not resort to the above argument in its defence. Nor do they interpret the centrality of source-based law to their conception of law in that way. They regard it as supported by and necessary for some version of a thesis about the separability of law and morals. Jules Coleman, for example, is anxious to deny that there is a necessary connection between law and morality. He mistakenly identifies this thesis with another: “The separability thesis is the claim that there exists at least one conceivable rule of recognition and therefore one possible legal system that does not specify truth as a moral principle as a truth condition for any proposition of law”. If this were a correct rendering of the separability thesis stated by Coleman in the first quotation above, the incorporation thesis entails separability. But Coleman’s rendering of his own separability thesis is mistaken. A necessary connection between law and morality does not require that truth as a moral principle be a condition of legal validity. All it requires is that the social features which identify something as a legal system entail that it possess moral value. For example, assume that the maintenance of orderly social relations is itself morally’ valuable. Assume further that a legal system can be the law in force in a society’ only if it succeeds in maintaining orderly social relations. A necessary connection between law and morality would then have been established, without the legal validity of any rule being made, by’ the rule of recognition, to depend on the truth of any’ moral proposition.
Supporters of the incorporation thesis may’ admit that, while it is not sufficient to establish the separability thesis, at least it is necessary’ for it, and is therefore supported by it. The separability thesis is, however, imp. Of course the remarks about orderly social relations do not disprove it. They are much too vague and to do that. But it is very likely that there is some necessary connection between law and morality,
that every legal system in force has some moral merit or does some moral good even if it is also the cause of a great deal of moral evil. It is relevant to remember that all major traditions in western political thought, including both the Aristotelian and the Hobbesian traditions, believed in such a connection.12 If the incorporation thesis seems much more secure than the separability thesis, it is because if seems to be required by the fact that all law comes under the guise of authority, together with the considerations on the nature of authority advanced in the previous sections. The law is the product of human activity because if it were not it could not be an outcome of a judgment based on dependent reasons, that is, it could not provide reasons set by authority.
There may, of course, be other cogent reasons for favouring the incorporation thesis. They will not be explored here. Instead I will argue that the thesis ought to be rejected, and that the support it seems to derive from the argument about the nature of authority is illusory. In fact the incorporation thesis is incompatible with the authoritative nature of law. To explain the point let us turn for a moment to look at theoretical authority.
Suppose that a brilliant mathematician, Andrew, proves that the Goldbach hypothesis, that every integer greater than two is the sum of two prime numbers, is true if and only if the solution to a certain equation is positive. Neither he nor anyone else knows the solution of the equation. Fifty years later that equation is solved by another mathematician and the truth of the Goldbach hypothesis is established. Clearly we would not say that Andrew proved the hypothesis, even though he made the first major breakthrough and even though the truth of the hypothesis is a logical consequence of his discovery. Or suppose that Betty is an astrophysicist who demonstrates that the big bang theory’ of the origin of the universe is true if and only if certain equations have a certain resolution. Again, their resolution is not known at the time, and is discovered only’ later. It seems as clear of Betty’ as it was of Andrew that she cannot be credited with proving (or disproving) the big bang theory even though the truth (or falsity ) of the theory’ is entailed by her discovery. Now imagine that Alice tells you of Andrew’s discovery, or that Bernard tells you of Betty’s. Alice and Bernard are experts in their respective fields. They give you authoritative advice. But Alice does not advise you to accept the Goldbach hypothesis. She merely advises belief in it if the relevant equation has a positive solution. The fact that the truth of the hypothesis is entailed by her advice is neither here nor there. The same applies to Bernard’s advice based on Betty’s work.
All this is commonplace. Nor is it difficult to understand why one cannot be said to have advised acceptance of a particular proposition simply on the ground that it is entailed by another proposition acceptance of which
one did advise. People do not believe in all that is entailed by their beliefs. Beliefs play a certain role in our lives in supporting other beliefs, in providing premises for our practical deliberations. They colour our emotional and imaginative life. More generally, they are fixed points determining our sense of orientation in the world. Many of the propositions entailed by our beliefs do not play this role in our lives. Therefore they do not count amongst our beliefs. One mark of this is the fact that had people been aware of some of the consequences of their beliefs, rather than embrace them they might have preferred to abandon the beliefs which entail them (or even provisionally to stick by them and refuse their consequences, i.e. embrace inconsistencies until they found a satisfactory way out), This consideration explains why we cannot attribute to people belief in all the logical consequences of their beliefs. It also explains why a person cannot be said to have advised belief in a proposition he does not himself believe in. (Though it is possible to advise others to take the risk and act as if certain propositions are true even if one does not believe in them and equally’ possible to advise believing in a proposition if it is true.)
Advice shares the mediating role of authoritative directives. It too is an expression of a judgment on the reasons which apply to the addressee of the advice. Because the advice has this mediating role it can include only’ matters on which the adviser has a view, or presents himself as having one (to cover cases of insincere advice). Since a person does not believe in all the consequences of his beliefs he does not, barring special circumstances, advise others to believe in them either.
The analogy with authority is clear and hardly needs further elaboration. The mediating role of authority implies that the content of an authoritative directive is confined to what the authority which lends the directive its binding force can be said to have held or to have presented itself as holding. It does not extend to what it would have directed, given a chance to do so, nor to all that is entailed by what it has directed. It will by now be clear why’ the incorporation thesis must be rejected if the law does necessarily claim authority. The main thrust of the incorporation thesis is that all that is derivable from the law (with the help of other true premises) is law. It makes the law include standards which are inconsistent with its mediating role, for they were never endorsed by the law-making institutions on whose authority they are supposed to rest. The mistake of the incorporation thesis is to identify being entailed by the source-based law with being endorsed by the sources of law.
Law is a complex social institution, and some of its complexities help mask the incorporation thesis’s mistake. When thinking of a piece of advice or of an authoritative directive we tend to think of them as having one
author. In the law, as in other hierarchical institutions, matters are complicated in two respects. First, authoritative directives are typically issued by institutions following an elaborate process of drafting and evaluation. Second, they are often amended, modified, and their content amplified and changed by a succession of subsequent legislative, administrative, and judicial actions. A convention of reference sometimes exists which allows one to refer to a statute, or to the original judicial decision, when citing a legal rule, even though they are no more than the starting-point in the development of the rule, which is in a very real sense the product of the activities of several bodies over a period of time.
These complications mean, of course, that the rule as it is now may include aspects which cannot be attributed to its original creator. They are part of the rule because they are attributable to the author of a later intervention. For example, typically successive judicial interpretations change or add to the meaning of statutes. Likewise, though we attribute beliefs and intentions to institutions and corporations on the basis of the beliefs and intentions of their officials, the attributing functions may sometimes sanction holding a corporate body’ to have had a belief or an intention which none of its officials had. This is not the place to inquire into the rules of attributions invoked when we talk of the intentions or beliefs of states, governments, corporations, trade unions, universities, etc. All that is required for our present purposes is that attribution is made in a restrictive way which does not allow one to attribute to such a body’ all the logical consequences of its beliefs and intentions. Restrictions to all the foreseen or foreseeable consequences are the ones most common in the law. This is enough to show’ that the incorporation thesis receives no sustenance from the institutional complexitiof the law, since it insists that the law includes all the logical consequences of source-based law.
In disputing the incorthesis I am not denying two other points which are asserted by D. Lyons in the most thorough going defence of this position. First, I agree with him that judges who work out what is required by, for example, the due process provision of the American constitution are engaged in interpreting the constitution. Lyons is mistaken, however, in thinking that it follows from that that they are merely applying the law as it is (at least if they succeed in discovering the right answer). Judicial interpretation can be as creative as a Glen Gould interpretation of a Beethoven piano sonata. It is a mistake to confuse interpretation with paraphrase or with any other mere rendering of what the interpreted object is in any case. Second, Lyons is quite right to think that there is more to the law than is explicitly stated in the authoritative texts. Authorities can and do direct and guide by implication. It does not follow, however, that they imply all that is entailed
by what they say, let alone all that is entailed by it with the addition of true premises. The limits of the justifiable imputation of directives are no wider, I have argued above, than the limits of the imputation of belief.
V. THE SOURCES THESIS
The last section established that not all the moral consequences of a legal rule are part of the law. But it leaves open the possibility that some are: that some moral consequences of a legal rule can be attributed to the author of that legal rule as representing its intention or meaning and thus being part of the law. I will not present a refutation of this possibility. The purpose of the present section is more modest. It argues that the authoritative nature of law gives a reason to prefer the sources thesis. It leaves open the possibility that additional considerations lead to a complex view of the law lying between the incorporation and the sources theses.
Let us distinguish between what source-based law states explicitly and what it establishes by implication. If a statute in country A says that income earned abroad by a citizen of A is liable to income tax in A, then it only implicitly establishes that I am liable to such tax. For my liability is not stated by the statute but is inferred from it (and some other premisses).
Similarly, if earnings abroad are taxed at a different rate from earnings at home, the fact that the proceeds of export sales are subject to the home rate is implied rather than stated. It is inferred from this statute and other legal rules on the location of various transactions. By the same reasoning it also established that not all the factual consequences of a rule of law are part of the law.
The two examples differ in that the statement that I am liable to tax at a certain rate is an applied legal statement depending for its truth on both law and fact. The statement that export earnings are taxed at a certain rate is a pure legal statement, depending for its truth on law only (i.e. on acts of legislation and other law-making facts). The Sources thesis as stated at the beginning of this chapter can bear a narrow or a wide interpretation. The narrow thesis concerns the truth conditions of pure legal statements only’.
Pure legal statements are those which state the content of the law, i.e. of legal rules, principles, doctrines, etc. The wide thesis concerns the truth conditions of all legal statements, including applied ones. It claims that the truth or falsity of legal statements depends on social facts which can be established without resort to moral argument.
The fact that the law claims authority supports the narrow sources thesis because it leads to a conception of law as playing a mediating role between ultimate reasons and people’s decisions and actions. To play this role the law must be, or at least be presented as being, an expression of the judgment of some people or of some institutions on the merits of the actions it requires. Hence, the identification of a rule as a rule of law consists in attributing
it to the relevant person or institution as representing their decisions and expressing their judgments. Such attribution need not be on the ground that this is what the person or institution explicitly said. It may be based on an implication. But the attribution must establish that the view expressed in the alleged statement is the view of the relevant legal institution. Such attributions can only be based on factual considerations. Moral argument can establish what legal institutions should have said or should have held but not what they did say or hold.
We have already traced one source of resistance to this conclusion to the assumption that if attribution is on factual rather than moral grounds then it must be a non-controversial, easily established matter which requires at most the application of a procedure of reasoning having the character of an algorithm to some non-controversial simple facts. The assumption that only moral questions can resist easy agreement or solution by algorithmic procedures has nothing to recommend it, and I in no way share it. The case for saying that attribution of belief and intention to their author is based on factual criteria only does not rest on the false claim that such attributions are straightforward and non-controversial. A second source of resistance, also noted above, derives from overlooking the greater complexity involved in attributing views or intentions to complex institutions whose activities spread over long stretches of time, and the tendency to think that nothing more is involved in these cases than is involved in attributing beliefs or intentions to individuals.
But there is a third difficulty with the view I am advocating which must be addressed now. One may ask: if an authority explicitly prohibited e.g. unfair discrimination, is not the fact that certain cases display unfair discrimination evidence enough for attributing their prohibition to the authority? Two considerations are usually brought to support the view that these reasons are sufficient to determine the content of the law on such matters. I shall try to rebut this view by showing that these supporting considerations are mistaken. First is the claim that the only alternative view holds that the law is determined only regarding cases which the law-maker actually contemplated and had in mind when making the law. This, let it be conceded right away, is not merely false but very likely an incoherent view. Second (and it does not matter that this point may be incompatible with the first), it is sometimes said that the only alternative view assumes that the law’-makers intend their particular view of what is unfair discrimination to become law even if they are wrong.
Suppose that the fathers of the constitution outlawed cruel punishment. Suppose further that it is beyond doubt that they thought that flogging is not cruel, and finally, that in fact (or in morals) it is cruel. Are we to
assume that the law-maker’s intention was to exclude flogging from the scope of the constitutional prohibition of cruel punishments? Would not the correct view be that in making cruelty a bench-mark of legality the law-makers intended their own judgment to be subject to that criterion, so that, though believing flogging not to be cruel, they expressed the view that if it is cruel it is unlawful?
Both points have a deja vu aspect. They depend on the unimaginative assumption that either the law is determined by the thoughts actually entertained by the law-maker when making the law or it must include all the implications of those thoughts. Since it must be granted, and I do grant, that it is not the first, the second is supposed to be the case. This was the structure of Lyons’s argument regarding the explicit content thesis. As he saw it, either the law is confined to its explicit content or it contains all its implications. Since Hart rejects the second alternative he was saddled by Lyons with the first. Since Lyons sees, as everyone must, that the first is wr, he embraces the second. The two considerations explained above are the psychological variants of Lyons’s linguistic dichotomy. They contrastnot actual language with its implications but actual thoughts with their implications.
The answer to both arguments is the same: the dichotomy is a false one. There are other possibilities. Sometimes we know of a person that, for example, if only’ he realized that certain forms of psychological abuse are cruel, would not be so indifferent to them. At others we know that if he were convinced that they’ are cruel he would find some other way to justify them. He would come to believe that cruelty is sometimes justifiable. In attributing such views to people, one does not endorse either of the two unacceptable views mentioned above. Naturally it is often impossible to impute any such view to a person. The question whether he would have maintained his intention to prohibit cruel punishment had he known that capital punishment is cruel (assuming for a moment that it is) may admit of no answer.
Furthermore, and this is often overlooked, the sources thesis by itself does not dictate anyone rule of interpretation. It is compatible with several. It is compatible, for example, with saying that, if it is known that the lawmakers prohibited cruel punishment only because they regarded flogging as not cruel, then that law does not prohibit flogging. It is also compatible with the rule that the law is confined in such cases to the intention expressed by the law-maker. This is to prohibit cruel punishment. Since, by this rule of interpretation, no more specific intention is attributable to the law-maker, the law gives discretion to the courts to forbid punishments they consider cruel (this reflects the lack of specificity in the law) and instructs them to forbid those which are cruel. Which of these, or of a number of alternative interpretations,
is the right one varies from one legal system to another. It is a matter of their own rules of interpretation. One possibility is that they have none on this issue, that the question is unsettled in some legal systems. The only point which is essential to the sources thesis is that the character of the rules of interpretation prevailing in any legal system, i.e. the character of the rules for imputing intentions and directives to the legal authorities is a matter of fact and not a moral issue. It is a matter of fact because it has to sustain conclusions of the kind: “That is in fact the view held by these institutions on the moral issues in question.”
Two further points have to be made to avoid misunderstanding. First, none of the above bears on what judges should do, how they should decide cases. The issue addressed is that of the nature and limits of law. If the argument here advanced is sound, it follows that the function of courts to apply and enforce the law coexists with others. One is authoritatively to settle disputes, whether or not their solution is determined by law. Another additional function the courts have is to supervise the working of the law and revise it interstitially when the need arises. In some legal systems they are assigned additional roles which may be of great importance. For example, the courts may be made custodians of freedom of expression, a supervisory body in charge both of laying down standards for the protection of free expression and adjudicating in disputes arising out of their application.
Second, it may be objected that relying on the mediating role of authority’ becomes an empty’ phrase when it comes to legal rules which have evolved through the activities of many hands over a long time. The fact that we implicitly or explicitly endorse rules of attribution which sanction talk of the intention of the law where that intention was never had by anyone person does not support the argument from the mediating role of the law. It merely shows it to be a formalistic, hollow shell. This objection, like some of the earlier ones, seems to betray impatience with the complexities, and shortcomings, of the world. Every attribution of an intention to the law is based on an attribution of a real intention to a real person in authority or exerting influence over authority. That intention may well relate to a small aspect or modification of the rule. If the intention of the law regarding the rule as a whole differs from that of any single individual, this is because it is a function of the intentions of many. Sometimes, but by no means always, this leads to reprehensible results. Be that as it may, the view propounded here will in such circumstances highlight the indirect and complex way in which the law has played its mediating role.
All the arguments so far concern the narrow sources thesis only. Nothing was said about its application to applied legal statements. I tend to feel that it applies to them as well, since they are legal statements whose truth value depends on contingent facts as well as on law. If one assumes that
contingent facts cannot be moral facts, then the sources thesis applies here as well. That is, what is required is the assumption that what makes it contingently true that a person acted fairly on a particular occasion is not the standard of fairness, which is not contingent, but the ‘brute fact’ that he performed a certain action describable in value-neutral ways. If such an assumption is sustainable in all cases, then the sources thesis holds regarding applied legal statements as well.
The considerations adumbrated above dispel some of the misunderstandings which surround the sources thesis. First, it does not commit one to the view that all law is explicit law. Much that is not explicitly stated in legal sources is nevertheless legally binding. Second, the sources thesis does not rest on an assumption that law cannot be controversial. Nor does it entail that conclusion. Its claim that the existence and content of the law is a matter of social fact which can be established without resort to moral argument does not presuppose nor does it entail the false proposition that all factual matters are non-controversial, nor the equally false view that all moral propositions are controversial. The sources thesis is based on the mediating role of the law. It is true that the law fails in that role if it is not, in general, easier to establish and less controversial than the underlying considerations it reflects. But this generalization is exaggerated and distorted when it turns into the universal, conceptual dogmas of the explicit content or the non-controversiality theses.
The sources thesis leads to the conclusion that courts often exercise discretion and participate in the law-making process. They do so when their decisions are binding on future courts (even where the decisions can be modified or reversed under restrictive conditions) and where their decisions do not merely reflect previous authoritative rulings. Saying this does not mean, however, that courts in exercising their discretion either do or should act on the basis of their personal views on how the world should be ideally run. That would be sheer folly. Naturally judges act on their personal views, otherwise they would be insincere. (Though the fact that these are their views is not their reason for relying on them. Their reasons are that those propositions are true or sound, for whatever reason they find them to be so.) But judges are not allowed to forget that they are not dictators who can fashion the world to their own blueprint of the ideal society. They must bear in mind that their decisions will take effect in society as it is, and the moral and economic reasons they resort to should establish which is the best or the just decision given things as they are rather than as they would be in an ideal world.
Finally, the sources thesis does not presuppose a non-naturalist ethical position. Even if a certain social fact entails certain moral consequences it can still be a source of law. It is a source of law as the social fact it is, and not
as a source of moral rights and obligations. It is a source of law under its naturalistic rather tunder its moral description.
VI. THE ROLE OF VALUES IN LEGAL THEORY
According to R. M. Dworkin, legal positivists endorse the model of rules because of a poltheory about the function of law which they think is to ‘provide a settled public and dependable set of standards for private and official conduct, standards whose force cannot be called into question by some individual official’s conception of policy or morality. The argument of this article shows that something like Dworkin’s description applies to my argument. But notice that Dworkin’s remark suggests that legal positivists endorse the non-controversiality and the explicit contents theses, which I do not share. Besides, it is misleading to regard the thesis and argument explained here as moral ones. The argument is indeed evaluative, but in the sense that any good theory of society is based on evaluative considerations in that its success is in highlighting important social structures and processes, and every judgment of importance is evaluative.
Given the centrality of that feature, it is justified to interpret the action of law-makers who are in a hurry to get back home, who vote without paying attention to what they are voting for, in the way described. Two features stand out. First, while this is an evaluative judgment, it is not a judgment of the moral merit of any1hing. Second, its application depends on the fact that the perception of importance of the feature focused upon is shared in our society’, that it is shared, among others, by’ the law-makers themselves.
The concept of law is part of our culture and of our cultural traditions. It plays a role in the way in which ordinary people as well as the legal profession understand their own and other people’s actions. It is part of the way they ‘conceptualize’ social reality. But the culture and tradition of which the concept is a part provide it with neither sharply defined contours nor a clearly identifiable focus. Various, sometimes conflicting, ideas are displayed in them. It falls to legal theory to pick on those which are central and significant to the way the concept plays its role in people’s understanding of society, to elaborate and explain them.
Legal theory contributes in this respect to an improved understanding of society. But it would be wrong to conclude, as D. Lyons has done, that one judges the success of an analysis of the concept of law by its theoretical sociological fruitfulness. To do so is to miss the point that, unlike concepts like ‘mass’ or ‘election’, ‘the law’ is a concept used by people to understand themselves. We are not free to pick on any fruitful concepts. It is a major task of legal theory to advance our understanding of society by helping us understand how people understand themselves.
To do so it does engage in evaluative judgment, for such judgment is inescapable in trying to sort out what is central and significant in the common understanding of the concept of law. It was my claim in this chapter that one such feature is the law’s claim to authority and the mediating role it carries with it. The significance of this feature is both in its distinctive character as a method of social organization and in its distinctive moral aspect, which brings special considerations to bear on the determination of a correct moral attitude to authoritative institutions. This is a point missed both by’ those who regard the law as a gunman situation writ large and by those who, in pointing to the close connection between law and morality, assume a linkage inconsistent with it.
Let me exemplify the difference between my conception of the role of evaluation in explaining the nature of law and that of Dworkin by considering one central objection to the sources thesis. Some people object not to the attribution of intention to legislators or interpreters of the law in itself, but to the presupposition of the sources thesis that whenever one is faced with valid legislation one can also find an intention behind it. Is it always the case? Do we not know that sometimes members of Parliament vote knowing nothing and intending only to get home as early as possible? An adequate answer to this and related questions has to await a comprehensive treatment of interpretation and the role of intention within its context. A brief indication of the direction in which an answer is to be sought will have to do.
Let us start by considering the view which denies the important of the law-maker’s intention to our understanding of’ the law. To the question: ‘why should one assign any importance to a particular text as legally binding?’ that view will reply: ‘because is was endorsed by the proper constitutional procedure.’ To the question ‘how’ should the text be interpreted other than by’ reference to the intentions of its author or of those whose action maintains its force as law’? the answer would refer to existing conventions of interpretation which need not refer to anyone’s intention. There is nothing wrong with these replies. They merely raise further questions. Why does the endorsement of a certain text in accord with those procedures endow it with a special status? Is it some form of magic or fetishism?
That procedure is a way of endowing a text with legal force because it is a procedure designed to allow those in authority to express a view on how people should behave, in a way which will make it binding. That it is such a procedure, and not just any arbitrarily’ chosen ritual, is part of what makes it into a legal procedure. The law-making procedure includes conventions of interpretation. A change in the conventions of interpretation of a legal system changes its law. Consider the simple example: of a change from an understanding of “person” to include only people to a reading of it which covers foetuses as well. Law-makers need not intend anything other than that the bill should become law with the meaning given it by the conventions of interpretation of their country. To deny them that intention is to deny that they know what they are doing when they make law.
How is this sketchy reply to the objection to he defended? It turns on evaluative conceptions about what is significant and important about central social institutions, i.e., legal institutions. But in claiming that these features are important one is not commending them as good. Their importance can be agreed upon by anarchists who reject any possibility of legitimacy for such institutions. All that is claimed is the centrality to our social experience of institutions which express what they claim to be the collective and binding judgement of their society as to how people should behave.
When judges decide cases, and sometimes make law thereby, what guides their decisions? In many cases there are clear rules for the judges to follow, but what happens when the rules run out? This is a question that has given legal writers a lot to think about, and one of the major jurisprudential debates of the late twentieth century, the Hart-Dworkin debate, concerned the extent to which a legal system can be regarded as a system of rules and how far account must be taken of other non-rule contributions.
Professor HLA Hart in his book The Concept of Law sought to distinguish obedience to a rule from habitual conduct “as a rule” by saying that a rule is essentially normative rather than predictive. The word “rule”, he said, has at least two meanings: it may be used in a descriptive or predictive sense: “as a rule I go to the cinema on Saturdays”, or in a normative sense: “there is a rule against walking on the grass”. He was concerned only with normative rules, though he recognised that some people might obey normative rules as a matter of habit, without consciously thinking of the rule at all.
The idea of obligation depends on the idea of a rule, Hart said. A victim may “be obliged” to hand over his wallet to a mugger if he fears some unpleasant consequence that is likely to occur if he fails to comply, and the effect of which is not trivial in comparison with the effect of complying. We say he is obliged to comply; such a statement refers to his own beliefs and motives and implies that he actually does comply. If he decides to resist the robbery in spite of the robber’s threat (and perhaps suffers the consequence), we certainly do not say that he was obliged to obey the robber’s demand. On the other hand, to say that a person “has an obligation” not to drive faster than 30 mph in a built-up area says nothing about the likelihood or the seriousness of the consequences that might follow non-compliance, nor about whether he actually complies with the obligation or not.
The identification of a normative rule and a corresponding duty or obligation therefore depends on two things, one external and one internal. First, there must be a general habit of conformity with the rule, and this is a matter of observation. This is not to say that there must be total conformity, but no percentage figure can be set down: Hart himself likened it to asking how many hairs a man can have and still be bald. On the other hand, if the “rule” is widely disregarded, it may not really be a rule at all.
Conformity alone is not enough, however: there is general conformity with the eating of turkey on Christmas Day, but this is a descriptive rather than a normative rule. Hart therefore demands a second element, a “critical reflexive attitude”, in that members of the society share in criticising those who deviate from the rule (which may involve self-criticism in some cases) and perhaps in making some demand for conformity. Typically, the critical reflexive attitude is shown in the way parents encourage their children to conform with the rule, using normative vocabulary involving words such as “must”, “ought” or “should”. For example, the practice that men remove their hats in church is something they “ought to do”, and so can fairly be described as a rule.
The use of language is ambiguous, however, and the position of the detached observer must be considered. A meat-eater may say to a vegetarian “you ought not to eat that sausage” without internally accepting that meat-eating is wrong, but by reference to the rules by which he understands the vegetarian to live. Conversely, the vegetarian may say to his friend “you ought not to eat meat”, putting the proposition forward as a moral philosophy but recognising realistically that it is not a rule. Only when the speaker recognises the existence of a social practice and adopts a critical reflexive attitude towards it can it be said that he internally accepts it as a rule.
The acceptance of a rule and the critical reflexive attitude towards it are not synonymous with approval of the rule. A person may commonly disapprove of a particular rule (whether legal, as in the requirement to pay a particular tax, or social, as regards appropriate dress for a particular occasion), and even feel morally able to disobey it, while accepting that it is a rule and that those who break it (including perhaps himself) are open to criticism on that account. Those who internally accept the rule are those who use it as a standard by which to measure their own or others’ behaviour, without necessarily believing that it is a good rule.
Rules and principles
According to Hart, law is essentially a system of rules, identified and prioritised by a “rule of recognition”. When the rules run out, he said, the judge has discretion to decide the case. The most cogent criticisms of this view came from Ronald Dworkin, who said that law contains not only rules but also principles: in “hard cases” where the rules do not cover a particular situation, or give an unacceptable answer, the judge must be guided by principles. Such principles are not external to the legal system and used just for guidance, as Hart would claim: rather, they are an integral part of the system. A judge need not follow principles rigidly – if he did they would be rules – but he must take them into account when exercising his discretion. A judge who departs from principles too often will find many of his decisions reversed on appeal, and to that extent would evidently be making wrong decisions. Certainly the idea of a persuasive precedent (which is clearly not a legal rule) is hard to explain in positivist terms, and gives powerful support to Dworkin’s criticism. The principles that guide the judges’ decisions, said Dworkin, are themselves part of the law. Lawyers and even students think it meaningful to say “The House of Lords reached the wrong decision in such-and-such a case”, even though the decision did not directly break any pre-existing rule.
Every legal problem, according to Dworkin, has just one right answer, and the judge’s task is to discover it. The right answer is the one that is “best” both in terms of its fit with the corpus of decided cases and in terms of its content. For example, suppose a chess player annoys his opponent by smiling at him: the rules of chess prohibit causing unreasonable annoyance to an opponent but say nothing expressly about smiling, and the referee has to make a decision as to whether it is allowed or not. The “right” answer depends on whether chess is more a form of psychological warfare or a purely analytic exercise, and the referee will decide which explanation is more consistent with the history and practice of the game.
The judges, said Dworkin, are engaged in an exercise similar to the writing of a chain novel: a TV soap opera is probably an even better analogy. Each writer has some freedom in respect of his episodes, but must ensure that the characters act in a way consistent with their past behaviour. Subject to producing such a fit, the author may develop the story as he wishes, but some developments may be better from a literary or dramatic point of view than others, and his job is to choose the best. An episode in which all the leading characters were simultaneously killed in a railway accident might well “fit”, but would probably not be regarded as the best way of developing the story for the future! In the same way, each interpretation of the law adds something to the legal story, but the law has an integrity of its own and the judge must ensure that his interpretation forms part of a coherent theory justifying the legal system as a whole.
Dworkin therefore argued that in determining the law a judge is constrained to act in accordance with legal principle, and is not (as Hart suggested) free to use his discretion in any way he chooses. He uses an analogy of a sergeant told to select his five most experienced soldiers, and another of a boxing referee told to award the fight to the more aggressive boxer, and says that although the sergeant and the referee are called upon to exercise their judgment the criteria on which they are to do so are clear, and they certainly do not have the total discretion envisaged by Hart’s theory of law. Judges should not have such wide discretion either: if a judge is free to choose without restraint whether to benefit P at D’s expense or vice versa, the law is reduced to a kind of lottery. What we surely expect a judge to do is to enforce the pre-existing rights of one against the other, and where those rights are not clearly spelled out by the rules of law, we expect him at least to apply certain established legal principles.
- Lawson v Serco  UKHL 3
- In three conjoined cases, the House of Lords considered the territorial scope of s.94(1) of the Employment Rights Act 1996, which gives employees the right not to be unfairly dismissed. The question in each case, said Lord Hoffmann, was whether s.94(1) applied to cases in which the employment had some British and some foreign elements. This was a matter of statutory construction, and should be decided according to established principles, giving effect to what Parliament might reasonably be supposed to have intended and attributing to Parliament a rational scheme. But this involved the application of principles, not the invention of supplementary rules. On the other hand, the fact that the House was dealing in principles and not rules did not mean that the decision was an exercise of discretion: the section either applied to each of the employment relationships in question or it did not, and that was a question of law, albeit one involving judgment in the application of the law to the facts.
We consider now several examples of principles in operation.
Persuasive precedents and obiter dicta are evidently not rules – if they were they would be binding rather than persuasive – but are certainly taken seriously by subsequent judges.
- Doughty v Turner Manufacturing  1 All ER 98, CA
- A man P was badly burned when a workmate carelessly knocked a cement block into a bath of molten metal; there was no splash, but a minute or so later there was a violent and wholly unexpected chemical reaction. In Re Polemis  3 KB 560 the Court of Appeal had held that a person who performed a negligent act was liable for all its direct consequences, but in The Wagon Mound  1 All ER 404 the Privy Council had disapproved this rule and had said liability existed only where the kind of damage was reasonably foreseeable. Following the latter decision, the Court found DD were not liable for P’s injuries: whether or not The Wagon Mound is binding on this court, said Harman LJ, we ought to treat it as the law.
- Anderson v Rhodes  2 All ER 850, Cairns J
- Vegetable merchants PP, acting on a verbal recommendation from other merchants DD, supplied potatoes on credit to a firm which subsequently went bankrupt. PP sued for negligent misrepresentation and the judge applied the principles set out by the House of Lords in Hedley Byrne v Heller  2 All ER 575. Academic lawyers, he said, might argue that those principles were obiter dicta and that he was bound by the contrary decision of the Court of Appeal in Candler v Crane Christmas  1 All ER 426. But when five members of the House of Lords have all said after close examination of the authorities that a certain type of tort exists, a judge of first instance should proceed on the basis that it does exist, without pausing to embark on an investigation whether what was said was necessary to the ultimate decision.
- Caparo v Dickman  1 All ER 568, HL
- PP bought shares in F plc with a view to taking it over, and bought more after seeing F’s auditors’ report. The shares then fell in value, and PP sued the auditors for their negligence in preparing their report. Giving judgment for DD, the House of Lords approved a dictum of Brennan J in the High Court of Australia that the law should preferably develop novel categories of negligence incrementally and by analogy with established categories, rather than by a massive extension of a prima facie duty of care restrained only by indefinable “considerations which ought to negative or limit the scope of the duty or the class of person to whom it is owed”.
- Wright v Lodge & Shepherd  RTR 123, CA
- Mrs S was the driver of a Mini which broke down on an unlit dual carriageway in fog. A lorry driven by L, travelling at 60 mph, struck the Mini from behind and then swerved across the central reservation to strike two more cars, injuring W and killing K. L admitted liability but claimed a contribution from S. Although the presence of S’s car was clearly a factual cause of these injuries, the trial judge applied a dictum of Cairns LJ in Rouse v Squires  2 All ER 903 to the effect that the reckless driving of a third party might amount to nova causa interveniens, and found S not liable. [The Court of Appeal subsequently agreed.]
When the rules run out, the judges seek to justify their decisions by analogy with past cases (the “goodness of fit” test) or by explicit appeal to general principles (the “best development” test), and do not simply rely on unfettered discretion.
- McLoughlin v O’Brien  2 All ER 298, HL
- A mother P saw members of her family in hospital about an hour after a fatal road accident – one daughter was dead and her husband and two other children were seriously injured – and herself suffered psychological injury for which she sued the other driver. Lord Wilberforce reviewed the history of “psychic injury” cases and their step-by-step development of the law, and proposed an extension from a victim at the scene to one who (like P) came upon the “immediate aftermath” of the accident, subject to other criteria which P satisfied. Lord Scarman took a different approach, and said there was a general legal principle that tortfeasors were liable for the forseeable consequences of their acts: it was foreseeable that P would suffer psychological injury in these circumstances, and that was enough.
- Gillick v West Norfolk & Wisbech HA  3 All ER 402, HL
- A mother of five daughters sought a declaration that a doctor would be acting unlawfully if he gave contraceptive treatment for any of her daughters without the mother’s consent. The House of Lords considered various principles such as the rights and responsibilities of parents, the human rights of children as individuals, the need to respect doctors’ professional judgments, the importance of medical confidentiality, the welfare of the child as a cornerstone of family law, and the importance of not encouraging under-age sex, before deciding by a majority of 4-1 that a child under 16 who has sufficient intelligence to understand fully the implications of the proposed treatment (a “Gillick competent” child) can give her own consent to medical treatment.
- Munroe v London Fire Authority  2 All ER 865, CA
- Firefighters employed by DD were called to PP’s premises, where a number of small fires had been started by third parties. When the firefighters arrived the fires had apparently been extinguished, and after looking around they decided there was no more danger and left. One of the fires was not in fact extinguished and flared up again, causing damage to PP’s premises, and PP sued. In a preliminary hearing, Rougier J considered whether the fire brigade owed PP a duty of care. There were no decided cases directly on the point, so his judgment referred to 24 cases about duties of care owned or not owed by police officers, prison officers, hospitals, local councils, river authorities and so on. The closest analogy, he felt, was with the police, and he applied the reasoning in Hill v Chief Constable of West Yorkshire  2 All ER 238 to hold that the brigade owed no duty of care to PP. [The decision was affirmed on appeal: Stuart-Smith LJ said the fire brigade was not under any common law duty to repond to a call for help, and was not liable if it failed to do so promptly or effectively or at all.]
Where the rules seem to lead to an unacceptable result, the judges may appeal to principles as justification for setting the rules aside. This goes beyond Hart’s “exercise of discretion”: if the rules are clear there is no scope for any exercise of discretion at all. Rather, the judges are weighing a powerful principle “that the rules should be applied” against another principle of fairness or justice or morality and coming down in favour of the latter.
- Riggs v Palmer (1889) 115 NY 506, Supreme Court (New York)
- A court had to decide whether a man who had murdered his grandfather could take his inheritance under the victim’s will. The rules of succession were apparently clear, but the judges cited repeatedly the principle that “no one should profit from his own wrong” and held that this overrode the provisions of the will, balancing this against the principle calling for a literal interpretation of the statute. The written rules, they said, were subject to certain overriding principles, and the legislature “could not possibly” have intended the murderer to inherit.
- Central London Property v High Trees House  KB 130, Denning J
- In 1937 the owners PP leased a block of flats in London to DD at an agreed rent. When war broke out, many flats were left empty as people moved out to escape the bombing, and PP agreed to reduce the rent by half. DD paid the reduced rent until the end of the war, and PP then claimed for the “arrears”. The rules were clear – at common law, a promise made without consideration is not legally binding – but Denning J said that a party to a contract makes a promise to the other, which he knows will be acted on, that he will not enforce his strict legal rights, the equitable principle of promissory estoppel makes that promise binding on him until such time as he gives reasonable notice of his intention to resume those rights.
- R v R (rape – marital exemption)  4 All ER 481, HL
- Removing the “marital rape” exemption to affirm the conviction of a husband who had raped his own wife, the House of Lords clearly took into account a twentieth-century principle (not expressly stated anywhere as a rule of law) concerning a woman’s autonomy over her own body. This principle (and others) they then weighed against the very strong principle of stare decisis in relation to more than two hundred years’ recognition of the exemption and decided that the greater weight was in favour of the former.
- Re Pinochet  4 All ER 897, HL;  1 All ER 577, HL
- The House of Lords, reversing the Divisional Court, ruled 3-2 that A’s position as a former head of state did not confer immunity from extradition proceedings based on allegations by a third country of torture and other violations of human rights. Amnesty International was allowed to present an amicus curiae argument at this hearing, and it subsequently emerged that Lord Hoffmann, one of the judges in the majority, had for some years been a non-executive director of Amnesty International Charity. There is normally no appeal from decisions of the House of Lords, but A sought to set aside the Lords’ judgment, and a panel of five different Law Lords allowed his application. Lord Browne-Wilkinson said the fundamental principle is that a man must not be a judge in his own cause: Lord Hoffman was so closely and actively associated with one of the parties that he was disqualified from hearing the case regardless of whether or not there was any actual appearance or risk of bias. This did not mean that judges could never sit in cases involving charities they supported, but where the judge was a director or senior officer of a charity that was party to a case, disqualification is automatic subject to the possibility that the parties (having been fully informed) might waive any objection.
There are many rules both in statute and in common law that require a judge to determine (for example) whether or not certain conduct is reasonable, and this is often not expressly covered by rules. Here and in many other situations the judge must therefore “exercise his discretion”, but he is expected to do so in accordance with certain well-established principles and may be reversed on appeal if he does not.
- R v Mason  3 All ER 481, CA
- Section 78 of the Police And Criminal Evidence Act 1984 gives a trial judge discretion to exclude prosecution evidence if in the judge’s opinion the admission of the evidence would have such an adverse effect on the fairness of the trial that he ought to exclude it. Where the police obtained a confession by telling the accused man and his solicitor that his fingerprints had been found at the scene of a burglary, the judge exercised his discretion to admit the evidence. The Court of Appeal quashed the conviction, saying the judge had not properly taken into account the deception practised on the solicitor and had thus exercised his discretion wrongly.
- Re S (Custody)  2 FLR 388, CA
- M walked out leaving F with a girl G, aged 2. F was subsequently awarded custody, but M kept G after an access visit and subsequently obtained from a judge a custody order in her favour. Allowing F’s appeal and remitting the case to the family proceedings court for a new hearing, the Court of Appeal said G’s welfare was the first and paramount consideration: there is no legal presumption in favour of one parent over another, even though in practice a small child is usually better off with its mother, and the judge was wrong to prefer his discretion over the magistrates’.
- Barrett v Ministry of Defence  3 All ER 87, CA
- A sailor S became so drunk one night that he passed out and, having then been inadequately treated, choked to death on his own vomit. His widow P sued the Navy for their negligence. The trial judge though it fair and reasonable to impose on DD had a duty of care to prevent S becoming drunk, but the Court of Appeal disagreed. I can see no reason, said Beldam LJ, why it should not be fair, just and reasonable for the law to leave a responsible adult to assume responsibility for his own actions in consuming alcoholic drink. No one is better placed to judge the amount the he can safely consume or to exercise control in his own interest as well as the interest of others.
Even in sentencing, where judges have considerable discretion within the statutory limits, the Court of Appeal lays down guidelines for each offence. They will not alter a sentence that differs only slightly from the guideline, but the judge’s discretion is certainly not unlimited, and a sentence that is wildly out of line (even though within the rules) will be changed and the judge possibly criticised.
Principles are not simply rules of a different kind: they differ from rules in two fundamental ways.
First, the validity of rules is determined by the rule of recognition, which relates to their form and pedigree. A rule is valid if it satisfies certain criteria, normally quite independent of the rule’s content. But legal principles cannot be validated by pedigree: they are “valid” because they are felt to be appropriate by society and, in particular, by the judges. The rule of recognition cannot bring them in except by saying that they are those principles which society regards as legally binding, and that is a circular definition which involves an examination of the content of the principle in order to assess its validity.
Second, rules of law cannot conflict: any rule of recognition must necessarily include some test for determining which (if either) of two apparently conflicting rules is valid. In English law, for example, a later statute supersedes an earlier, a statute takes precedence over a common law rule, and directly applicable European law takes precedence even over Acts of Parliament. But there is no limit on conflicting principles. Suppose, for example, that we regard the principle that “no one shall profit from his own wrong” as a rule of law. We see immediately that there are exceptions to this “rule” – the doctrine of adverse possession, for example – and we seek to explain them. In terms of rules this is not easy, but in terms of principles it is simply a matter of setting that principle against another (and, in this case, weightier) that “stability and certainty of land tenure is to be promoted”.
Principles and policies
Dworkin’s general view of law is a liberal democratic one. He demands that the government treat people as deserving equal concern and respect, and impose no constraint depriving any citizen of a sense of equal worth. Any class or caste system that counts some members inherently less worthy than others, he says, is a system yielding no communal responsibilities and hence no moral duty to obey the law. Individual rights are fundamental, and the law should not be used to enforce private morality. Rather, the law acts as a constraint on the government and requires it to justify every use of coercive force against the individual.
Dworkin also demanded that judges respect the idea of the democratic mandate so far as policy-making is concerned, and leave matters of policy wherever possible to the elected legislature. The judges’ role is to apply legal rules and legal principles – the latter already exist and have merely to be discovered – rather than to make legislation based on policy and affecting retrospectively the rights of the parties in the instant case. He saw it as a matter for regret that judges do sometimes claim to be applying policy considerations when in fact they are looking for legal principles, and a matter for condemnation that they sometimes apply policy in fact as well as in name. He thus drew a distinction between principles and policies, though conceding that most principles could be framed as policies and most policies as principles by anyone so inclined. A policy, he said, is a standard setting out a goal to be achieved, usually in terms of the economic, social or political well-being of the community. A principle, on the other hand, sets individual rights above communal well-being and imposes a standard of justice or fairness or some other moral dimension. Matters of policy should be left to the elected legislators; judges should concern themselves only with legal principles, distinguishable from merely moral principles by the fact that lawyers and others regard them as being part of the legal system.
In ordinary debate, a person who asserts that a certain decision is “a matter of principle” is generally understood to mean that the consequences are immaterial to the decision, and Dworkin’s definition is similar. Principles, he says, are propositions that describe rights; policies are propositions that describe goals. Individual rights trump utilitarian goals, though in a time of major emergency (such as all-out war) it may be permissible to give goals preference over rights in order to regain or preserve a state of affairs in which principled decisions are once again possible. In general, however, judges should confine themselves to principles and rights, and leave matters of policy and goals to the elected legislature.
On the whole, the judges tend to agree with this view at least in what they say. Questions of social policy are better left to Parliament, they say, and it is not for judges to interfere in such matters.
- Fisher v Bell  3 All ER 731, DC
- A shopkeeper displayed in his window a flick knife with a price ticket, and was prosecuted for offering for sale an offensive weapon contrary to the Restriction of Offensive Weapons Act 1959. The Divisional Court said the phrase “offer for sale” was to be taken literally, in accordance with its meaning in contract law, and that D’s display of the weapon was no more than an invitation to treat. This interpretation made the relevant section of the statute almost wholly useless, but was consistent with the strict approach commonly adopted towards penal provisions. This is classically a decision based on principle, perhaps even wrongly so in the circumstances, and clearly takes no account of policy considerations.
- Bradbury v Enfield LBC  3 All ER 434, CA
- The local authority DD anticipated by some months Ministerial approval of a reorganisation plan for eight schools. Goff J conceded that DD were in breach of the statutory rules providing that they “shall not do anything” prior to such approval, but refused to grant relief because of the major inconvenience that would be caused. However, the Court of Appeal granted the injunction sought: Lord Denning MR agreed there might well be considerable upset for a number of people, but thought it more important to uphold the rule of law.
- Morgans v Launchbury  2 All ER 606, HL
- The victims of a road accident sued the owner of a car, claiming she was vicariously liable for the negligence of the driver, a friend of her husband driving the husband to a restaurant after an evening at the pub. They claimed it was a “family car” and that the husband should be regarded as joint owner with the wife. The House of Lords found for the owner and refused to bring into the law the novel “family car” concept. Lord Pearson said such an innovation, whether or not it was desirable, was not suitable to be introduced by judicial decision. It raised important questions of policy, which needed to be considered by Parliament, using the resources at their command for making wide enquiries and gathering evidence and opinion. Lord Wilberforce said that to declare from that date a new and more extensive principle of liability would affect many people’s assumed legal rights. Any such new direction must be set by Parliament for the future, not by the courts retrospectively.
- Blathwayt v Cawley  3 All ER 625, HL
- A man T made a well which left substantial property on trust to C for life and then to C’s sons, on condition that they were not and did not become Roman Catholic. T died and the property passed to C, who subsequently joined the Roman Catholic church. In a judgment based mainly on technical matters, the House of Lords said obiter that the public policy against religious discrimination must give way to the principle of testamentary freedom.
- McLoughlin v O’Brian  2 All ER 298, HL
- P’s family were injured in a road accident, and she suffered psychiatric illness after seeing them in hospital. She sought damages from the other driver, although she had not been at the scene of the accident, and the House of Lords agreed she should succeed. Lord Scarman said that by concentrating on principle instead of policy the judges could keep the common law flexible and consistent, and keep the legal system clear of policy problems which neither it nor the trial process were equipped to solve. If this led to socially unacceptable results, Parliament could legislate to draw a line or map out a new path. Why should the courts not draw the line, as the Court of Appeal had tried to do in this case? Because the policy issue – where to draw the line – was not justiciable: the problem was one of social, economic and financial policy.
- Cambridge Water v Eastern Counties Leather  1 All ER 53, CA & HL
- A tanning firm on a small industrial estate used organochlorides which were stored in drums on their land. Over time there were small spillages on the land, and chemicals percolated down (over a period of several years) and entered the water supply. When new EC quality standards for drinking water were introduced, the level of chemicals in the water abstracted from that area was unacceptable, and the water company sued for damages. Giving judgment for PP in the Court of Appeal, Mann LJ said that where (as here) the law binding on the court was clear, the court’s decision could not be affected by policy considerations. Whether that old law was still appropriate for modern conditions was for others [i.e. for Parliament] to decide. The House of Lords subsequently reversed the decision on the basis that the existing law was not as clear as it had appeared, but agreed on the policy question. The protection and preservation of the environment is now perceived as being of crucial importance to the future of mankind, said Lord Goff, and public bodies are taking significant steps to make the polluter pay for damage to the environment, but it does not follow that a common law principle should be developed or rendered more strict to provide for such liability. On the contrary, it may well be undesirable that the courts should do this.
- Hunt v Severs  2 All ER 385, HL
- P had been injured in a road accident through D’s negligence, but D visited her regularly while she was in hospital, subsequently married her, and continued to provide part of her care. The trial judge awarded damages which included an amount for the care being provided by D, and the Court of Appeal upheld this award, but the House of Lords reversed this decision. The fact that the damages would actually be paid by D’s insurers made no difference, said Lord Bridge; to accept this as a relevant factor would represent a novel and radical departure in the law of a kind which only the legislature might properly effect. At common law, the fact that a defendant was contractually indemnified by a third party could have no relevance whatever to the measure of liability.
- R v Clegg  1 All ER 334, HL
- A soldier who used excessive force and killed an escaping joyrider appealed against his conviction for murder, but the House of Lords said no alternative verdict of manslaughter was possible in these circumstances. Lord Lloyd said he was not averse to judges’ developing law, or even making new law, when they could see their way clearly, even where questions of social policy were involved. A good recent example would be the decision that a man could be guilty of raping his wife. But in the present case their Lordships should abstain from lawmaking: the point in issue was closely related to the wider issue of whether the mandatory life sentence for murder should be retained, and that wider issue could only be decided by Parliament. [The conviction was later quashed on different grounds.]
- C (a minor) v DPP  2 All ER 43, HL
- A 12-year-old boy A was charged with interfering with a motor cycle under s.9(1) of the Criminal Attempts Act 1981. The magistrates convicted him, and the Divisional Court not only upheld the conviction but declared the presumption obsolete: it had no utility today, said Mann LJ, and ought to go. The House of Lords disagreed. The presumption had been discussed in many official reports, said Lord Lowry, and a draft Bill produced by the Law Commission in 1985 had proposed its abolition, but a white paper in 1990 had indicated that the government had no intention of changing the law in this respect. The imperfections attributed to that doctrine could not provide a justification for saying that it was no longer part of English law, and to sweep it away under the doubtful auspices of judicial legislation was impracticable.
- Al-Masari v Home Secretary (1996) unreported
- A Saudi dissident A applied for political asylum in the UK. The government of Saudi Arabia objected to his writings, which advocated Muslim fundamentalism and were highly critical of that government, and threatened economic measures against the UK if A was allowed to continue. The UK government acknowledged implicitly that A’s life would be in danger if he were returned to Saudi Arabia, but refused political asylum and ordered his deportation to a third country willing to receive him. The Home Office minister argued that a balance had to be struck between A’s right of free speech and thousands of British jobs. The Chief Immigration Adjudicator (Judge Pearl) allowed A’s appeal, saying DD had not established that Dominica (to which A was to be deported) was a safe third country. The Home Office subsequently granted “exceptional leave to remain” in the UK for four years.
- R v Davis  UKHL 36
- A man D was charged with murder. The main witnesses against him were frightened to testify in open court, so the judge ordered that their identities should be kept secret (from the defendant as well as from the public). Allowing D’s appeal against his conviction, the House of Lords said that a defendant’s right to confront his accusers is fundamental to the adversarial system of justice, and any departure from this common law rule would be a matter for Parliament.
The judges’ words are sometimes belied by their actions, however, and there are many examples of cases in which judicial decisions have clearly been based on social policy goals rather than on individual rights.
- Smith v Hughes  2 All ER 859, DC
- A number of prostitutes DD were charged with soliciting “in a street or public place” contrary to s.1(1) of the Street Offences Act 1959. One had been on a balcony above the street, and others had been sitting behind open or closed windows at first-floor level. Upholding their convictions, Lord Parker CJ said this was the mischief the Act was intended to prevent – everybody knows this was an Act intended to enable people to walk along the streets without being molested or solicited by prostitutes – and if DD could be clearly seen from the street that was sufficient. The mischief rule is essentially a rule of policy, but its use can probably be justified if the court is genuinely seeking to determine the intention of Parliament.
- Shaw v DPP  2 All ER 446, HL
- D published a booklet containing the names and addresses of prostitutes, their photographs, and details of the services they provided, and was charged with conspiracy to corrupt public morals, a supposed common law offence never previously charged. The House of Lords (Lord Reid dissenting) upheld his conviction. Viscount Simonds said that in the sphere of criminal law he had no doubt the courts retained a residual power to enforce the supreme and fundamental purpose of the law, namely, to conserve not only the safety and order but also the moral welfare of the state. It was their duty to guard against attacks which might be the more insidious because they were novel and unexpected. Here was a policy decision pure and simple.
- Chadwick v British Railways  2 All ER 945, CA
- A rescuer who voluntarily spent some six hours helping to bring the dead and injured out of the wreckage at a particularly harrowing train crash suffered foreseeable psychiatric injury, and was awarded damages against British Railways, whose negligence had led to the crash. Although he had come onto the scene voluntarily, the courts as a matter of policy do not want to discourage rescuers.
- Nettleship v Weston  3 All ER 581, CA
- A learner driver D went out for her first lesson, supervised by a friend P. D crashed the car into a lamppost, and P was injured. P’s claim for damages was upheld by the Court of Appeal, subject to a deduction for contributory negligence. Even learner drivers, said the Court, are to be judged against the standard of the reasonably competent driver. The fact that a particular driver is inexperienced and incompetent does not excuse his falling short of this standard. Lord Denning MR justified the decision by reference to policy: the injured person can recover damages from an insurer only if the driver is liable in law. So the judges must see to it that the is liable unless he can prove care and skill of a high standard. In this branch of the law, he went on, we are moving away from the concept “No liability without fault” to another, “On whom should the risk fall?”. Morally the learner driver is not at fault, but legally she is liable because she is insured and the risk should therefore fall on her.
- DPP v Majewski  2 All ER 142, HL
- Following a fight in a pub, D was charged with assaulting a constable and causing actual bodily harm; his defence was that he had taken a mixture of drink and drugs and had no intention to commit the acts in question. The Criminal Justice Act 1967 s.8 says that in determining whether a defendant intended certain consequences the court must have regard to all the evidence, the burden of proof being on the prosecution, and must not infer such an intention merely because the consequences were likely, but the House of Lords said that rule was irrelevant to cases such as this. One of the prime purposes of the criminal law, said Lord Simon, is the protection from unprovoked violence of people who are pursuing their lawful lives; to allow intoxication as a defence would leave the citizen unprotected from such violence where the perpetrator had taken drink or drugs and did not know what he was doing. The decision is not hard to understand, and is probably right, but it is based on policy considerations going to goals, and ignores the principle that criminal statutes are to be construed narrowly.
- Ashton v Turner  3 All ER 870, Ewbank J
- An escaping burglar was injured through the negligence of his getaway driver, but his claim for damages got short shrift from the courts. As a matter of policy, said the judge, the law will in some circumstances refuse to recognise the existence of a duty of care owed by one participant in crime to another.
- R v O’Grady  3 All ER 420, CA
- Following a drunken brawl, D was charged with murder; his defence was that (in his drunken state) he had overestimated a threat to himself and responded in self-defence. Dismissing his appeal against his conviction for manslaughter, the Court of Appeal said the principle that a person should not be penalised for an honest mistake gave way to the policy consideration that society should be protected from those who do dangerous things as a result of intoxication.
- Hill v Chief Constable of West Yorkshire  2 All ER 238, HL
- In spite of a massive police search, the “Yorkshire Ripper” remained free for several years and murdered a dozen young women. The mother of his last victim sued the police for negligence in failing to catch him, alleging inefficiency and errors in their handling of the investigation. The House of Lords said she could not succeed: the police owed no duty of care towards Susan Hill to protect her from the Ripper. Glidewell LJ in the Court of Appeal and Lord Keith in the House of Lords suggested that there were public policy reasons for not allowing claims like these. If such claims were allowed, they said, the police would be inhibited in the exercise of their professional judgment, and a vast amount of police time and money would be diverted from the fight against crime to the defending of civil cases.
- Alcock v Chief Constable of South Yorkshire  4 All ER 907, HL
- Claims were brought by relatives and friends of some of the people killed in a crush at the Hillsborough football stadium in Sheffield after the police had negligently allowed a crowd to build up too rapidly in a particular part of the stand. The House of Lords reviewed the law on the scope of liability for psychiatric injury, and confirmed the continued existence of tests other than mere foreseeability. Rescuers should continue to qualify on policy grounds, said the House, even though they were not in a close relationship with the victim. But otherwise, several of their Lordships referred to the “floodgates” argument, and the fear of opening up unlimited liability, while Lord Oliver openly used the word “policy” in explaining his decision.
- R v Powell  4 All ER 545, HL
- DD went with another man to buy drugs from a dealer; the dealer was shot as he came to the door and (as it could not be shown who fired the shot) both were convicted as accomplices to murder. Affirming DD’s conviction, Lord Hutton said it was enough to found a conviction for murder that a secondary party to a joint enterprise realised the principal might kill with the intention of causing death or serious injury. It might be anomalous and illogical that a secondary party could be convicted on lesser mens rea than was required of the principal, but the rules of the common law were based in part on practical concerns and (in relation to crimes committed during a joint enterprise) the need to give effective protection to the public against criminals operating in gangs.
Almost everyone has a concept of justice – even a small child understands that certain things are “not fair” – but like an elephant it is easier to recognise than to define. A number of distinguished lawyers have made the attempt, however, and we need to consider and compare their various theories. So what is justice?
JUSTICE AS EQUALITY
The Greek philosopher Aristotle lived in the 4th century BCE, and argued his Nicomachean Ethics that justice (in a politico-legal sense) has two branches: distributive justice and corrective justice. Distributive justice includes the distribution of honours etc among citizens by the state, and the distribution of private property through contracts; corrective justice is concerned with the rectification of unfair distribution. There are always two parties involved, he said, and justice is the mean between the unfairness which favours A and the unfairness which favours B.
Distributive justice, said Aristotle, plainly takes into consideration the merits of the parties – it is unjust that equal parties should have unequal shares, or that unequal parties should have equal shares. Distributive justice is a question of proportion rather than of equality: “all men agree” that what is just in distribution must be according to merit in some sense, though they do not all specify the same sort of merit.
Corrective justice is concerned with restoring a balance which has been disturbed, whether by a voluntary or an involuntary act. Here we are concerned with equality: it makes no difference whether a good man has defrauded a bad man or a bad man a good one, nor whether it is a good or a bad man that has committed adultery; the law looks only to the character of the injury and treats the parties as equal. The judge tries to equalise them once again by imposing a penalty, taking away from the wrongdoer’s gain and (where possible) compensating the victim.
That is not to say that corrective justice requires exact reciprocity – the Lex Talionis of “eye for eye, tooth for tooth” – for in many cases reciprocity and rectificatory justice are not in accord. If an official has unjustly inflicted a wound, for example, he should not be wounded in return, but if some one has wounded an official, he ought not to be wounded only but punished in addition. Moreover, there is a great difference between a voluntary and an involuntary act – justice may require that these be treated differently even where they lead to the same injury.
The French jurist C H Perelman set out in his book De La Justice in 1945 six possible meanings of the word “justice”. He explained these meanings in terms of a distribution of benefits, not dissimilar from Aristotle’s ideas of distributive justice, but his arguments could fairly easily be adapted to deal with burdens.
- To each according to his works
- Justice may be done if people’s rewards are determined according to their contributions, either on a limited scale (e.g. within the factory, those who produce more are paid more) or on a larger social scale (e.g. the doctor is paid more than the dustman, because his greater skills make him a more valuable member of society). This notion of justice is at the heart of a free-enterprise culture, though very few societies, if any, function entirely on this principle.
- To each according to his needs
- Some people would say justice is served if people receive what they need. A parent supporting several children and an aging grandparent receives more in benefits and tax credits than a single person with no such responsibilities, because her needs are greater. A couple with small children gets higher priority on the housing list; and a disabled person is eligible for special benefits and tax allowances because he has special needs. The welfare state is based on this idea of justice.
- To each according to his merits
- Justice may mean that each person gets what he or she deserves: the good (according to the appropriate ethical or other criteria) are rewarded and the bad are penalised. Those who commit serious crimes are sent to prison because this is what they deserve, and (in some people’s belief) the ultimate destination of the soul after death depends on the person’s merits during life on earth.
- To each according to his rank
- Such a concept of justice may sound outdated, and the special privileges of peers have largely vanished from modern society. But teachers often have recreational and other facilities denied to students, older children within a family often have more privileges (and more pocket money) than younger ones, and many large organisations determine salary by reference to a fixed scale (perhaps with annual increments) irrespective of individual productivity.
- To each according to his legal entitlement
- Although this may not be sufficient to ensure justice, it is almost certainly necessary. If the law provides that those charged with a grave offence should receive legal aid, justice surely demands that they receive it. Similarly, if the widow of an intestate is legally entitled to the first £125 000 of his estate, it would not be just if she were deprived of it, even if she were an unloving wife and a millionairess in her own right.
- To each equally
- On its face this might seem the most basic form of justice, even if rarely applied in practice. If several children are given a bag of sweets and told to share them fairly, they will probably interpret that as meaning that each should receive the same amount. But the poll tax was based on this principle of equality, and was widely regarded as an unjust tax because it took no account of other notions of justice such as the ability to pay.
Perelman thus recognised that justice might be defined in various ways, but suggested that supporters of any of his six views would agree on something. Once the type of justice had been defined, he said, and each individual placed in a certain “essential category” according to his works, needs, merits, rank or legal entitlement, everyone would then agree that justice requires all the individuals in a given category to be treated the same. What would be unjust would be if two people, alike in every respect defined by the chosen criteria, were to be treated differently.
In The Concept of Law, H L A Hart linked the idea of formal justice with that of morality. He agreed that like cases should be treated alike – this point seems to be common to most theories of justice – but argued that this raised important questions as to what makes cases alike or different. Why is it just to treat blonde murderers the same as brunettes, but unjust to treat sane murderers the same as insane? If it is just to treat women the same as men in relation to their employment opportunities, why should they be treated differently in relation to maternity leave? And if two people commit similar crimes in similar circumstances, is it just that one should be given a heavier sentence “to set an example to others”?
- R v Reeves  Crim LR 67, CA
- Two men were convicted of receiving stolen property together. D1 had chosen to be tried summarily and had been fined £25, but D2, who had elected trial by jury, was sent to prison for nine months. On appeal, the Court of Appeal said D1’s sentence was ludicrously lenient, particularly as he had a previous conviction for this offence, but ordered D2’s immediate release (after he had served some three months) because of the strong sense of grievance he would feel at the unfairness of the outcome.
- Ghaidan v Mendoza  UKHL 30
- The tenant of a privately rented flat died, and his gay partner (who had lived with him) sought to take over as a statutory tenant, claiming he had been living “as the wife or husband of” the deceased tenant. (In Fitzpatrick v Sterling Housing the House of Lords had rejected such a claim, but had granted the partner an assured tenancy as “a member of the tenant’s family” under a different section of the relevant Act.) Lord Nicholls said Art.14 of the European Convention requires like cases to be treated alike, and unlike cases not to be treated alike. The circumstances which justify two cases being regarded as unlike are infinite, but there are certain grounds of factual difference which are not in themselves acceptable as a basis for different legal treatment. Differences of race or sex or religion are obvious examples, and sexual orientation is another. The majority of the House agreed, and granted the survivor the statutory tenancy he sought.
- R (Carson) v Secretary of State  UKHL 37
- A British pensioner now living in South Africa claimed she should receive the same cost-of-living increases as pensioners still living in the UK. Dismissing her claim, Lord Hoffmann said the principle that everyone is entitled to equal treatment by the state, that like cases should be treated alike and different cases should be treated differently, will be found in most human rights instruments and written constitutions. The claimant was being treated differently from a pensioner who lived in the UK, but that was not discrimination. Discrimination means a failure to treat like cases alike: there was no discrimination when (as here) the cases were relevantly different.
JUSTICE IS DIVINE
The religious philosopher Thomas Aquinas said in the thirteenth century that a just law was one which served the common good, distributed burdens fairly, promoted religion, and was within the law-maker’s authority. That authority is limited by Divine law, and a human law that goes against God’s law is unjust and should not be obeyed. The difficulties of applying such ideas in a largely secular society are obvious, but they are important in their recognition that civil disobedience may sometimes be appropriate.
Aquinas argued inter alia that “almsgiving by the rich from their superfluity, to relieve those in extreme need, is an act of justice”. Note that he does not say that this is a good thing, or that the rich person should be praised: he says it is an act of justice. Almsgiving in these circumstances is what justice requires, he implies, and failure to give alms would be unjust.
The New Catechism of the Roman Catholic Church in England and Wales takes a reformative rather than a conservative view of justice. “Justice is a disposition of the will which inclines us to give to every person what is his or her due with a view to the common good of the whole society. We exercise the virtue of justice … by seeking to change an unjust society … in which some section of the community is systematically exploited in the interests of another wealthy and powerful section.”
The political philosophy of utilitarianism was developed by Jeremy Bentham and modified by John Stuart Mill. A law (or an action) is just, said Bentham, if its overall effect is to increase the sum of human happiness, and unjust if it decreases happiness. It is important to note, though, that this notional calculation must take into account not only the number of people who are made more or less happy, but the depth of their happiness or unhappiness. An action which makes a few people very happy and a lot of people slightly unhappy (or vice versa) may be just or unjust according to the numerical values assigned to their feelings.
Utilitarianism still plays a major part in the democratic decision-making process, and in its favour are two telling arguments. First, it is a secular theory requiring no reference to any Divine law or other abstract religious principles defensible only by faith. And second, the idea of maximising the total happiness of the community is often applied – though perhaps not mentioned by name – in everyday life, both on a national political level and in ordinary dealings among friends. There is no doubt that it works well for much of the time, but in marginal cases the theory breaks down and produces results far removed from those which most people would consider right.
In particular, utilitarianism is concerned only with the total or average happiness of the community, and has no concern for its distribution. [Different writers take different views on the question whether utilitarianism is concerned with the total happiness of the community or the average happiness of its members. In most cases they increase or decrease together, but in a small community with limited resources the birth of a child – a “happy event” – might increase total happiness while diminishing the average because of the greater number required to share the common food store.] The strict utilitarian would see nothing wrong in slavery, for example, if the happiness of many could be increased thereby sufficiently to outweigh the misery caused to a few.
In reply, the utilitarian commonly argues that the principle of diminishing marginal utility actually favours redistribution moving towards greater equality. £100 given to a poor person increases his happiness by much more than it decreases the happiness of the rich person from whom it is taken. On the other hand, maximum happiness depends on high wealth and high productivity, and these are likely to be promoted by a system which gives incentives for hard work and enterprise, so that the extreme of total equality is avoided. This argument fails, however, if we try to apply utility to a crime such as rape. Rape clearly causes pain (at least in the utilitarian sense) to the victim and presumably gives pleasure to the rapist, and in virtually all cases the pain caused (to other women put in fear, as well as to the individual victim) outweighs the pleasure gained. But in cases of gang rape, where several men are deriving pleasure from the same assault, and in other cases where the pain and pleasure caused are equal, the strict logic of utility leads to the conclusion that the more the rapists’ enjoyment the less “unjust” it is, and it is possible that they might enjoy it so much as to outweigh the victim’s pain. But this defies common sense: the obvious injustice of rape should not depend on the statistics of a particular case.
A second criticism of utilitarianism is that it defines as right or just that which brings about the desirable consequence of increased happiness: in other words, it claims that the end (in this case the result, rather than the purpose) may justify the means. Mill was particularly firm on this point, insisting that the principle of utility is concerned only with the effects of an act and not with the intentions behind it, and the consequence principle is now regarded as a cornerstone of utilitarianism. But this position is one that many people find unacceptable: torturing a terrorist, for example, would be just if is succeeded in extracting evidence preventing further deaths, but unjust if it failed (because there would be no general increase in happiness to offset the unhappiness caused to the terrorist). The impossibility of determining in advance the justness of any proposed act therefore makes utilitarianism of little value as a guide to just behaviour.
JUSTICE AS FAIRNESS
The American jurist John Rawls, who died in 2002, published A Theory of Justice in 1971. He defined justice as that which prevailed in a just society, and a just society as one to which a group of rational but mutually disinterested (i.e. selfish but not envious) people would unanimously choose to belong if such a choice were available. In making that hypothetical choice, however, the individual would operate behind a “veil of ignorance”, knowing nothing about his own position in the society. (We might liken it to a person devising a set of rules for a ship’s crew, knowing that his role in that crew would be decided by pure chance after the rules had been drawn up.) He would be ignorant even of his own age, sex, character, physical and mental abilities, tastes and preferences, beyond those common to all human beings.
Rawls then predicted that any such society would exhibit two essential features. No one, he claimed, would agree to a system involving lasting personal sacrifice for the greater good of others – each would give least some thought to his own well-being. But people would adopt a “maximin” approach, seeking to optimise the fate of those worst off in case they themselves should come into that class. They would therefore try to ensure that every person had certain basic liberties, such as freedom of person, freedom of speech and thought, freedom to participate in government, and freedom to possess property, to the greatest extent compatible with the enjoyment of the same basic liberties by others. And second, they would not accept significant social or economic inequalities, or differences of treatment, except insofar as these were for the benefit of the least well off members of the society. Thus (said Rawls) people would agree that doctors should be paid higher than average incomes, because this would encourage able people to qualify as doctors and so benefit everyone in the long run.
JUSTICE AS ENTITLEMENT
Another American, Robert Nozick, put forward in Anarchy State and Utopia (1974) a very different approach. His idea of justice was based on rights, and he defined a just society as one in which individual rights were accorded the respect due to them. Each individual, said Nozick, has certain natural rights to the enjoyment of life, health, liberty and possessions without interference from others, and to compensation from anyone who trespasses upon them. Those rights are inalienable: no other person and no state authority can justly diminish them, for however good a cause, without the individual’s consent.Inequalities between human beings are a fact of life, and justice does not require that they be corrected.
In the early days, said Nozick, each individual was responsible for protecting his or her own rights, but the weakness of an individual alone and the danger that he might assess wrongly the extent to which his rights had been violated could lead to anarchy. Human beings therefore came gradually to accept the role of the state as a protector of these rights, but that is the limit of the state’s legitimate role. It can secure compensation for any member whose natural rights are infringed, and it can properly prohibit potentially dangerous conduct (such as the driving of motor cars by epileptics) as long as the individuals thus restricted are compensated for their loss of liberty. But the right to life (say) does not extend to the right to be fed or housed by others: such a claim would impinge upon their rights to enjoy their own property.
Property, said Nozick, can be justly acquired in three ways: if it was previously unowned by anyone and was acquired by the individual’s effort or skill; or if it is validly transferred by its previous owner (e.g. by way of gift or sale); or if it was transferred by court order to rectify a previous unjust acquisition (e.g as compensation for a crime). But there should be no question of redistribution for social purposes. No one suggests that kidneys should be compulsorily redistributed, even though the possession of two functioning kidneys is just as much a matter of “unfair” chance as the possession of inherited wealth. Rawls’ ideas of distributive justice, said Nozick, involved unwarranted interference with the inherent rights of individual members of society.
IS THE LAW JUST?
No one, surely, would disagree with the proposition that one of the aims of any legal system should be the promotion of justice. It should not necessarily be the sole aim – other “goods” to be promoted might include mercy, liberty, public order and the avoidance of unreasonable public expenditure – but it is an important one. The buildings in the Strand commonly called “the Law Courts” are actually the “Royal Courts of Justice”, and High Court judges and magistrates have been known as “Justices” for many years.
However, different judges see their duty to promote justice in different ways. In his autobiography The Family Story, Lord Denning wrote that “My root belief is that the proper role of the judge is to do justice between the parties before him. If there is any rule of law which impairs the doing of justice, then it is the province of the judge to do all he legitimately can to avoid the rule, even to change it, so as to do justice in the instant case before him.”
Sir Robert Megarry VC in Tito v Waddell (No.2)  3 All ER 129 took a different view: “The question is not whether the plaintiffs ought to succeed as a matter of fairness or ethics or morality. I have no jurisdiction to make an award to the plaintiffs just because I reach the conclusion … that they have had a raw deal. This is a Court of Law and Equity (using “equity” in its technical sense), administering justice according to law and equity, and my duty is to examine the plaintiffs’ claim on that footing.”
Where possible, most judges take a position somewhere between these two extremes, bringing justice (however they understand the word) into the administration of the law whenever they have the opportunity.
The extent to which the English legal system is formally just and/or leads to substantive justice depends on one’s definition of justice. Most people (with the possible exception of Marxists and those who endorse “critical legal studies”) would agree that most of the system is just and leads to just results most of the time. But a number of illustrative cases and other points are worth considering.
The rules of procedure in English law generally show an intention to secure formal justice, and often succeed.
Everyone has access to the law, and everyone (even the government) is subject to the law. That seems to show justice as fairness, but if the Government does not like the result it can usually reverse it by legislation.
- Burmah Oil v Lord Advocate  2 All ER 348, HL
- Various oil installations belonging to PP were blown up by British troops during World War II to prevent their falling into Japanese hands. In interlocutory proceedings the House of Lords said that although the troops’ actions were lawful, PP had prima facie a common law right to compensation. Following that decision, Parliament enacted the War Damage Act 1965, which declared that no compensation was payable in such cases and was expressly made retroactive so as to prevent any final judgment in PP’s favour.
- Congreve v Home Office  1 All ER 697, CA
- The Government announced a substantial increase in the cost of television licences, to come into effect at a specified future date. P and others sought to avoid paying the higher price by buying new licences (at the old rate) before their existing licences had expired. The Home Secretary then purported to exercise his discretionary powers to cancel the new licences, but the Court of Appeal said to do so would be unlawful. Parliament had given him that power in order that he could prevent the improper use of broadcasting equipment, not so that he could penalise those who were acting quite lawfully.
- Council of Civil Service Unions v Minister for the Civil Service  3 All ER 935, HL
- The Government announced changes in the conditions of service of workers at GCHQ, made under prerogative powers and affecting chiefly the workers’ rights to belong to a trade union and take part in its activities. The House of Lords said the Prime Minister could be made to answer to the Court for the way in which the royal prerogative was exercised, although the instant case involved issues of national security, which are not justiciable.
- M v Home Office  3 All ER 537, HL
- An alien M sought judicial review of a decision to deport him. In interlocutory proceedings Garland J understood counsel for D to give an undertaking that M would not be removed pending final judgment, but on learning later that night that M had in fact been deported, the judge ordered the Home Office to bring him back. The Home Office failed to do so. The House of Lords held the Home Secretary in contempt of court but declined to impose any punishment because there had apparently been a genuine misunderstanding by the Home Office legal advisors.
Everyone is entitled to put his or her case in court, either personally or through a legal representative, and there are various legal help schemes are supposed to ensure that everyone can get legal advice and (where necessary) legal representation at trial. But legal aid does not cover all types of case – it does not even cover most civil cases since April 2000 – and many people with even modest incomes are not eligible. Conditional fee agreements are available to plaintiffs with a good case, but defendants and those whose chances of success are poor may find it hard to get a lawyer willing to take the case without payment. As Darling J is supposed to have said, “the law, like the tavern, is open to all.”.
Judges, magistrates and jurors must not be biased (or even appear to be biased), but prospective jurors cannot be questioned about their views and the right of peremptory challenge was abolished some twenty years ago.
- Dimes v Grand Junction Canal (1852) 10 ER 301, HL
- The canal company RR brought a case in equity against a landowner A; the Vice-Chancellor granted RR’s request and Lord Cottenham LC upheld the decision on appeal. A then discovered that Lord Cottenham held a substantial block of shares in the canal company and applied to have the Chancellor’s decision set aside. The House of Lords said that although there was no suggestion that the Lord Chancellor had in fact been influenced by his interest in the company, no case should be decided by a judge with a financial interest in the outcome. The Chancellor’s orders were therefore set aside as such, but those of the Vice-Chancellor (to the same effect) were confirmed.
- R v Bingham JJ ex p Jowitt (1974) Times 3/7/74, HC QBD
- A motorist A was charged with exceeding the speed limit, and the only evidence was given by D and a police officer, who contradicted one another. Finding D guilty, the chairman said “My principle in such cases has always been to believe the evidence of the police officer.” The High Court quashed the conviction; this remark would cause any reasonable person to suspect that the chairman of magistrates was biased and that D had not had a fair trial.
- Bradford v McLeod  Crim LR 690, HCJ (Scotland)
- A miner convicted of a breach of the peace on the picket line appealed against his conviction. The sheriff had previously been heard to say on a social occasion that striking miners should not be given legal aid, but declined to disqualify himself and insisted he could try the case fairly. The High Court of Justiciary said this was enough to create a reasonable suspicion of bias, whether there had in fact been any bias or not: the sheriff should have disqualified himself.
- Re Pinochet  4 All ER 897, HL;  1 All ER 577, HL
- The House of Lords, reversing the High Court, ruled 3-2 that A’s position as a former head of state did not confer immunity from extradition proceedings based on allegations by a third country of torture and other violations of human rights. Amnesty International was allowed to present an amicus curiae argument at this hearing, and it subsequently emerged that Lord Hoffmann, one of the judges in the majority, had for some years been a non-executive director of Amnesty International Charity. A sought to set aside the Lords’ judgment, and a panel of five different Law Lords allowed his application. Lord Browne-Wilkinson said the fundamental principle is that a man must not be a judge in his own cause: Lord Hoffman was so closely and actively associated with one of the parties that he was disqualified from hearing the case regardless of whether or not there was any actual appearance or risk of bias. This did not mean that judges could never sit in cases involving charities they supported, but where the judge was a director or senior officer of a charity that was party to a case, disqualification is automatic subject to the possibility that the parties (having been fully informed) might waive any objection.
The rules of evidence ensure that only relevant evidence fairly obtained is given in court. Evidence obtained by oppression or in circumstances making it unreliable can be excluded, though not every illegal act by the police necessarily has this result. But the rules of evidence can operate against substantive justice when they exclude evidence (perhaps evidence improperly obtained) that might have led the court to the right decision.
- Sparks v R  1 All ER 727, PC (Bermuda)
- A white man A was charged with indecently assaulting a three-year-old girl. The girl had told her mother that it was a coloured boy who had assaulted her, but the girl did not give evidence and the mother was not allowed to say what her daughter had said because of the rule against hearsay evidence. A was convicted and appealed. The Privy Council allowed his appeal on other grounds, but Lord Morris said the cause of justice was best served by strict adherence to the recognised rule. [Section 114(1)(d) of the Criminal Justice Act 2003 now makes hearsay admissible where the judge is satisfied that it would be in the interests of justice for it to be so.]
- Jeffrey v Black  1 All ER 555, DC
- A student D was arrested for stealing a sandwich from a pub; the police searched his flat, where they found a quantity of drugs. At D’s trial for possession of drugs the justices found the evidence inadmissible and dismissed the charge, but the Divisional Court remitted the case for rehearing by a new bench. Although the police had no right to search D’s home without his consent (because they had no reasonable grounds to suspect large-scale sandwich theft!) that was not in itself a reason to exclude the evidence.
- R v Watts  3 All ER 101, CA
- A man of low intelligence was charged with indecently assaulting a woman in an underpass; he claimed to have been at home at the time and said a supposed “confession” had been fabricated by the police. At his trial, the judge allowed evidence to be given of two previous convictions for sexual assaults on children, and the jury convicted. Allowing his appeal, the Court of Appeal said the prejudicial effect of these previous convictions outweighed any possible value they might have as indicators of his truthfulness, and that the judge should not have allowed them to be mentioned. [Section 101 of the Criminal Justice Act 2003 now allows evidence of the defendant’s previous convictions to be given in some circumstances, but not where it would have such an adverse effect on the fairness of the proceedings that the court ought not to admit it.
- R v Mason  3 All ER 481, CA
- A man D confessed to burglary after the police had told him and his solicitor that his fingerprints had been found at the scene of the crime. This was not in fact true, but the judge admitted the confession to evidence and D was convicted. The Court of Appeal quashed the conviction: the deception of the solicitor was a serious matter, they said, making it impossible for him to give D the best advice, and the judge should have exercised his discretion to exclude the evidence.
- R v Miller (1992) Times 24/12/92, CA
- A man D was charged with murder, and the evidence against him included his confession. When interviewed by the police he had denied his involvement more than three hundred times, but in the face of “questioning” that took the form of officers’ repeatedly shouting at him what they wanted him to say, he eventually gave way and admitted that he might have been there but could not remember. The Court of Appeal quashed his conviction on the grounds that the confession had been obtained by undoubted oppression: short of physical violence it was hard to imagine a more hostile and intimidating approach. There was evidence that D had a mental age of only 11, but the tenor and length of the interviews was such that they would have been oppressive even with a person of normal mental capacity.
- R v B (A-G’s Ref. No.3 of 1999) (2000) Times 15/12/00, HL
- A man B was charged with rape after the police found a match between DNA taken from the scene and a sample of B’s DNA, which they had taken on an earlier occasion but had not subsequently destroyed as the law then required.The judge refused to allow any DNA evidence to be given and the trial collapsed; the Court of Appeal said the judge had been right, but the House of Lords adopted a different reading of PACE and said the judge could have exercised his discretion to allow it. The fairness of a trial has to take account of fairness to the victim and to the public at large, as well as to the defendant.
Trial by jury (in criminal cases) allows the jury to do justice as it sees it, irrespective of the substantive rules of law, but the jury is not required to explain its decision and may in fact decide on the basis of the lawyer’s appearance or manner or some other irrelevant factor.
- R v Ponting  Crim LR 318, McCowan J
- D was a civil servant working in the Ministry of Defence, and saw documents indicating that the Government had lied in their account of the sinking of the General Belgrano during the Falklands War. He gave copies of these documents to an Opposition MP so that the matter could be raised in Parliament, and was charged with an offence against the Official Secrets Act. In spite of the judge’s clear direction that D’s conduct did amount to an offence, the jury acquitted him and he walked free.
- R v Wilson (1996) unreported
- Four women were acquitted by a jury at Liverpool Crown Court after causing £1½m worth of damage to a Hawk fighter jet. Their defence was that the jet was to be sold to Indonesia, where it would be used against the people fighting for independence in East Timor: their action was thus the prevention of the greater crime of genocide.
- R v Blythe (1998) unreported
- A man D was charged with cultivating cannabis with intent to supply it to his terminally ill wife W, who suffered severe pain from multiple sclerosis. Judge Hale told the jury at Warrington that the defence of duress of circumstances was not available in such a case, even though D feared W might commit suicide, but the jury disregarded this instruction and found D not guilty. D was convicted of simple possession and fined £100.
The House of Lords may use its powers under the Practice Statement 1966 to depart from an earlier decision which seems to lead to an injustice.
- R v G & R  4 All ER 765, HL
- Boys aged 11 and 12 were convicted of arson, the judge having ruled (following R v Caldwell  1 All ER 961, HL that recklessness was based on a risk obvious to a reasonable person rather than to a typical 12-year-old. The House of Lords departed from its decision in Caldwell and said the test is a subjective one: to make no allowance for a defendant’s youth or mental incapacity would violate the principle that no one should be convicted of a serious offence unless he acted with guilty intent. It is clearly blameworthy to take an obvious and significant risk of causing injury to another, said Lord Bingham, but it is not clearly blameworthy to do something involving a risk of injury to another if (for reasons other than self-induced intoxication) one genuinely does not perceive the risk.
- A v Hoare  UKHL 6
- In 1989 the defendant D was convicted of rape and sent to prison. In 2004, having been released on licence, D won £7 million on the National Lottery, and later that year the claimant A began an action seeking damages for the rape. Her claim was struck out: the cause of action had arisen at the time of the rape, said the judge, and since the claim had not been brought until some sixteen years later it was barred by s.2 of the Limitation Act 1980. The House of Lords had decided in Stubbings v Webb  1 All ER 322 that the shorter three-year time limit in s.11, which the judge has the power to disapply under s.33, applied only to claims in negligence and similar breach of duty and not to claims for assault. The fact that D had not hitherto been worth suing was irrelevant. After considering the legislative history and the evident injustice arising from their earlier decision, the House of Lords exercised its power under the 1966 Practice Statement to depart from its own previous decision and remitted the case to the judge so that he could decide whether or not to exercise his discretion to allow Mrs A’s claim to proceed.
Many of the substantive rules of English law try to achieve substantive justice (according to at least one definition of that term), and many (but not all) succeed.
In criminal law, the mandatory life sentence for murder is a matter of treating like cases alike, except that not all murders are alike; is the Parole Board’s power to order the release on licence of those who have served the “minimum term” sufficient to ensure justice? The Criminal Justice Act 2003 sets predetermined minimum terms for various classes of murder, but allows the judge to set a different minimum so long as he gives reasons for doing so.
The partial defences such as diminished responsibility and provocation are intended to do justice, so that those who kill while not fully responsible for their actions are not convicted of murder, but the defences of necessity (now duress of circumstances) and duress are not available in cases of murder or attempted murder. The Law Commission have commented on the way that the defence of diminished responsibility is sometimes allowed in cases not strictly meeting the statutory definition, but have said thatwhile the mandatory life sentence for murder remains, they “are not persuaded that the acknowledged infelicities of the current formulation presently cause injustice in practice”.
- R v Dudley & Stephens (1884) LR 14 QBD 273, CCR
- Three sailors and a cabin boy were shipwrecked and were adrift in an open boat 1600 miles from land. After they had been eight days without food, and six without water, Dudley and Stephens decided that their only chance of survival was to kill the cabin boy and eat him, and this they did. Four days later they were picked up by a passing ship, and on returning to England were convicted of murder. The Court for Crown Cases Reserved upheld their conviction, but their sentence of death was later commuted to six months’ imprisonment.
- R v Price (1971) unreported
- A boy of six had the mental capacity of a baby and a short life expectancy. His father D placed the boy in a river and watched him float away; the boy drowned. D pled guilty to manslaughter on the basis of diminished responsibility, and was put on probation for a year on condition that he underwent “such treatment as a doctor may prescribe for the next few weeks or so”.
- R v Whitfield (1976) 63 Cr App R 39, CA
- Following a long series of family quarrels, including a threat to take away the baby, D killed his wife and her sister. He was charged with murder and claimed provocation, but the judge withdrew this question from the jury. The Court of Appeal quashed D’s conviction for murder and substituted one for manslaughter: it is clear that any conduct (including mere words) can in principle amount to provocation.
- R v English (1981) unreported
- A woman who killed was allowed to bring evidence to show that pre-menstrual tension had impaired her responsibility sufficiently for this defence to be admitted.
- R v Howe  1 All ER 771, HL
- D took part with others in two separate murders, and on a third occasion the intended victim escaped. D’s claim to have acted under duress was left to the jury on two of the three counts, but D was convicted on all three, and appealed. The House of Lords said no participant (whether principal or accessory) can claim duress in defence to a murder charge. The law should deny a man the right to take an innocent life even at the price of his own (per Lord Griffiths), but should rather set a standard of heroism and self-sacrifice which ordinary men and women should be expected to observe (per Lord Hailsham).
- R v Ahluwalia  4 All ER 889, CA
- A woman D had entered into an “arranged marriage” and had been very badly treated by her husband. He had been violent and abusive towards her; he had threatened to kill her and had once tried to run her down; and he had taunted her about his affair with another woman. One evening D poured petrol over his bed as he slept and set light to it. The Court of Appeal quashed D’s original conviction for murder, and at the retrial D’s plea of diminished responsibility resulting from the newly-acknowledged “battered woman syndrome” was accepted.
- R v Morhall  3 All ER 659, HL
- A habitual glue-sniffer D killed another man V who nagged him about his habit. D was charged with murder and claimed he had been provoked. Allowing his appeal and substituting a conviction for manslaughter, the House of Lords said there is no rule to prevent a defendant’s relying on a self-induced condition such as drug addiction (or even previous criminal convictions) as characteristics of the ordinary person where these are relevant to the provocation.
The proportionality of sentencing raises other questions as to the extent to which justice is served by the criminal law. Mandatory minimum sentences present particular problems, because there may be cases in which the mandatory sentence would actually be unjust. In an article in The Times in March 1997, Sir Stephen Tumim (formerly HM Inspector of Prisons) quoted Judge F-K Fährig (Presiding Judge at the Berlin District Court) as saying that the suffering imposed on the defendant should not outweigh the suffering of the victim.
- R v Cannings (2002) unreported
- A woman D convicted of killing her two baby sons was convicted of murder, having denied the killing and having thus excluded the possibility of a verdict of infanticide. Sentencing her to imprisonment for life, Hallett J said this was “a classic example of the kind of injustice that can result from mandatory sentencing”. [The conviction itself was later quashed on appeal.]
In the law of tort, a duty of care is imposed only if it is fair just and reasonable to do so. If (in the judge’s opinion) it would be unjust to impose such a duty, he says there is no duty.
But the standard of care demanded may be higher than the defendant could reasonably achieve.
- Nettleship v Weston  3 All ER 581, CA
- A learner driver D went out for her first lesson, supervised by a friend P. D crashed the car into a lamppost, and P was injured. P’s claim for damages was upheld by the Court of Appeal, subject to a deduction for contributory negligence. Even learner drivers, said the Court, are to be judged against the standard of the reasonably competent driver. The fact that a particular driver is inexperienced and incompetent does not excuse his falling short of this standard. [This decision was made primarily to ensure that the driver’s insurance company (which had plenty of money) would compensate the injured victim (who had comparatively little). Was this justice?]
- Snelling v Whitehead (1975) Times 31/7/75, HL
- A 7-year-old boy P riding his bicycle on a minor road towards a crossroads was seriously injured in a collision with a car driven by D along the major road. Reluctantly rejecting P’s claim for compensation, the House of Lords said there was no proof that the driver had been negligent, and in the absence of such proof the claim must fail. Lord Wilberforce suggested this was a case where no-fault compensation would be appropriate, but that was a matter for Parliament.
- Barrett v Ministry of Defence  3 All ER 87, CA
- A sailor S became so drunk one night that he passed out and, having then been inadequately treated, choked to death on his own vomit. His widow P sued the Navy for their negligence. The Court of Appeal reversed the trial judge’s finding that DD had a duty of care to prevent S becoming drunk, applying the test of whether it was just and reasonable to impose a duty of care. I can see no reason, said Beldam LJ, why it should not be fair, just and reasonable for the law to leave a responsible adult to assume responsibility for his own actions in consuming alcoholic drink. No one is better placed to judge the amount the he can safely consume or to exercise control in his own interest as well as the interest of others.
In contract law, the doctrine of frustration deals with unforseen contingencies that occur after agreement, preventing the completion of the contract. In such circumstances, the contract is terminated and the court has power to order repayment where appropriate.
- Herne Bay Steamboat v Hutton  2 KB 683, CA
- An agreement was made that PP’s ship would be at DD’s disposal on a certain date “for the purpose of viewing the naval review and for a day’s cruise around the fleet”. On the day in question, the fleet was assembled but the royal review was cancelled because of the King’s illness. The Court of Appeal said the contract had not been frustrated, because the review was not the sole foundation of the agreement and the cruise around the fleet could still have taken place.
- Krell v Henry  3 KB 740, CA
- D agreed to hire briefly from P a flat in Pall Mall, intending to use it with friends to watch the coronation procession as it passed. The coronation was postponed at short notice owing to the King’s illness, but P sought to recover the agreed hiring fee. The Court of Appeal turned down the claim and said the contract had been frustrated; although the purpose of the hire had not been stipulated in the contract, the circumstances were such that both parties clearly knew it, and the sole foundation of the contract had been destroyed.
Equity was developed to enable the courts to do justice where the common law prevented it, by recognising new rights such as the rights of a beneficiary under a trust, and new remedies such as the injunction and the order for specific performance.
- Central London Property v High Trees House  1 All ER 256, Denning J
- A landlord sought to renege on an undertaking to accept a redfuced rent during the second world war. Denning J said that when a party to a contract makes a promise to the other, which he knows will be acted on, that he will not enforce his strict legal rights, the equitable principle of promissory estoppel makes that promise binding on him.
- Re Posner  1 All ER 1123, Karminski J
- A man T left his property in his will to “my wife Rose Posner”. In fact she was not legally his wife, having been already married when she met and “married” T, but the court said the will could be rectified and the words “my wife” ignored so that she could inherit.
- Beswick v Beswick  2 All ER 1197, HL
- The elderly owner of a small business agreed to transfer the business to his nephew, in return for which the nephew promised to pay an annuity to the man’s widow after his death. The man died and the nephew refused to pay. The widow could not sue in her own right, because she was not privy to the contract, so she sued as the executrix of her husband’s estate. Damages would not have been a satisfactory remedy, because the loss to the estate was negligible, so she was granted an order directing the nephew to perform his part of the contract.
- Eves v Eves  3 All ER 768, CA
- An unmarried couple set up home together and had two children. A house was bought in the man D’s name because P was under 21; he said (probably falsely) that he would have put it in their joint names had she been of age. The house was in a run-down condition, and P did a lot of heavy building work. After three years, D left to live with another woman. P sought and was granted a declaration that the house was held on a constructive trust for both of them; the work she had done, coupled with D’s representations, entitled her to a quarter-share in its value.
- Miller v Jackson  3 All ER 338, CA
- Cricket had been played on a village cricket ground since 1905. In 1970 a number of new houses were built close to the cricket ground, and P bought one of them in 1972. On a number of occasions, cricket balls were hit into P’s garden, and P sued the cricket club in negligence and nuisance, claiming actual damage to property and fear of personal injury. The Court of Appeal by a majority (Lord Denning MR dissenting) said P should succeed; it was no defence that she had come to the nuisance rather than vice versa. But no injunction should issue (i) because DD’s activities were socially beneficial and (ii) because P had come to the nuisance with her eyes open. Damages of £400 were awarded instead to cover both past and future inconvenience.
- Bloomsbury & Rowling v News Group & others  EWHC 1205 (Ch)
- Following a pre-publication threat to disclose an important plot development, Sir Andrew Morritt VC granted an injunction against “the [unknown] person or persons who have offered the publishers of The Sun … a copy of “Harry Potter and the Order of the Phoenix” … and the person or persons who has or have physical possession of a copy …”, restraining them from disclosing without permission any information derived from the book. There was no precedent for an injunction that did not name any of the parties at whom it was directed, but making such an injunction caused no injustice, and not making it could cause considerable injustice to the claimants.
One particular area in which the law tries to achieve substantive justice is in the reduction of discrimination. Like cases should be treated alike, said Aristotle, Perelman, Hart and others, and different cases should often be treated differently. There should be no unwarranted discrimination on the basis of irrelevant factors, but differences between individuals should be properly taken into account when they are relevant.
Various Acts of Parliament therefore make it unlawful (usually in a civil rather than a criminal sense) to discriminate on the grounds of a person’s gender, race, religion or sexual orientation in matters relating to employment, education or the provision of services, or on grounds of age in matters relating to employment, unless a particular gender, race &c is a “genuine occupational requirement”. Other Acts require employers and providers of education and other services to make reasonable adjustments to their premises and working practices to meet the needs of disabled people.
- Blathwayt v Cawley  3 All ER 625, HL
- A man T made a well which left substantial property on trust to C for life and then to C’s sons, on condition that they were not and did not become Roman Catholic. T died and the property passed to C, who subsequently joined the Roman Catholic church. In a judgment based mainly on technical matters, the House of Lords said obiter that the public policy against religious discrimination must give way to the principle of testamentary freedom.
- Singh v Rowntree Mackintosh  ICR 554, EAT
- A Sikh S was refused employment in a chocolate factory because of his unwillingness to remove his beard. The Industrial Tribunal found for RM, saying the public interest that hygiene rules should be enforced outweighed the public interest in preventing discrimination. The Appeal Tribunal agreed, and said the standard required for justification fell somewhere between necessity and mere convenience.
- Ahmad v United Kingdom (1981) 4 EHRR 126, EComHR
- A Muslim teacher A applied for time off on Friday afternoons and on certain other occasions to take part in religious observances, but was refused. He complained this was a violation of his freedom of religion, but the Commission disagreed: the evidence was that teachers of other religions would have been similarly treated, and the refusal was reasonable.
- Gill v El Vino  1 All ER 398, CA
- A wine bar EV operated a rule that women were served drinks only when sitting at tables, not at the bar. Allowing Ms G’s appeal from the County Court judge, the Court of Appeal said this was discriminatory. The different treatment was more than trivial: women were deprived of the conversation and social flexibility of the bar area.
- James v Eastleigh BC  2 All ER 607, HL
- A local authority EBC granted free admission to the public swimming baths to men and women of pensionable age. Since the statutory retirement age was 60 for women and 65 for men, the House of Lords held this was discriminatory against men. The test, said Lord Goff, is whether the complainant would have received the same treatment but for his or her sex; the discriminator’s motive is irrelevant.
- B v B (Custody &c)  1 FLR 402, Judge Callman
- A mother M left home to live with another woman, taking her two-year-old son B with her but leaving the two older children with their father F and his new (female) partner. The judge awarded F custody of the older children (who were happy with him) but gave B to M. What is so important, said the judge, is to distinguish between militant lesbians who try to convert others to their way of life, and lesbians in private. In this case there was no evidence to support the suggestion that B’s own sexual identity would be influenced by M’s lesbianism, which she did not parade openly, and the possible social stigma was outweighed by the fact that M had cared for him ever since he was born.
- Fitzpatrick v Sterling Housing Association  4 All ER 705, HL
- Reversing the Court of Appeal, the House of Lords held that a gay man was entitled to take over the tenancy formerly held by his long-term male partner, now deceased, under the Housing Act 1988. Lord Slynn said the legislation could not be interpreted to allow P’s claim on the basis that he had been living “as the husband or wife” of the deceased – if Parliament had intended such a relationship to include same-sex partners it would surely have said so – but P could claim as “a member of the family” living with the deceased at the time of his death. The word “family” is used in many senses, he said, some wider than others, and if P could show (as on the facts he could) the mutual inter-dependence, sharing of lives, caring and love, commitment and support that are rebuttably presumed to exist between married couples, that would be enough to establish a family relationship.
- Meikle v Nottinghamshire CC  EWCA Civ 859
- A long-serving home economics teacher began to suffer from deteriorating eyesight. She asked the school to make allowance for this, for example by supplying her with a large-print version of the daily bulletin, by reducing the distance she had to move between classrooms, and by increasing her non-contact time so that she could do her marking and lesson preparation in daylight, but they did not do so. The Court of Appeal upheld her claim that this amounted to unlawful discrimination because of her disability: there were reasonable adjustments the school could have made, and they had not done so.
There will always be some miscarriages of justice – it is statistically unavoidable – in which the legal system reaches the wrong result. In criminal law, innocent people are sometimes convicted, and guilty people are often acquitted. There is an appeal system so that some of the wrongful convictions can be put right, but there are serious doubts about its effectiveness. For example, Timothy Evans was hanged in 1950 after being convicted of murder; the real murderer was identified shortly afterwards, and Evans was pardoned (posthumously) in 1966. Mahmood Hussein Mattan and Derek Bentley were hanged for separate murders in 1952; their convictions were eventually quashed by the Court of Appeal in 1998. In other cases, wrongful convictions have not been corrected until the defendants have spent long periods in prison: again in 1998 the Court of Appeal quashed Patrick Nicholls’ 1975 conviction for murder and thereby terminated a sentence of life imprisonment after hearing new pathological evidence that the victim had died from natural causes.
The Criminal Appeals Act 1995 tried to rectify some of the shortcomings of the system: under s.2(1) of this Act the Court of Appeal should allow an appeal if it thinks the conviction is “unsafe”, regardless of technicalities. In the past, however, the Court of Appeal have been very reluctant to interfere with the jury’s verdict: the jury have seen the witnesses’ behaviour in the witness box, and twelve jurors are as good as three judges in deciding who is telling the truth.
The Criminal Justice Act 2003 introduced several changes designed to assist substantive justice by reducing the number of wrongful acquittals. Part 10 of the Act abolished the “double jeopardy” rule so as to allow an acquitted person to be retried if new and compelling evidence comes to light, and Part 11 modified the rules of evidence to widen the circumstances in which “bad character” and hearsay evidence can be admitted. These changes will almost inevitably reduce the number of wrongful acquittals but increase the number of wrongful convictions.
- R v McIlkenny  2 All ER 417, CA
- DD were convicted of terrorist offences in 1975 and were sentenced to life imprisonment, but in 1991 it emerged (largely through investigative journalism) that the scientific evidence against them had been flawed and that their so-called “confessions” had been forged by police officers. Quashing their convictions, the Court of Appeal said no system is better than its human input. Like any other system of justice, the adversarial system may be abused. The evidence adduced may be inadequate. Expert evidence may not have been properly researched or there may have been a deliberate attempt to undermine the system by giving false evidence. If there is a conflict of evidence there is no way of ensuring the jury will always get it right … No human system can expect to be perfect.
Feminist political philosophy is an area of philosophy focused on understanding and critiquing the way political philosophy is usually construed, often without any attention to feminist concerns, and to articulating how political theory might be reconstructed in a way that advances feminist concerns. Feminist political philosophy is a branch of both feminist philosophy and political philosophy. As a branch of feminist philosophy, it serves as a form of critique or a hermeneutics of suspicion (Ricœur 1970). That is, it serves as a way of opening up or looking at the political world as it is usually understood and uncovering ways in which women and their current and historical concerns are poorly depicted, represented, and addressed. As a branch of political philosophy, feminist political philosophy serves as a field for developing new ideals, practices, and justifications for how political institutions and practices should be organized and reconstructed. While feminist philosophy has been instrumental in critiquing and reconstructing many branches of philosophy, from aesthetics to philosophy of science, feminist political philosophy may be the paradigmatic branch of feminist philosophy because it best exemplifies the point of feminist theory, which is, to borrow a phrase from Marx, not only to understand the world but to change it (Marx and Engels 1998). And, though other fields have effects that may change the world, feminist political philosophy focuses most directly on understanding ways in which collective life can be improved. This project involves understanding the ways in which power emerges and is used or misused in public life (see the entry on feminist perspectives on power). As with other kinds of feminist theory, common themes have emerged for discussion and critique, but there has been little in the way of consensus among feminist theorists on what is the best way to understand them. This introductory article lays out the various schools of thought and areas of concern that have occupied this vibrant field of philosophy for the past thirty years.
Current feminist political philosophy is indebted to the work of earlier generations of feminist scholarship and activism, including the first wave of feminism in the English-speaking world, which took place from the 1840s to the 1920s and focused on improving the political, educational, and economic system primarily for middle-class women. Its greatest achievements were to develop a language of equal rights for women and to garner women the right to vote. It is also indebted to the second wave of feminism, which, beginning in the 1960s, drew on the language of the civil rights movements (e.g., the language of liberation) and on a new feminist consciousness that emerged through women’s solidarity movements and new forms of reflection that uncovered sexist attitudes and impediments throughout the whole of society. As the entry on approaches to feminism notes, by 1970 feminism had expanded from activism to scholarship with the publication of Shulamith Firestone’s The Dialectic of Sex (Firestone 1971); Kate Millett’s Sexual Politics (Millett 1970); and Robin Morgan’s Sisterhood is Powerful (Morgan 1970).
As a branch of political philosophy, feminist political philosophy has often mirrored the various divisions at work in political philosophy more broadly. Prior to the fall of the Berlin Wall and the end of the Cold War, political philosophy was usually divided into categories such as liberal, conservative, socialist, and Marxist. Except for conservatism, for each category there were often feminists working and critiquing alongside it. Hence, as Alison Jaggar’s classic text, Feminist Politics and Human Nature, spelled out, each ideological approach drew feminist scholars who would both take their cue from and borrow the language of a particular ideology (Jaggar 1983). Jaggar’s text grouped feminist political philosophy into four camps: liberal feminism, socialist feminism, Marxist feminism, and radical feminism. The first three groups followed the lines of Cold War global political divisions: American liberalism, European socialism, and a revolutionary communism (though few in the west would embrace Soviet-style communism). Radical feminism was the most indigenous of the feminist philosophies, developing its own political vocabulary with its roots in the deep criticisms of patriarchy that feminist consciousness had produced in its first and second waves. Otherwise, feminist political philosophy largely followed the lines of traditional political philosophy. But this has never been an uncritical following. As a field bent on changing the world, even liberal feminist theorists tended to criticize liberalism more than to embrace it, and to embrace socialism and other more radical points of view more than to reject them. Still, on the whole, these theorists generally operated within the language and framework of their chosen approach to political philosophy.
Political philosophy began to change enormously in the 1980s, just before the end of the Cold War, with a new invocation of an old Hegelian category: civil society, an arena of political life intermediate between the state and the household. This was the arena of associations, churches, labor unions, book clubs, choral societies and manifold other nongovernmental yet still public organizations. In the 1980s political theorists began to turn their focus from the state to this intermediate realm, which suddenly took center stage in Eastern Europe in organizations that challenged the power of the state and ultimately led to the downfall of communist regimes.
After the end of the Cold War, political philosophy along with political life radically realigned. New attention focused on civil society and the public sphere, especially with the timely translation of Jürgen Habermas’s early work, the Structural Transformation of the Public Sphere (Habermas 1989). Volumes soon appeared on civil society and the public sphere, focusing on the ways in which people organized themselves and developed public power rather than on the ways that the state garnered and exerted its power. In fact, there arose a sense that the public sphere ultimately might exert more power than the state, at least in the fundamental way in which public will is formed and serves to legitimate—or not—state power. In the latter respect, John Rawls’s work was influential by developing a theory of justice that tied the legitimacy of institutions to the normative judgments that a reflective and deliberative people might make (Rawls 1971). By the early 1990s, Marxists seemed to have disappeared or at least become very circumspect (though the downfall of communist regimes needn’t have had any effect on Marxist analysis proper, which never subscribed to Leninist or Maoist thought). Socialists also retreated or transformed themselves into “radical democrats.”
Now the old categories of liberal, socialist, and Marxist feminisms were much less relevant. Along with political philosophy more broadly, more feminist political philosophers began to turn to the meaning and interpretation of civil society, the public sphere, and democracy itself.
Of Jaggar’s categories, liberal feminism and radical feminism remain strong currents in feminist political thought. Care ethics, which was originally developed as an alternative to mainstream ethical theory, has been harnessed to counter liberal political theory (Gilligan 1982; Held 1995). More recently, two other approaches have emerged: radical democratic or “agonistic” feminism (Mouffe 1992, 1993, 1999, 2000) and deliberative or communicative democratic theory (Benhabib 1992, 1996; Benhabib and Cornell 1987; Fraser 1989; Young 1990, 1997, 2000). An even newer approach that might be termed “performative” is also developing (Zerilli 2005; Cornell 1998). This section gives a brief overview of these various approaches and attendant issues.
The first of these, liberal feminism, can be traced back to feminist efforts and theorizing around political and economic equality for women. This approach got a boost with the publication of John Rawls’s A Theory of Justice (Rawls 1971) and subsequently his Political Liberalism (Rawls 1993). Susan Moller Okin (Okin 1989, 1979; Okin et al. 1999) and Eva Kittay (Kittay 1999) have used Rawls’s work productively to extend his theory to attend to women’s concerns. From a more critical perspective, several feminist theorists have argued that some of the central categories of liberalism occlude women’s lived concerns; for example, the central liberal private/public distinction sequesters the private sphere, and any harm that may occur there to women, away from political scrutiny (Pateman 1983). Perhaps more than any other approach, liberal feminist theory parallels developments in liberal feminist activism. While feminist activists have waged legal and political battles to criminalize, as just one example, violence against women (which previously, in marital relations, hadn’t been considered a crime), feminist political philosophers who have engaged the liberal lexicon have shown how the distinction between private and public realms has served to uphold male domination of women by rendering power relations within the household as “natural” and immune from political regulation. Such political philosophy uncovers how seemingly innocuous and “commonsensical” categories have covert power agendas. For example, old conceptions of the sanctity of the private space of the household and the role of women primarily as child-bearers and caregivers served to protect male domination of women in the household from public scrutiny. Feminist critiques of the public/private split supported legal advances that finally led in the 1980s to the criminalization in the United States of spousal rape (Hagan and Sussman 1988).
A second approach, radical feminism, remains committed to getting at the root of male domination by understanding the source of power differentials, which some radical feminists, including Catharine MacKinnon, trace back to male sexuality and the notion that heterosexual intercourse enacts male domination over women. “Women and men are divided by gender, made into the sexes as we know them, by the requirements of its dominant form, heterosexuality, which institutionalizes male sexual dominance and female sexual submission. If this is true, sexuality is the linchpin of gender inequality” (MacKinnon 1989,113). Radical feminists tend to see power as running one-way, from those with power over those who are being oppressed. As Amy Allen puts it, “Unlike liberal feminists, who view power as a positive social resource that ought to be fairly distributed, and feminist phenomenologists, who understand domination in terms of a tension between transcendence and immanence, radical feminists tend to understand power in terms of dyadic relations of dominance/subordination, often understood on analogy with the relationship between master and slave.” (See the section on radical feminist approaches in the entry on feminist perspectives on power.) Unlike the more reformist politics of liberal feminism, radical feminists have largely sought to reject the prevailing order altogether, sometimes advocating separatism (Daly 1985, 1990).
A third important approach in feminist political philosophy draws on what is called care or maternal ethics. (See the discussion in the entry on feminist ethics.) Drawing on feminist research in moral psychology (Gilligan 1982; Held 1995), this field explores the ways in which the virtues that society and mothering cultivate in women can provide an alternative to the traditional emphases in moral and political philosophy on universality, reason, and justice. Some maternalists have sought to take the virtues that had long been relegated to the private realm, such as paying particular attention to those who are vulnerable or taking into consideration circumstances and not just abstract principles, and use them as well in the public realm. This approach has led to intense debates between liberals who advocated universal ideals of justice and maternalists who advocated attention to the particular, to relationships, to care. By the 1990s, though, many maternalists had revised their views. Rather than seeing care and justice as mutually exclusive alternatives, they began to recognize that attention to care should be accompanied by attention to fairness (justice) in order to attend to the plight of those with whom we have no immediate relation (Koggel 1998).
The maternalist approach raises the question of whether and, if so, how women have distinct virtues. Feminists as a whole have long distanced themselves from ideas that women have any particular essence, choosing instead to see femininity and its accompanying virtues as social constructs, dispositions that result from culture and conditioning, certainly not biological givens. So for maternalists to champion the virtues that have inculcated femininity seems also to champion a patriarchal system that relegates one gender to the role of caretaker. The maternalist tactic has largely been to flip the hierarchy, to claim that the work of the household is more meaningful and sustaining than the work of the polis. But critics, such as Drucilla Cornell, Mary Dietz, and Chantal Mouffe, argue that such a revaluation keeps intact the dichotomy between the private and the public and the old association of women’s work with childcare. (Butler and Scott 1992; Phillips and NetLibrary Inc 1998, pp. 386-389)
Such concerns are part of a larger set of concerns and criticisms that have run through feminist theorizing since the 1970s, with non-white, non-middle-class, and non-American women starting to question the very category of “woman” and the notion that this title could be a boundary-spanning category that could unite women of various walks of life. (See the entries on identity politics and feminist perspectives on sex and gender.) Criticisms of a unitary identity of “woman” have been motivated by worries that much feminist theory has originated from the standpoint of a particular class of women who mistake their own particular standpoint for a universal one. In her 1981 book, Ain’t I a Woman?: Black women and feminism, bell hooks notes that the feminist movement pretends to speak for all women but was made up of primarily white, middle class women who, because of their narrow perspective, did not represent the needs of poor women and women of color and ended up reinforcing class stereotypes (hooks 1981). What is so damning about this kind of critique is that it mirrors the one that feminists have leveled against mainstream political theorists who have taken the particular category of men to be a universal category of mankind, a schema that does not in fact include women under the category of mankind but marks them as other.
Hence, one of the most vexing issues facing feminist theory in general and feminist political philosophy in particular is the matter of identity (see the entry on identity politics). Identity politics in general is a controversial political practice of mobilizing for change on the basis of a political identity (women, black, chicana, etc.). The philosophical debate is whether such identities are based on some real difference or history of oppression, and also whether people should embrace identities that have historically been used to oppress them. Identity politics in feminist practice is fraught along at least two axes: whether there is any real essence or identity of woman in general and even if so whether the category of woman could be used to represent all women. Women at the intersection of various identities (e.g, black women) have raised questions about which identity is foremost or whether either identity is apt. Such questions play out with the question of political representation—what aspects of identity are politically salient and truly representative, whether race, class, or gender (Phillips 1995; Young 1997, 2000). The ontological question of women’s identity gets played out on the political stage when it comes to matters of political representation, group rights, and affirmative action. The 2008 U.S. Democratic Party primary battle between Senators Barack Obama and Hillary Clinton turned this philosophical question into a very real and heated one from black women throughout the United States. Was a black woman who supported Clinton a traitor to her race, or a black woman who supported Obama a traitor to her sex? Or did it make any sense to talk about identity in a way that would lead to charges of treason? Of the approaches discussed above, radical and maternal feminism seem particularly wedded to feminist identity politics.
Fourth is feminist democratic theory, perhaps best known through the works of Seyla Benhabib (Benhabib 1992, 1996), greatly inspired by the work of the German critical theorist, Jürgen Habermas. Like other feminist democratic theorists, Benhabib’s work engages democratic theorists quite broadly, not just feminist theorists. This passage of hers helps to clarify what she takes to be the best aim of a political philosophy: a state of affairs to which all affected would assent. As she writes,
Only those norms (i.e., general rules of action and institutional arrangements) can be said to be valid (i.e., morally binding), which would be agreed to by all those affected by their consequences, if such agreement were reached as a consequence of a process of deliberation that had the following features: 1) participation in such deliberation is governed by norms of equality and symmetry; all have the same chances to initiate speech acts, to question, to interrogate, and to open debate; 2) all have the right to question the assigned topics of conversation; and 3) all have the right to initiate reflexive arguments about the very rules of the discourse procedure and the way in which they are applied or carried out. (Benhabib 1996, 70)
Democratic theorists such as Benhabib and Habermas contend that certain conditions need to be in place in order for members of a political community to arrive at democratic outcomes, namely the proceedings need to be deliberative. Some take deliberation to be a matter of reasoned argumentation; others see it as less about reason or argumentation but more about an open process of working through choices. (McAfee 2004.)
Deliberative theory is not the only prominent form of feminist democratic theory. Iris Young’s pioneering book, Justice and the Politics of Difference and several of her subsequent works have been very influential and have led to a good deal of hesitance in feminist theoretical communities about the claims of deliberative theory. Where Benhabib is confident that conditions can be such that all who are affected can have a voice in deliberations, Young points out that those who have been historically silenced have a difficult time having their views heard or heeded. Young is skeptical of the claims of mainstream democratic theory that democratic deliberative processes could lead to outcomes that would be acceptable to all (Young 1990, 1997). Young, along with Nancy Fraser (Fraser 1989) and others, worried that in the process of trying to reach consensus, the untrained voices of women and others who have been marginalized would be left out of the final tally. Young’s criticisms were very persuasive, leading a generation of feminist political philosophers to be wary of deliberative democratic theory. Instead of deliberative democracy, in the mid 1990s Young proposed a theory of communicative democracy, hoping to make way for a deliberative conception that was open to means of expression beyond the rational expression of mainstream deliberative democratic theory. Young worried that deliberation as defined by Habermas is too reason-based and leaves out forms of communication that women and people of color tend to use, such as greeting, rhetoric, and storytelling. Young argued that these alternative modes of communication, modes that women and people of color and other marginalized people tended to use, could provide the basis of a more democratic, communicative theory. In her last major book, Inclusion and Democracy (Young 2000), Young had clearly moved to embrace deliberative theory itself, seeing the ways in which it could be constructed to give voice to those who had been otherwise marginalized. More recent feminist democratic theory has engaged deliberative theory more positively. (See McAfee and Snyder 2007.)
A fifth approach is agonistic feminism, which draws from a certain reading of the work of Hannah Arendt and from Antonio Gramsci, among others. Leading theorists of this approach include Chantal Mouffe (Critchley and Mouffe 1996; Laclau and Mouffe 1985, 2001; Mouffe 1979, 1992, 1993, 1999, 2000, 2005), Bonnie Honig (Honig 1993, 1995, 2001; Honig and Mapel 2002), and Ewa Ziarek (Ziarek 2001). Where liberal feminists inspired by John Rawls and democratic feminists inspired by Jürgen Habermas and/or John Dewey hold out the hope that democratic deliberations might lead to democratic agreements, agonistic feminists maintain that any kind of agreement is inherently undemocratic.
Agonistic feminist political philosophy comes out of poststructural continental feminist and philosophical traditions. It takes from Marxism, especially western Marxism, the hope for a more radically egalitarian society. It takes from contemporary continental philosophy notions of subjectivity and solidarity as malleable and constructed. Along with much of postmodern thought, it repudiates any notion of pre-existing moral or political truths or foundations. Its central claim is that feminist struggle, like other struggles for social justice, is engaged in politics as battle or war. Agonistic views see the nature of politics as inherently conflictual, with battles over power and hegemony being the central tasks of democratic struggle. Advocates of agonistic politics worry that the kind of consensus sought by democratic theorists (discussed above) will lead to some kind of oppression or injustice by silencing new struggles. As Chantal Mouffe puts it, “We have to accept that every consensus exists as a temporary result of a provisional hegemony, as a stabilization of power, and that it always entails some form of exclusion” (Mouffe 2000, 104).
A sixth approach to feminist political philosophy is emerging, what could be called performative feminist political philosophy. Performative feminist politics doesn’t worry about whether it is possible to come up with a single definition of “woman” or any other political identity; it sees identity as something that is performatively created. “How we assume these identitites,” Drucilla Cornell writes, “is never something ‘out there’ that effectively determines who we can be as men and women—gay, lesbian, straight, queer, transsexual, transgender, or otherwise” (Cornell 2003, 144). It is something that is shaped as we live and externalize identities. From a performative feminist perspective, feminism is a project of anticipating and creating better political futures in the absence of foundations. As Linda Zerilli writes, “politics is about making claims and judgments—and having the courage to do so—in the absence of the objective criteria or rules that could provide certain knowledge and the guarantee that speaking in women’s name will be accepted or taken up by others” (Zerilli 2005, 179). Drawing on the works of Arendt, Butler, and Joan Copjec, Zerilli calls for a “freedom-centered feminism” that “would strive to bring about transformation in normative conceptions of gender without returning to the classical notion of freedom as sovereignty” that feminists have long criticized but found difficult to resist (ibid.).
In its feminist incarnations, this view also takes its cue from Judith Butler’s performative account of gender as well as Hannah Arendt’s concern with the anticipatory nature of rights, as well as other thinkers’ ideas, to describe an anticipatory ideal of politics. Linda Zerilli describes this kind of feminist politics as “the contingently based public practice of soliciting the agreement of others to what each of us claims to be universal” (Zerilli 2005, p. 173). From a performatve perspective, normative political claims appeal to other people, not to supposed truths or foundations.
This view recuperates many of the ideals of the Enlightenment—such as freedom, autonomy, and justice—but in a way that drops the Enlightenment’s metaphysical assumptions about reason, progress, and human nature. Instead of seeing these ideals as grounded in some metaphysical facts, this new view sees them as ideals that people hold and try to instantiate through practice and imagination. Where many ancient and modern ideals of politics were based on suppositions about the nature of reality or of human beings, contemporary political philosophies generally operate without supposing that there are any universal or eternal truths. Some might see this situation as ripe for nihilism, arbitrariness, or the exercise of brute power. The performative alternative is to imagine and try to create a better world by anticipating, claiming, and appealing to others that it should be so. Even if there is no metaphysical truth that human beings have dignity and infinite worth, people can act as if it were true in order to create a world in which it is seen to be so.
What Zerilli does with the concept of freedom, Drucilla Cornell does with the idea of autonomy. Her work in ethics and political philosophy is also in this vein, arguing for seeing old Enlightenment notions such as autonomy, dignity, and personhood, in a new performative capacity, as ideals that people aspire to rather than as moral facts waiting to be discovered, applied, or realized.
Peformative feminist political philosophy shares liberal feminism’s appreciation for Enlightenment ideals but in a way that is skeptical about foundations, just as agonistic feminism repudiates foundations. It has less in common with radical and maternalist feminisms for these very reasons.
In sum, feminist political philosophy is a still evolving field of thought that has much to offer mainstream political philosophy. In the past two decades it has come to exert a stronger influence over mainstream political theorizing, raising objections that mainstream philosophers have had to address, though not always very convincingly. And in its newest developments it promises to go even further.
Karl Popper is generally regarded as one of the greatest philosophers of science of the 20th century. He was also a social and political philosopher of considerable stature, a self-professed ‘critical-rationalist’, a dedicated opponent of all forms of scepticism, conventionalism, and relativism in science and in human affairs generally, a committed advocate and staunch defender of the ‘Open Society’, and an implacable critic of totalitarianism in all of its forms. One of the many remarkable features of Popper’s thought is the scope of his intellectual influence.
Karl Raimund Popper was born on 28 July 1902 in Vienna, which at that time could make some claim to be the cultural epicentre of the western world. His parents, who were of Jewish origin, brought him up in an atmosphere which he was later to describe as ‘decidedly bookish’. His father was a lawyer by profession, but he also took a keen interest in the classics and in philosophy, and communicated to his son an interest in social and political issues which he was to never lose. His mother inculcated in him such a passion for music that for a time he seriously contemplated taking it up as a career, and indeed he initially chose the history of music as a second subject for his Ph.D examination. Subsequently, his love for music became one of the inspirational forces in the development of his thought, and manifested itself in his highly original interpretation of the relationship between dogmatic and critical thinking, in his account of the distinction between objectivity and subjectivity, and, most importantly, in the growth of his hostility towards all forms of historicism, including historicist ideas about the nature of the ‘progressive’ in music. The young Karl attended the local Realgymnasium, where he was unhappy with the standards of the teaching, and, after an illness which kept him at home for a number of months, he left to attend the University of Vienna in 1918. However, he did not formally enrol at the University by taking the matriculation examination for another four years. 1919 was in many respects the most important formative year of his intellectual life. In that year he became heavily involved in left-wing politics, joined the Association of Socialist School Students, and became for a time a Marxist. However, he was quickly disillusioned with the doctrinaire character of the latter, and soon abandoned it entirely. He also discovered the psychoanalytic theories of Freud and Adler (under whose aegis he engaged briefly in social work with deprived children), and listened entranced to a lecture which Einstein gave in Vienna on relativity theory. The dominance of the critical spirit in Einstein, and its total absence in Marx, Freud and Adler, struck Popper as being of fundamental importance: the latter, he came to think, couched their theories in terms which made them amenable only to confirmation, while Einstein’s theory, crucially, had testable implications which, if false, would have falsified the theory itself.
Popper obtained a primary school teaching diploma in 1925, took a Ph.D. in philosophy in 1928, and qualified to teach mathematics and physics in secondary school in 1929. The dominant philosophical group in Vienna at the time was the Wiener Kreis, the circle of ‘scientifically-minded’ intellectuals focused around Moritz Schlick, who had been appointed Professor of the philosophy of the inductive sciences at Vienna University in 1922. This included Rudolf Carnap, Otto Neurath, Viktor Kraft, Hans Hahn and Herbert Feigl. The principal objective of the members of the Circle was to unify the sciences, which carried with it, in their view, the need to eliminate metaphysics once and for all by showing that metaphysical propositions are meaningless—a project which Schlick in particular saw as deriving from the account of the proposition given in Wittgenstein’s Tractatus. Although he was friendly with some of the Circle’s members and shared their esteem for science, Popper’s hostility towards Wittgenstein alienated Schlick, and he was never invited to become a member of the group. For his part, Popper became increasingly critical of the main tenets of logical positivism, especially of what he considered to be its misplaced focus on the theory of meaning in philosophy and upon verification in scientific methodology, and reveled in the title ‘the official opposition’ which was bestowed upon him by Neurath. He articulated his own view of science, and his criticisms of the positivists, in his first work, published under the title Logik der Forschung in 1934. The book—which he was later to claim rang the death knell for positivism—attracted more attention than Popper had anticipated, and he was invited to lecture in England in 1935. He spent the next few years working productively on science and philosophy, but storm clouds were gathering—the growth of Nazism in Germany and Austria compelled him, like many other intellectuals who shared his Jewish origins, to leave his native country.
In 1937 Popper took up a position teaching philosophy at the University of Canterbury in New Zealand, where he was to remain for the duration of the Second World War. The annexation of Austria in 1938 became the catalyst which prompted him to refocus his writings on social and political philosophy. In 1946 he moved to England to teach at the London School of Economics, and became professor of logic and scientific method at the University of London in 1949. From this point on Popper’s reputation and stature as a philosopher of science and social thinker grew enormously, and he continued to write prolifically—a number of his works, particularly The Logic of Scientific Discovery (1959), are now universally recognised as classics in the field. He was knighted in 1965, and retired from the University of London in 1969, though he remained active as a writer, broadcaster and lecturer until his death in 1994. (For more detail on Popper’s life, cf. his Unended Quest).
A number of biographical features may be identified as having a particular influence upon Popper’s thought. In the first place, his teenage flirtation with Marxism left him thoroughly familiar with the Marxist view of economics, class-war, and history. Secondly, he was appalled by the failure of the democratic parties to stem the rising tide of fascism in his native Austria in the 1920s and 1930s, and the effective welcome extended to it by the Marxists. The latter acted on the ideological grounds that it constituted what they believed to be a necessary dialectical step towards the implosion of capitalism and the ultimate revolutionary victory of communism. This was one factor which led to the much feared Anschluss, the annexation of Austria by the German Reich, the anticipation of which forced Popper into permanent exile from his native country. The Poverty of Historicism (1944) and The Open Society and Its Enemies (1945), his most impassioned and brilliant social works, are as a consequence a powerful defence of democratic liberalism as a social and political philosophy, and a devastating critique of the principal philosophical presuppositions underpinning all forms of totalitarianism. Thirdly, as we have seen, Popper was profoundly impressed by the differences between the allegedly ‘scientific’ theories of Freud and Adler and the revolution effected by Einstein’s theory of relativity in physics in the first two decades of this century. The main difference between them, as Popper saw it, was that while Einstein’s theory was highly ‘risky’, in the sense that it was possible to deduce consequences from it which were, in the light of the then dominant Newtonian physics, highly improbable (e.g., that light is deflected towards solid bodies—confirmed by Eddington’s experiments in 1919), and which would, if they turned out to be false, falsify the whole theory, nothing could, even in principle, falsify psychoanalytic theories. These latter, Popper came to feel, have more in common with primitive myths than with genuine science. That is to say, he saw that what is apparently the chief source of strength of psychoanalysis, and the principal basis on which its claim to scientific status is grounded, viz. its capability to accommodate, and explain, every possible form of human behaviour, is in fact a critical weakness, for it entails that it is not, and could not be, genuinely predictive. Psychoanalytic theories by their nature are insufficiently precise to have negative implications, and so are immunised from experiential falsification.
The Marxist account of history too, Popper held, is not scientific, although it differs in certain crucial respects from psychoanalysis. For Marxism, Popper believed, had been initially scientific, in that Marx had postulated a theory which was genuinely predictive. However, when these predictions were not in fact borne out, the theory was saved from falsification by the addition of ad hoc hypotheses which made it compatible with the facts. By this means, Popper asserted, a theory which was initially genuinely scientific degenerated into pseudo-scientific dogma.
These factors combined to make Popper take falsifiability as his criterion for demarcating science from non-science: if a theory is incompatible with possible empirical observations it is scientific; conversely, a theory which is compatible with all such observations, either because, as in the case of Marxism, it has been modified solely to accommodate such observations, or because, as in the case of psychoanalytic theories, it is consistent with all possible observations, is unscientific. For Popper, however, to assert that a theory is unscientific, is not necessarily to hold that it is unenlightening, still less that it is meaningless, for it sometimes happens that a theory which is unscientific (because it is unfalsifiable) at a given time may become falsifiable, and thus scientific, with the development of technology, or with the further articulation and refinement of the theory. Further, even purely mythogenic explanations have performed a valuable function in the past in expediting our understanding of the nature of reality.
As Popper represents it, the central problem in the philosophy of science is that of demarcation, i.e., of distinguishing between science and what he terms ‘non-science’, under which heading he ranks, amongst others, logic, metaphysics, psychoanalysis, and Adler’s individual psychology. Popper is unusual amongst contemporary philosophers in that he accepts the validity of the Humean critique of induction, and indeed, goes beyond it in arguing that induction is never actually used by the scientist. However, he does not concede that this entails the scepticism which is associated with Hume, and argues that the Baconian/Newtonian insistence on the primacy of ‘pure’ observation, as the initial step in the formation of theories, is completely misguided: all observation is selective and theory-laden—there are no pure or theory-free observations. In this way he destabilises the traditional view that science can be distinguished from non-science on the basis of its inductive methodology; in contradistinction to this, Popper holds that there is no unique methodology specific to science. Science, like virtually every other human, and indeed organic, activity, Popper believes, consists largely of problem-solving.
Popper, then, repudiates induction, and rejects the view that it is the characteristic method of scientific investigation and inference, and substitutes falsifiability in its place. It is easy, he argues, to obtain evidence in favour of virtually any theory, and he consequently holds that such ‘corroboration’, as he terms it, should count scientifically only if it is the positive result of a genuinely ‘risky’ prediction, which might conceivably have been false. For Popper, a theory is scientific only if it is refutable by a conceivable event. Every genuine test of a scientific theory, then, is logically an attempt to refute or to falsify it, and one genuine counter-instance falsifies the whole theory. In a critical sense, Popper’s theory of demarcation is based upon his perception of the logical asymmetry which holds between verification and falsification: it is logically impossible to conclusively verify a universal proposition by reference to experience (as Hume saw clearly), but a single counter-instance conclusively falsifies the corresponding universal law. In a word, an exception, far from ‘proving’ a rule, conclusively refutes it.
Every genuine scientific theory then, in Popper’s view, is prohibitive, in the sense that it forbids, by implication, particular events or occurrences. As such it can be tested and falsified, but never logically verified. Thus Popper stresses that it should not be inferred from the fact that a theory has withstood the most rigorous testing, for however long a period of time, that it has been verified; rather we should recognise that such a theory has received a high measure of corroboration. and may be provisionally retained as the best available theory until it is finally falsified (if indeed it is ever falsified), and/or is superseded by a better theory.
Popper has always drawn a clear distinction between the logic of falsifiability and its applied methodology. The logic of his theory is utterly simple: if a single ferrous metal is unaffected by a magnetic field it cannot be the case that all ferrous metals are affected by magnetic fields. Logically speaking, a scientific law is conclusively falsifiable although it is not conclusively verifiable. Methodologically, however, the situation is much more complex: no observation is free from the possibility of error—consequently we may question whether our experimental result was what it appeared to be.
Thus, while advocating falsifiability as the criterion of demarcation for science, Popper explicitly allows for the fact that in practice a single conflicting or counter-instance is never sufficient methodologically to falsify a theory, and that scientific theories are often retained even though much of the available evidence conflicts with them, or is anomalous with respect to them. Scientific theories may, and do, arise genetically in many different ways, and the manner in which a particular scientist comes to formulate a particular theory may be of biographical interest, but it is of no consequence as far as the philosophy of science is concerned. Popper stresses in particular that there is no unique way, no single method such as induction, which functions as the route to scientific theory, a view which Einstein personally endorsed with his affirmation that ‘There is no logical path leading to [the highly universal laws of science]. They can only be reached by intuition, based upon something like an intellectual love of the objects of experience’. Science, in Popper’s view, starts with problems rather than with observations—it is, indeed, precisely in the context of grappling with a problem that the scientist makes observations in the first instance: his observations are selectively designed to test the extent to which a given theory functions as a satisfactory solution to a given problem.
On this criterion of demarcation physics, chemistry, and (non-introspective) psychology, amongst others, are sciences, psychoanalysis is a pre-science (i.e., it undoubtedly contains useful and informative truths, but until such time as psychoanalytical theories can be formulated in such a manner as to be falsifiable, they will not attain the status of scientific theories), and astrology and phrenology are pseudo-sciences. Formally, then, Popper’s theory of demarcation may be articulated as follows: where a ‘basic statement’ is to be understood as a particular observation-report, then we may say that a theory is scientific if and only if it divides the class of basic statements into the following two non-empty sub-classes: (a) the class of all those basic statements with which it is inconsistent, or which it prohibits—this is the class of its potential falsifiers (i.e., those statements which, if true, falsify the whole theory), and (b) the class of those basic statements with which it is consistent, or which it permits (i.e., those statements which, if true, corroborate it, or bear it out).
For Popper accordingly, the growth of human knowledge proceeds from our problems and from our attempts to solve them. These attempts involve the formulation of theories which, if they are to explain anomalies which exist with respect to earlier theories, must go beyond existing knowledge and therefore require a leap of the imagination. For this reason, Popper places special emphasis on the role played by the independent creative imagination in the formulation of theory. The centrality and priority of problems in Popper’s account of science is paramount, and it is this which leads him to characterise scientists as ‘problem-solvers’. Further, since the scientist begins with problems rather than with observations or ‘bare facts’, Popper argues that the only logical technique which is an integral part of scientific method is that of the deductive testing of theories which are not themselves the product of any logical operation. In this deductive procedure conclusions are inferred from a tentative hypothesis. These conclusions are then compared with one another and with other relevant statements to determine whether they falsify or corroborate the hypothesis. Such conclusions are not directly compared with the facts, Popper stresses, simply because there are no ‘pure’ facts available; all observation-statements are theory-laden, and are as much a function of purely subjective factors (interests, expectations, wishes, etc.) as they are a function of what is objectively real.
How then does the deductive procedure work? Popper specifies four steps:
(a) The first is formal, a testing of the internal consistency of the theoretical system to see if it involves any contradictions.
(b) The second step is semi-formal, the axiomatising of the theory to distinguish between its empirical and its logical elements. In performing this step the scientist makes the logical form of the theory explicit. Failure to do this can lead to category-mistakes—the scientist ends up asking the wrong questions, and searches for empirical data where none are available. Most scientific theories contain analytic (i.e., a priori) and synthetic elements, and it is necessary to axiomatise them in order to distinguish the two clearly.
(c) The third step is the comparing of the new theory with existing ones to determine whether it constitutes an advance upon them. If it does not constitute such an advance, it will not be adopted. If, on the other hand, its explanatory success matches that of the existing theories, and additionally, it explains some hitherto anomalous phenomenon, or solves some hitherto unsolvable problems, it will be deemed to constitute an advance upon the existing theories, and will be adopted. Thus science involves theoretical progress. However, Popper stresses that we ascertain whether one theory is better than another by deductively testing both theories, rather than by induction. For this reason, he argues that a theory is deemed to be better than another if (while unfalsified) it has greater empirical content, and therefore greater predictive power than its rival. The classic illustration of this in physics was the replacement of Newton’s theory of universal gravitation by Einstein’s theory of relativity. This elucidates the nature of science as Popper sees it: at any given time there will be a number of conflicting theories or conjectures, some of which will explain more than others. The latter will consequently be provisionally adopted. In short, for Popper any theory X is better than a ‘rival’ theory Y if X has greater empirical content, and hence greater predictive power, than Y.
(d) The fourth and final step is the testing of a theory by the empirical application of the conclusions derived from it. If such conclusions are shown to be true, the theory is corroborated (but never verified). If the conclusion is shown to be false, then this is taken as a signal that the theory cannot be completely correct (logically the theory is falsified), and the scientist begins his quest for a better theory. He does not, however, abandon the present theory until such time as he has a better one to substitute for it. More precisely, the method of theory-testing is as follows: certain singular propositions are deduced from the new theory—these are predictions, and of special interest are those predictions which are ‘risky’ (in the sense of being intuitively implausible or of being startlingly novel) and experimentally testable. From amongst the latter the scientist next selects those which are not derivable from the current or existing theory—of particular importance are those which contradict the current or existing theory. He then seeks a decision as regards these and other derived statements by comparing them with the results of practical applications and experimentation. If the new predictions are borne out, then the new theory is corroborated (and the old one falsified), and is adopted as a working hypothesis. If the predictions are not borne out, then they falsify the theory from which they are derived. Thus Popper retains an element of empiricism: for him scientific method does involve making an appeal to experience. But unlike traditional empiricists, Popper holds that experience cannot determine theory (i.e., we do not argue or infer from observation to theory), it rather delimits it: it shows which theories are false, not which theories are true. Moreover, Popper also rejects the empiricist doctrine that empirical observations are, or can be, infallible, in view of the fact that they are themselves theory-laden.
The general picture of Popper’s philosophy of science, then is this: Hume’s philosophy demonstrates that there is a contradiction implicit in traditional empiricism, which holds both that all knowledge is derived from experience and that universal propositions (including scientific laws) are verifiable by reference to experience. The contradiction, which Hume himself saw clearly, derives from the attempt to show that, notwithstanding the open-ended nature of experience, scientific laws may be construed as empirical generalisations which are in some way finally confirmable by a ‘positive’ experience. Popper eliminates the contradiction by rejecting the first of these principles and removing the demand for empirical verification in favour of empirical falsification in the second. Scientific theories, for him, are not inductively inferred from experience, nor is scientific experimentation carried out with a view to verifying or finally establishing the truth of theories; rather, all knowledge is provisional, conjectural, hypothetical—we can never finally prove our scientific theories, we can merely (provisionally) confirm or (conclusively) refute them; hence at any given time we have to choose between the potentially infinite number of theories which will explain the set of phenomena under investigation. Faced with this choice, we can only eliminate those theories which are demonstrably false, and rationally choose between the remaining, unfalsified theories. Hence Popper’s emphasis on the importance of the critical spirit to science—for him critical thinking is the very essence of rationality. For it is only by critical thought that we can eliminate false theories, and determine which of the remaining theories is the best available one, in the sense of possessing the highest level of explanatory force and predictive power. It is precisely this kind of critical thinking which is conspicuous by its absence in contemporary Marxism and in psychoanalysis.
In the view of many social scientists, the more probable a theory is, the better it is, and if we have to choose between two theories which are equally strong in terms of their explanatory power, and differ only in that one is probable and the other is improbable, then we should choose the former. Popper rejects this. Science, or to be precise, the working scientist, is interested, in Popper’s view, in theories with a high informative content, because such theories possess a high predictive power and are consequently highly testable. But if this is true, Popper argues, then, paradoxical as it may sound, the more improbable a theory is the better it is scientifically, because the probability and informative content of a theory vary inversely—the higher the informative content of a theory the lower will be its probability, for the more information a statement contains, the greater will be the number of ways in which it may turn out to be false. Thus the statements which are of special interest to the scientist are those with a high informative content and (consequentially) a low probability, which nevertheless come close to the truth. Informative content, which is in inverse proportion to probability, is in direct proportion to testability. Consequently the severity of the test to which a theory can be subjected, and by means of which it is falsified or corroborated, is all-important.
For Popper, all scientific criticism must be piecemeal, i.e., he holds that it is not possible to question every aspect of a theory at once. More precisely, while attempting to resolve a particular problem a scientist of necessity accepts all kinds of things as unproblematic. These things constitute what Popper terms the ‘background knowledge’. However, he stresses that the background knowledge is not knowledge in the sense of being conclusively established; it may be challenged at any time, especially if it is suspected that its uncritical acceptance may be responsible for difficulties which are subsequently encountered. Nevertheless, it is clearly not possible to question both the theory and the background knowledge at the same time (e.g., in conducting an experiment the scientist of necessity assumes that the apparatus used is in working order).
How then can one be certain that one is questioning the right thing? The Popperian answer is that we cannot have absolute certainty here, but repeated tests usually show where the trouble lies. Even observation statements, Popper maintains, are fallible, and science in his view is not a quest for certain knowledge, but an evolutionary process in which hypotheses or conjectures are imaginatively proposed and tested in order to explain facts or to solve problems. Popper emphasises both the importance of questioning the background knowledge when the need arises, and the significance of the fact that observation-statements are theory-laden, and hence fallible. For while falsifiability is simple as a logical principle, in practice it is exceedingly complicated—no single observation can ever be taken to falsify a theory, for there is always the possibility (a) that the observation itself is mistaken, or (b) that the assumed background knowledge is faulty or defective.
Popper was initially uneasy with the concept of truth, and in his earliest writings he avoided asserting that a theory which is corroborated is true—for clearly if every theory is an open-ended hypothesis, as he maintains, then ipso facto it has to be at least potentially false. For this reason Popper restricted himself to the contention that a theory which is falsified is false and is known to be such, and that a theory which replaces a falsified theory (because it has a higher empirical content than the latter, and explains what has falsified it) is a ‘better theory’ than its predecessor. However, he came to accept Tarski’s reformulation of the correspondence theory of truth, and in Conjectures and Refutations (1963) he integrated the concepts of truth and content to frame the metalogical concept of ‘truthlikeness’ or ‘verisimilitude’. A ‘good’ scientific theory, Popper thus argued, has a higher level of verisimilitude than its rivals, and he explicated this concept by reference to the logical consequences of theories. A theory’s content is the totality of its logical consequences, which can be divided into two classes: there is the ‘truth-content’ of a theory, which is the class of true propositions which may be derived from it, on the one hand, and the ‘falsity-content’ of a theory, on the other hand, which is the class of the theory’s false consequences (this latter class may of course be empty, and in the case of a theory which is true is necessarily empty).
Popper offered two methods of comparing theories in terms of verisimilitude, the qualitative and quantitative definitions. On the qualitative account, Popper asserted:
Assuming that the truth-content and the falsity-content of two theories t1 and t2 are comparable, we can say that t2 is more closely similar to the truth, or corresponds better to the facts, than t1, if and only if either:(a) the truth-content but not the falsity-content of t2 exceeds that of t1, or
(b) the falsity-content of t1, but not its truth-content, exceeds that of t2. (Conjectures and Refutations, 233).
Here, verisimilitude is defined in terms of subclass relationships: t2 has a higher level of verisimilitude than t1 if and only if their truth- and falsity-contents are comparable through subclass relationships, and either (a) t2‘s truth-content includes t1‘s and t2‘s falsity-content, if it exists, is included in, or is the same as, t1‘s, or (b) t2‘s truth-content includes or is the same as t1‘s and t2‘s falsity-content, if it exists, is included in t1‘s.
On the quantitative account, verisimilitude is defined by assigning quantities to contents, where the index of the content of a given theory is its logical improbability (given again that content and probability vary inversely). Formally, then, Popper defines the quantitative verisimilitude which a statement ‘a’ possesses by means of a formula:
Vs(a) = CtT(a) − CtF(a),
where Vs(a) represents the verisimilitude of a, CtT(a) is a measure of the truth-content of a, and CtF(a) is a measure of its falsity-content.
The utilisation of either method of computing verisimilitude shows, Popper held, that even if a theory t2 with a higher content than a rival theory t1 is subsequently falsified, it can still legitimately be regarded as a better theory than t1, and ‘better’ is here now understood to mean t2 is closer to the truth than t1. Thus scientific progress involves, on this view, the abandonment of partially true, but falsified, theories, for theories with a higher level of verisimilitude, i.e., which approach more closely to the truth. In this way, verisimilitude allowed Popper to mitigate what many saw as the pessimism of an anti-inductivist philosophy of science which held that most, if not all scientific theories are false, and that a true theory, even if discovered, could not be known to be such. With the introduction of the new concept, Popper was able to represent this as an essentially optimistic position in terms of which we can legitimately be said to have reason to believe that science makes progress towards the truth through the falsification and corroboration of theories. Scientific progress, in other words, could now be represented as progress towards the truth, and experimental corroboration could be seen an indicator of verisimilitude.
However, in the 1970’s a series of papers published by researchers such as Miller, Tichý, and Grünbaum in particular revealed fundamental defects in Popper’s formal definitions of verisimilitude. The significance of this work was that verisimilitude is largely important in Popper’s system because of its application to theories which are known to be false. In this connection, Popper had written:
Ultimately, the idea of verisimilitude is most important in cases where we know that we have to work with theories which are at best approximations—that is to say, theories of which we know that they cannot be true. (This is often the case in the social sciences). In these cases we can still speak of better or worse approximations to the truth (and we therefore do not need to interpret these cases in an instrumentalist sense). (Conjectures and Refutations, 235).
For these reasons, the deficiencies discovered by the critics in Popper’s formal definitions were seen by many as devastating, precisely because the most significant of these related to the levels of verisimilitude of false theories. In 1974, Miller and Tichý, working independently of each other, demonstrated that the conditions specified by Popper in his accounts of both qualitative and quantitative verisimilitude for comparing the truth- and falsity-contents of theories can be satisfied only when the theories are true. In the crucially important case of false theories, however, Popper’s definitions are formally defective. For while Popper had believed that verisimilitude intersected positively with his account of corroboration, in the sense that he viewed an improbable theory which had withstood critical testing as one the truth-content of which is great relative to rival theories, while its falsity-content (if it exists) would be relatively low, Miller and Tichý proved, on the contrary, that in the case of a false theory t2 which has excess content over a rival theory false t1 both the truth-content and the falsity-content of t2 will exceed that of t1. With respect to theories which are false, therefore, Popper’s conditions for comparing levels of verisimilitude, whether in quantitative and qualitative terms, can never be met.
Commentators on Popper, with few exceptions, had initially attached little importance to his theory of verisimilitude. However, after the failure of Popper’s definitions in 1974, some critics came to see it as central to his philosophy of science, and consequentially held that the whole edifice of the latter had been subverted. For his part, Popper’s response was two-fold. In the first place, while acknowledging the deficiencies in his own formal account (“my main mistake was my failure to see at once that … if the content of a false statement a exceeds that of a statement b, then the truth-content of a exceeds the truth-content of b, and the same holds of their falsity-contents”, Objective Knowledge, 371), Popper argued that “I do think that we should not conclude from the failure of my attempts to solve the problem [of defining verisimilitude] that the problem cannot be solved” (Objective Knowledge, 372), a point of view which was to precipitate more than two decades of important technical research in this field. At another, more fundamental level, he moved the task of formally defining the concept from centre-stage in his philosophy of science, by protesting that he had never intended to imply “that degrees of verisimilitude … can ever be numerically determined, except in certain limiting cases” (Objective Knowledge, 59), and arguing instead that the chief value of the concept is heuristic and intuitive, in which the absence of an adequate formal definition is not an insuperable impediment to its utilisation in the actual appraisal of theories relativised to problems in which we have an interest. The thrust of the latter strategy seems to many to genuinely reflect the significance of the concept of verisimilitude in Popper’s system, but it has not satisfied all of his critics.
Given Popper’s personal history and background, it is hardly surprising that he developed a deep and abiding interest in social and political philosophy. However, it is worth emphasising that his angle of approach to these fields is through a consideration of the nature of the social sciences which seek to describe and explicate them systematically, particularly history. It is in this context that he offers an account of the nature of scientific prediction, which in turn allows him a point of departure for his attack upon totalitarianism and all its intellectual supports, especially holism and historicism. In this context holism is to be understood as the view that human social groupings are greater than the sum of their members, that such groupings are ‘organic’ entities in their own right, that they act on their human members and shape their destinies, and that they are subject to their own independent laws of development. Historicism, which is closely associated with holism, is the belief that history develops inexorably and necessarily according to certain principles or rules towards a determinate end (as for example in the dialectic of Hegel, which was adopted and implemented by Marx). The link between holism and historicism is that the holist believes that individuals are essentially formed by the social groupings to which they belong, while the historicist—who is usually also a holist—holds that we can understand such a social grouping only in terms of the internal principles which determine its development.
These beliefs lead to what Popper calls ‘The Historicist Doctrine of the Social Sciences’, the views (a) that the principal task of the social sciences is to make predictions about the social and political development of man, and (b) that the task of politics, once the key predictions have been made, is, in Marx’s words, to lessen the ‘birth pangs’ of future social and political developments. Popper thinks that this view of the social sciences is both theoretically misconceived (in the sense of being based upon a view of natural science and its methodology which is totally wrong), and socially dangerous, as it leads inevitably to totalitarianism and authoritarianism—to centralised governmental control of the individual and the attempted imposition of large-scale social planning. Against this Popper strongly advances the view that any human social grouping is no more (or less) than the sum of its individual members, that what happens in history is the (largely unplanned and unforeseeable) result of the actions of such individuals, and that large scale social planning to an antecedently conceived blueprint is inherently misconceived—and inevitably disastrous—precisely because human actions have consequences which cannot be foreseen. Popper, then, is an historical indeterminist, insofar as he holds that history does not evolve in accordance with intrinsic laws or principles, that in the absence of such laws and principles unconditional prediction in the social sciences is an impossibility, and that there is no such thing as historical necessity.
The link between Popper’s theory of knowledge and his social philosophy is his fallibilism—just as we make theoretical progress in science by deliberately subjecting our theories to critical scrutiny, and abandoning those which have been falsified, so too, Popper holds, the critical spirit can and should be sustained at the social level. More specifically, the open society can be brought about only if it is possible for the individual citizen to evaluate critically the consequences of the implementation of government policies, which can then be abandoned or modified in the light of such critical scrutiny—in such a society, the rights of the individual to criticise administrative policies will be formally safeguarded and upheld, undesirable policies will be eliminated in a manner analogous to the elimination of falsified scientific theories, and differences between people on social policy will be resolved by critical discussion and argument rather than by force. The open society as thus conceived of by Popper may be defined as ‘an association of free individuals respecting each other’s rights within the framework of mutual protection supplied by the state, and achieving, through the making of responsible, rational decisions, a growing measure of humane and enlightened life’ (Levinson, R.B. In Defense of Plato, 17). As such, Popper holds, it is not a utopian ideal, but an empirically realised form of social organisation which, he argues, is in every respect superior to its (real or potential) totalitarian rivals. But he does not engage in a moral defence of the ideology of liberalism; rather his strategy is the much deeper one of showing that totalitarianism is typically based upon historicist and holist presuppositions, and of demonstrating that these presuppositions are fundamentally incoherent.
At a very general level, Popper argues that historicism and holism have their origins in what he terms ‘one of the oldest dreams of mankind—the dream of prophecy, the idea that we can know what the future has in store for us, and that we can profit from such knowledge by adjusting our policy to it.’ (Conjectures and Refutations, 338). This dream was given further impetus, he speculates, by the emergence of a genuine predictive capability regarding such events as solar and lunar eclipses at an early stage in human civilisation, which has of course become increasingly refined with the development of the natural sciences and their concomitant technologies. The kind of reasoning which has made, and continues to make, historicism plausible may, on this account, be reconstructed as follows: if the application of the laws of the natural sciences can lead to the successful prediction of such future events as eclipses, then surely it is reasonable to infer that knowledge of the laws of history as yielded by a social science or sciences (assuming that such laws exist) would lead to the successful prediction of such future social phenomena as revolutions? Why should it be possible to predict an eclipse, but not a revolution? Why can we not conceive of a social science which could and would function as the theoretical natural sciences function, and yield precise unconditional predictions in the appropriate sphere of application? These are amongst the questions which Popper seeks to answer, and in doing so, to show that they are based upon a series of misconceptions about the nature of science, and about the relationship between scientific laws and scientific prediction.
His first argument may be summarised as follows: in relation to the critically important concept of prediction, Popper makes a distinction between what he terms ‘conditional scientific predictions’, which have the form ‘If X takes place, then Y will take place’, and ‘unconditional scientific prophecies’, which have the form ‘Y will take place’. Contrary to popular belief, it is the former rather than the latter which are typical of the natural sciences, which means that typically prediction in natural science is conditional and limited in scope—it takes the form of hypothetical assertions stating that certain specified changes will come about if particular specified events antecedently take place. This is not to deny that ‘unconditional scientific prophecies’, such as the prediction of eclipses, for example, do take place in science, and that the theoretical natural sciences make them possible. However, Popper argues that (a) these unconditional prophecies are not characteristic of the natural sciences, and (b) that the mechanism whereby they occur, in the very limited way in which they do, is not understood by the historicist.
What is the mechanism which makes unconditional scientific prophecies possible? The answer is that such prophecies can sometimes be derived from a combination of conditional predictions (themselves derived from scientific laws) and existential statements specifying that the conditions in relation to the system being investigated are fulfilled. Schematically, this can be represented as follows:
[C.P. + E.S.]=U.P.
where C.P. = Conditional Prediction; E.S. = Existential Statement; U.P. = Unconditional Prophecy. The most common examples of unconditional scientific prophecies in science relate to the prediction of such phenomena as lunar and solar eclipses and comets.
Given, then, that this is the mechanism which generates unconditional scientific prophecies, Popper makes two related claims about historicism: (a) That the historicist does not in fact derive his unconditional scientific prophecies in this manner from conditional predictions, and (b) the historicist cannot do so because long-term unconditional scientific prophecies can be derived from conditional predictions only if they apply to systems which are well-isolated, stationary, and recurrent (like our solar system). Such systems are quite rare in nature, and human society is most emphatically not one of them.
This, then, Popper argues, is the reason why it is a fundamental mistake for the historicist to take the unconditional scientific prophecies of eclipses as being typical and characteristic of the predictions of natural science—in fact such predictions are possible only because our solar system is a stationary and repetitive system which is isolated from other such systems by immense expanses of empty space. The solar system aside, there are very few such systems around for scientific investigation—most of the others are confined to the field of biology, where unconditional prophecies about the life-cycles of organisms are made possible by the existence of precisely the same factors. Thus one of the fallacies committed by the historicist is to take the (relatively rare) instances of unconditional prophecies in the natural science as constituting the essence of what scientific prediction is, to fail to see that such prophecies apply only to systems which are isolated, stationary, and repetitive, and to seek to apply the method of scientific prophecy to human society and human history. The latter, of course, is not an isolated system (in fact it’s not a system at all), it is constantly changing, and it continually undergoes rapid, non-repetitive development. In the most fundamental sense possible, every event in human history is discrete, novel, quite unique, and ontologically distinct from every other historical event. For this reason, it is impossible in principle that unconditional scientific prophecies could be made in relation to human history—the idea that the successful unconditional prediction of eclipses provides us with reasonable grounds for the hope of successful unconditional prediction regarding the evolution of human history turns out to be based upon a gross misconception, and is quite false. As Popper himself concludes, “The fact that we predict eclipses does not, therefore, provide a valid reason for expecting that we can predict revolutions.” (Conjectures and Refutations, 340).
This argument is one of the strongest that has ever been brought against historicism, cutting, as it does, right to the heart of one of its main theoretical presuppositions. However, it is not Popper’s only argument against it. An additional mistake which he detects in historicism is the failure of the historicist to distinguish between scientific laws and trends, which is also frequently accompanied by a simple logical fallacy. The fallacy is that of inferring from the fact that our understanding of any (past) historical event—such as, for example, the French Revolution—is in direct proportion to our knowledge of the antecedent conditions which led to that event, that knowledge of all the antecedent conditions of some future event is possible, and that such knowledge would make that future event precisely predictable. For the truth is that the number of factors which predate and lead to the occurrence of any event, past, present, or future, is indefinitely large, and therefore knowledge of all of these factors is impossible, even in principle. What gives rise to the fallacy is the manner in which the historian (necessarily) selectively isolates a finite number of the antecedent conditions of some past event as being of particular importance, which are then somewhat misleadingly termed ‘the causes’ of that event, when in fact what this means is that they are the specific conditions which a particular historian or group of historians take to be more relevant than any other of the indefinitely large number of such conditions (for this reason, most historical debates range over the question as to whether the conditions thus specified are the right ones). While this kind of selectivity may be justifiable in relation to the treatment of any past event, it has no basis whatsoever in relation to the future—if we now select, as Marx did, the ‘relevant’ antecedent conditions for some future event, the likelihood is that we will select wrongly.
The historicist’s failure to distinguish between scientific laws and trends is equally destructive of his cause. This failure makes him think it possible to explain change by discovering trends running through past history, and to anticipate and predict future occurrences on the basis of such observations. Here Popper points out that there is a critical difference between a trend and a scientific law, the failure to observe which is fatal. For a scientific law is universal in form, while a trend can be expressed only as a singular existential statement. This logical difference is crucial because unconditional predictions, as we have already seen, can be based only upon conditional ones, which themselves must be derived from scientific laws. Neither conditional nor unconditional predictions can be based upon trends, because these may change or be reversed with a change in the conditions which gave rise to them in the first instance. As Popper puts it, there can be no doubt that “the habit of confusing trends with laws, together with the intuitive observation of trends such as technical progress, inspired the central doctrines of … historicism.” (The Poverty of Historicism, 116). Popper does not, of course, dispute the existence of trends, nor does he deny that the observation of trends can be of practical utility value—but the essential point is that a trend is something which itself ultimately stands in need of scientific explanation, and it cannot therefore function as the frame of reference in terms of which anything else can be scientifically explained or predicted.
A point which connects with this has to do with the role which the evolution of human knowledge has played in the historical development of human society. It is incontestable that, as Marx himself observed, there has been a causal link between the two, in the sense that advances in scientific and technological knowledge have given rise to widespread global changes in patterns of human social organisation and social interaction, which in turn have led to social structures (e.g. educational systems) which further growth in human knowledge. In short, the evolution of human history has been strongly influenced by the growth of human knowledge, and it is extremely likely that this will continue to be the case—all the empirical evidence suggests that the link between the two is progressively consolidating. However, this gives rise to further problems for the historicist. In the first place, the statement that ‘if there is such a thing as growing human knowledge, then we cannot anticipate today what we shall know only tomorrow’ is, Popper holds, intuitively highly plausible. Moreover, he argues, it is logically demonstrable by a consideration of the implications of the fact that no scientific predictor, human or otherwise, can possibly predict, by scientific methods, its own future results. From this it follows, he holds, that ‘no society can predict, scientifically, its own future states of knowledge’. (The Poverty of Historicism, vii). Thus, while the future evolution of human history is extremely likely to be influenced by new developments in human knowledge, as it always has in the past, we cannot now scientifically determine what such knowledge will be. From this it follows that if the future holds any new discoveries or any new developments in the growth of our knowledge (and given the fallible nature of the latter, it is inconceivable that it does not), then it is impossible for us to predict them now, and it is therefore impossible for us to predict the future development of human history now, given that the latter will, at least in part, be determined by the future growth of our knowledge. Thus once again historicism collapses—the dream of a theoretical, predictive science of history is unrealisable, because it is an impossible dream.
Popper’s arguments against holism, and in particular his arguments against the propriety of large-scale planning of social structures, are interconnected with his demonstration of the logical shortcomings of the presuppositions of historicism. Such planning (which actually took place, of course, in the USSR, in China, and in Cambodia, for example, under totalitarian regimes which accepted forms of historicism and holism), Popper points out, is necessarily structured in the light of the predictions which have been made about future history on the basis of the so-called ‘laws’ which historicists such as Marx and Mao claimed to have discovered in relation to human history. Accordingly, recognition that there are no such laws, and that unconditional predictions about future history are based, at best, upon nothing more substantial than the observation of contingent trends, shows that, from a purely theoretical as well as a practical point of view, large-scale social planning is indeed a recipe for disaster. In summary, unconditional large-scale planning for the future is theoretically as well as practically misguided, because, again, part of what we are planning for is our future knowledge, and our future knowledge is not something which we can in principle now possess—we cannot adequately plan for unexpected advances in our future knowledge, or for the effects which such advances will have upon society as a whole. The acceptance of historical indeterminism, then, as the only philosophy of history which is commensurate with a proper understanding of the nature of scientific knowledge, fatally undermines both historicism and holism.
Popper’s critique of both historicism and holism is balanced, on the positive side, by his affirmation of the ideals of individualism and market economics and his strong defence of the open society—the view, again, that a society is equivalent to the sum of its members, that the actions of the members of society serve to fashion and to shape it, and that the social consequences of intentional actions are very often, and very largely, unintentional. This part of his social philosophy was influenced by the economist Friedrich Hayek, who worked with him at the London School of Economics and who was a life-long friend. Popper advocated what he (rather unfortunately) terms ‘piecemeal social engineering’ as the central mechanism for social planning—for in utilising this mechanism intentional actions are directed to the achievement of one specific goal at a time, which makes it possible to monitor the situation to determine whether adverse unintended effects of intentional actions occur, in order to correct and readjust when this proves necessary. This, of course, parallels precisely the critical testing of theories in scientific investigation. This approach to social planning (which is explicitly based upon the premise that we do not, because we cannot, know what the future will be like) encourages attempts to put right what is problematic in society—generally-acknowledged social ills—rather than attempts to impose some preconceived idea of the ‘good’ upon society as a whole. For this reason, in a genuinely open society piecemeal social engineering goes hand-in-hand for Popper with negative utilitarianism (the attempt to minimise the amount of misery, rather than, as with positive utilitarianism, the attempt to maximise the amount of happiness). The state, he holds, should concern itself with the task of progressively formulating and implementing policies designed to deal with the social problems which actually confront it, with the goal of eliminating human misery and suffering to the highest possible degree. The positive task of increasing social and personal happiness, by contrast, can and should be should be left to individual citizens (who may, of course, act collectively to this end), who, unlike the state, have at least a chance of achieving this goal, but who in a free society are rarely in a position to systematically subvert the rights of others in the pursuit of idealised objectives. Thus in the final analysis for Popper the activity of problem-solving is as definitive of our humanity at the level of social and political organisation as it is at the level of science, and it is this key insight which unifies and integrates the broad spectrum of his thought.
While it cannot be said that Popper was a modest man, he took criticism of his theories very seriously, and spent much of his time in his later years endeavouring to show that such criticisms were either based upon misunderstandings, or that his theories could, without loss of integrity, be made compatible with new and important insights. The following is a summary of some of the main criticisms which he has had to address. (For Popper’s responses to critical commentary, see his ‘Replies to My Critics’, in P.A. Schilpp (ed.), The Philosophy of Karl Popper, Volume 2, and his Realism and the Aim of Science, edited by W.W. Bartley III.)
1. Popper professes to be anti-conventionalist, and his commitment to the correspondence theory of truth places him firmly within the realist’s camp. Yet, following Kant, he strongly repudiates the positivist/empiricist view that basic statements (i.e., present-tense observation statements about sense-data) are infallible, and argues convincingly that such basic statements are not mere ‘reports’ of passively registered sensations. Rather they are descriptions of what is observed as interpreted by the observer with reference to a determinate theoretical framework. This is why Popper repeatedly emphasises that basic statements are not infallible, and it indicates what he means when he says that they are ‘theory laden’—perception itself is an active process, in which the mind assimilates data by reference to an assumed theoretical backdrop. He accordingly asserts that basic statements themselves are open-ended hypotheses: they have a certain causal relationship with experience, but they are not determined by experience, and they cannot be verified or confirmed by experience. However, this poses a difficulty regarding the consistency of Popper’s theory: if a theory X is to be genuinely testable (and so scientific) it must be possible to determine whether or not the basic propositions which would, if true, falsify it, are actually true or false (i.e., whether its potential falsifiers are actual falsifiers). But how can this be known, if such basic statements cannot be verified by experience? Popper’s answer is that ‘basic statements are not justifiable by our immediate experiences, but are … accepted by an act, a free decision’. (Logic of Scientific Discovery, 109). However, and notwithstanding Popper’s claims to the contrary, this itself seems to be a refined form of conventionalism—it implies that it is almost entirely an arbitrary matter whether it is accepted that a potential falsifier is an actual one, and consequently that the falsification of a theory is itself the function of a ‘free’ and arbitrary act. It also seems very difficult to reconcile this with Popper’s view that science progressively moves closer to the truth, conceived of in terms of the correspondence theory, for this kind of conventionalism is inimical to this (classical) conception of truth.
2. As Lakatos has pointed out, Popper’s theory of demarcation hinges quite fundamentally on the assumption that there are such things as critical tests, which either falsify a theory, or give it a strong measure of corroboration. Popper himself is fond of citing, as an example of such a critical test, the resolution, by Adams and Leverrier, of the problem which the anomalous orbit of Uranus posed for nineteenth century astronomers. Both men independently came to the conclusion that, assuming Newtonian mechanics to be precisely correct, the observed divergence in the elliptical orbit of Uranus could be explained if the existence of a seventh, as yet unobserved outer planet was posited. Further, they were able, again within the framework of Newtonian mechanics, to calculate the precise position of the ‘new’ planet. Thus when subsequent research by Galle at the Berlin observatory revealed that such a planet (Neptune) did in fact exist, and was situated precisely where Adams and Leverrier had calculated, this was hailed as by all and sundry as a magnificent triumph for Newtonian physics: in Popperian terms, Newton’s theory had been subjected to a critical test, and had passed with flying colours. Popper himself refers to this strong corroboration of Newtonian physics as ‘the most startling and convincing success of any human intellectual achievement’. Yet Lakatos flatly denies that there are critical tests, in the Popperian sense, in science, and argues the point convincingly by turning the above example of an alleged critical test on its head. What, he asks, would have happened if Galle had not found the planet Neptune? Would Newtonian physics have been abandoned, or would Newton’s theory have been falsified? The answer is clearly not, for Galle’s failure could have been attributed to any number of causes other than the falsity of Newtonian physics (e.g., the interference of the earth’s atmosphere with the telescope, the existence of an asteroid belt which hides the new planet from the earth, etc). The point here is that the ‘falsification/corroboration’ disjunction offered by Popper is far too logically neat: non-corroboration is not necessarily falsification, and falsification of a high-level scientific theory is never brought about by an isolated observation or set of observations. Such theories are, it is now generally accepted, highly resistant to falsification. They are falsified, if at all, Lakatos argues, not by Popperian critical tests, but rather within the elaborate context of the research programmes associated with them gradually grinding to a halt, with the result that an ever-widening gap opens up between the facts to be explained, and the research programmes themselves. (Lakatos, I. The Methodology of Scientific Research Programmes, passim). Popper’s distinction between the logic of falsifiability and its applied methodology does not in the end do full justice to the fact that all high-level theories grow and live despite the existence of anomalies (i.e., events/phenomena which are incompatible with the theories). The existence of such anomalies is not usually taken by the working scientist as an indication that the theory in question is false; on the contrary, he will usually, and necessarily, assume that the auxiliary hypotheses which are associated with the theory can be modified to incorporate, and explain, existing anomalies.
3. Scientific laws are expressed by universal statements (i.e., they take the logical form ‘All As are X’, or some equivalent) which are therefore concealed conditionals—they have to be understood as hypothetical statements asserting what would be the case under certain ideal conditions. In themselves they are not existential in nature. Thus ‘All As are X’ means ‘If anything is an A, then it is X’. Since scientific laws are non-existential in nature, they logically cannot imply any basic statements, since the latter are explicitly existential. The question arises, then, as to how any basic statement can falsify a scientific law, given that basic statements are not deducible from scientific laws in themselves? Popper answers that scientific laws are always taken in conjunction with statements outlining the ‘initial conditions’ of the system under investigation; these latter, which are singular existential statements, do, when combined with the scientific law, yield hard and fast implications. Thus, the law ‘All As are X’, together with the initial condition statement ‘There is an A at Y’, yields the implication ‘The A at Y is X’, which, if false, falsifies the original law.
This reply is adequate only if it is true, as Popper assumes, that singular existential statements will always do the work of bridging the gap between a universal theory and a prediction. Hilary Putnam in particular has argued that this assumption is false, in that in some cases at least the statements required to bridge this gap (which he calls ‘auxiliary hypotheses’) are general rather than particular, and consequently that when the prediction turns out to be false we have no way of knowing whether this is due to the falsity of the scientific law or the falsity of the auxiliary hypotheses. The working scientist, Putnam argues, always initially assumes that it is the latter, which shows not only that scientific laws are, contra Popper, highly resistant to falsification, but also why they are so highly resistant to falsification.
Popper’s final position is that he acknowledges that it is impossible to discriminate science from non-science on the basis of the falsifiability of the scientific statements alone; he recognizes that scientific theories are predictive, and consequently prohibitive, only when taken in conjunction with auxiliary hypotheses, and he also recognizes that readjustment or modification of the latter is an integral part of scientific practice. Hence his final concern is to outline conditions which indicate when such modification is genuinely scientific, and when it is merely ad hoc. This is itself clearly a major alteration in his position, and arguably represents a substantial retraction on his part: Marxism can no longer be dismissed as ‘unscientific’ simply because its advocates preserved the theory from falsification by modifying it (for in general terms, such a procedure, it now transpires, is perfectly respectable scientific practice). It is now condemned as unscientific by Popper because the only rationale for the modifications which were made to the original theory was to ensure that it evaded falsification, and so such modifications were ad hoc, rather than scientific. This contention—though not at all implausible—has, to hostile eyes, a somewhat contrived air about it, and is unlikely to worry the convinced Marxist. On the other hand, the shift in Popper’s own basic position is taken by some critics as an indicator that falsificationism, for all its apparent merits, fares no better in the final analysis than verificationism.
A good place to situate the start of theoretical debates about women, class and work is in the intersection with Marxism and feminism. Such debates were shaped not only by academic inquiries but as questions about the relation between women’s oppression and liberation and the class politics of the left, trade union and feminist movements in the late 19th and 20th centuries, particularly in the U.S., Britain and Europe. It will also be necessary to consider various philosophical approaches to the concept of work, the way that women’s work and household activities are subsumed or not under this category, how the specific features of this work may or may not connect to different “ways of knowing” and different approaches to ethics, and the debate between essentialist and social constructionist approaches to differences between the sexes as a base for the sexual division of labor in most known human societies. The relation of women as a social group to the analysis of economic class has spurred political debates within both Marxist and feminist circles as to whether women’s movements challenging male domination can assume a common set of women’s interests across race, ethnicity, and class. If there are no such interests, on what can a viable women’s movement be based, and how can it evade promoting primarily the interests of white middle class and wealthy women? To the extent to which women do organize themselves as a political group cutting across traditional class lines, under what conditions are they a conservative influence as opposed to a progressive force for social change? If poor and working class women’s issues are different than middle and upper class women’s issues, how can middle class women’s movements be trusted to address them? In addition to these questions, there is a set of issues related to cross-cultural comparative studies of women, work and relative power in different societies, as well as analyses of how women’s work is connected to processes of globalization.
Marxism as a philosophy of human nature stresses the centrality of work in the creation of human nature itself and human self-understanding. Link to Marx essay. Both the changing historical relations between human work and nature, and the relations of humans to each other in the production and distribution of goods to meet material needs construct human nature differently in different historical periods: nomadic humans are different than agrarian or industrial humans. Marxism as a philosophy of history and social change highlights the social relations of work in different economic modes of production in its analysis of social inequalities and exploitation, including relations of domination such as racism and sexism. (Marx 1844, 1950, 1906-9; Marx and Engels 1848, 1850; Engels 1942). Within capitalism, the system they most analyzed, the logic of profit drives the bourgeois class into developing the productive forces of land, labor and capital by expanding markets, turning land into a commodity and forcing the working classes from feudal and independent agrarian production into wage labor. Marx and Engels argue that turning all labor into a commodity to be bought and sold not only alienates workers by taking the power of production away from them, it also collectivizes workers into factories and mass assembly lines. This provides the opportunity for workers to unite against the capitalists and to demand the collectivization of property, i.e., socialism, or communism.
According to Engels’s famous analysis of women’s situation in the history of different economic modes production in The Origin of the Family, Private Property and the State(1942), women are originally equal to, if not more powerful than, men in communal forms of production with matrilineal family organizations. Women lose power when private property comes into existence as a mode of production. Men’s control of private property, and the ability thereby to generate a surplus, changes the family form to a patriarchal one where women, and often slaves,  become the property of the father and husband.
The rise of capitalism, in separating the family household from commodity production, further solidifies this control of men over women in the family when the latter become economic dependents of the former in the male breadwinner-female housewife nuclear family form. Importantly, capitalism also creates the possibility of women’s liberation from family-based patriarchy by creating possibilities for women to work in wage labor and become economically independent of husbands and fathers. Engels stresses, however, that because of the problem of unpaid housework, a private task allocated to women in the sexual division of labor of capitalism, full women’s liberation can only be achieved with the development of socialism and the socialization of housework and childrearing in social services provided by the state. For this reason, most contemporary Marxists have argued that women’s liberation requires feminists to join the working class struggle against capitalism (Cliff 1984).
Many Marxist-feminists thinkers, prominent among them sociologists and anthropologists, have done cross-cultural and historical studies of earlier forms of kinship and economy and the role of the sexual or gender division of labor in supporting or undermining women’s social power (cf. Reed 1973, Leacock 1972, Rosaldo and Lamphere 1974). They have also attempted to assess the world economic development of capitalism as a contradictory force for the liberation of women (Saffioti 1978) and to argue that universal women’s liberation requires attention to the worse off: poor women workers in poor post-colonial countries (Sen & Grown 1987). Other feminist anthropologists have argued that other variables in addition to women’s role in production are key to understanding women’s social status and power (Sanday 1981; Leghorn and Parker 1981). Yet other feminist economic historians have done historical studies of the ways that race, class and ethnicity have situated women differently in relation to production, for example in the history of the United States (Davis 1983; Amott and Matthaei 1991). Finally some Marxist-feminists have argued that women’s work in biological and social reproduction is a necessary element of all modes of production and one often ignored by Marxist economists (Benston 1969; Vogel 1995).
Those feminist analyses which have highlighted the role of women’s work in the social construction of gender and the perpetuation of male dominance have been termed liberal, radical, Marxist, and socialist feminism by such influential categorizers as Jaggar and Rothenberg [Struhl] (1978), Tong (2000), Barrett (1980), Jaggar (1983) and Walby (1990). However, the pigeonhole categories of liberal, radical, Marxist, or socialist categories apply poorly to both to first wave women’s movement feminist predecessors and contemporary deconstructionist, post-structuralist and post-colonialist perspectives.
A number of first wave feminists write about work and class as key issues for women’s liberation, such as socialist-feminist Charlotte Perkins Gilman, heavily influenced by Darwinism and 19th century utopian modernism (Gilman 1898, 1910, 1979), anarchist Emma Goldman (1969), and existentialist, radical feminist and Marxist of sorts Simone de Beauvoir (1952). This is because the debates that arose around the place of the women’s movement in class politics were different in the early and mid-twentieth century than they were in the 1960s when many feminist theorists were trying to define themselves independently of the left anti-Vietnam war and civil rights movements of the time.
The debate about the economic and social function of housework and its relation to women’s oppression is an old one that has been a feature of both the first and second wave women’s movements in the US, Britain and Europe. In both eras, the underlying issue is how to handle the public/private split of capitalist societies in which women’s reproductive functions have either limited their work to the home or created a “second shift” problem of unpaid housework and childcare as well as waged work. In the first wave, located as it was in the Victorian period where the dominant ideology for middle and upper class women was purity, piety and domesticity (also called the “cult of true womanhood”), the debate centered on whether to keep housework in the private sphere yet make it more scientific and efficient (Beecher 1841; Richards 1915 ), or whether to “socialize” it by bringing it into the public sphere, as socialist Charlotte Perkins Gilman advocated (1898).
In the US, the “public housekeeping” aspect of the Progressive movement of the 1890s through early 1900s advocated that women bring the positive values associated with motherhood into the public sphere — by obtaining the vote, cleaning out corruption in politics, creating settlement houses to educate and support immigrants, and forming the women’s peace movement, etc. (cf. Jane Addams 1914). Disagreements about whether to downplay or valorize the distinctive function and skills in motherhood as work for which women are naturally superior, or to see motherhood as restricting women’s chances for economic independence and equality with men in the public sphere, were also evident in debates between Ellen Keys (1909, 1914) and Gilman. Keys represented the difference side, that women are superior humans because of mothering; while Gilman and Goldman took the equality side of the debate, that is, that, women are restricted, and made socially unequal to men, by unpaid housework and mothering.
In the second wave movement, theorists can be grouped by their theory of how housework oppresses women. Typically, liberal feminists critique housework because it is unpaid. This makes women dependent on men and devalued, since their work is outside the meaningful sphere of public economic production (Friedan 1963). Marxist feminist theorists see this as part of the problem, but some go further to maintain that housework is part of a household feudal mode of production of goods for use that persists under capitalism and gives men feudal powers over women’s work (Benston 1969, Fox 1980). Other Marxist feminists argue that women’s housework is part of the social reproduction of capitalism (Federici 1975, 2004; Malos 1975; Vogel 1995). That the necessary work of reproducing the working class is unpaid allows more profits to capitalists. It is the sexual division of labor in productive and reproductive work that makes woman unequal to men and allows capitalists to exploit women’s unpaid labor. Some even make this analysis the basis for a demand for wages for housework (Dalla Costa 1974; Federici 1975). More recently, Federici has done an analysis of the transition to capitalism in Europe. She argues that it was the emerging capitalist class need to control working class reproduction, to eliminate working class women’s control over biological reproduction, and to assure their unpaid reproductive work in the home by restricting abortions, that fueled the campaign against witches during this period (Federici 2004).
One of the philosophical problems raised by the housework debate is how to draw the line between work and play or leisure activity when the activity is not paid: is a mother playing with her baby working or engaged in play? If the former, then her hours in such activity may be compared with those of her husband or partner to see if there is an exploitation relation present, for example, if his total hours of productive and reproductive work for the family are less than hers (cf. Delphy 1984). But to the extent that childrearing counts as leisure activity, as play, as activity held to be intrinsically valuable (Ferguson 2004), no exploitation is involved. Perhaps childrearing and other caring activity is both work and play, but only that portion which is necessary for the psychological growth of the child and the worker(s) counts as work. If so, who determines when that line is crossed? Since non-market activity does not have a clear criterion to distinguish work from non-work, nor necessary from non-necessary social labor, an arbitrary element seems to creep in that makes standards of fairness difficult to apply to gendered household bargains between men and women dividing up waged and non-waged work. (Barrett 1980).
One solution to this problem is simply to take all household activity that could also be done by waged labor (nannies, domestic servants, gardeners, chauffeurs, etc.) as work and to figure its comparable worth by the waged labor necessary to replace it (Folbre 1982, 1983). Another is to reject altogether the attempts to base women’s oppression on social relations of work, on the grounds that such theories are overly generalizing and ignore the discrete meanings that kinship activities have for women in different contexts (Nicholson 1991; Fraser and Nicholson 1991; Marchand 1995). Or, one can argue that although the line between work and leisure changes historically, those doing the activity should have the decisive say as to whether their activity counts as work, i.e., labor necessary to promote human welfare. The existence of second wave women’s movements critiques of the “second shift” of unpaid household activity indicates that a growing number of women see most of it as work, not play (cf. Hochchild 1989). Finally, one can argue that since the human care involved in taking care of children and elders creates a public good, it should clearly be characterized as work, and those who are caretakers, primarily women, should be fairly compensated for it by society or the state (Ferguson and Folbre 2000: Folbre 2000, Ferguson 2004).
Liberal, Marxist and radical feminists have all characterized women as doubly alienated in capitalism because of the public/private split that relegates their work as mothers and houseworkers to the home, and psychologically denies them full personhood, citizenship and human rights (Foreman 1974, Okin 1989, Pateman 1988, Goldman 1969). Noting that women workers on average only have about 70% of the average salary of men in the contemporary U.S., feminists have claimed this is because women’s work, tied stereotypically to housework and hence thought unskilled is undervalued, whether it is cleaning or rote service work, or nurturing work thought to be connected to natural maternal motivations and aptitudes. Hence some feminists have organized in campaigns for “comparable worth” to raise women’s wages to the same as men’s wages involving comparable skills (Brenner 2000; cf. also articles in Hansen and Philipson eds. 1990).
Many radical feminists maintain that women’s work is part of a separate patriarchal mode of reproduction that underlies all economic systems of production and in which men exploit women’s reproductive labor (Delphy 1984; O’Brien 1981; Leghorn and Parker 1981; Rich 1980; Mies 1986). Smith (1974), O’Brien (1981), Hartsock (1983 a,b), Haraway (1985) and Harding (1986) pioneered in combining this radical feminist assumption with a perspectival Marxist theory of knowledge to argue that one’s relation to the work of production and reproduction gave each gender and each social class a different way of knowing the social totality. Women’s work, they argued, ties them to nature and human needs in a different way than men’s work does, which creates the possibility of a less alienated and more comprehensive understanding of the workings of the social totality. Collins argues further that the racial division of labor, institutional racism and different family structures put African American women in yet a different epistemic relation to society than white and other women (1990, 2000). Writing in a post-modernist re-articulation of this feminist standpoint theory, Donna Haraway argues that the breakdown of the nature/culture distinction because of scientific technology and its alteration of the human body makes us into “cyborgs.” Hence our perspectives are so intersectional that they cannot be unified simply by a common relation to work. What is required for a feminist politics is not a situated identity politics, whether of gender and/or race and/or class, but an affinity politics based on alliances and coalitions that combine epistemic perspectives (Haraway 1985).
Like these radical feminists, some socialist-feminists have tried to develop a “dual systems” theory (cf. Young 1981). This involves theorizing a separate system of work relations that organizes and directs human sexuality, nurturance, affection and biological reproduction. Rather than seeing this as an unchanging universal base for patriarchy, however, they have argued that this system, thought of as the “sex/gender system” (Rubin 1975; Hartmann 1978, 1981a,b), or as “sex/affective production” (Ferguson 1989, 1991; Ferguson and Folbre 1981) has different historical modes, just as Marx argued that economies do. Rubin argues that sex/gender systems have been based in different kinship arrangements, most of which have supported the exchange of women by men in marriage, and hence have supported male domination and compulsory heterosexuality. She is hopeful that since capitalism shifted the organization of the economy from kinship to commodity production, the power of fathers and husbands over daughters and wives, and the ability to enforce heterosexuality, will continue to decline, and women’s increasing ability to be economically independent will lead to women’s liberation and equality with men.
With a different historical twist, Hartmann argues that a historical bargain was cemented between capitalist and working class male patriarchs to shore up patriarchal privileges that were being weakened by the entrance of women into wage labor in the 19th century by the creation of the “family wage” to allow men sufficient wages to support a non-wage-earning wife and children at home (1981a). While Ferguson and Folbre (1981) agree that there is no inevitable fit between capitalism and patriarchy, they argue that there are conflicts, and that the family wage bargain has broken down at present. Indeed, both Ferguson and Smart (1984) argue that welfare state capitalism and the persistent sexual division of wage labor in which work coded as women’s is paid less than men’s with less job security are ways that a “public patriarchy” has replaced different systems of family patriarchy that were operating in early and pre-capitalist societies. Walby (1990) has a similar analysis, but to her the connection between forms of capitalism and forms of patriarchy is more functional and less accidental than it appears to Ferguson and Smart. [It would be helpful to briefly explain how Walby sees it as functional. Added paragraph below:]
Walby argues that there are two different basic forms of patriarchy which emerge in response to the tensions between capitalist economies and patriarchal household economies: private and public patriarchy. Private patriarchy as a form is marked by excluding women from economic and political power while public patriarchy works by segregating women. There is a semi-automatic re-adjustment of the dual systems when the older private father patriarchy based on the patriarchal family is broken down due to the pressures of early industrial capitalism. The family wage and women’s second class citizenship that marked that initial re-adjustment are then functionally replaced by a public form of patriarchy, the patriarchal welfare state, where women enter the wage labor force permanently but in segregated less well paid jobs. But Ferguson (1989,1991), Smart (1984) and Folbre (1994) suggest that although the patriarchal control of fathers and husbands over wife and children as economic assets has been diminished in advanced capitalism, there is always a dialectical and contradictory tension between patriarchy and capitalism in which both advances and retreats for women’s equality as citizens and in work relations are constantly occurring in the new form of public patriarchy. Thus, the new “marriage” of patriarchal capitalism operates to relegate women to unpaid or lesser paid caring labor, whether in the household or in wage labor, thus keeping women by and large unequal to men. This is especially notable in the rise of poor single-mother-headed families. However, as it forces more and more women into wage labor, women are given opportunities for some independence from men and the possibility to challenge male dominance and sex segregation in all spheres of social life. Examples are the rise of the first and second wave women’s movements and consequent gains in civil rights for women.
The socialist-feminist idea that there are two interlocking systems that structure gender and the economy, and thus are jointly responsible for male domination, has been developed in a psychological direction by the psychoanalytic school of feminist theorists. Particularly relevant to the question of women and work are the theories of Mitchell (1972, 1974), Kuhn and Wolpe (1978), Chodorow (1978, 1979, 1982) and Ruddick (1989). Mothering, or, taking care of babies and small children, as a type of work done overwhelmingly by women, socializes women and men to have different identities, personalities and skills. In her first work (1972), Mitchell argues that women’s different relations to productive work, reproduction, socialization of children and sexuality in patriarchy give her lesser economic and psychological power in relation to men. In a Freudian vein, Mitchell later argues (1974) that women learn that they are not full symbolic subjects because compulsory heterosexuality and the incest taboo bar them from meeting either the desire of their mother or any other woman. Chodorow, also reading Freud from a feminist perspective, suggests that women’s predominance in mothering work is the basis for the learned gender distinction between women and men. The sexual division of infant care gives boys, who must learn their masculine identity by separating from their mother and the feminine, a motive for deprecating, as well as dominating, women. Ruddick from a more Aristotelian perspective suggests that it is the skills and virtues required in the practice of mothering work which not only socially construct feminine gender differently from men’s, but could ground an alternative vision for peace and resolving human conflicts, if a peace movement were led by women.
Ferguson argues that the “sex/affective” work of mothering and wifely nurturing is exploitative of women: women give more nurturance and satisfaction (including sexual satisfaction) to men and children than they receive, and do much more of the work of providing these important human goods (cf. also Bartky 1990). The gendered division of labor has both economic and psychological consequences, since women’s caring labor creates women less capable of or motivated to separate from others, and hence less likely to protest such gender exploitation (Ferguson 1989, 1991). Folbre argues by contrast that it is only because women’s bargaining power is less than men’s because of the power relations involved in the gender division of labor and property that women acquiesce to such inequalities (Folbre 1982). Ferguson argues that gendered exploitation in a system of meeting human needs suggests that women can be seen as a “sex class” (or gender class) which cuts across economic class lines (1979, 1989, 1991). This line of thought is also developed by Christine Delphy (1984), Monique Wittig (1980) and Luce Irigaray (1975).
On the other side of the debate, Brenner (2000) argues that women are not uniformly exploited by men across economic class lines: indeed, for working class women their unpaid work as housewives serves the working class as a whole, because the whole class benefits when its daily and future reproduction needs are met by women’s nurturing and childcare work. They argue further that middle and upper class women’s economic privileges will inevitably lead them to betray working class women in any cross-class alliance that is not explicitly anti-capitalist. Hochchild (2000) and hooks (2000) point out that career women tend to pay working class women to do the second shift work in the home so they can avoid that extra work, and they have an interest in keeping such wages, e.g., for house cleaning and nannies, as low as possible to keep the surplus for themselves. Kollias (1981) argues further that working class women are in a stronger political position to work effectively for women’s liberation than middle class women, while McKenny (1981) argues that professional women have to overcome myths of professionalism that keep them feeling superior to working class women and hence unable to learn from or work with them for social change.
Several authors have explored the ethical implications of the sexual division of labor in which it is primarily women who do caring labor. Nancy Fraser (1997) and Susan Moller Okin (1989) formulate ethical arguments to maintain that a just model of society would have to re-structure work relations so that the unpaid and underpaid caring labor now done primarily by women would be given a status equivalent to (other) wage labor by various means. In her council socialist vision, Ferguson (1989, 1991) argues that an ideal society would require both women and men to do the hitherto private unpaid work of caring or “sex/affective labor.” For example, such work would be shared by men, either in the family and/or provided by the state where appropriate (as for elders and children’s childcare), and compensated fairly by family allowances (for those, women or men, doing the major share of housework), and by higher pay for caring wage work (such as daycare workers, nurses, and teachers).
Carol Gilligan (1982) claims that women and girls tend to use a different form of ethical reasoning — she terms this the “ethics of care” — than men and boys who use an ethics of justice. Some have argued that this different ethical approach is due to women’s caring sensibilities that have been developed by the sexual division of labor (Ruddick 1989). Interestingly, the debate between feminist theorists of justice, e.g., Fraser and Okin, and ethics of care feminists such as Gilligan and Ruddick, is less about substance than a meta-ethical disbute as to whether ethics should concern principles or judgments in particular cases. All of these theorists seem to have ideal visions of society which dovetail: all would support the elimination of the sexual division of labor so that both men and women could become equally sensitized to particular others through caring work.
Useful anthologies of the first stage of second wave socialist feminist writings which include discussions of women, class and work from psychological as well as sociological and economic perspectives are Eisenstein (1979), Hansen and Philipson (1990), Hennessy and Ingraham (1997), and Holmstrom (2002). Jaggar (1983) wrote perhaps the first philosophy text explaining the categories of liberal, radical, Marxist and socialist-feminist thought and defending a socialist-feminist theory of male domination based on the notion of women’s alienated labor. Others such as Jaggar and Rothenberg (1978), Tuana and Tong (1995) and Herrmann and Stewart (1993) include classic socialist feminist analyses in their collections, inviting comparisons of the authors to others grouped under the categories of liberal, radical, psychoanalytic, Marxist, postmodern, postcolonial and multicultural feminisms.
Various post-modern critiques of these earlier feminist schools of thought such as post-colonialism as well as deconstruction and post-structuralism challenge the over-generalizations and economic reductionism of many of those constructing feminist theories that fall under the early categories of liberal, radical, Marxist or socialist- feminism (cf. Nicholson 1991; Fraser and Nicholson 1991; hooks 1984, 2000; Anzaldua and Moraga, eds. 1981). Others argue that part of the problem is the master narratives of liberalism or Marxism, the first of which sees all domination relations due to traditional hierarchies and undermined by capitalism, thus ignoring the independent effectivity of racism (Josephs 1981); and the second of which ties all domination relations to the structure of contemporary capitalism and ignores the non-capitalist economics contexts in which many women work, even within so-called capitalist economies, such as housework and voluntary community work (Gibson-Graham 1996).
In spite of the “pomo” critiques, there are some powerful thinkers within this tendency who have not completely rejected a more general starting point of analysis based on women, class and work. For example, Spivak (1988), Mohanty (1997), Carby (1997), and Hennessy (1993, 2000) are creating and re-articulating forms of Marxist and socialist-feminism less susceptible to charges of over-generalization and reductionism, and more compatible with close contextual analysis of the power relations of gender and class as they relate to work. They can be grouped loosely with a tendency called materialist feminism that incorporates some of the methods of deconstruction and post-structuralism (Hennessy 1993; Landry and MacLean 1993; and the online paper by Ferguson in the Other Internet Resources section).
Many in the contemporary feminist theory debate are interested in developing concrete “intersectional” or “integrative feminist” analyses of particular issues which try to give equal weight to gender, race, class and sexuality in a global context without defining themselves by the categories, such as liberal, radical or materialist, of the earlier feminist debate categories (cf. work by Davis 1983; Brewer 1995; Crenshaw 1997; Stanlie and James 1997; Anzaldua; hooks 1984, 2000). Nonetheless strong emphasis on issues of race and ethnicity can be found in their work on women, class and work. For example, Brewer shows that white and African-American working class women are divided by race in the workforce, and that even changes in the occupational structure historically tend to maintain this racial division of labor. Hooks argues that women of color and some radical feminists were more sensitive to class and race issues than those, primarily white, feminists whom she labels “reformist feminists” (hooks 2000).
Presupposed in the general theoretical debates concerning the relations between gender, social and economic class, and work are usually definitions of each of these categories that some thinkers would argue are problematic. For example, Tokarczyk and Fay have an excellent anthology on working class women in the academy (1993) in which various contributors discuss the ambiguous positions in which they find themselves by coming from poor family backgrounds and becoming academics. One problem is whether they are still members of the working class in so doing, and if not, whether they are betraying their families of origin by a rise to middle class status. Another is, whether they have the same status in the academy, as workers, thinkers and women, as those men or women whose families of origin were middle class or above. Rita Mae Brown wrote an early article on this, arguing that education and academic status did not automatically change a working class woman’s identity, which is based not just on one’s relation to production, but one’s behavior, basic assumptions about life, and experiences in childhood (Brown 1974). Tokarczyk and Fay acknowledge that the definition of “class” is vague in the U.S. Rather than provide a standard philosophical definition in terms of necessary and sufficient conditions for membership in the working class, they provide a cluster of characteristics and examples of jobs, such as physically demanding, repetitive and dangerous jobs, jobs that lack autonomy and are generally paid badly. Examples of working class jobs they give are cleaning women, waitresses, lumberjacks, janitors and police officers. They then define their term “working class women academics” to include women whose parents had jobs such as these and are in the first generation in their family to attend college (Tokarczyk and Fay: 5). They challenge those that would argue that family origin can be overcome by the present position one has in the social division of labor: simply performing a professional job and earning a salary does not eradicate the class identity formed in one’s “family class” (cf. Ferguson 1979).
To theorize the problematic relation of women to social class, Ferguson (1979, 1989, 1991) argues that there are at least three different variables — an individual’s work, family of origin, and present household economic unit — which relate an individual to a specific socio-economic class. For example, a woman may work on two levels: as a day care worker (working class), but also as a member of a household where she does the housework and mothering/child care, while her husband is a wealthy contractor (petit bourgeois, small capitalist class). If in addition her family of origin is professional middle class (because, say, her parents were college educated academics), the woman may be seen and see herself as either working class or middle class, depending on whether she and others emphasize her present relations of wage work (her individual economic class, which in this case is working class), her household income (middle class) or her family of origin (middle class).
Sylvia Walby deals with this ambiguity of economic class as applying to women as unpaid houseworkers by claiming against Delphy (1984) that the relevant economic sex classes are those who are housewives vs. those who are husbands benefiting from such work, not those of all women and men, whether or not they do or receive housework services (Walby 1990). Ferguson, however, sides with Delphy in putting all women into “sex class”, since all women, since trained into the gender roles of patriarchal wife and motherhood, are potentially those whose unpaid housework can be so exploited. But seeing herself as a member of a fourth class category, “sex class,” and hence, in a patriarchal capitalist system, seeing herself exploited as a woman worker in her wage work and unpaid second shift housework,  is thus not a given but an achieved social identity. Such an identity is usually formed through political organizing and coalitions with other women at her place of employment, in her home and her community. In this sense the concept of sex class is exactly analogous to the concept of a feminist epistemological standpoint: not a given identity or perspective, but one that is achievable under the right conditions.
Realizing the importance of this disjuncture between economic class and sex class for women, Maxine Molyneux (1984) argues in a often cited article that there are no “women’s interests” in the abstract that can unify women in political struggle. Instead, she theorizes that women have both “practical gender interests” and “strategic gender interests.” Practical gender interests are those that women develop because of the sexual division of labor, which makes them responsible for the nurturant work of sustaining the physical and psychological well-being of children, partners and relatives through caring labor. Such practical gender interests, because they tie a woman’s conception of her own interests as a woman to those of her family, support women’s popular movements for food, water, child and health care, even defense against state violence, which ally them with the economic class interests of their family. Strategic gender interests, on the contrary, may ally women across otherwise divided economic class interests, since they are those, like rights against physical male violence and reproductive rights, which women have as a sex class to eliminate male domination.
Molyneux used her distinctions between practical and strategic gender interests to distinguish between the popular women’s movement in Nicaragua based on demands for economic justice for workers and farmers against the owning classes, demands such as education, health and maternity care, clean water, food and housing, and the feminist movement which emphasized the fight for legal abortion, fathers’ obligation to pay child support to single mothers, and rights against rape and domestic violence. She and others have used this distinction between practical and strategic gender interests to characterize the tension between popular women’s movements and feminist movements in Latin America (Molyneux 2001; Alvarez 1998; Foweraker 1998).
A similar distinction between different types of women’s interests was developed further as a critique of interest group paradigms of politics by Anna Jónasdóttir (1988, 1994). Jónasdóttir argues that women have a common formal interest in votes for women, women’s political caucuses, gender parity demands, and other mechanisms which allow women a way to develop a collective political voice, even though their content interests, that is, their specific needs and priorities, may vary by race and economic class, among others. Her distinctions, and those of Molyneux, have been changed slightly — practical vs. strategic gender needs, rather than interests — to compare and contrast different paradigms of economic development by World Bank feminist theorist Carolyn Moser (1993). Most recently the Jónasdóttir distinctions have been used by Mohanty (1997) to defend and maintain, in spite of postmodernists’ emphasis on intersectional differences, that commonalities in women’s gendered work can create a cross-class base for demanding a collective political voice for women: a transnational feminism which creates a demand for women’s political representation, developing the platform of women’s human rights as women and as workers. Nonetheless, the tension between women’s economic class-based interests or needs and their visionary/strategic gender interests or needs is still always present, and must therefore always be negotiated concretely by popular movements for social justice involving women’s issues.
Another approach to the problematic nature of socio-economic class as it relates to women are empirical studies which show how class distinctions are still important for women in their daily lives as a way to compare and contrast themselves with other women and men, even if they do not use the concepts of “working class,” “professional class” or “capitalist class”. Many have pointed out that the concept of class itself is mystified in the U.S. context, but that nonetheless class distinctions still operate because of different structural economic constraints, which act on some differently from others. The Ehrenreichs (1979), in a classic article, argue that this mystification is due to the emergence of a professional-managerial class that has some interests in common with the capitalist class and some with the working class. Whatever its causes, there are empirical studies which show that class distinctions still operate between women, albeit in an indirect way. Barbara Ehrenreich (2001), by adopting the material life conditions of a poor woman, did an empirical study of the lives of women working for minimum wages and found their issues to be quite different from and ignored by middle and upper-class women. Diane Reay (2004) does an empirical study of women from manual labor family backgrounds and their relation to the schooling of their children, and discovers that they use a discourse that acknowledges class differences of educational access and career possibilities, even though it does not specifically define these by class per se. Similarly, Julie Bettie (2000) does an impressive discourse analysis of the way that Latina high school students create their own class distinctions through concepts such as “chicas,” “cholas” and “trash” to refer to themselves and their peers. These categories pick out girls as having middle class, working class or poor aspirations by performance indicators such as dress, speech, territorial hang-outs and school achievement, while never mentioning “class” by name. Women’s experiences of growing up working class are presented in the anthology edited by Tea (2003).
Theoretical and empirical debates about the relation of women to class and work, and the implications of these relations for theories of male domination and women’s oppression as well as for other systems of social domination, continue to be important sources of theories and investigations of gender identities, roles and powers in the field of women and gender studies, as well as in history, sociology, anthropology and economics. They also have important implications for epistemology, metaphysics and political theory in the discipline of philosophy, and consequently other disciplines in humanities and the social sciences.
Colonialism is not a modern phenomenon. World history is full of examples of one society gradually expanding by incorporating adjacent territory and settling its people on newly conquered territory. The ancient Greeks set up colonies as did the Romans, the Moors, and the Ottomans, to name just a few of the most notorious examples. Colonialism, then, is not restricted to a specific time or place. Nevertheless, in the sixteenth century, colonialism changed decisively because of technological developments in navigation that began to connect more remote parts of the world. Fast sailing ships made it possible to reach distant ports while sustaining closer ties between the center and colonies. Thus, the modern European colonial project emerged when it became possible to move large numbers of people across the ocean and to maintain political sovereignty in spite of geographical dispersion. This entry uses the term colonialism to describe the process of European settlement and political control over the rest of the world, including Americas, Australia, and parts of Africa and Asia.
The difficulty of defining colonialism stems from the fact that the term is often used as a synonym for imperialism. Both colonialism and imperialism were forms of conquest that were expected to benefit Europe economically and strategically. The term colonialism is frequently used to describe the settlement of places such as North America, Australia, New Zealand, Algeria, and Brazil that were controlled by a large population of permanent European residents. The term imperialism often describes cases in which a foreign government administers a territory without significant settlement; typical examples include the scramble for Africa in the late nineteenth century and the American domination of the Philippines and Puerto Rico. The distinction between the two, however, is not entirely consistent in the literature. Some scholars distinguish between colonies for settlement and colonies for economic exploitation. Others use the term colonialism to describe dependencies that are directly governed by a foreign nation and contrast this with imperialism, which involves indirect forms of domination.
The confusion about the meaning of the term imperialism reflects the way that the concept has changed over time. Although the English word imperialism was not commonly used before the nineteenth century, Elizabethans already described the United Kingdom as “the British Empire.” As Britain began to acquire overseas dependencies, the concept of empire was employed more frequently. Thus, the traditional understanding of imperialism was a system of military domination and sovereignty over territories. The day to day work of government might be exercised indirectly through local assemblies or indigenous rulers who paid tribute but sovereignty rested with the British. The shift away from this traditional understanding of empire was influenced by the Leninist analysis of imperialism as a system oriented towards economic exploitation. According to Lenin, imperialism was the necessary and inevitable result of the logic of accumulation in late capitalism. Thus, for Lenin and subsequent Marxists, imperialism described a historical stage of capitalism rather than a trans-historical practice of political and military domination. The lasting impact of the Marxist approach is apparent in contemporary debates about American imperialism, a term which usually means American economic hegemony, regardless of whether such power is exercised directly or indirectly (Young 2001).
Given the difficulty of consistently distinguishing between the two terms, this entry will use colonialism as a broad concept that refers to the project of European political domination from the sixteenth to the twentieth centuries that ended with the national liberation movements of the 1960s. Post-colonialism will be used to describe the political and theoretical struggles of societies that experienced the transition from political dependence to sovereignty. This entry will use imperialism as a broad term that refers to economic, military, political domination that is achieved without significant permanent European settlement.
The Spanish conquest of the Americas sparked a theological, political, and ethical debate about the legitimacy of using military force in order to acquire control over foreign lands. This debate took place within the framework of a religious discourse that legitimized military conquest as a way to facilitate the conversion and salvation of indigenous peoples. The idea of a “civilizing mission” was by no means the invention of the British in the nineteenth century. The Spanish conquistadores and colonists explicitly justified their activities in the Americas in terms of a religious mission to bring Christianity to the native peoples. The Crusades provided the initial impetus for developing a legal doctrine that rationalized the conquest and possession of infidel lands. Whereas the Crusades were initially framed as defensive wars to reclaim Christian lands that had been conquered by non-Christians, the resulting theoretical innovations played an important role in subsequent attempts to justify the conquest of the Americas. The core claim was that the “Petrine mandate” to care for the souls of Christ’s human flock required Papal jurisdiction over temporal as well as spiritual matters, and this control extended to non-believers as well as believers.
Even the spread of Christianity, however, did not provide an unproblematic justification for the project of overseas conquest. The Spanish conquest of the Americas was taking place during a period of reform when humanist scholars within the Church were increasingly influenced by the natural law theories of theologians such as St. Thomas Aquinas. According to Pope Innocent IV, war could not be waged against infidels and they could not be deprived of their property simply because of their non-belief. Under the influence of Thomism, Innocent IV concluded that force was legitimate only in cases where infidels violated natural law. Thus nonbelievers had legitimate dominion over themselves and their property, but this dominion was abrogated if they proved incapable of governing themselves according to principles that every reasonable being would recognize. The Spanish quickly concluded that the habits of the native Americans, from nakedness to unwillingness to labor to alleged cannibalism, clearly demonstrated their inability to recognize natural law. From this, they legitimized the widespread enslavement of the Indians as the only way of teaching them civilization and introducing them to Christianity.
Many of the Spanish missionaries sent to the New World, however, immediately noticed that the brutal exploitation of slave labor was widespread while any serious commitment to religious instruction was absent. Members of the Dominican order in particular noted the hypocrisy of enslaving the Indians because of their alleged barbarity while practicing a form of conquest, warfare, and slavery that reduced the indigenous population of Hispaniola from 250,000 to 15,000 in two decades of Spanish rule. Given the genocidal result of Spanish “civilization,” they began to question vocally the idea of a civilizing mission. Bartolomé de Las Casas and Franciscus de Victoria were two of the most influential critics of Spanish colonial practice. Victoria gave a series of lectures on Indian rights that applied Thomistic humanism to the practice of Spanish rule. He argued that all human beings share the capacity for rationality and have natural rights that stem from this capacity. From this premise, he deduced that the Papal decision to grant Spain title to the Americas was illegitimate. Unlike the position of Pope Innocent IV, Victoria argued that neither the Pope nor the Spaniards could subjugate the Indians in order to punish violations of natural law, such as fornication or adultery. He noted that the Pope has no right to make war on Christians and take their property simply because they are “fornicators or thieves.” If this were the case, then no European king’s dominion would ever be safe. Furthermore, according to Victoria, the pope and Christian rulers acting on his mandate have even less right to enforce laws against unbelievers, because they are outside of the Christian community, which is the domain of Papal authority (Williams 1990).
Despite this strongly worded critique of the dominant modes of justifying Spanish conquest, Victoria concluded that the use of force in the New World was legitimate in cases when Indian communities violated the Law of Nations, a set of principles derivable from reason and therefore universally binding. At first it might sound contradictory that Victoria concluded that the Indians’ supposed violation of the law of nature did not justify conquest but their violation of the Law of Nations, itself derived from natural law, did. Victoria emphasized that the Law of Nations is binding because “there exists clearly enough a consensus of the greater part of the whole world” (391) and because the principles benefit “the common good of all.” This distinction seems to rely on the assumption that other principles usually associated with natural law (such as the prohibitions on adultery and idolatry) only affect those who consent to the practices, whereas violations of the Law of Nations (e.g. prohibitions on peaceful travel and trade) have consequences for those who do not consent. Ultimately, Victoria’s understanding of the Law of Nations led him to defend the practice of Spanish colonialism, even as he emphasized that the Spanish remedy of warfare should be limited to minimal measures required to attain the legitimate objectives of peaceful trade and missionary work. Within Victoria’s critique of the legality and morality of Spanish colonialism was a rationalization for conquest, albeit a restrictive one.
The legitimacy of colonialism was also a topic of debate among French, German, and British philosophers in the eighteenth and nineteenth centuries. Enlightenment thinkers such as Kant, Smith and Diderot were critical of the barbarity of colonialism and challenged the idea that Europeans had the obligation to “civilize” the rest of the world. At first it might seem relatively obvious that Enlightenment thinkers would develop a critique of colonialism. The system of colonial domination, which involved some combination of slavery, quasi-feudal forced labor, or expropriation of property, is antithetical to the basic Enlightenment principle that each individual is capable of reason and self-government. The rise of anti-colonial political theory, however, required more than a universalistic ethic that recognized the shared humanity of all people. As suggested above, the universalism and humanism of Thomism proved to be a relatively weak basis for criticizing colonialism. Given the tension between the abstract universalism of natural law and the actual cultural practices of indigenous peoples, it was easy to interpret native difference as evidence for the violation of natural law. This in turn became a justification for exploitation.
Diderot was one of the most forceful critics of European colonization. In his Histoire des deux Indes, he challenged the view that indigenous people benefit from European civilization and argued that the European colonists are the uncivilized ones. He claimed that culture (“national character”) helps to inculcate morality and reinforces norms of respect, but these norms tend to dissipate when the individual is far from his country of origin. Colonial empires, he believed, frequently become the sites of extreme brutality because when the colonists were far away from legal institutions and informal sanctions, the habits of restraint fell away, exposing natural man’s full instinct for violence (Muthu 2003).
In Book VIII of Histoire des deux Indes, Diderot also refutes the dominant justifications for European colonialism. Although he grants that it is legitimate to colonize an area that is not actually inhabited, he insists that foreign traders and explorers have no right of access to fully inhabited lands. This is important because the right to commerce (understood to encompass not only trade but also missionary work and exploration) was used as a justification for colonization by Spanish thinkers in the sixteenth and seventeenth century. Emblematic of this approach was Victoria’s conclusion that an indigenous people could not exclude peaceful traders and missionaries without violating the Law of Nations. If the native peoples resisted these incursions, the Spanish could legitimately wage war and conquer their territory. Diderot specifically challenged this view, noting that the European traders have proven themselves “dangerous as guests.” (Muthu 2003: 75)
Before enlightenment thinkers could articulate a compelling critique of colonialism, they had to recognize the importance of culture and the possibility of cultural pluralism. The claim that all individuals are equally worthy of dignity and respect was a necessary but not sufficient basis for anti-imperialist thought. They also had to recognize that the tendency to develop diverse institutions, narratives, and aesthetic practices was an essential human capacity. The French term moeurs or what today would be called culture captures the idea that the humanity of human beings is expressed in the distinctive practices that they adopt as solutions to the challenges of existence.
The work of enlightenment anti-imperialists such as Diderot and Kant reflects their struggle with the tension between universalistic concepts such as human rights and the realities of cultural pluralism. The paradox of enlightenment anti-imperialism is that human dignity is understood to be rooted in the universal human capacity for reason. Yet when people engage in cultural practices that are unfamiliar or disturbing to the European observer, they appear irrational and thus undeserving of recognition and respect. Diderot’s solution was to identify particularity as the universal human trait. In other words, he emphasized that human beings all share similar desires to create workable rules of conduct that allow particular ways of life to flourish without themselves creating harsh injustices and cruelties. (Muthu 2003: 77) There are infinite varieties of solutions to the challenges posed by human existence. Societies all need to find a way to balance individual egoism and sociability and to overcome the adversities that stem from the physical environment. From this perspective, culture itself, rather than rationality, is the universal human capacity.
Unlike many other eighteenth and nineteenth century political philosophers, Diderot did not assume that non-Western societies were necessarily primitive (e.g. lacking political and social organization) nor did he assume that more complex forms of social organization were necessarily superior. One of the key issues that distinguished critics from proponents of colonialism and imperialism was their view of the relationship between culture, history and progress. Most of the influential philosophers writing in France and England in the eighteenth and nineteenth centuries had assimilated some version of the developmental approach to history that was associated with the Scottish Enlightenment. While the Scots quite consciously took their lead from Montesquieu, they went on to develop a unique and profoundly influential eighteenth-century historical narrative known as the four-stages thesis. In that story, all societies were imagined as naturally moving from hunting, to herding, to farming, to commerce, a developmental process that simultaneously tracked a cultural arc from “savagery,” through “barbarism,” to “civilization.” This meant that for the Scots, “civilization” was not just a marker of material improvement, but also a normative judgment about the moral progress of society. The Scottish Enlightenment thinkers were central to the creation of an historical imaginary that described a civilizing process, one marked most significantly by increasing refinement in modes of social interaction, which they saw as tied to the advent of commercial society. This, in turn, produced a historical narrative, which celebrated the emergence of a shared Western civilization based on the emergence of wealth and commerce (Kohn and O’Neill 2006)
The language of civilization, savagery, and barbarism is pervasive in writers as diverse of Edmund Burke, Karl Marx, and John Stuart Mill. It would therefore be incorrect to conclude that a developmental theory of history is somehow particular to the liberal tradition; nevertheless, given that figures of the Scottish Enlightenment such as Ferguson and Smith were among its leading expositors, it is strongly associated with liberalism. Smith himself opposed imperialism for economic reasons. He felt that relations of dependence between metropole and periphery distorted self-regulating market mechanisms and worried that the cost of military domination would be burdensome for taxpayers (Pitts 2005). The idea that civilization is the culmination of a process of historical development, however, proved useful in justifying imperialism. According to Uday Mehta, liberal imperialism was the product of the interaction between universalism and developmental history (1999). A core doctrine of liberalism holds that all individuals share a capacity for reason and self-government. The theory of development history, however, modifies this universalism with the notion that these capacities only emerge at a certain stage of civilization. For example, according to John Stuart Mill (hereafter Mill), savages do not have the capacity for self-government because of their excessive love of freedom. Serfs, slaves, and peasants in barbarous societies, on the other hand, may be so schooled in obedience that their capacity for rationality is stifled. Only in commercial society are the material and cultural conditions ideal for individuals to realize and exercise their potential. The consequence of this logic is that civilized societies like Great Britain are acting in the interest of less-developed peoples by governing them. Imperialism, from this perspective, is not primarily a form of political domination and economic exploitation but rather a paternalistic practice of government that exports “civilization” (e.g. modernization) in order to foster the improvement and native peoples. Despotic government (and Mill doesn’t hesitate to use this term) is a means to the end of improvement and ultimately self-government.
Of course, Mill, a life-long employee of the British East India Company, recognized that despotic government by a foreign people could lead to injustice and economic exploitation. These abuses, in turn, if unchecked, could undermine the legitimacy and efficacy of the imperial project. In Considerations on Representative Government (1861), Mill identified four reasons why foreign peoples were not suited to governing dependencies. First, metropolitan politicians were unlikely to have the knowledge of local conditions that was necessary for effectively solving problems of public policy. Second, given cultural, linguistic, and often religious difference, European colonists were unlikely to sympathize with the native peoples and more likely to act tyrannically. Third, even if the Englishmen abroad really tried to act fairly to native peoples, their natural tendency to sympathize with those similar to themselves (other foreign colonists or merchants) would likely lead to distorted judgment in cases of disputes. Finally, British colonists and merchants went abroad primarily to acquire wealth with no long term investment and little effort, which meant that their economic activity was likely to exploit rather than develop the country. These arguments also echoed points made in Edmund Burke’s voluminous writings assailing the misgovernment in India, most notably Burke’s famous Speech on Fox’s East India Bill (1783).
For Mill, parliamentary oversight was no solution. First of all, it would politicize decisions, making imperial policy a result of the factional struggles of party politics rather than technocratic expertise. Furthermore, given that members of the House of Commons were accountable to their domestic electors, it would guarantee that imperial policy would be aimed exclusively at maximizing British self-interest rather than promoting good government and economic development in the dependencies. Mill’s solution to the problem of imperial misgovernment was to eschew parliamentary oversight in favor of a specialized administrative corps. Members of this specialized body would have the training to acquire relevant knowledge of local conditions. Paid by the government, they would not personally benefit from economic exploitation and could fairly arbitrate conflicts between colonists and indigenous people. Mill, however, was not able to explain how to ensure good government in a situation in which those wielding political power were not accountable to the population. In this sense, Mill’s writing is emblematic of the failure of liberal imperial thought.
Nineteenth century liberal thinkers held a range of views on the legitimacy of foreign domination and differed about what tactics should be used to achieve that goal. Alexis de Tocqueville, for example, made a case for colonialism that did not rely on the idea of a “civilizing mission.” Tocqueville recognized that colonialism probably did not bring good government to the native peoples, but this was irrelevant since his justification rested entirely on the benefit to France. Tocqueville insisted that French colonies in Algeria would increase France’s stature vis-à-vis rivals like England; they would provide an outlet for excess population that was a cause of disorder in France; and imperial endeavors would incite a feeling of patriotism that would counterbalance the modern centrifugal forces of materialism and class conflict.
Tocqueville was actively engaged in advancing the project of French colonization of Algeria. Tocqueville’s first analysis of French colonialism was published during his 1837 electoral campaign for a seat in the Chamber of Deputies. As a member of the Chamber of Deputies, Tocqueville argued in favor of expanding the French presence in Algeria. He traveled to Algeria in 1841 composing an “Essay on Algeria” that served as the basis for two parliamentary reports on the topic (Tocqueville 2001). Unlike the more naïve proponents of the “civilizing mission,” Tocqueville recognized that the brutal military occupation did little to introduce good government or advance civilization. In an apparent reversal of the four-stages theory of the Scottish Enlightenment, he acknowledged that “we are now fighting far more barbarously than the Arabs themselves” and “it is on their side that one meets with civilization.” (Tocqueville 2001: 70) This realization, however, was not the basis of a critique of French brutality. Instead, Tocqueville defended controversial tactics such as destroying crops, confiscating land, and seizing unarmed civilians. His texts, however, provide little in the way of philosophical justification and dismiss the entire just war tradition with a curt statement that “I believe that the right of war authorizes us to ravage the country.” (Tocqueville 2001: 70). Tocqueville’s writing on Algeria, the French national interest is paramount and moral considerations are explicitly subordinate to political goals.
Tocqueville’s analysis of Algeria reflects little anxiety about its legitimacy and much concern about the pragmatics of effective colonial governance. The stability of the regime, he felt, depended on the ability of the colonial administration to provide good government to the French settlers. Tocqueville emphasized that the excessive centralization of decision-making in Paris combined with the arbitrary practices of the local military leadership meant that French colonists had no security of property, let alone the political and civil rights that they were accustomed to France. Tocqueville was untroubled by the use of martial law against indigenous peoples, but felt that it was counterproductive when applied to the French. For Tocqueville, the success of the French endeavor in Algeria depended entirely on attracting large numbers of permanent French settlers. Given that it was proving impossible to win the allegiance of the indigenous people, France could not hold Algeria without creating a stable community of colonists. The natives were to be ruled through military domination and the French were to be enticed to settle through the promise of economic gain in an environment that reproduced, as much as possible, the cultural and political life of France. After a brief period of optimism about “amalgamation” of the races in his Second Letter on Algeria” (Tocqueville 2001: 25), Tocqueville understood the colonial world in terms of the permanent opposition of settler and native, an opposition structured to ensure the economic benefit of the former.
In recent years, scholars have devoted less attention to the debates on colonialism within the Marxist tradition. This reflects the waning influence of Marxism in the academy and in political circles more generally. Marxism, however, has been extremely influential on both post-colonial theory and anti-colonial independence movements around the world. Marxists have drawn attention to the material basis of European political expansion and developed concepts that help explain the persistence of economic exploitation after the end of direct political rule.
Although Marx never developed a theory of colonialism, his analysis of capitalism emphasized its inherent tendency to expand in search of new markets. In his classics works such as The Communist Manifesto, Grundrisse, and Capital, Marx predicted that the bourgeoisie would continue to create a global market and undermine any local or national barriers to its own expansion. Expansion is a necessary product of the core dynamic of capitalism: overproduction. Competition among producers drives them to cut wages, which in turn leads to a crisis of under-consumption. The only way to prevent economic collapse is to find new markets to absorb excess consumer goods. For a Marxist perspective, some form of imperialism is inevitable. By exporting population to resource rich foreign territories, a nation creates a market for industrial goods and a reliable source of natural resources. Alternately, weaker countries can face the choice of either voluntarily admitting foreign products that will undermine domestic industry or submitting to political domination, which will accomplish the same end.
In a series of newspaper articles published in the 1850s in the New York Daily Tribune, Marx specifically discussed the impact of British colonialism in India. His analysis was consistent with his general theory of political and economic change. He described India as an essentially feudal society experiencing the painful process of modernization. According to Marx, however, Indian “feudalism” was a distinctive form because, he believed (incorrectly) that agricultural land in India was owned communally. Marx used the concept of “Oriental despotism” to describe a specific type of class domination that used the mechanism of the state and taxation in order to extract resources from the peasantry. Oriental despotism emerged in India because agricultural productivity depended on large-scale public works that could only be financed by the state, particularly irrigation. This meant that the state could not be easily replaced by a more decentralized system of authority. In Western Europe, feudal property could be transformed gradually into privately owned, alienable property in land. In India, communal land ownership made this impossible, thereby blocking the development of commercial agriculture and free markets. Since “oriental despotism” inhibited the indigenous development of economic modernization, British domination became the agent of economic modernization.
Marx’s analysis of colonialism as a progressive force bringing modernization to a backward feudal society sounds like a transparent rationalization for foreign domination. His endorsement of British domination, however, reflects the same ambivalence that he shows towards capitalism in Europe. In both cases, Marx recognized the immense suffering brought about during the transition from feudal to bourgeois society while insisting that the transition is both necessary and ultimately progressive. He argued that the penetration of foreign commerce is causing a social revolution in India. For Marx, this upheaval has both positive and negative ramifications. When peasants loose their traditional livelihoods, there is a great deal of human suffering, but he also pointed out that the traditional village communities were hardly idyllic; they were sites of caste oppression, slavery, misery, and cruelty. The first stage of this process is entirely negative, because it involves heavy burdens of taxation to support British rule and economic upheaval due to the glut of cheaply produced English cotton. Eventually, however, British merchants begin to realize that Indians cannot pay for imported cloth or administrators if they don’t efficiently produce goods to trade, which provides an incentive for British investment in production and infrastructure. Even though Marx believed that British rule was motivated by greed and exercised through cruelty, he felt it was still unwittingly the agent of progress. Thus, Marx’s discussion of British rule in India has three dimensions: an account of the progressive character of foreign rule, a critique of the human suffering involved, and a concluding argument that British rule must be temporary if the progressive potential it unleashed is to be realized.
Lenin developed his analysis of Western economic and political domination in his pamphlet Imperialism: The Highest Stage of Capitalism (1917) (see Other Internet Resources). Unlike Marx, Lenin took a more explicitly critical view of imperialism. He noted that imperialism was a technique which allowed European countries to put off the inevitable domestic revolutionary crisis by exporting their own economic burdens onto weaker states. Lenin argued that late-nineteenth century imperialism was driven by the economic logic of late-capitalism. The falling rate of profit caused an economic crisis which could only be resolved through territorial expansion. Capitalist conglomerates were compelled to expand beyond their national borders in pursuit of new markets and resources. In a sense, this analysis is fully consistent with Marx, who saw European colonialism as continuous with the process of internal expansion within states and across Europe. From this perspective, colonialism and imperialism resulted from the same logic that drove the economic development and modernization of peripheral areas in Europe. But there was one difference. Since late capitalism was organized around national monopolies, the competition for markets took the form of military competition between states over territories that could be dominated for their exclusive economic benefit.
Marxist theorists including Rosa Luxemburg, Karl Kautsky, and Nikolai Bukharin also explored the issue of imperialism. Kautsky’s position is especially important because his analysis introduced concepts that continue to play a prominent role in contemporary world systems theory and post-colonial studies. Kautsky challenged the assumption that imperialism would lead to the development of the areas subjected to economic exploitation. He suggested that imperialism was a relatively permanent relationship structuring the interactions between two types of countries. (Young 2001) Although imperialism initially took the form of military competition between capitalist countries, it would result in collusion between capitalist interests to maintain a stable system of exploitation of the non-developed world. The most influential contemporary proponent of this view is Immanuel Wallerstein, who is known for world-system theory. According to this theory, the world-system involves a relatively stable set of relations between core and peripheral states as a functional in internal division of labor that is structured to benefit the core states (Wallerstein 1974-1989).
From the perspective of world-system theory, the economic exploitation of the periphery does not necessarily require direct political or military domination. In a similar vein, contemporary literary theorists have drawn attention to practices of representation that reproduce a logic of subordination that endures even after former colonies gain independence. The field of postcolonial studies was established by Edward Said in his path-breaking book Orientalism. In Orientalism Said applied Michel Foucault’s technique of discourse analysis to the production of knowledge about the Middle East. The term orientalism described a structured set of concepts, assumptions, and discursive practices that were used to produce, interpret, and evaluate knowledge about non-European peoples. Said’s analysis made it possible for scholars to deconstruct literary and historical texts in order to understand how they reflected and reinforced the imperialist project. Unlike previous studies that focused on the economic or political logics of colonialism, Said drew attention to the relationship between knowledge and power. By foregrounding the cultural and epistemological work of imperialism, Said was able to undermine the ideological assumption of value-free knowledge and show that “knowing the Orient” was part of the project of dominating it. Thus, Orientalism can be seen as an attempt to extend the geographical and historical terrain of the poststructuralist critique of Western epistemology.
Said uses the term Orientalism in several different ways. First, Orientalism is a specific field of academic study about the Middle East and Asia, albeit one that Said conceives quite expansively as including history, sociology, literature, anthropology and especially philology. He also identifies it as a practice that helps define Europe by creating a stable depiction of its other, its constitutive outside. Orientalism is a way of characterizing Europe by drawing a contrasting image or idea, based on a series of binary oppositions (rational/irrational, mind/body, order/chaos) that manage and displace European anxieties. Finally, Said emphasizes that it is also a mode of exercising authority by organizing and classifying knowledge about the Orient. This discursive approach is distinct both from a vulgar materialist assumption that knowledge is simply a reflection of economic or political interests and from an idealist conviction that scholarship is disinterested and neutral. Following Foucault, Said’s concept of discourse identifies a way in which knowledge is not used instrumentally in service of power but rather is itself a form of power.
The second quasi-canonical contribution to the field of post-colonial theory is Gayatri Spivak’s “Can the Subaltern Speak?” Spivak works within Said’s problematic of representation but extends it to the contemporary academy. By posing the question “Can the subaltern speak?” she asks whether the scholarly interest in non-Western cultures may unwittingly reproduce a new kind of orientalism, whereby academic theorists mine non-Western sources in order to speak authoritatively in their place. Even though the goal is to challenge the existing Eurocentrism of the academy, post-colonial studies is particularly vulnerable to the risks associated with any claim to speak authoritatively on behalf of the subaltern. Thus the field of post-colonial studies is haunted by its own impossibility. It was born out of the recognition that representation is inevitably implicated in power and domination yet struggles to reconfigure representation as an act of resistance. In order to do so, it introduces new strategies of reading and interpretation while recognizing the limitations of this endeavor.
The core problematic of post-colonial theory is an examination of the relationship between power and knowledge in the non-Western world. Some scholars have approached this topic through historical research rather than literary or discursive analysis. The most influential movement is the Subaltern Studies group, which was originally made up of South Asian historians who explored the contribution of non-elites to Indian politics and culture. The term subaltern suggests an interest in social class but more generally it is also a methodological orientation that opens up the study of logics of subordination. Whereas Said raised the broad issue of Orientalism, the Subaltern Studies group dismantled particular hegemonic narratives of Indian colonial history. According to Spivak, the Subaltern Studies group developed two important challenges to the narrative of Indian colonial history as a change from semi-feudalism to capitalist domination. First, they showed that the moment of change must be pluralized as a story of multiple confrontations involving domination and resistance rather than a simple great modes-of-production narrative. Second, these epochal shifts are marked by a multidimensional change in sign-system from the religious to the militant, crime to insurgency, bondsman to worker (Guha and Spivak 1988: 3)
The work of the Subaltern Studies group is emblematic of the way that post-colonial theory often inhabits the terrain between post-structuralism and Marxism, two traditions that have many differences as well as some commonalities. Despite the fact that many practitioners of the field are sympathetic to both traditions, other scholars highlight the incompatibility of the two. For example, Aijaz Ahmad has criticized post-colonialist theory from a Marxist perspective, arguing that its infatuation with issues of representation and discourse makes it blind to the material basis and systematic structure of power relations. The use of concepts such as hybridity easily degenerates into a kind of eclecticism that gestures at radical resistance while denying the theoretical basis of any theory of revolutionary change. Ahmad also argued that the influence of Said’s Orientalism was due not to its originality but, on the contrary, to its conventionality. According to Ahmad, Orientalism benefited from its affinity with two problematic intellectual fashions: the reaction against Marxism that lead to the vogue for post-structuralism and the “Third-worldism” that provided academics with a veneer of radicalism. Said, for his part, also developed a sustained critique of Marxism. In Orientalism, Said argued that Marx’s explicit defense of British colonialism was emblematic of his own implication in Orientalist discourse. Furthermore, for Said, Marx’s position was not merely a personal failure but instead reflected a more general problem with totalizing theory that he felt tended to marginalize any signs of difference that undermined Marx’s narrative of progress.
To conclude, it is worth noting that some scholars have begun to question the usefulness of the concept post-colonial theory. Like the idea of the Scottish four stages theory, a theory with which it would appear to have little in common, the very concept of post-colonialism seems to rely on a progressive understanding of history (McClintock 1992)). It suggests, perhaps unwittingly, that the core concepts of hybridity, alterity, particularly, and multiplicity may lead to a kind of methodological dogmatism or developmental logic. Moreover, the term “colonial” as a marker of this domain of inquiry is also problematic in so far as it suggests historically implausible commonalities across territories that experienced very different techniques of domination. Thus, the critical impulse behind post-colonial theory has turned on itself, drawing attention to the way that it may itself be marked by the utopian desire to transcend the trauma of colonialism (Gandhi 1998).