This is the readme for the Illinois Intentional Tort Qualitative Dataset. The dataset was developed by Joseph Blass (joeblass@u.northwestern.edu) as part of his thesis work. He would love your comments and feedback on it! The dataset currently consists of 88 cases representing 112 claims in Tort, in Assault, Battery, Trespass, and Self-Defense. Each case is contained in its own file. The files are Knowledge Representation Files with extension .krf, designed for use in a system like NextKB. That said, you can open the files as text files or change the file extension to .txt without any trouble. The dataset is encoded in CycL-style predicate logic representation, of the form ( . . . ). Cases are also scoped within microtheories, keeping metadata about a case separate from the predicate logic representations of that case. Cases are defined within the NextKB ontology, publicly available at https://www.qrg.northwestern.edu/nextkb/index.html . The rest of the readme is concerned with explaining the different microtheories you can expect to encounter and the predicates used to describe the cases. CASE SYMBOLS Every case is given a unique symbol based on the case name. For example, the case Amos v. State has the unique symbol Amos_v_State. This symbol is used to attach information to the case and create microtheories. MICROTHEORIES Microtheories are a way of scoping knowledge contextually. In this dataset, when you see (in-microtheory ), that means that all subsequent facts are scoped in the microtheory , until the next in-microtheory statement appears. Microtheories can inherit from one another, such that within a child microtheory, a reasoner can see all knowledge from the parent microtheories from which the child microtheory inherits. Inheritance is expressed using (genlMt CaseLawCorpusCase). All other information is kept in a microtheory specific to the case itself, (CaseLawCorpusMtFn ), or in child microtheories of that case microtheory. For example, metadat for our case Amos v. State comes after the statement (in-microtheory (CaseLawCorpusMtFn Amos_v_State)) The submicrotheories of each case are: - (rawLanguageOutputMtFn ), which contains the raw language output of CNLU for that case; - (cleanLanguageOutputMtFn ), which contains the language output with erroneous facts removed, without the duplicate unnesting of facts that is a raw output of CNLU, and with temporal ordering statements rerepresented to indicate temporal ordering; - (LegalCaseMtFn ), which contains the cleanest output of the case facts, including having extra facts that are reintroduced at each statement (e.g., (isa Plaintiff)) removed; - (LegalCaseConclusionMtFn ), which contains the predicate logic representations of the case conclusions; and - variations on the LegalCaseConclusionMtFn (negated; reversed; both) for use in multiple choice tests. PREDICATES (caseType

) indicates that is a case illustrating the tort claim . Cases can have multiple such statements, e.g., many cases illustrate both assault and battery. (caseValence

) indicates whether case-name is a Positive or Negative example of the claim . (caseName

) encodes the original case name of (caseCourt ) shows which court was decided in (caseYear ) keeps the case year (caseReporter ) contains the index for the particular case reporter where was originally reported. Some of these are in Westlaw or LexisNexis specific reporters, but most are in Illinois or Federal reporters. (caseOriginalText

) contains the judicial opinion's original statement of the case facts. These are quotes taken directly from the judicial opinions and only minimally edited for coherence. (caseConclusion ) contains a plain english description of the case's legal conclusion. Cases can have more than one conclusion. (caseSimpleText

) contains the simplified english statement of the original case facts. Please refer to the 2022 JURIX paper for a description of how facts were simplified.