On Ground Facts

This post details a few related ideas:

The military recently launched a program to help automate the accuracy, legitimacy, and validity of hypotheses and findings in the social sciences.
The military has also launched a ground truth program.
My previous work on Truth-Grounding and the Liar Paradox.
Structured News proposal to combat fake news and misinformation.
Recent exposure to the ontologies used by W3 Consortium which define the Semantic Web and the World Wide Web.
How to verify statements and understand their value.

Can these ideas be combined into something practical, implementable, and useful?

Introduction

Automated theorem proving has enjoyed tremendous success and widespread use. Early systems relied on First-Order semantics and representations to verify the accuracy of statements.

Below, a smattering of widely-cited papers implementing such systems in PROLOG:

But these systems have remained largely in the background and have mostly been used to verify statements of pure mathematics and logic.

What about the following practical problems:

Fighting misinformation on social media platforms?
Combating scams and financial fraud through the content used to lure unsuspecting users rather than tracing financial transactions alone?
Settle debates or disputes of a legal or political nature in a fair way?
Assess the accuracy of scientific and academic research (when determining budgeting and public policy).

Adjacent Concepts

How can we take an arbitrary sentence P and determine whether it's true or not? How can we do this formally (in a way that can be automated and expressed mathematically)?

Soundness: starting from a set of statements known to be true (usually proven tautologies) called axioms, one may determine that any statement derived from them alone is also true (and a tautology for that matter).

i. Theorem Proving in LSPC

Statistical and probabilistic methods: given a background context (of data, events, or facts), we can reason about the probability that an expression or event is true or will occur.

i. We observed 100,000 6-sided dice throwings and saw that each face landed about 1/6 of the time.

ii. Given that a 6-sided dice, there is a 1/6 chance the dice will land on a 1 when thrown.

Decidability: a finite method exists to determine the truth-value of an expression.

i. Is there an effective procedure to prove the consistency of ZF Set Theory with the Axiom of Choice?

Conceptual Analysis: by clarifying the meanings of terms (concepts) we can evaluate the truth of an expression.

i. P := Bachelors have wives.

ii. Bachelors are unmarried men (this is the colloquial use, scientific meaning, and the definition in the standard dictionary).

iii. Therefore, P is false.

Coherence: given a theory we believe to be true, we can assess whether a second theory is false if it fails to cohere (produces a contradiction or falsehood when combined) with the first.

i. The ancient "four metaphysical humors" worldview does not cohere whatsoever with the worldview advanced by Quantum Mechanics (the most empirically verified scientific theory of all time).

The key step here is to get to a set of true sentences.

From a machine learning standpoint, the key step is to get to the correct training set.

How can these be accomplished for complex sets of data for which no such methods exist?

Discussion and Some Philosophy

Let's go back to logic. Not "logic" (what makes sense to me). Not "logic" (true or false, P implies Q, syllogistic square, etc.). Logic, the discipline, the activity.

The four conceptions of logic that I think are most compelling are the following:

Logic is the formal science of truth. (G. Frege)
Logic is the formal science of logical consequence. (JC Beall, David Ripley, G. Restall, etc.)
Logic is the study of proper thinking. (Aristotle)
Logic is the study of universal (truly unrestricted - they apply in any discipline, in any area of inquiry, at any time, at any place) methods of reasoning.

Of these, I think 1 and 2 are the most compelling conceptions.

I think that 2. is part of 1. but involves more. If 1. were simply 2. then logic would not tell us what was true (but only what we can do with statements once we know what they are). On that view, empirical science and mathematics plug in true statements that are then calculated using logic.

There's a problem with that view (which was popular among early Logical Positivists) which is that logics (involving non-logical concepts) are also employed in science and mathematics. The sequence of use is incorrect. That view also fails to consider the points described above.

So, I think the only fair characterization of logic is one in which logic:

Is not only the formal science of logical consequence.
But also determines how to judge truth-values for a sentence.

These questions have led to skeptical worries traditionally and it's worth addressing them here albeit briefly. These apply to several topics, including the one at hand. That's hardly surprising since truth is relevant to almost every human activity.

In Model Theory, we combine a set of sentences (of a formal language) with a structure (a domain of sets) that's taken to represent some physical, concrete, abstract, real, imaginary, or mathematical thing. But how do we know that the structures (the representations) are correct? Presumably, because the predictions we make using them match observable phenomena studied in science.

But then, how do we say that matching works? What is it, precisely, that this matching involves? Presumably, it's isomorphism, correspondance, or some other kind of representation relation.

But then, under what conditions do we say a relation of isomorphism holds? That is, under which conditions do we say that the assertion that two things are isomorphic is true?

This is basically philosophical skepticism reprised for representation. As it turns out, I'm not much of a philosophical skeptic but I'll talk about that elsewhere.

The upshot? Logic requires truth to evaluate the appropriateness of models prior to constructing some structure (a point of criticism by Graham Priest to my Truth-Grounding and the Liar). If judgement just amounts to evaluating whether or not a sentence matches a model, we've seemed to have missed the most important part (whether the model is an appropriate representation in the first place).

My quick reply:

Traditionally, philosophers and logicians have retreated to a theory of types (common in programming) to try to mitigate the damage from equivalent skeptical problems.
I am an Alethic Pluralist and believe there are multiple kinds of truth at work. And, I think I can be an Alethic Pluralist without invoking an infinite hierarchy of truths (though there are probably an infinite number of slightly variant truth predicates) nor a theory of types (within one theory).
I think we can separate truth simplicater from Truth-Grounding (as described in my paper) using notions like Crispin Wright's Entitlement of a Cognitive Project to move us from skepticism.
We have epistemic warrant enough to begin a scientific project (though not epistemic justification) and gain justification once our project aligns with phenomena ("you have to start somewhere").
As such, we can construct simple, initial, models (and be epistemically warranted in doing so) and then cast out those that fail to map to our observed results. Gradually, as we refine our models and they acquire predictive power we gain epistemic justification!
(This is the very argument I give to justify my proposal - it coheres most maximally with all other competitor views since it is entailed by any theory that preserves the analyticitiy of the T-Schema. Furthermore, it aligns the most with the empirical observation that four kinds of approaches have been made w.r.t. to truth-value of the Liar Sentence itself.)
My perspective is generally non-foundationalist: we often confuse temporal order with some vague notion of dependence and priority. If we adopt the use a "pyramid" metaphor to frame our thought experience we end up with the standard sketch: people build models by moving from a structure, to evaluating truth-values of expressions of a formal language, and then at the top of the "pyramid" comparing these expressions to some real-world phenomena.
As mentioned above, I reject the "pyramid" conception - models are constructed (through warrant not justification), then cast aside in light of "higher-level" empirical observations. It's not a "pyramid" at all it's a "milieu" - a flexible and free-floating backdrop not constrained by some simplistic geometric metaphor.
For example, we might use the "high-level" empirical observations to modify our most "fundamental" constraints. When we get a range of matches, we can feel justified (or at least more justified) than the alternatives. (Interestingly, ruling out alternatives has been identified as central to debates about skepticism and epistemic reasoning as a whole.)
Going the other way, we might think of "higher-level" empirical findings as being the foundation of our pyramid. Here, we might think that our models are constructed from the empirical observations and then we test our assertions against them.
In science, we see scenarios that are suitable for both ways of framing them using "pyramids" - observations guide model generation and models guide observations (Higgs Boson) - which suggests that metaphor is inappropriate (the point of a "pyramid" metaphor is that it is one-directional along some dimension under consideration - epistemic priority, being more fundamental than, etc.).
Ultimately, it's that fact that causes me to disagree fundamentally with the metaphor of a "pyramid". (I explain this much more elegantly - and totally dispensing with "pyramid" talk - here - sidenote: I believe that the use of pernicious metaphors infects academic philosophy.)

The upshot: I think models can be constructed without some of the traditional baggage associated with philosophical skepticism - no "pyramids" of foundationalist justification, a preference for warrant over justification, checking our traditional assumptions with evidence to refine our models (just like we do in machine learning when we refine a function or algorithm), and with no theory of types (Russell) or infinite hierarchies (Tarski) required.

This approach frames my larger reply to criticism about the Truth-Grounding approach.

The above considerations address some of the core philosophical critiques that have traditionally been run against such kinds of systems (and, which now have practical commercial, business, or applied facets).

Description

So, the idea is that we can build models, have those models refined by comparing them to some other, non-alethic, mechanism to generate accurate, alethic, models. Woah!

Example:

Ask two people A and B to each define a term.
A defines H := Husband(x)
B defines W := Wife(y)
Then A, B construct a model for H and W (a domain in which there are disjoint singleton sets defining the extension of H and W ). Simple.
Here, there's no need to map H or W to some empirical observation at all. It's a stipulation and one concurrent with linguistic norms. (Conceptual Analysis.)
A, B agree that the domain is correct as well as the predicates.
A, B are then asked to define additional predicates (and so on) until the predicate constants so defined comprise a rich vocabulary.
When and where predicates lead to conflict in terms of definitions and meaning A, B can perform an empirical survey (The cat is on the table vs. The cat is not on the table) when needed for complex predicates, definitions, or expressions.

Most of these disputes will be disagreements about state representations (about the facts, as opposed to conceptual stipulation).

A, B can assign levels of confidence (if it's desired) to indicate the strength of their confidence in some definition being accurate or true.

Now:

A, B get into a dispute (The cat was on the table at 2PM vs. The cat was not on the table at 2PM).
Suppose A, B begin to argue about what happened at a certain time (let's assume that a sequence of events has been defined).
We will be able to employ straightforward automated proving from this set of definitions (as if they were axioms - à la soundness and decideability) to verify which person is correct.
Statistical methods can be employed where evidence is missing, where definitions are incomplete, or where confidence is not high.

Key Takeways:

The participants must agree to some stable set of definitions.
A formal language must exist to express all states defined or expressed above. It must be rich enough to meaningful capture all relevant points of data and powerful enough to do so in succinctly and efficiently. (The received wisdom is that second-order representations are required to express the axioms of basic Set Theory.)

That's just Set Theory and doesn't include higher areas of mathematics, complex human dynamics (which most people commonly hold cannot be expressed mathematically), etc.

Every term must be unambiguously defined - must have a distinct logical form or range of determinate logical forms.

Some approaches to these kinds of problems involve text translation and processing (to generate the models). What if one started directly with the formal language and models?

What about the submission of new facts - e.g. refining the models, definitions, or adding new ones? What process best adjudicates this process?

Some related ideas were previously sketched out here.

Practical Applications

Truth and Falsity Detection - is a single declarative statement true or false?
Theory and Model Verification - is a theory (in the formal Model-Theoretic sense - e.g. as a class of well-formed sentences expressed in a formal grammar).
Truth Filtering - the application of multiple statistical methods to provide various overlaying metrics to assess the credibility of a statement or theory. One of the key fears about a singular truth system is that one system would rule them all. Instead, multiple systems are used and overlaid to generate a range of evaluations (potentially one for each system - these can also be combined to give an aggregate evaluation).
Debate resolute - settle arguments on the basis of some agreed upon base assumptions.
Legal dispute resolution - compare two sets of statements and determine which is more likely to be true.
Structured News - facts (but not opinions about them nor how they are expressed) would be verified across news platforms. This is currently present for stock and a lot of financial data through templating platforms like Automated Insights.

post: 5/23/2020
update: 5/25/2020
update: 6/7/2020

Links

On Ground Facts

Introduction

Adjacent Concepts

Discussion and Some Philosophy

Description

Practical Applications

post: 5/23/2020

update: 5/25/2020

update: 6/7/2020

Contents