This post is sort of a translation and a follow-up of a post in Spanish about the comparison between *naïve *and *axiomatic* set theory.

The point I made in the previous post is that

One leaves naïve set theory in the moment that

first order logic(FOL) gets explicit.

Or, from a different perspective, when you realize in full the possibility of different models of set theory.

In FOL, we can express properties by using every operation, relation and distinguished elements (“constants”) that come with the structure on which we are interpreting FOL formulas, as well as propositional connectives and equality. The main difference between FOL and usual mathematical language is that *quantifiers *$\forall x$ and $\exists x$ can only be used with variables $x$ ranging over the universe of discourse. For instance, with the structure $\mathbf{N}\doteq\lb\N,+,\cdot,0,1\rb$ of *arithmetic,* we can use the operations to write down polynomials with natural coefficients,

\[p(x,y) \doteq x^2 + y^2,\]

\[q(z) \doteq z^2,\]

and write FOL formulas asking if for every natural number $n$ there exist $x,y $ such that $n=p(x,y) $, or stating that there exist $x,y,z $ such that $q(z) =p(x,y) $.

The principle of induction (which characterizes $\mathbf{N}$ up to isomorphism) can’t be expressed in FOL. This can be shown by using the Compactness Theorem, that states that given a set $\Phi$ of FOL sentences such that every finite subset of $\Phi$ has a model, then the whole set has a model. With this tool, one can construct *non-standard models of arithmetic*, that is, models $\mathbf{N^*}\not\iso\mathbf{N}$ satisfying every FOL sentence holding in $\mathbf{N}$. This can be achieved surprisingly easily: take $\Th(\N)$ to be the set of FOL sentences holding in $\mathbf{N}$, and take a new constant symbol $c$. Now take

\[\Phi \doteq \Th(\N) \cup \{ c > \overline{n} : n \in \N\},\]

where $ \overline{n}$ is the term $1+\dots +1$ with $n$ ones. Every finite $\Phi_0\subseteq\Phi$ is easily seen to have a model (take $\mathbf{N}$ and interpret $c$ as a big enough natural number), so there is a model of $\Phi$. In this model, the (nonempty) set of numbers that are not of the form $1+\dots+1$ (strictly speaking, $\overline{n}^{\mathbf{N^*}} $) has no minimum.

ZFC is a set of first order sentences written by using an additional relation symbol $\in$, and hence they are interpreted in structures of the shape $\lb M, E\rb$, where $M$ is the universe of discourse and $E$ is a binary relation on $M$. In the case of *transitive models* of ZFC, the elements of the universe of discourse $M$ **are sets** indeed, and the relation $E$ is the restriction of the **actual** $\in$ relation to $M$.

One example where things are not naïve is the following. You can write in FOL the formula

\[ \forall x\subseteq y : \phi (x)\]

that reads, naively, “every subset of the set $y$ satisfies the property $\phi$”. But when you realize that “$\forall $” in FOL only speaks about elements of the universe of discourse, the formula must be read as

every subset of the set $y$

that belongs to the modelsatisfies the property $\phi$.

This whole post was inspired by a discussion following one question in math.stackexchange.com. There I asked if one could construct alternative models of ZFC by using Compactness. More precisely, if given a model of ZFC, one can construct a model with non-constructible sets in it; you need this in order (for example) to refute the continuum hypothesis.

I was motivated by the construction of $\mathbb{N^*}$, where the extraneous set of non-standard natural numbers fails to have a minimum. But, as the answer to my question by Asaf Karagila elucidates, you can’t do it that way. And the main reason is the one discussed in this post: Although a new model obtained by using Compactness may have new subsets, its first order properties will be preserved (as above). And actually, *“every set is constructible”* ($V=L$) is a FOL statement about a model, and as such it will hold in the new model whenever it does in the older. In other words, you need to tweak the model preserving some FOL properties (i.e., the ZFC axioms) and changing others (CH, $V=L$, …). And it seems that the only tool available to perform this fine-tuning is forcing.