Generalizing
What if they were just numbers?
Utilizing our knowledge in a consistent number system, we know the fact that
and to solve for , we would
In more familiar expression, we know that
and this fact is not trivial. Focusing on
this summarizes the implied ‘equality of relevance’ into a single equality, which does not directly compare compare each objects but their cross products.

Can we generalize this to ARC tasks?

We do not have a ‘number system of ARC grids’ neither the operation defined on such number system. Let’s generalize such operator.

We can train an operator such that for all pairs of input-output pairs,
Such trained operator will form a small consistent system of ARC Grid pairs where those ‘analogical invariance’ holds.
Then, how do we use it to infer the output?
In number systems, the reason we were able to solve for was we have the multiplicative inverse.
But, in our learned operator, we only have the ‘multiplication’ defined, and we cannot directly find the multiplicative inverse.
If we did not know the concept of division, we would have solved for by trying out values for which satisfies the equation . It is basically searching the space of to find a solution that satisfies the expression. We can do something similar.

To solve for , we can first calculate with the learned operator.

We can search the space of to find one such that the calculated analogical invariance from is satisfied.

The search algorithm here is gradient descent, and this process is the inference in test time.
To make this search via gradient descent in test-time feasible, the will be held back to the normal distribution by regularizing by its divergence from normal distribution, like VAE.
From training example pairs, we can train the neural operator on pairs. Furthermore, the learned neural operator only has to capture the analogical invariance, instead of both learning to represent the input and output, and the implied program, all entangled together.