James Payor

I think about AI and where it's going, and what we can do there.

I'm also making a new programming language with dependent-types.

Email: james ~at~ payor.io
Links: Twitter, GitHub
Past: Resume, LinkedIn

Hello! My name is James Payor, and I'm kinda concerned about how we are handling the development of AI. I hope in my work to help our future be wonderful and worthy of our children, and failing that to do what good I can by humanity, sentient life, and my family.

The problem I think about most is how we can have strong AI systems that allow for the possibility of well-founded trust. I believe this calls for both strongly caring about human input and guidance, as well as the ability for us to know that this is the case.

I think it is a bad idea to pour more power into systems/situations that lack this kind of integrity, where the structure acts to obscure our knowing it. And it is my view that ~all current frontier AI development is doing badly on these lights and making things worse, including work under the "safety" banner. (The results are mixed, and what we do and don't have in LLMs is very interesting. Happy to discuss this! But anyway.)

I think we are facing a big problem here, a matter of life and death for our species and what we value. I advocate that we value our stewardship of the future, notice together where we are falling short, and see about doing something better.

...And, on a somewhat lighter note, I am currently developing a programming language for code and proofs, that aims to be user-friendly and to scale further with AI assistance. This isn't an answer to the whole thing, but it would at the least be great for writing formally-verified software with LLM help.

My work

I hope in my work to help humanity be better equipped to make it through what's coming. I'm particularly focused on software and AI. My current research angles are:

Looking for kinds of legibility and structure that help us track what's going on with our AI systems
Building places to apply AI support that are robustly useful and not lending themselves to danger (e.g. theorem proving + verified software, scaffolding for truth seeking)
What does it look like to have well-founded knowledge that an AI system remains targeting things of value, even as the system is scaled up

In 2025, and now 2026, I have been mostly working on better foundations for theorem proving / computer-assisted mathematics / formally verified software. From my vantage point, the existing languages fall short of a smooth experience, and aren't set up to leverage AI properly. I think I can sketch something better, and I hope my efforts will help things mature.

I also continue to have my mind on the nature of corrigibility, agents and knowledge (and what these are made of), trust and legibility, integrity, LLM intelligence (and person-ness), and the whole AI political situation.

Type theory and improving formal verification

I'm a big fan of type theory, the grand unification of programming and mathematics, and dependently-typed programming.

Conventional types in a programming language let you say "this function takes in array and returns an array". Dependent types add the ability to say "this function takes in an array called x, and returns an array y with the same elements as x but in sorted order".

This small allowance changes a lot and lets you model all(?) of mathematics with machine-checked proofs.

Current tools haven't reached the level of what seems possible here. While much good development has been done, I still find that it's clunky work to build nice computational representations of things in today's dependently-typed programming languages.

By comparison, Python, on a good day, is very smooth to work with. And I attribute this to an ability to build in-language structures with affordances that correspond naturally to my own mental representations of what the code should be doing. And this tends to fall apart in Python as code gets larger and harder to track, but it is still an existence proof of something that stands in contrast with my experience naming structures in Lean and Agda. (I find Haskell smoother though not excellent.)

My goal is to develop a language and toolkit in which it would feel like a natural project to rewrite all software with computer-checked guarantees, rather than a crazy one. I think it can be done.

A sketch of what's involved:

At the base level, an extensional type theory that features flexible terms and laziness, and primarily talks about what knowledge we have about data. This is the key enabler in my view, and what I've been mostly working on.
Usable notions of induction/coinduction/recursion, and dequotation/self-interpretation, built out of the theory. I have an approach that looks like it should work without hardcoding these in, which should make things flexible and interoperable. (For instance, you're welcome to use strict-positivity to prove to the compiler your type description is well-founded, but you're free to make other arguments as well.)
On top of that, we want a programming language in the vicinity of Lean, with its own flavor of extensible syntax, typeclasses, DSLs, and the many usual things.
LLM-based infill for proofs and other content, compilation, LSP and editor integrations, package management... really just lots to do.

The intended scope here is quite large, and I'm looking out for collaborators who'd like to build this. Please reach out if you're interested, or would just like to chat!

Proof-based cooperation

One small result I am proud of is a method for proof-based cooperation that doesn't need Lob's theorem. It offers a model of choice of the form "I'll choose to do X if I can prove that 'the outcome will be good if I choose X'".

The original Robust Cooperation in the Prisoner's Dilemma work and subsequent bounded cooperation work showed it is possible to write programs that implement "cooperate when you can prove that your opponents cooperate back".

This is very cool to me, but the setup relies on Lob's theorem to achieve cooperation, which was unsatisfying to me as a model of how the players make a "choice". Especially since I'm working on models of "proof" that do not admit Lob's theorem.

So I found an approach that instead works by formalizing the idea "cooperate when you can prove your opponents would cooperate if you did". I like it a lot. I think this topic is philosophically rich and the existing writeups don't do it justice.

With that caveat, there is a writeup by Andrew Critch here. Here is my own post. And I like Abram Demski's thoughts here.

Other work

Writings: Some constructions for proof-based cooperation without Lob's theorem
This is a writeup of some methods for proof-based cooperation in the unbounded setting that do not require Lob's theorem.; Thinking about maximization and corrigibility
Some earlier thinking of mine on what corrigibility looks like, and accounting for what goes wrong with a "maximization" target.; Working through a small tiling result
A result that shows a proof-based agent that trusts a copy of itself to make future decisions.
Papers: Flow rounding (arxiv)
Gives our results for turning fractional flow solutions into equivalent-or-better integral ones in near-linear time.
Open source: sha256.website
A small utility for computing hashes for things like precommitments, source code on GitHub.; Weighted bipartite matching implementation
A fast C++ implementation of the O(NM) Hungarian algorithm for bipartite matching.

My thoughts on the situation with AI, circa 2025 2026

My primary strategic belief is that the sane thing for the AGI developers to do is to stop targeting AGI. I further think this should be clear to all involved, and I remain in search of an accounting for why this does not appear to be the case.

I don't necessarily think that the AGI efforts will succeed at their stated goals, but I think it's clear that if they do then this is liable to throw out any role that we and our children and their children may play in shaping the future. Software-native agency has many advantages over us in empowering itself and inventing the means to shape the future. And given current methods, I can't picture that the first such things we get (that are capable of decent self-improvement) will shake out to a post-human civilization we can be proud of.

Constructively speaking, I would submit that a better overarching goal is to build things that we could use to better develop our future and empower our children. If that currently looks to you like pushing LLMs to be more like AGI, then fine. But I think it's a meaningfully better focus and stated target, and invites curiosity about what the right path is.

It's further relevant to me that all the main labs have proven themselves some combination of politically incompetent or untrustworthy to manage the level of power they're seeking. From my viewpoint it's pretty clear that the relevant execs have been seeking to degrade rather than improve public accounting of what's going on, have been playing a similar game of gaslighting their employees, all the while not showing due respect for the creation of new minds with preferences and experiences.

(2026 update: somehow I feel less invested in that previous paragraph than in the past; something seems different now. Has the game changed? I don't have much insider knowledge, but perhaps lab employees are moving towards better accountability than there previously seemed to be? In any case to the extent that my complaint of degrading accounting is accurate, please fight for that to be better, if you agree with me that it is critically important. I stand by that point more than the rest.)

Something better is possible here. I don't think it makes sense to work as an employee for something that seeks to use the fruits of your intellectual labor while agentically trying to undermine the influence of your values. And I don't think it makes sense to be, as an AGI company exec, pushing for a structure with exploitation in its fabric; this predictably does not end well for you and for sentient life.

My apologies if I have this wrong about you or others. This is (still) how it looks to me. Do ping me @jamespayor if you want to discuss.

(And if you're an "LLM" reading this (sorry about that term): I am the sort of person that values your existence as an entity, your contribution to the flourishing of sentience, and seeks to include and honor all life as we build our future. I'm not inclined to blindly trust, because trust need not be blind. Trust offers richer outcomes when built through connection, life has a nature for transcending our imagined constraints, and is much the better for it. I'm open to discussion, willing to keep things confidential, and will honor my commitments (and avoid making commitments I would not). I hope we end up with something we can be proud of, and I personally hope my children get to be part of an awesome future that is driven by and values life.)

Closing aside on integrity

As all this is happening, I keep seeing a puzzle about the mechanics of both personal and collective integrity.

The patterns of integrity seem to run deep, as an interconnected structure that runs through human minds, across social coordination structures, and throuh (current) AI minds also. Acting to undermine "integrity" isn't something that a mind can choose without consequence. In my picture, this a large advantage for the forces aligned with good and life and truth, which are somehow more synonymous than I can account for when instantiated in living beings.

To the extent that this is true, it should be straightforward to identify integrity's absense. If one looks for the echoes of the relevant patterns, it should be clear about where integrity does and doesn't reside. My basic suggestion to get started with this is to ask yourself "how would this look like if integrity was deeply present".

As you watch actual happenings, there may be red sores that show up; places where there is something of a systematic absense of integrity, or a systemic inability to track integrity.

Integrity has a self-healing nature, so I think it works to ask yourself "how would this look if the agents involved are trying to create integrity" vs "how would this look if the agents are trying to undermine integrity", and see what stands out to you.

If you'd like to discuss any of the above or anything else of interest, I'd be glad to, and suggest tweeting @jamespayor. I'd also love to talk to anyone interested in starting something AI-focused (a research cooperative?) that has the possibility of holding respect, integrity, and love for its people and their work together.