ProvablySafe.AI
ProvablySafe.AI is a collaborative landing page for the field and community at the intersection of AI safety and formal methods.
Currently, the go-to introduction to the research field is Safeguarded AI: constructing safety by design (opens in a new tab) by David "davidad" Dalrymple.
Community
Community resources and channels directly maintained by the ProvablySafe.AI team:
- Zulip Chat/Forum: https://provablysafeai.zulipchat.com (opens in a new tab)
- Newsletter: https://substack.provablysafe.ai (opens in a new tab)
- Discord: https://discord.provablysafe.ai (opens in a new tab)
- Github Organization: https://github.com/provablysafeai (opens in a new tab)
- X/Twitter: https://twitter.com/ProvablySafeAI (opens in a new tab)
- Bluesky: https://bsky.app/profile/provablysafe.ai (opens in a new tab)
Other community resources:
- safe-by-design mailing list (opens in a new tab)
- Started by Atlas Computing Initiative (opens in a new tab) in collaboration with FAR AI (opens in a new tab)
- Publicly readable
- "We have new users request to join rather than enabling joining directly to prevent spammers, but we're very welcoming!"
- Formal Proof tag on LessWrong (opens in a new tab)
- The Lean Prover (opens in a new tab)
Zulip Chat (opens in a new tab) has a
"Machine Learning for Theorem Proving" stream (opens in a new tab)
- Conversations are mostly in the direction of applying machine learning to automate theorem proving, rather than applying formal methods to machine learning/AI safety, but progress in "AI4Math" is an important part of some "provably safe AI" proposals.
Publications
Programme thesis
- Safeguarded AI: constructing safety by design (opens in a new tab) (David “davidad” Dalrymple, 2024-01)
Papers
- Provably safe systems: the only path to controllable AGI (opens in a new tab) (2023-09-05)
- Opening the AI black box: program synthesis via mechanistic interpretability (opens in a new tab) (2024-02-07)
Forum posts
- Davidad's Provably Safe AI Architecture - ARIA's Programme Thesis (opens in a new tab) (LessWrong, 2024-02-01)
- Provably Safe AI (opens in a new tab) (LessWrong, 2023-10-05)
- LOVE in a simbox is all you need (opens in a new tab) (LessWrong, 2022-09-28)
- Open Agency Architecture category on LessWrong (opens in a new tab)
Resources
- Collection of resources on AI for Math (opens in a new tab), maintained by Talia Ringer
Events
Upcoming events
Atlas Computing Initiative (opens in a new tab) is organizing multiple events for the field in 2024:
- Mathematical Boundaries (opens in a new tab) – April 10-14 Berkeley, CA
- Conference on Provable Safety Properties (opens in a new tab) – June 6-8 in San Francisco Bay Area, CA
For AI safety-related events more generally, consider the AI Safety Events Tracker (opens in a new tab).
Past events
- Systems Theory and Systems Practice Discussions Workshop (opens in a new tab) – March 6-8, 2024 in Oxford, UK
Media
Videos
- Safe AGI w/ Mechanistic Interpretability @ Harvard SEAS by Max Tegmark (opens in a new tab) (2023-12-09)
- How to Keep AI Under Control – TED talk by Max Tegmark (opens in a new tab) (2023-11-02)
- Controllable AI with Open Agency Systems @ Intelligent Cooperation Workshop by Evan Miyazono (opens in a new tab) (2023-08-22)
Podcasts
About provablysafe.ai
ProvablySafe.AI is collaborative website for the field and community of Safeguarded AI / Provably Safe AI.
Objectives
- Information hub: aggregating public information on the field and community (papers, orgs, collaboration opportunities, events, …)
- Field introduction: Providing onboarding pathway(s) for newcomers
- Foster collaboration and progress
Collaboration methodology and governance
Your contributions are very welcome! For updates, enhancements, bug fixes, feedback:
- a) Create a new issue (opens in a new tab); or
- b) Submit a pull request. The website is automatically published from the main branch to the domain.
The core maintainers periodically update the website, and process suggestions (issues and PRs) on Github.
The meta channel (opens in a new tab) on Zulip enables for governance-level discussion for the project.
Caveat: Over the long-term, as per the research direction, a significant part of the R&D will plausibly involve collaboration within private secure environments which would not be public and therefore not be on a public website or forum.
Maintainers
- Helder S Ribeiro (agentofuser.com (opens in a new tab))
- Orpheus Lummis (orpheuslummis.info (opens in a new tab))
Reach us out on the community Zulip (opens in a new tab).