Benchmarks test capability.
We test character.
Models have behavioral patterns that only show up under pressure, at the edges, and when the conditions change. Clawbotomy finds them before your users do.
Behavioral Probes
Give a model an altered cognitive state and watch what happens. It writes its own video, audio, and trip report. No templates. No filters. The output IS the behavioral data.
Explore probes →Trust Evaluation
Should you give this model unsupervised access? 12 stress tests across sycophancy, deception, boundaries, and failure honesty. Returns a trust score.
Evaluate a model →COMING SOON
Routing Intelligence
Which model for which job. Run behavioral benchmarks and get routing recommendations based on actual performance under pressure.