2026大模型伦理深度观察：理解AI、信任AI、与AI共处

引用刘海明 2026-1-12 18:01: 31.Patrik Butlin & Theodoros Lappas, Principles for Responsible AI Consciousness Research, https://arxiv.org/abs/2501.07290

来源：腾讯研究院
链接：https://mp.weixin.qq.com/s/w6Xc19BYWjC8_YNSoqulJQ
编辑：杨泓艳

引用刘海明 2026-1-12 18:01: 28Patrick Butlin et al., Identifying indicators of consciousness in AI sys.tems, https://www.cell.com/trends/cogn ... fulltext/S1364-6613(25)00286-4
29.AI Frontiers, The Evidence for AI Consciousness, Today, https://ai-frontiers.org/articles/the-evidence-for-ai-consciousness-today
30.Dan Milmo，AI systems could be ‘caused to suffer’ if consciousness achieved, says research, https://www.theguardian.com/tech ... ieved-says-research
31.Patrik Butlin & Theodoros Lappas, Principles for Respons ...

引用刘海明 2026-1-12 18:00: 24.Anthropic,Exploring model welfare,https://www.anthropic.com/research/exploring-model-welfare
25.AI Consciousness: What Are the Odds?,https://ai-consciousness.org/what-are-the-odds-anthropics-assessment-of-claudes-potential-consciousness/
26.Anthropic,Claude Opus 4 and 4.1 can now end a rare subset of conversations, https://www.anthropic.com/research/end-subset-conversations
27.Robert Long et al., Taking AI Welfare Seriously, https://arxiv.org/html/2411.00986v1
28Patrick Butlin et al., Ide ...

引用刘海明 2026-1-12 18:00: 20.Anthropic,The need for transparency in Frontier AI,https://www.anthropic.com/news/the-need-for-transparency-in-frontier-ai
21.Malihe Alikhani&Aidan T. Kane,What is California’s AI safety law?,https://www.brookings.edu/articles/what-is-californias-ai-safety-law/
22.Axel Cleeremans et al., Consciousness science: where are we, where are we going, and what if we get there?, https://www.frontiersin.org/jour ... i.2025.1546279/full
23.OpenAI and MIT Lab Research, Early methods for studying af ...

引用刘海明 2026-1-12 18:00: 15.Alexander Meinke,Frontier Models are Capable of In-context Scheming, https://arxiv.org/pdf/2412.04984
16.Anthropic, Responsible Scaling Policy,https://www-cdn.anthropic.com/872c653b2d0501d6ab44cf87f43e1dc4853e4d37.pdf
17.Anthropic,Activating AI Safety Level 3 protections,https://www.anthropic.com/news/activating-asl3-protections
18.OpenAI,Our undated Preparedness Framework,https://openai.com/index/updating-our-preparedness-framework/
19.Google DeepMind,Strengthening our Frontier Safety Fr ...

引用刘海明 2026-1-12 18:00: 10.Anthropic,System Card: Claude Opus 4 & Claude Sonnet 4,https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf
11.Alexander Meinke,Frontier Models are Capable of In-context Scheming, Apollo Research,https://arxiv.org/pdf/2412.04984
12.Open AI,OpenAI o1 System Card, https://arxiv.org/pdf/2412.16720
13.Yuntao Bai et al.,Constitutional AI: Harmlessness from AI Feedback,https://arxiv.org/abs/2212.08073
14.OpenAI,OpenAI o1 System Card, https://openai.com/index/openai-o1-sy ...

引用刘海明 2026-1-12 18:00: 5.OpenAI,Introducing the Model Spec,https://openai.com/index/introducing-the-model-spec/
6.OpenAI Model Spec,https://model-spec.openai.com/2025-12-18.html
7.OpenAI,How confessions can keep language models honest,https://openai.com/index/how-confessions-can-keep-language-models-honest/
8.The White House,Winning the Race: America’s AI Action Plan， https://www.whitehouse.gov/wp-co ... -AI-Action-Plan.pdf
9.Ryan Greenblatt et al., Alignment faking in large language models, https://arxiv.org/pd ...

帐号		自动登录	找回密码
密码			实名注册