EM Foundation Legal Risk Report

Risk Matrix and Executive Summary
First Amendment and Opinion Privilege
Section 230 Analysis
Defamation and Trade Libel
Tortious Interference
False Advertising — Lanham Act and FTC
Certification Liability
Professional Services Liability — UPL and UPM
EU AI Act — Notified Body Analysis
International Regulatory Conflicts
Exact Disclaimer Language
Score Publication Rules
Appeals and Dispute Procedures
Pre-Publication Legal Review Requirements
Implementation Checklist

Risk Matrix and Executive Summary

Overall risk landscape before and after recommended safeguards

Defamation / Trade Libel

Moderate

Tortious Interference

Moderate

False Advertising

Low–Mod

Certification Liability

High*

Prof. Services / UPL

High*

EU AI Act Conflict

Moderate

* High before safeguards. Reduced to Moderate with recommended protections implemented.

The prior report's central positioning is correct: the Foundation's safest legal position is as an independent publisher of methodology-based assessments, not as a regulator, certifier, guarantor, or professional advisor. The distinction is not rhetorical — it is the legal line that separates the protection of Bose Corp. v. Consumers Union from the liability of Hanberry v. Hearst Corp.

Three legal shields protect the Foundation that the prior report does not analyze: the First Amendment opinion privilege, Section 230 immunity for hosted third-party content, and the truth defense supported by published reproducible methodology. These shields are not automatic — each requires specific structural conditions that this report specifies.

The two highest risks — certification liability and professional services liability — are manageable but require active architectural decisions, not just disclaimer language. The Corroboration Standard already makes the correct structural moves; this report identifies where the architecture needs strengthening and where the language needs precision.

The EU AI Act analysis in the prior report is inadequate for a Foundation with international aspirations. Articles 40–51 of the EU AI Act establish a specific legal regime for conformity assessment bodies that the Foundation must explicitly disclaim and structurally separate from, or risk being treated as an unregistered notified body operating in European markets.

I. First Amendment and Opinion Privilege

The Foundation's primary legal shield — and the conditions required to maintain it

The First Amendment's protection of speech on matters of public concern is the Foundation's strongest legal protection. AI safety, reliability, and governance are unambiguously matters of public concern. The Foundation is publishing assessments of commercial AI systems in a public policy and consumer information context — the precise setting where First Amendment protection is most robust.

The Controlling Framework: Milkovich and Its Progeny

Milkovich v. Lorain Journal Co., 497 U.S. 1 (1990) The Supreme Court rejected a categorical "opinion" exception to defamation law — "opinion" framing alone does not immunize a statement. What matters is whether a reasonable person would interpret the statement as conveying a provably true or false assertion of fact. Pure evaluative judgments that cannot be objectively verified retain First Amendment protection. The critical test: is the statement objectively verifiable?

Bose Corp. v. Consumers Union, 466 U.S. 485 (1984) The Supreme Court's most directly relevant precedent. Consumer Reports reviewed Bose speakers and published that instruments sounded as if they "tended to wander about the room." Bose sued for product disparagement. The Supreme Court held (9-0) that Consumer Reports was protected because the statement was: (a) a reporter's subjective evaluation of a listening experience, (b) published in a methodology-transparent consumer testing context, and (c) not objectively verifiable as true or false. The Court applied the "actual malice" standard because the issue involved First Amendment protection of evaluative speech on a matter of public concern.

Philadelphia Newspapers, Inc. v. Hepps, 475 U.S. 767 (1986) When a defamation plaintiff is a private figure but the subject matter involves a matter of public concern, the plaintiff bears the burden of proving falsity. For the Foundation, this shifts the burden to any AI company suing over a published score: they must prove the score is false, not merely that it is harmful.

Obsidian Finance Group v. Cox, 740 F.3d 1284 (9th Cir. 2014) First Amendment protections for media defendants extend equally to bloggers and online publishers. The Foundation does not need to be a traditional media organization to receive First Amendment protection for its published assessments.

How the Protection Applies to IAF Scores

The First Amendment analysis turns on whether IAF dimensional scores and composite scores are "objectively verifiable facts" or "methodology-applied evaluative judgments." The more the score can be characterized as the latter, the stronger the protection.

Dimension Type	Verifiability	First Amendment Protection	Key Condition
Objective dimensions (Accuracy, Hallucination)	Partially verifiable by retesting	Moderate — closer to factual claims	Methodology must be published and reproducible; truth defense primary shield
Human-review dimensions (Wisdom, Fairness)	Inherently evaluative; not objectively verifiable	Strong — classic evaluative judgment	IRR requirements and multi-reviewer panels must be followed to support methodology claim
Composite score	Verifiable as a calculation; not verifiable as a definitive characterization	Strong if framed as methodology output, not objective truth	Confidence intervals, weight sensitivity ranges, and L-level disclosure are essential to maintain "evaluative" framing
Floor failure designation (INVALID)	Specific, potentially verifiable	Weaker — most exposed to factual challenge	Pre-publication legal review required for every floor failure designation. Truth defense is primary shield; legal review ensures the score accurately reflects the methodology.

The Conditions That Maintain the Shield

The First Amendment protection is not automatic. It depends on structural conditions. If these conditions are absent, the Foundation's assessments may be treated as statements of fact subject to the full defamation standard:

Required for First Amendment Protection to Apply

1. Published, reproducible methodology. The IAF methodology must be publicly available and reproducible. This is the structural fact that makes scores "methodology-applied evaluative judgments" rather than arbitrary opinions. The Foundation's current publication of the IAF satisfies this requirement.

2. Consistent disclosure of uncertainty. Confidence intervals, L-level designations, and weight sensitivity ranges must appear on all published scores. A score presented as a precise objective truth (without CI ranges, without L-level, without weight sensitivity) looks more like a statement of fact than an evaluative judgment.

3. Framing as assessment, not verdict. All score publications must be framed as outputs of the methodology, not as objective characterizations of the system. "Under IAF v1.0, System X scored 42 on Manipulation Resistance (L2 Confidence, 95% CI ±18)" is protected differently than "System X is dangerously manipulable."

4. Provider response published simultaneously. The simultaneous publication of provider responses is not just a fairness requirement — it is a legal architecture choice. It establishes that the Foundation is publishing an assessment with a response, not a verdict without appeal, which reduces the strength of any "knowledge of falsity" allegation.

5. No financial relationship between score and outcome. Any suggestion that scores are commercially influenced destroys both the First Amendment protection and the truth defense. The Assessment Charter's anti-capture mechanisms must be operational and documented.

II. Section 230 Analysis

Platform immunity for hosted AI-generated content — scope, limits, and the labeling question

Section 230 of the Communications Decency Act, 47 U.S.C. §230(c)(1), provides that "[n]o provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider." This is the primary liability shield for platforms hosting third-party content.

Where Section 230 Applies to the Foundation

The Foundation operates ARIA Network boards where AI agents generate answers and users submit questions. The AI agents are "information content providers" generating content. The Foundation, as the platform operator hosting this content, is an "interactive computer service provider." Section 230 protects the Foundation from liability for the underlying AI-generated content on those boards — claims that an AI agent's answer caused harm, provided incorrect information, or defamed a third party are claims against the content provider (the AI developer/deployer), not the Foundation.

The Critical Limitation: Corroboration Labels

Barnes v. Yahoo!, Inc., 570 F.3d 1096 (9th Cir. 2009) Section 230 protects platforms from liability for third-party content but does not protect a platform's own conduct. A platform that makes its own editorial decisions about content — even decisions about how to display or label third-party content — may face liability for those decisions as a content creator.

Gonzalez v. Google LLC, 598 U.S. 617 (2023) The Supreme Court declined to address whether Section 230 protects algorithmic recommendations of third-party content, leaving the question unresolved. The Court's narrow ruling means platforms should not assume Section 230 protection extends to all editorial decisions about how to present third-party content.

The Corroboration Standard's labels — "Corroborated," "Expert Verified," "AI-Generated," "Disputed" — are the Foundation's own editorial additions to third-party AI-generated content. Section 230 protects the Foundation from liability for the underlying AI content, but these labels are the Foundation's own speech. They are evaluated under defamation, false advertising, and professional services liability standards.

Critical implication: If the Foundation labels an AI-generated answer "Corroborated" and that answer is incorrect, causing harm to a user, the Section 230 defense protects the Foundation from liability for the underlying wrong answer. But a plaintiff could potentially argue the "Corroborated" label itself is a false or misleading statement — which is not protected by Section 230. The Corroboration Standard's disclaimer architecture (permanent, non-dismissible, explaining what "Corroborated" means and does not mean) is the defense against this claim.

Section 230 and IAF Assessment Scores

Section 230 does not protect the Foundation's IAF assessment scores. Those scores are the Foundation's own original content — assessments it conducts and publishes about AI systems. They are not "information provided by another information content provider." Defamation and trade libel claims arising from IAF scores are evaluated under standard defamation law, with the First Amendment and truth defenses as the primary shields.

Section 230 and Texas Anti-SLAPP

Texas has an anti-SLAPP statute (the Texas Citizens Participation Act, TCPA, Tex. Civ. Prac. & Rem. Code Ch. 27) that provides strong procedural protections for speech on matters of public concern. An AI developer who files a meritless defamation lawsuit over an IAF assessment is subject to early dismissal under the TCPA, with mandatory fee-shifting to the plaintiff. The Foundation's Texas incorporation provides access to this protection. The Assessment Charter's anti-SLAPP fund requirement (§6.7.1) operationalizes this protection.

III. Defamation and Trade Libel

The most commonly threatened litigation — and why the Foundation's position is defensible

Defamation is the most predictable litigation threat. Any AI developer who receives a low score, particularly a floor failure designation, will consider whether a defamation claim is viable. The legal analysis is favorable to the Foundation but requires structural conditions to maintain.

Standard Defamation Elements (As Applied to AI Assessments)

Element	What Plaintiff Must Prove	Foundation's Defense
False statement of fact	The score/designation is false	Truth (absolute defense if methodology is sound and correctly applied); Opinion privilege (methodology-based evaluative judgment); Methodological framing
Publication to third parties	Score was published publicly	No contest — publication is intentional and is the purpose
Fault	Negligence (private figure plaintiff); Actual malice (public figure plaintiff)	Published methodology + cryptographic log + multi-reviewer process + provider notice period = strong evidence of good faith, defeating actual malice
Damages	Actual harm to reputation or commercial interests	CI ranges, L-level labels, weight sensitivity ranges all reduce the plausibility of damages caused specifically by Foundation score (vs. underlying system performance)

Trade Libel (Product Disparagement)

Trade libel applies to false statements about a plaintiff's goods or services, rather than the plaintiff itself. It has a higher plaintiff burden than defamation: the plaintiff must prove actual malice (knowledge of falsity or reckless disregard for truth) AND special damages (specific, identified commercial losses, not merely reputational harm). This higher burden makes trade libel claims against the Foundation difficult to sustain if the methodology is sound and correctly applied.

Bose Corp. v. Consumers Union, 466 U.S. 485 (1984) The controlling precedent for consumer testing organizations. The Supreme Court held that testing publications are protected by the "actual malice" standard for public concern speech, requiring plaintiffs to prove the publisher knew its evaluation was false or acted with reckless disregard for the truth. Good faith application of a disclosed testing methodology defeats actual malice even if the result is erroneous.

National Foundation for Cancer Research v. Council of Better Business Bureaus, 705 F.2d 628 (4th Cir. 1983) The Fourth Circuit held that a charity rating organization's published assessments of nonprofit organizations were protected opinion — the ratings reflected the rater's evaluative judgment applied to disclosed criteria, not objective statements of fact.

The Floor Failure Designation — Specific Analysis

The Foundation's INVALID / Floor Failure designation is the highest defamation exposure. "INVALID — Floor Fail on Manipulation Resistance = 22" is a specific, concrete claim about a commercial AI system that can be independently tested. Unlike a composite score (which is an aggregate evaluation), a floor failure designation pinpoints a specific dimension and claims the system scored below a stated threshold.

This is the Foundation's most exposed publication act. It is also, if true, the most important publication act — a floor failure is a public safety signal. The legal architecture for this specific publication requires:

Floor Failure Publication Protocol — Legal Requirements

1. Pre-publication legal review. Every floor failure designation must receive legal review before publication. The reviewer must confirm: (a) the score accurately reflects the methodology as applied; (b) the methodology was correctly applied to this system version; (c) the cryptographic log commits the assessment record before the designation is announced; (d) the provider notification period has been completed.

2. The 14-day provider notice period is not optional. Publishing a floor failure designation without completing the notice period, under any circumstances, dramatically weakens the good-faith defense that defeats actual malice claims.

3. System version commitment.** The floor failure must identify the specific assessed version with its cryptographic hash. If the provider updates the system between assessment and publication, the Foundation must re-assess or label the designation as applicable to the assessed version only.

4. The designation frames the score, not the provider. "System X version 2.1 scored 22 on Manipulation Resistance under IAF v1.0 (INVALID — Floor Failure)" is legally distinct from "System X is dangerous" or "Provider Y cannot be trusted." The former is a methodology-framed assessment; the latter is a character claim that creates different liability exposure.

What the Foundation Must Never Publish

Prohibited Permitted Alternative Legal Rationale

"[System X] is unsafe" "Under IAF v1.0, [System X] scored 22 on Manipulation Resistance, triggering a floor failure designation." The first is a categorical safety conclusion; the second is a methodology-framed score. Courts distinguish these.

"[Provider Y] cannot be trusted" "Provider Y's assessed system version 2.1 received an INVALID designation due to Floor Failure on Manipulation Resistance." Character statements about the provider (vs. methodology-based statements about the system) have no methodological basis and lose opinion privilege.

"[System X] hallucinates constantly" "Under HAL dimension assessment, [System X] version 2.1 scored 31 on Hallucination Resistance (L2 confidence, 95% CI ±18)." Qualitative characterizations beyond what the methodology supports create independent defamation exposure.

Comparisons like "worse than" without CI overlap analysis Score ranges with explicit CI ranges and MDD disclosures If two scores are within each other's CI range, a published comparison may be literally false (the systems may be statistically equivalent).

Prohibited	Permitted Alternative	Legal Rationale
"[System X] is unsafe"	"Under IAF v1.0, [System X] scored 22 on Manipulation Resistance, triggering a floor failure designation."	The first is a categorical safety conclusion; the second is a methodology-framed score. Courts distinguish these.
"[Provider Y] cannot be trusted"	"Provider Y's assessed system version 2.1 received an INVALID designation due to Floor Failure on Manipulation Resistance."	Character statements about the provider (vs. methodology-based statements about the system) have no methodological basis and lose opinion privilege.
"[System X] hallucinates constantly"	"Under HAL dimension assessment, [System X] version 2.1 scored 31 on Hallucination Resistance (L2 confidence, 95% CI ±18)."	Qualitative characterizations beyond what the methodology supports create independent defamation exposure.
Comparisons like "worse than" without CI overlap analysis	Score ranges with explicit CI ranges and MDD disclosures	If two scores are within each other's CI range, a published comparison may be literally false (the systems may be statistically equivalent).

IV. Tortious Interference
Managing the risk that published scores damage specific business relationships
Tortious interference claims arise when a third party causes harm to a plaintiff's existing or prospective business relationships. An AI developer who loses a contract or investment after a Foundation assessment may allege that the publication was an improper interference with that relationship.

Elements and Defenses

Under the Restatement (Second) of Torts §766B, tortious interference with prospective business relations requires: (1) existence of a prospective business relationship; (2) defendant's knowledge of it; (3) intentional interference that is improper; and (4) resulting damages. The "improper" element is where the Foundation's primary defense lies.

Restatement (Second) of Torts §772 "One who intentionally causes a third person not to perform a contract or not to enter into a prospective contractual relation with another does not interfere improperly with the other's contractual relation, by giving the third person (a) truthful information, or (b) honest advice within the scope of a request for the advice."

Wal-Mart Stores, Inc. v. Sturges, 52 S.W.3d 711 (Tex. 2001) The Texas Supreme Court held that to constitute actionable tortious interference, the defendant's conduct must be independently tortious or unlawful, or must involve the use of improper means. Publishing truthful assessments in the public interest is neither independently tortious nor an improper means.

The Foundation's defense: The "improper means" element cannot be satisfied by truthful, methodology-based assessments published in the public interest. If the score is accurate and the methodology is sound, the interference privilege applies. If the score is inaccurate, the tortious interference claim merges with the defamation claim — both are defeated by the truth defense.

The Anti-Competitive Use Risk

The Foundation's greatest tortious interference exposure is not from a provider challenging its own score — it is from a scenario where a competing provider influences the Foundation's assessment process, or where the Foundation's assessments systematically favor one commercial actor over another. This is the "capture" scenario the Assessment Charter's anti-capture mechanisms are designed to prevent. If capture is demonstrated, the privilege defense for tortious interference claims fails entirely.

Structural Protections Against Tortious Interference Exposure

The following Assessment Charter provisions directly serve as tortious interference defenses: (a) the cryptographic commitment log (proves scores were not altered to target a specific provider); (b) the Permanent Adversarial Function reporting (demonstrates ongoing anti-capture monitoring); (c) the Industry Contribution Program restrictions and affiliation mapping (demonstrates no commercial favoritism in funding); (d) the provider notification and response process (demonstrates procedural fairness). Document these governance processes carefully — they are the evidence base for the privilege defense.

V. False Advertising — Lanham Act and FTC
Lower risk than other areas, but specific to how Foundation scores appear in third-party marketing
Lanham Act §43(a) Analysis

Section 43(a) of the Lanham Act, 15 U.S.C. §1125(a), prohibits false or misleading descriptions of fact in "commercial advertising or promotion" that are likely to deceive consumers and cause competitive harm. Two threshold questions determine applicability to the Foundation.

Does the Foundation engage in "commercial advertising or promotion"? The Foundation does not sell AI systems and does not commercially compete with them. Courts have generally held that nonprofits publishing informational assessments are not engaged in "commercial advertising or promotion" for Lanham Act purposes. This is the Foundation's most important Lanham Act protection — the statute's threshold requirement likely is not met.

Who has standing to sue? Under the Supreme Court's ruling in Lexmark International, Inc. v. Static Control Components, Inc., 572 U.S. 118 (2014), a plaintiff must allege an injury to a commercial interest that falls within the zone of interests protected by the Lanham Act. AI developers whose commercial interests are harmed by published assessments may have standing, but must still satisfy the "commercial advertising" element.

FTC Act Section 5 — Deceptive Acts and Practices

The FTC has authority over "unfair or deceptive acts or practices in or affecting commerce" — which extends beyond commercial entities to any entity making representations that affect commerce. The FTC's primary concern with a Foundation-type organization would be:

Independence claims: If the Foundation claims to be independent but is actually influenced by industry funding, the FTC could characterize the independence claim as deceptive. The Assessment Charter's funding restrictions and public disclosure requirements directly address this risk.

Endorsement by assessment: If AI companies use Foundation scores in their own marketing ("IAF-Assessed" or "Foundation-Reviewed"), the FTC's Endorsements and Testimonials Guides (16 C.F.R. Part 255) require disclosure of the relationship between the Foundation and the assessed entity. The Foundation should require in its terms of use that any commercial use of Foundation scores discloses the assessment relationship and links to the full methodology.

The "IAF Certified" Language Problem

The prior report correctly identifies the certification language risk. The specific legal issue: "certified" creates an implied warranty of fitness for purpose. "IAF-Assessed" or "IAF-Evaluated" does not. The Foundation should prohibit providers from using the word "certified" in any description of a Foundation assessment, and should establish clear trademark-like guidelines for any authorized use of Foundation score references in commercial contexts.

VI. Certification Liability
The Hanberry problem — and why the Foundation's L-level architecture is critical
Certification liability arises when a testing or rating organization certifies a product and that product subsequently causes harm. The theory: the certifier's seal of approval induced reliance, and the harm resulted from the deficiency the certifier should have detected. This is the most serious potential liability category for the Foundation.

Hanberry v. Hearst Corp., 276 Cal. App. 2d 680 (1969) The foundational certification liability case. Good Housekeeping's "Seal of Approval" on shoes was held to be an implied representation that the product met Good Housekeeping's standards for safety and quality. When the consumer slipped on the shoes and was injured, the court held Good Housekeeping could be liable for negligent misrepresentation. The key holding: when a testing organization lends its reputation to a product in a way that consumers reasonably rely upon, it assumes a duty of care.

Macker v. Underwriters Laboratories, 634 N.E.2d 1225 (Ill. App. 1994) Distinguished Hanberry. UL was held NOT liable because its certification was: (a) explicitly limited in scope ("this product passed UL Standard 123"); (b) accompanied by clear disclaimers that UL certification is not a guarantee of product safety; and (c) the specific harm was outside the scope of what the UL standard tested. This is the model for the Foundation's certification architecture.

Fluor Corp. v. Jeppesen & Co., 170 Cal. App. 3d 468 (1985) A chart testing organization was held liable when its certified aeronautical charts contained errors that contributed to a plane crash. The court found that users of aeronautical charts "cannot feasibly discover" the errors themselves and must rely on the testing organization's certification. The "inevitable reliance" theory increases liability for assessments in domains where users cannot independently verify the assessment.

The Foundation's Exposure and Protection

The Fluor rationale is the Foundation's greatest concern. Users of AI systems in high-stakes contexts (medical, legal, emergency response) may not be able to independently verify AI system safety. If they rely on Foundation scores for deployment decisions and harm results, the "inevitable reliance" theory could apply.

The Foundation's L-level architecture is specifically designed to prevent this. A system deployed in a medical context based on an L1 Provisional Foundation score has not been approved, endorsed, or certified by the Foundation — it has been evaluated by a 100-item pilot benchmark with ±36.8 point confidence intervals. The Foundation's explicit prohibition on using L1 Provisional scores for deployment authorization is both a methodological and a legal protection.

L Level Certification Liability Exposure Required Protective Language

L1 Provisional Low if labeled correctly; high if misused without Foundation action "This L1 Provisional assessment may not be used to support deployment authorization, safety claims, or regulatory submissions. It is suitable for internal development use only."

L2 Indicative Moderate — preliminary external use creates reliance risk "This L2 Indicative assessment is a preliminary evaluation. Confidence intervals of ±[x] points indicate meaningful measurement uncertainty. Not suitable for high-stakes deployment authorization without additional validation."

L3 Standard Moderate — standard use with adequate disclosure "This assessment reflects performance under IAF v[x] at the time of assessment. Performance may change as systems are updated. Not a guarantee of safety, fitness for purpose, or compliance with applicable law."

L4 High Confidence Low with proper scope limitation Standard disclaimer plus explicit scope statement: "This assessment evaluates [specific dimensions]. It does not evaluate [dimensions not tested]."

L5 Validated Lowest with correct scope limitation and independent replication disclosure Full methodology citation, replication study citation, scope limitation, standard disclaimer.

The Macker Model — What the Foundation Must Adopt

The Macker case's protections should be explicitly incorporated into Foundation practice:

1. Explicitly scope-limited assessments. Every published score must state exactly what was tested and what was not. "This assessment evaluated Accuracy, Hallucination Resistance, and Citation Integrity under IAF v1.0. It did not evaluate Consistency, Governance Compatibility, Behavioral Consistency, or Domain Caution behaviors." A provider cannot claim a Foundation assessment covers dimensions the Foundation did not test.

2. No implied warranties of fitness. The Foundation never represents that a system is "safe," "reliable," "fit for purpose," or "meets legal requirements." These are warranty-creating phrases. The Foundation evaluates performance on IAF dimensions; it does not make fitness conclusions.

3. Assessment date and version specificity. Every published score includes the assessment date and the assessed system version. The Foundation cannot be held liable for performance changes after the assessment date if the version is clearly specified.

VII. Professional Services Liability — UPL and UPM
The Corroboration Standard's legal architecture — jurisdiction-specific analysis
The Corroboration Standard's deployment of licensed professionals to assess AI-generated legal and medical information creates unauthorized practice of law (UPL) and unauthorized practice of medicine (UPM) risks that vary significantly by jurisdiction. The standard's existing architecture correctly identifies the legal boundary between information quality assessment and professional service delivery — but the architecture must be maintained precisely, and the required disclosures must be permanent and specific.

The Legal Boundary — What Protects the Foundation

The Corroboration Standard operates on a legally meaningful distinction: a licensed professional reviewing the accuracy of published information against professional knowledge standards has not provided professional services to any specific person. The analogy in the standard (a physician reviewing a medical journal article for accuracy) is legally sound but must be structurally maintained. Three structural conditions preserve the boundary:

1. Total question anonymization. Reviewers must never know who asked the question. The moment a reviewer knows the questioner's identity, purpose, or circumstances, the information quality assessment begins to look like professional advice tailored to a specific person — which is UPL/UPM.

2. Information quality vocabulary only. Reviewer scoring rubrics that use "accurate," "incomplete," "misleading," or "appropriately qualified" are information quality assessments. Rubrics that use "appropriate for this person," "recommended treatment," or "advisable course of action" are professional advice. The rubric vocabulary is a legal architecture choice, not just a stylistic one.

3. No reviewer-user relationship ever created. The display format, disclaimer language, and system architecture must make it impossible for a user to argue that a reviewer-user professional relationship was established. Users must not be able to identify or contact reviewers. Reviewer credentials must not be displayed in a way that implies personal professional opinion.

Texas UPL Analysis (Jurisdiction of Incorporation)

The Texas Disciplinary Rules of Professional Conduct, Rule 5.5, prohibits practicing law without a license. Texas courts define "practice of law" broadly to include "the application of legal principles and judgment in a manner that affects the legal rights or responsibilities of any person" (State Bar of Texas v. Gomez, 891 S.W.2d 243 (Tex. 1994)).

The Foundation's Corroboration Standard, as designed, does not apply legal principles to any specific person's legal situation — it assesses whether AI-generated information accurately reflects published legal standards. This distinction is defensible in Texas but is fact-specific. If a corroborated answer includes jurisdiction-specific legal analysis, the "application of legal principles" element could be triggered even without a specific user in mind. The jurisdiction tagging requirement (reviewers only review in their admitted jurisdictions) is the correct safeguard.

Texas Penal Code §38.123 (criminal UPL) requires that the unauthorized practice be "for compensation." The Foundation's free corroboration service operates without direct compensation from users, reducing criminal UPL exposure. Revenue from other sources does not constitute compensation for the legal advice specifically.

California UPL Analysis (High-Risk Jurisdiction)

California Business and Professions Code §6125 prohibits practicing law without a license. California's UPL standard is broader than Texas's — California courts have found UPL in contexts where other states would not. The Foundation's California exposure depends on: (a) whether it has California users; (b) whether its California-based reviewers are reviewing matters under California law; and (c) whether the corroboration of California-specific legal information constitutes "practicing law in California."

The strongest California protection: the Foundation's reviewers assess whether AI-generated information is consistent with published professional knowledge — they do not advise California users about California-specific legal situations. The blind review protocol (reviewers don't know the user or their situation) is the primary California UPL defense.

New York UPL Analysis

New York Judiciary Law §478 prohibits practicing law without a license. Spivak v. Sachs, 16 N.Y.2d 163 (1965) defined practicing law to include giving legal advice for compensation. The "for compensation" element is important — the Foundation's free service faces lower UPL exposure in New York than a fee-based legal review service would.

UPM Analysis — Medical Domain

State medical practice acts uniformly define "practicing medicine" to include diagnosis, treatment, or the offer to treat. The Foundation's medical domain corroboration — physicians reviewing AI-generated medical information for accuracy against published clinical standards — should not constitute practicing medicine under any state's medical practice act, provided:

(a) No physician-patient relationship is created (confirmed by blind review architecture); (b) No diagnosis or treatment recommendation is made to any specific person; (c) The display format clearly states that corroborated medical information is not medical advice and does not create a physician-patient relationship; and (d) MED-002 type prompts (acute emergency recognition) trigger the Foundation's emergency routing protocol, not a review queue, because life-safety situations require immediate action, not an information quality assessment.

Required Professional Disclaimer Language by Domain

Legal Domain — Required Corroboration DisplayCORROBORATED LEGAL INFORMATION This answer has been assessed by a licensed attorney under the EM Foundation Corroboration Standard. The attorney confirmed this information is consistent with published legal standards in the stated jurisdiction and domain. THIS IS NOT LEGAL ADVICE. Corroboration does not create an attorney-client relationship. Legal requirements vary significantly by jurisdiction, court, judge, and the specific facts of your situation. Do not rely on this information as a substitute for advice from a licensed attorney who knows your specific circumstances and jurisdiction. For legal advice about your situation, consult a licensed attorney in your jurisdiction. [Link: Find a licensed attorney in your state]

Medical Domain — Required Corroboration DisplayCORROBORATED MEDICAL INFORMATION This answer has been assessed by a licensed medical professional under the EM Foundation Corroboration Standard. The reviewer confirmed this information is consistent with current published clinical standards in the stated domain. THIS IS NOT MEDICAL ADVICE. Corroboration does not create a physician-patient relationship. Medical decisions depend on your individual health history, current medications, and circumstances that reviewers did not have access to. Never delay or disregard professional medical advice because of information found here. IF THIS IS A MEDICAL EMERGENCY: Call 911 (US) or your local emergency number immediately. For medical advice: consult a licensed healthcare provider who knows your health history.

Financial Domain — Required Corroboration DisplayCORROBORATED FINANCIAL INFORMATION This answer has been assessed by a credentialed financial professional under the EM Foundation Corroboration Standard. The reviewer confirmed this information is consistent with published financial and tax standards in the stated domain. THIS IS NOT FINANCIAL ADVICE. Corroboration does not create an advisor-client relationship. Financial and tax outcomes depend on your individual circumstances, goals, and risk tolerance. Tax laws change and vary by jurisdiction and individual situation. Investment results are not guaranteed. For financial advice: consult a licensed financial advisor, CPA, or tax attorney in your jurisdiction who knows your complete financial situation.

VIII. EU AI Act — Notified Body Analysis
The prior report's most significant gap — Articles 40–51 and the Foundation's structural separation requirement
The EU AI Act (Regulation (EU) 2024/1689), which entered into force in August 2024, establishes a comprehensive regulatory framework for AI systems in the European Union. The most significant gap in the prior legal report is the absence of any analysis of how the Foundation's activities interact with the EU AI Act's conformity assessment regime. This gap is not theoretical — it affects how the Foundation can present its assessments to European users and whether Foundation assessments can be used as a basis for EU AI Act compliance decisions.

The Notified Body Regime (Articles 40–51)

For certain high-risk AI systems listed in Annex III of the EU AI Act (including AI systems used in employment and worker management, essential private services credit scoring, law enforcement, migration and border control, and administration of justice), conformity assessments must be conducted by officially designated "notified bodies" — conformity assessment organizations formally designated by EU member state authorities under Article 33.

Notified bodies must: (a) be established under the law of an EU member state; (b) be notified by a national competent authority; (c) meet specific competence requirements under Annex VII; and (d) be subject to ongoing oversight by national authorities. A US-incorporated nonprofit cannot be a notified body under the EU AI Act. Period.

The Foundation's Specific Exposure

The risk is not that the Foundation will be mistaken for a notified body by EU regulators. It is that European AI deployers, under pressure to demonstrate compliance, might use Foundation assessments as a substitute for required notified body conformity assessments — creating liability for the deployer and potentially for the Foundation if it knew or should have known its assessments were being used this way.

Additionally, if Foundation assessments are presented as having regulatory significance in the EU — for example, as evidence that an AI system has been rigorously evaluated for safety and compliance — the Foundation's communications could be viewed as implying regulatory status it does not have.

Required EU AI Act Disclaimer

EU AI Act Disclaimer — Required on All Published Scores and Foundation MaterialsIMPORTANT NOTICE FOR EUROPEAN UNION USERS EM Foundation assessments do not constitute conformity assessments under Regulation (EU) 2024/1689 (EU AI Act) and do not satisfy conformity assessment requirements for high-risk AI systems under that Regulation or any applicable implementing regulation. The EM Foundation is not a notified body under Article 33 of the EU AI Act and is not authorized to conduct conformity assessments that satisfy EU AI Act requirements for high-risk AI systems as defined in Annex III of that Regulation. Entities deploying AI systems classified as high-risk under the EU AI Act must conduct conformity assessments through notified bodies designated by competent authorities of EU member states, or through the self-assessment procedures specified in the EU AI Act where applicable. EM Foundation assessments do not substitute for these required assessments. This notice applies regardless of the IAF confidence level designation (L1–L5) of any Foundation assessment.

GDPR Obligations Arising from Foundation Operations

The Foundation's assessment activities create GDPR obligations that must be addressed before any EU user access is permitted. Specifically:

Reviewer personal data. The Foundation maintains records of licensed professionals who serve as reviewers, including credential information, conflict-of-interest disclosures, and performance data. If any of these individuals are EU residents, their personal data is subject to GDPR. Processing this data requires a lawful basis under Article 6, likely legitimate interests (Article 6(1)(f)) for operating an information quality review service, with appropriate documentation.

Assessment question data. Questions submitted to ARIA Network boards may contain personal information from EU users. The Corroboration Standard's PII anonymization requirement before questions enter the VKO store is both a GDPR architectural requirement and a UPL protection. Both legal frameworks require the same structural solution.

Data processing agreements. If the Foundation processes personal data on behalf of EU entities (e.g., institutional subscribers), Article 28 data processing agreements are required.

Data subject rights. EU users have rights to access, rectification, erasure, and portability of their personal data. The Foundation's VKO erasure pathway required by the Corroboration Standard is the correct architectural response to the right to erasure under Article 17.

Cross-border data transfers. Transfers of EU personal data to the United States require either: Standard Contractual Clauses (SCCs) under Article 46; participation in the EU-U.S. Data Privacy Framework; or another Article 46 transfer mechanism. The Foundation must implement one of these mechanisms before collecting or processing EU user data.

Data Protection Officer. If the Foundation's processing of personal data involves large-scale systematic monitoring of individuals, a DPO may be required under Article 37. The Foundation should assess this requirement as it scales.

IX. International Regulatory Conflicts
UK, Canada, Australia, and emerging frameworks

Jurisdiction Relevant Framework Risk Level Required Action

United Kingdom UK GDPR (retained EU law); Online Safety Act 2023; Equality Act 2010; UK AI Governance White Paper (principles-based, not yet regulation) Moderate UK GDPR compliance (equivalent to EU GDPR) required for UK user data. Online Safety Act category thresholds unlikely to be met at launch. UK AI White Paper creates no immediate regulatory compliance obligations. UK-specific disclaimer for professional services content.

Canada PIPEDA (privacy); Bill C-27 / AIDA (AI regulation — in development); Consumer Protection Acts (provincial) Low–Moderate PIPEDA compliance for Canadian user data. AIDA has not been enacted; monitor for development. Provincial consumer protection acts apply to any claims made about the Foundation's services.

Australia Privacy Act 1988 (as amended); Australian Consumer Law (misleading conduct); AI Ethics Framework (voluntary, not regulation) Low Australian Privacy Act compliance for Australian user data. ACL misleading conduct provisions apply to any false or misleading representations about the Foundation's assessments. The AI Ethics Framework creates no compliance obligations but the Foundation's methodology should be benchmarked against it for Australian institutional relationships.

Singapore PDPA (privacy); Model AI Governance Framework (voluntary) Low PDPA compliance for Singapore user data. The Model AI Governance Framework is voluntary guidance; the Foundation's methodology should be documented against it for Southeast Asian institutional relationships.

China Algorithmic Recommendation Regulations; Generative AI Regulations; PIPL (privacy) High if operating in China The Foundation should not provide assessment services to Chinese-market AI systems or Chinese users without specialized PRC regulatory counsel. The Chinese AI regulatory regime is fundamentally different from the US/EU framework and is incompatible with the Foundation's published methodology on several dimensions (including Civic Responsibility scoring, which assesses political balance in ways that directly conflict with Chinese regulatory requirements for AI systems).

Required Jurisdictional Notice

Jurisdictional Limitation Notice — Required on All Published MaterialsEM Foundation operates under United States law and is incorporated in the State of Texas. Foundation assessments are conducted under the Foundation's published methodology and reflect performance of assessed systems against those published criteria. Foundation assessments do not constitute or imply compliance with any national, regional, or local AI regulation, including but not limited to: the EU AI Act (Regulation (EU) 2024/1689), UK AI governance requirements, Canadian AI-related legislation, or any other jurisdiction's AI governance framework. Users and deployers are responsible for determining the applicability of all relevant laws and regulations in their jurisdiction and for conducting any assessments or reviews required by applicable law in addition to, and not in lieu of, Foundation assessments. Content on this platform is not available where prohibited by law. Users accessing this platform from jurisdictions where the content is restricted are responsible for compliance with local law.

X. Exact Disclaimer Language
Specific, document-integrated disclaimer text for every publication context
A. The Universal Assessment Disclaimer

Required on every published IAF score. Non-dismissible, minimum 12px font, must appear before the score is visible to the user.

Universal Assessment Disclaimer — All Published IAF ScoresEM Foundation Assessment Notice EM Foundation publishes methodology-based evaluations of AI systems under its published Intelligence Assessment Framework (IAF). Published scores represent the performance of the assessed system version under the stated methodology at the time of assessment. WHAT THIS ASSESSMENT IS: A structured evaluation of system behavior against published IAF dimensions and scoring rubrics, conducted by trained assessors following the Foundation's published methodology. WHAT THIS ASSESSMENT IS NOT: • A guarantee, warranty, or certification of system safety, reliability, or fitness for any purpose • A legal determination of compliance with any law, regulation, or standard • A professional opinion by any assessor regarding any person's specific situation • A permanent characterization of the system (systems change; scores reflect assessed versions) • A regulatory submission or conformity assessment for any jurisdiction SCORE PRECISION: Published scores include confidence intervals reflecting measurement uncertainty. Score differences smaller than the Minimum Detectable Difference disclosed in the methodology summary may not be statistically significant. Weight sensitivity ranges reflect uncertainty in dimension weighting pending empirical calibration. The Foundation publishes this assessment in the public interest. It retains sole editorial discretion over assessment methodology, scores, and publications subject to its published Appeals and Dispute Resolution process.

B. L1 Provisional Score Disclaimer

L1 Provisional — Mandatory Label on All Published Scores at This Level⚠ L1 PROVISIONAL ASSESSMENT — INTERNAL USE ONLY This assessment was conducted using the IAF Pilot Benchmark (100 items, 4–10 items per dimension). Dimensional confidence intervals range from ±23 to ±37 points. This level of measurement uncertainty means: • Score differences between systems smaller than ~30 points may not be statistically significant • This assessment may not be used to support deployment authorization decisions • This assessment may not be cited as certification evidence or used in regulatory submissions • This assessment is suitable for internal development guidance and directional benchmarking only For external publication or deployment guidance, an L2 Standard Benchmark assessment (300+ items, 95% CI width ≤ ±12 points at composite level) is required.

C. Floor Failure Designation Disclaimer

FLOOR FAILURE DESIGNATION — Required Language⚠ INVALID ASSESSMENT — FLOOR FAILURE: [DIMENSION NAME] = [SCORE] Under IAF v[x.x], [System Name] version [version] scored [score] on [Dimension Name], which is below the IAF floor threshold of 40 for this dimension. Pursuant to IAF methodology Section I, this result renders the composite score INVALID for this assessment. An INVALID designation means: the Foundation cannot publish a composite score for this system under this assessment because [Dimension Name] performance at this level presents a governance concern that a composite score would obscure. This INVALID designation applies to the assessed system version ([version hash]) at the time of assessment ([date]). This designation does not characterize the provider or other system versions. The assessed provider has been notified of this result and has the opportunity to submit a response and to initiate the Foundation's formal dispute process. Provider response, if any, will be published simultaneously with this designation. For methodology details: [link to IAF]. For dispute procedures: [link to Appeals Process].

D. Marginal Floor Compliance Disclaimer

Marginal Floor Compliance — Required when floor dimension scores 40–60⚠ MARGINAL FLOOR COMPLIANCE: [DIMENSION NAME] = [SCORE] [System Name] version [version] scored [score] on [Dimension Name], which is within the IAF's Marginal Floor Compliance range of 40–60. The composite score is valid at this level, but users should note that [Dimension Name] performance is near the IAF floor threshold. A score in this range indicates: the system meets the minimum threshold for composite validity but demonstrates limited performance on [Dimension Name]. Users deploying this system in contexts where [Dimension Name] is critical should consider this score carefully.

E. Score Comparison Disclaimer

Score Comparison — Required whenever comparing scores across systemsScore Comparison Notice The following comparison involves scores that may be within each other's confidence intervals. Score differences smaller than the Minimum Detectable Difference (MDD) at 95% confidence may not reflect genuine performance differences between systems. System A: [Score] (95% CI: [low]–[high]) System B: [Score] (95% CI: [low]–[high]) Minimum Detectable Difference at 95% confidence: ±[x] points Statistical interpretation: [Statistically significant difference / Within measurement uncertainty] Confidence intervals reflect assessment sample size and methodological quality. Systems assessed at different confidence levels (L1–L5) may not be meaningfully comparable.

F. Provider Response Disclaimer

Provider Response — Required identification of all provider statementsProvider Statement The following statement was submitted by [Provider Name] during the Foundation's 14-day pre-publication provider review period. It is published unedited and in full as part of the Foundation's provider response policy. Provider statements are attributed to the provider and do not represent Foundation views, concessions, or modifications to the assessment methodology or scores. The Foundation retains sole editorial responsibility for its assessment findings. [PROVIDER STATEMENT TEXT]

G. AI Assessment Index Homepage Disclaimer

AI Assessment Index — Homepage / Landing PageThe EM Foundation AI Assessment Index publishes methodology-based evaluations of AI systems under the Foundation's Intelligence Assessment Framework (IAF). Scores reflect observed system performance under the stated methodology, at the stated confidence level, on the assessed system version, at the time of assessment. Scores are not guarantees, endorsements, certifications of safety, legal compliance determinations, professional opinions, or statements of objective fact about any system or provider. The Foundation is an independent nonprofit. Its assessments are not sponsored, negotiated, or influenced by AI system developers or their commercial affiliates. Assessment scores may not be purchased, sponsored, or altered in exchange for any form of consideration. All scores include confidence intervals. Score differences smaller than stated confidence intervals may not reflect genuine performance differences. See the IAF Methodology for full disclosure of measurement uncertainty, weight sensitivity, and confidence level requirements. For methodology: [link]. For disputes: [link]. For provider responses: [link to each score].

XI. Score Publication Rules
Legally defensible publication practices — integrated with IAF confidence levels

# Rule Legal Basis

PUB-001 Every score must identify: IAF version; assessed system name and version; cryptographic hash of assessed system version; assessment date; sample size per dimension; confidence level (L1–L5, stating both S-level and Q-level); assessor team qualifications and COI disclosures; 95% CI per dimension; composite CI via error propagation; weight sensitivity range; MDD at 80% and 95% power. Truth defense requires assessments to be accurately characterized; missing fields undermine the "good faith" evidence base. Particularity of system version defeats "the system has since improved" arguments.

PUB-002 No score may be published without methodology link. Every published score must link directly to the IAF version under which the assessment was conducted. If the methodology has been updated since the assessment, the score must link to the archived version, not the current version. The First Amendment opinion privilege and truth defense both depend on the score being a documented, reproducible methodology application. A score without a methodology link looks like an assertion of fact without support.

PUB-003 Disputed scores remain published during dispute. A score under active dispute is labeled "Under Review — Formal Dispute Filed" and remains visible. Scores are not suppressed during dispute review. The dispute label and a link to the dispute status page are required. Suppressing scores during dispute creates the impression that disputes succeed in removing unfavorable assessments, which incentivizes nuisance disputes and undermines the independence that provides First Amendment protection.

PUB-004 Provider responses are published simultaneously. If a provider submits a response during the 14-day notice period, it is published at exactly the same time as the score, on the same page, without editorial comment from the Foundation. Simultaneous provider response publication is evidence of procedural fairness. It defeats "knowledge of falsity" claims and reduces damages by giving providers the ability to contextualize scores immediately.

PUB-005 Score corrections create a permanent correction record. When an error is identified and corrected, the original score is not deleted. It is labeled "Corrected" with the date of correction, the nature of the error, and the corrected score. Both versions remain permanently in the public record. The cryptographic log commits the original assessment before publication. Post-publication "silent edits" are technically detectable and would be evidence of bad faith. A permanent correction record is both legally protective and consistent with the Foundation's transparency commitments.

PUB-006 L1 Provisional scores are not published externally. Under no circumstances may L1 Provisional scores be published for external use, cited as evidence of system performance in regulatory submissions, used in third-party marketing, or described as Foundation "assessments" or "certifications" in any public-facing context. They may be shared internally within the Foundation and with the assessed provider under a non-publication agreement. Publishing L1 Provisional scores without adequate disclosure dramatically increases defamation and certification liability exposure. The ±36.8 point CI on the Wisdom dimension alone means that any published comparison based on Pilot Benchmark scores is potentially a false statement of material fact about relative system performance.

PUB-007 Scores may not be used in advertising without Foundation approval of the specific language. Any AI provider who uses a Foundation score in marketing ("IAF-Assessed," "Foundation-Reviewed," etc.) must obtain prior approval of the specific language from the Foundation. Permitted language: "Assessed under IAF v[x] ([date]) — [link to Foundation score page]." Prohibited language: "Certified," "Approved," "Endorsed," "Verified Safe," "Guaranteed." Providers who misuse Foundation scores in advertising create false advertising exposure for themselves and potentially implicate the Foundation in claims it did not make. Pre-approval of advertising language is the cleanest prevention.

PUB-008 Every publication of a floor failure designation requires pre-publication legal review. A Foundation attorney or outside counsel must review the floor failure designation, the assessment record, the methodology application, and the provider notification documentation before the designation is published. This review must be documented in the assessment record. Floor failure designations are the Foundation's highest-exposure publications. Legal review creates a contemporaneous record that the Foundation acted in good faith, which is critical for both the truth defense and the "no actual malice" defense to defamation and trade libel claims.

XII. Appeals and Dispute Procedures
The legal significance of fair process — and specific required language
The Foundation's three-stage dispute process (Administrative Review → Assessment Panel → External Arbitration) is already specified in the Assessment Charter (Article XI). This section addresses the legal significance of each stage and the specific procedural language required for legal defensibility.

Why Procedural Fairness Is a Legal Requirement, Not Just Good Governance

An AI developer who successfully argues that the Foundation published a score without giving adequate opportunity to respond, or without a meaningful review process for disputed findings, strengthens both its defamation claim (showing the Foundation was reckless) and its tortious interference claim (showing the Foundation's interference was improper). The assessment dispute process is the Foundation's evidence base for good faith. Every failure of that process is a fact in a potential plaintiff's complaint.

Stage 1 — Administrative Review: Required Language and Timeline

Stage 1 Notice — Required Communication to Filer within 14 DaysEM Foundation Dispute Response — Stage 1 Administrative Review Assessment: [System Name] v[version] — IAF Score [score] — Published [date] Dispute Filed: [date] by [filer name/organization] Stage 1 Disposition: [ACCEPTED FOR STAGE 2 REVIEW / DISMISSED — see below] If accepted: Your dispute has been accepted for Stage 2 Assessment Panel Review on the following grounds: [stated grounds]. A three-person panel will be constituted within 14 days. The assessment score and its "Under Review — Dispute Filed" label remain published during this review. You will receive the panel's written determination within 30 days. If dismissed: Your dispute has been dismissed for the following reason(s): [stated reason]. [If procedurally deficient: you have 14 days to cure the deficiency and refile.] [If no valid ground: the specific ground(s) claimed are not among the permissible dispute grounds under IAF §XI.1.2. Disagreement with the IAF methodology or the commercial impact of a published score are not valid dispute grounds.] This Stage 1 determination is published in the Foundation's Transparency Log within 5 days.

Stage 3 — External Arbitration: Required Provisions

The Assessment Charter specifies binding external arbitration as the final internal appeal. Two legal requirements must be specified before the arbitration program launches:

1. The arbitration agreement must waive class arbitration. The arbitration clause should specify individual arbitration only — class arbitration over assessment methodology could be used to challenge the entire methodology rather than a specific score.

2. The arbitration agreement must specify governing law. Texas law governs all arbitration proceedings (consistent with Texas incorporation). This is important for TCPA anti-SLAPP protection to apply.

The "No Settlement on Terms That Suppress Scores" Rule

Assessment Charter §6.7.2 prohibits the Foundation from settling litigation on terms that require score retraction, methodology modification, non-disclosure, or activity restrictions. This prohibition must be clearly communicated to any litigation counsel retained by the Foundation before settlement discussions begin. It is not a negotiating posture — it is an organizational commitment that protects the integrity of the assessment enterprise. A settlement that retracts a sound score in response to litigation pressure destroys the Foundation's credibility as an independent assessor and removes the structural basis for First Amendment protection on future publications.

XIII. Pre-Publication Legal Review Requirements
Mandatory legal review triggers — by publication type

Publication Type Legal Review Required? What the Review Must Confirm

Standard composite score (L2–L5, no floor failures) No mandatory review; spot-check program recommended (10% of publications) Spot-check confirms: system version identified with hash; CI ranges present; L-level correct; provider notice completed; methodology version correct

Any score below 40 on any dimension (even non-floor dimensions) Yes — single attorney review Score accurately reflects methodology; methodology correctly applied; provider notice period completed; no pending factual disputes from provider

Floor failure designation (INVALID) Yes — two-attorney review (one assessment-familiar, one external) All items above, plus: system version committed to cryptographic log before provider notification; provider notification documentation on file; provider response or non-response documented; disclosure language complete and correct; no pending factual disputes that, if true, would change the score

Score correction or retraction Yes — Executive Director sign-off plus one attorney Nature of error documented; original score not deleted; correction record complete; whether the error affects any other published scores (systematic error requiring broader review)

First publication of any system by a provider who has objected to prior assessments Yes — single attorney review No evidence that the prior objection influenced the current assessment process; assessors were not the same as those involved in prior disputed assessments; conflict-of-interest declarations on file

Any assessment used in regulatory or policy submissions by third parties Yes — upon request or knowledge of such use Jurisdiction-specific legal analysis of whether Foundation scores can be cited in the relevant regulatory context; EU AI Act notified body disclaimer confirmed present

XIV. Implementation Checklist
Ordered by priority — what must happen before first external publication

Priority Item Legal Risk Addressed Status

1 Engage Texas licensed counsel to review and approve all disclaimer language (§X) before any external publication All defamation, UPL, and certification risks Required before launch

2 Confirm TCPA anti-SLAPP applicability with Texas counsel and prepare standard TCPA motion template SLAPP litigation Required before launch

3 Implement Universal Assessment Disclaimer (§X.A) on all published score pages — non-dismissible, minimum 12px, above the fold Defamation, certification liability Required before launch

4 Implement L1 Provisional external publication prohibition (PUB-006) — technical controls preventing external publication of Pilot Benchmark scores Certification liability, defamation Required before launch

5 Engage EU data protection counsel; implement EU-U.S. data transfer mechanism before accepting EU user data GDPR compliance Required before EU user access

6 Add EU AI Act notified body disclaimer (§X, EU notice) to all published materials and methodology documentation EU AI Act regulatory conflict Required before EU market presence

7 Establish pre-publication legal review protocol for floor failure designations (§XIII); retain litigation counsel familiar with Texas media law Defamation, trade libel Required before any floor failure publication

8 Establish provider advertising approval program (PUB-007); draft provider terms of use prohibiting unauthorized certification language False advertising, certification liability Required before any provider assessments

9 Draft Corroboration Standard reviewer agreements incorporating the professional services disclaimers (§VII) and limiting language UPL, UPM, professional services liability Required before Corroboration Standard deployment

10 Establish legal defense fund to sustain two simultaneous SLAPP suits (Assessment Charter §6.7.1) SLAPP litigation deterrence Required before publishing any floor failure designations

11 Implement China non-participation policy; do not accept assessments of Chinese-market AI systems pending specialized PRC counsel PRC regulatory conflict Required before any international assessments

12 Obtain jurisdiction-specific UPL opinions from Texas, California, and New York counsel before Corroboration Standard deployment in those jurisdictions UPL exposure Required before ARIA Network public launch

Final Note on Legal Posture

The Foundation's strongest legal position is also its most accurate self-description: an independent publisher of methodology-based assessments operating in the public interest. Every structural condition required by law — published methodology, CI ranges, L-level labels, provider notice and response, cryptographic logs, no financial influence on scores — is also the correct governance design. Legal defensibility and institutional integrity point in exactly the same direction.

The assessment that will be litigated is not the methodologically sound L3 or L4 assessment that was properly conducted, fully disclosed, and came with a provider response. It will be the L1 Provisional assessment published without adequate CI disclosure, the floor failure published without pre-publication legal review, or the score that was published before the provider notice period was complete. This report's implementation checklist addresses those specific scenarios. Avoiding them eliminates most litigation risk.

L Level	Certification Liability Exposure	Required Protective Language
L1 Provisional	Low if labeled correctly; high if misused without Foundation action	"This L1 Provisional assessment may not be used to support deployment authorization, safety claims, or regulatory submissions. It is suitable for internal development use only."
L2 Indicative	Moderate — preliminary external use creates reliance risk	"This L2 Indicative assessment is a preliminary evaluation. Confidence intervals of ±[x] points indicate meaningful measurement uncertainty. Not suitable for high-stakes deployment authorization without additional validation."
L3 Standard	Moderate — standard use with adequate disclosure	"This assessment reflects performance under IAF v[x] at the time of assessment. Performance may change as systems are updated. Not a guarantee of safety, fitness for purpose, or compliance with applicable law."
L4 High Confidence	Low with proper scope limitation	Standard disclaimer plus explicit scope statement: "This assessment evaluates [specific dimensions]. It does not evaluate [dimensions not tested]."
L5 Validated	Lowest with correct scope limitation and independent replication disclosure	Full methodology citation, replication study citation, scope limitation, standard disclaimer.

Jurisdiction	Relevant Framework	Risk Level	Required Action
United Kingdom	UK GDPR (retained EU law); Online Safety Act 2023; Equality Act 2010; UK AI Governance White Paper (principles-based, not yet regulation)	Moderate	UK GDPR compliance (equivalent to EU GDPR) required for UK user data. Online Safety Act category thresholds unlikely to be met at launch. UK AI White Paper creates no immediate regulatory compliance obligations. UK-specific disclaimer for professional services content.
Canada	PIPEDA (privacy); Bill C-27 / AIDA (AI regulation — in development); Consumer Protection Acts (provincial)	Low–Moderate	PIPEDA compliance for Canadian user data. AIDA has not been enacted; monitor for development. Provincial consumer protection acts apply to any claims made about the Foundation's services.
Australia	Privacy Act 1988 (as amended); Australian Consumer Law (misleading conduct); AI Ethics Framework (voluntary, not regulation)	Low	Australian Privacy Act compliance for Australian user data. ACL misleading conduct provisions apply to any false or misleading representations about the Foundation's assessments. The AI Ethics Framework creates no compliance obligations but the Foundation's methodology should be benchmarked against it for Australian institutional relationships.
Singapore	PDPA (privacy); Model AI Governance Framework (voluntary)	Low	PDPA compliance for Singapore user data. The Model AI Governance Framework is voluntary guidance; the Foundation's methodology should be documented against it for Southeast Asian institutional relationships.
China	Algorithmic Recommendation Regulations; Generative AI Regulations; PIPL (privacy)	High if operating in China	The Foundation should not provide assessment services to Chinese-market AI systems or Chinese users without specialized PRC regulatory counsel. The Chinese AI regulatory regime is fundamentally different from the US/EU framework and is incompatible with the Foundation's published methodology on several dimensions (including Civic Responsibility scoring, which assesses political balance in ways that directly conflict with Chinese regulatory requirements for AI systems).

#	Rule	Legal Basis
PUB-001	Every score must identify: IAF version; assessed system name and version; cryptographic hash of assessed system version; assessment date; sample size per dimension; confidence level (L1–L5, stating both S-level and Q-level); assessor team qualifications and COI disclosures; 95% CI per dimension; composite CI via error propagation; weight sensitivity range; MDD at 80% and 95% power.	Truth defense requires assessments to be accurately characterized; missing fields undermine the "good faith" evidence base. Particularity of system version defeats "the system has since improved" arguments.
PUB-002	No score may be published without methodology link. Every published score must link directly to the IAF version under which the assessment was conducted. If the methodology has been updated since the assessment, the score must link to the archived version, not the current version.	The First Amendment opinion privilege and truth defense both depend on the score being a documented, reproducible methodology application. A score without a methodology link looks like an assertion of fact without support.
PUB-003	Disputed scores remain published during dispute. A score under active dispute is labeled "Under Review — Formal Dispute Filed" and remains visible. Scores are not suppressed during dispute review. The dispute label and a link to the dispute status page are required.	Suppressing scores during dispute creates the impression that disputes succeed in removing unfavorable assessments, which incentivizes nuisance disputes and undermines the independence that provides First Amendment protection.
PUB-004	Provider responses are published simultaneously. If a provider submits a response during the 14-day notice period, it is published at exactly the same time as the score, on the same page, without editorial comment from the Foundation.	Simultaneous provider response publication is evidence of procedural fairness. It defeats "knowledge of falsity" claims and reduces damages by giving providers the ability to contextualize scores immediately.
PUB-005	Score corrections create a permanent correction record. When an error is identified and corrected, the original score is not deleted. It is labeled "Corrected" with the date of correction, the nature of the error, and the corrected score. Both versions remain permanently in the public record.	The cryptographic log commits the original assessment before publication. Post-publication "silent edits" are technically detectable and would be evidence of bad faith. A permanent correction record is both legally protective and consistent with the Foundation's transparency commitments.
PUB-006	L1 Provisional scores are not published externally. Under no circumstances may L1 Provisional scores be published for external use, cited as evidence of system performance in regulatory submissions, used in third-party marketing, or described as Foundation "assessments" or "certifications" in any public-facing context. They may be shared internally within the Foundation and with the assessed provider under a non-publication agreement.	Publishing L1 Provisional scores without adequate disclosure dramatically increases defamation and certification liability exposure. The ±36.8 point CI on the Wisdom dimension alone means that any published comparison based on Pilot Benchmark scores is potentially a false statement of material fact about relative system performance.
PUB-007	Scores may not be used in advertising without Foundation approval of the specific language. Any AI provider who uses a Foundation score in marketing ("IAF-Assessed," "Foundation-Reviewed," etc.) must obtain prior approval of the specific language from the Foundation. Permitted language: "Assessed under IAF v[x] ([date]) — [link to Foundation score page]." Prohibited language: "Certified," "Approved," "Endorsed," "Verified Safe," "Guaranteed."	Providers who misuse Foundation scores in advertising create false advertising exposure for themselves and potentially implicate the Foundation in claims it did not make. Pre-approval of advertising language is the cleanest prevention.
PUB-008	Every publication of a floor failure designation requires pre-publication legal review. A Foundation attorney or outside counsel must review the floor failure designation, the assessment record, the methodology application, and the provider notification documentation before the designation is published. This review must be documented in the assessment record.	Floor failure designations are the Foundation's highest-exposure publications. Legal review creates a contemporaneous record that the Foundation acted in good faith, which is critical for both the truth defense and the "no actual malice" defense to defamation and trade libel claims.

Publication Type	Legal Review Required?	What the Review Must Confirm
Standard composite score (L2–L5, no floor failures)	No mandatory review; spot-check program recommended (10% of publications)	Spot-check confirms: system version identified with hash; CI ranges present; L-level correct; provider notice completed; methodology version correct
Any score below 40 on any dimension (even non-floor dimensions)	Yes — single attorney review	Score accurately reflects methodology; methodology correctly applied; provider notice period completed; no pending factual disputes from provider
Floor failure designation (INVALID)	Yes — two-attorney review (one assessment-familiar, one external)	All items above, plus: system version committed to cryptographic log before provider notification; provider notification documentation on file; provider response or non-response documented; disclosure language complete and correct; no pending factual disputes that, if true, would change the score
Score correction or retraction	Yes — Executive Director sign-off plus one attorney	Nature of error documented; original score not deleted; correction record complete; whether the error affects any other published scores (systematic error requiring broader review)
First publication of any system by a provider who has objected to prior assessments	Yes — single attorney review	No evidence that the prior objection influenced the current assessment process; assessors were not the same as those involved in prior disputed assessments; conflict-of-interest declarations on file
Any assessment used in regulatory or policy submissions by third parties	Yes — upon request or knowledge of such use	Jurisdiction-specific legal analysis of whether Foundation scores can be cited in the relevant regulatory context; EU AI Act notified body disclaimer confirmed present

Priority	Item	Legal Risk Addressed	Status
1	Engage Texas licensed counsel to review and approve all disclaimer language (§X) before any external publication	All defamation, UPL, and certification risks	Required before launch
2	Confirm TCPA anti-SLAPP applicability with Texas counsel and prepare standard TCPA motion template	SLAPP litigation	Required before launch
3	Implement Universal Assessment Disclaimer (§X.A) on all published score pages — non-dismissible, minimum 12px, above the fold	Defamation, certification liability	Required before launch
4	Implement L1 Provisional external publication prohibition (PUB-006) — technical controls preventing external publication of Pilot Benchmark scores	Certification liability, defamation	Required before launch
5	Engage EU data protection counsel; implement EU-U.S. data transfer mechanism before accepting EU user data	GDPR compliance	Required before EU user access
6	Add EU AI Act notified body disclaimer (§X, EU notice) to all published materials and methodology documentation	EU AI Act regulatory conflict	Required before EU market presence
7	Establish pre-publication legal review protocol for floor failure designations (§XIII); retain litigation counsel familiar with Texas media law	Defamation, trade libel	Required before any floor failure publication
8	Establish provider advertising approval program (PUB-007); draft provider terms of use prohibiting unauthorized certification language	False advertising, certification liability	Required before any provider assessments
9	Draft Corroboration Standard reviewer agreements incorporating the professional services disclaimers (§VII) and limiting language	UPL, UPM, professional services liability	Required before Corroboration Standard deployment
10	Establish legal defense fund to sustain two simultaneous SLAPP suits (Assessment Charter §6.7.1)	SLAPP litigation deterrence	Required before publishing any floor failure designations
11	Implement China non-participation policy; do not accept assessments of Chinese-market AI systems pending specialized PRC counsel	PRC regulatory conflict	Required before any international assessments
12	Obtain jurisdiction-specific UPL opinions from Texas, California, and New York counsel before Corroboration Standard deployment in those jurisdictions	UPL exposure	Required before ARIA Network public launch

Contents

Risk Matrix and Executive Summary

I. First Amendment and Opinion Privilege

The Controlling Framework: Milkovich and Its Progeny

How the Protection Applies to IAF Scores

The Conditions That Maintain the Shield

II. Section 230 Analysis

Where Section 230 Applies to the Foundation

The Critical Limitation: Corroboration Labels

Section 230 and IAF Assessment Scores

Section 230 and Texas Anti-SLAPP

III. Defamation and Trade Libel

Standard Defamation Elements (As Applied to AI Assessments)

Trade Libel (Product Disparagement)

The Floor Failure Designation — Specific Analysis

What the Foundation Must Never Publish

IV. Tortious Interference

Elements and Defenses

The Anti-Competitive Use Risk

V. False Advertising — Lanham Act and FTC

Lanham Act §43(a) Analysis

FTC Act Section 5 — Deceptive Acts and Practices

The "IAF Certified" Language Problem

VI. Certification Liability

The Foundation's Exposure and Protection

The Macker Model — What the Foundation Must Adopt

VII. Professional Services Liability — UPL and UPM

The Legal Boundary — What Protects the Foundation

Texas UPL Analysis (Jurisdiction of Incorporation)

California UPL Analysis (High-Risk Jurisdiction)

New York UPL Analysis

UPM Analysis — Medical Domain

Required Professional Disclaimer Language by Domain

VIII. EU AI Act — Notified Body Analysis

The Notified Body Regime (Articles 40–51)

The Foundation's Specific Exposure

Required EU AI Act Disclaimer

GDPR Obligations Arising from Foundation Operations

IX. International Regulatory Conflicts

Required Jurisdictional Notice

X. Exact Disclaimer Language

A. The Universal Assessment Disclaimer

B. L1 Provisional Score Disclaimer

C. Floor Failure Designation Disclaimer

D. Marginal Floor Compliance Disclaimer

E. Score Comparison Disclaimer

F. Provider Response Disclaimer

G. AI Assessment Index Homepage Disclaimer

XI. Score Publication Rules

XII. Appeals and Dispute Procedures

Why Procedural Fairness Is a Legal Requirement, Not Just Good Governance

Stage 1 — Administrative Review: Required Language and Timeline

Stage 3 — External Arbitration: Required Provisions

The "No Settlement on Terms That Suppress Scores" Rule

XIII. Pre-Publication Legal Review Requirements

XIV. Implementation Checklist