Transparency by Design: Why Open Source AI Matters for Privacy

by Yueming Zhang, Yaqi Zhou, Xinyi Zhang and Junhui Wang, Privacy Researcher, TikTok

Privacy

What is open source AI?

The term "open-source AI" has become ubiquitous in recent debates over algorithmic accountability. For much of the AI community's rapid growth, however, the term lacked a clear definition, leading to ongoing ambiguity in both technical practice and regulatory interpretation. To address this, the Open Source Initiative (OSI) launched a working group last year. The group brought together researchers, lawyers, technologists, and policymakers to develop a definition that could serve as a shared reference point for developers and lawmakers alike.

According to the OSI’s new proposal, an open source AI system must be usable for any purpose without permission, and its components should be inspectable to allow others to study how the system works. This includes source code, model architecture, and, at least to a meaningful extent, training data and model weights. The system must also permit modification and redistribution, under terms that do not restrict how it can be used downstream.

These criteria were designed not only for academic researchers or developers, but also for regulatory institutions, such as European data protection authorities and digital service coordinators, that require access to algorithmic logic and evidence to assess compliance. From this perspective, open source AI is not merely a licensing model; it is an infrastructure condition for meaningful external oversight.

The regulatory convergence of open source and algorithmic accountability

The Digital Services Act (DSA) establishes a layered framework of transparency obligations for online platforms. Key articles include:

Article 24 requires platforms to publish periodic transparency reports detailing the use of automated systems.
Article 25 mandates that user interfaces be designed in a way that does not manipulate consent or obscure user choice.
Article 26 imposes disclosure requirements for paid advertisements, including information about targeting criteria.
Article 27 obligates very large platforms to explain how their recommender systems work, including their "main parameters" and "options for users to modify or influence" those systems.

Taken together, these provisions reflect a growing regulatory expectation: that the internal logic of algorithmic systems should be understandable not only to developers, but also to users and regulators. In this respect, the technical nature of open source AI (OSAI) aligns with the law's underlying goals. OSAI can facilitate compliance in two key ways. First, by making model code and logic openly accessible, OSAI allows platforms to document and disclose algorithmic operations in a structured and verifiable way. Second, because open source systems are more adaptable to independent testing and interface adjustments, they help platforms design auditable recommendation flows and transparent user controls, contributing directly to Articles 25 and 26.

The AI Act adds to this framework. Article 13 requires all AI providers to disclose a system's capabilities, limitations, and intended use, regardless of whether the system is proprietary or open source. Still, openness of code alone does not guarantee legal transparency. Access to source files is not enough to support meaningful oversight unless it is accompanied by structured documentation, clear user-facing disclosures, and operational explanations. In this light, open source AI must now show that its architecture can stand up to normative scrutiny.

Auditing in the open: The structural advantages of OSAI

Open source AI (OSAI) indeed presents distinct advantages in the domain of algorithmic auditing. By making code and architecture publicly available, OSAI enables a form of scrutiny that is distributed, continuous, and independent. This structural openness allows external actors to examine how systems process data, implement decision rules, and encode assumptions.

Unlike proprietary systems protected by trade secrecy, OSAI permits the early detection of security flaws, bias, or regulatory violations. While openness alone does not ensure accountability, it reduces the information asymmetries that obstruct oversight. In this sense, OSAI serves as a foundation for meaningful algorithmic governance. It provides the informational basis upon which auditability, and by extension legal compliance, can be constructed.

Transparency and Inspectability Transparency and Inspectability OSAI makes system logic visible through public code. Unlike black-box systems, open models can be examined line by line. This allows researchers and regulators to understand how data is handled and decisions are made.

In this respect, TikTok’s release of its privacy-enhancing technologies under the PrivacyGo framework illustrates how open publication of cryptographic and differential privacy protocols can facilitate independent scrutiny and advance compliance with international privacy norms.

Independent Verifiability and Reproducibility Independent Verifiability and Reproducibility Public access enables third parties to test system behavior. This is vital to confirm whether claimed safeguards, such as encryption or anonymization, actually work.

TikTok's open source PETAce platform, which enables secure multiparty computation across organizations, exemplifies how technical verifiability can be institutionalized through OSAI. By releasing key components of its privacy-preserving analytics protocols, the project facilitates reproducible validation of sensitive data processing operations.

Regulatory Accessibility and Institutional Adaptability OSAI lowers practical barriers for public agencies and regulatory authorities. Open licenses permit legal inspection, sandbox testing, and modification independent of the original developer. This technical autonomy enables regulators to engage directly with the systems they supervise, fostering more granular and credible assessments. Moreover, licenses such as the GPL or AGPL impose obligations for downstream transparency and disclosure, aligning well with broader principles of public accountability.

Yet legal scholars caution: code alone is not enough. For instance, Lemley and Shafir note that many commercial users adopt Open Source Software without understanding its legal or ethical implications. Engagement must be supported by institutional capacity.

The Data Clean Room framework released by TikTok, which leverages trusted execution environments (TEEs) to enable secure data collaboration across institutional boundaries, exemplifies how OSAI can empower regulators and institutional users to test and validate privacy guarantees without reliance on black-box vendor assurances.

Conclusion

Open source AI is not a transparency solution in itself. Making code available does not guarantee explainability, nor does it remove the need for documentation, structured disclosures, or institutional engagement. However, it creates the conditions for meaningful oversight to occur. By reducing information asymmetries and enabling independent testing, OSAI gives regulators, compliance professionals, and civil society the technical foundation needed to audit complex algorithmic systems.

In this context, TikTok's open source initiatives demonstrate how open design can align with evolving legal expectations. These projects embed privacy protection, auditability, and regulatory access into their architecture, offering practical pathways for operationalizing compliance. As the regulatory environment matures, such examples suggest that open source AI, when paired with responsible governance, can bridge the gap between algorithmic innovation and democratic accountability.

Share this article

Discover more

European Capture the Flag on Privacy-Preserving Database SystemsOblivious and TikTok announce PET ARENA, a CTF competition where participants design redteaming exercises against protections like differential privacy. Top prize: €4000. Open to students, researchers, and engineers based in Europe.

TikTok x IMDA Dev Day: Building with Singapore's Tech CommunityWe teamed up with Singapore's tech community at Developer Discovery Day to explore AI and cloud innovation, share TikTok's developer tools, and support hands-on learning and collaboration.

Community

Introducing Knit: Zero-Overhead Dependency InjectionAs the TikTok app continued to grow, we began to encounter challenges that prompted us to rethink our approach to dependency injection.

Tech @ TikTok

Open source