What is Privacy Innovation?
Privacy Innovation, launched in 2023, is TikTok's open-source initiative committed to advancing data privacy through cutting-edge technological advancements. Privacy Innovation aims to research new ways to safeguard the privacy and security of TikTok users and protect sensitive information of partner organizations.
At TikTok, we proactively seek state-of-the-art solutions to address privacy challenges for billions of users on a global scale. Privacy Innovation underscores this commitment by actively researching and developing Privacy Enhancing Technologies (PETs), providing more robust solutions for the industry.
Recap of 2023 highlights
As we step into the new year, let's reflect on some of the achievements from Privacy Innovation in 2023.
PrivacyGo Released
PrivacyGo is the open-source synergistic fusion of various PETs, such as differential privacy, multi-party computation, homomorphic encryption, and artificial intelligence methods to enhance user privacy protection. PrivacyGo strives to carefully design approaches that harness the strengths of PETs while mitigating their individual limitations. For more details, visit PrivacyGo on GitHub.
PETAce Released
PETAce (Privacy-Enhancing Technologies via Applied Cryptography Engineering) is a privacy-enhancing protocol framework based on state-of-the-art research results. It provides data processing methods such as secret sharing, homomorphic encryption, and oblivious transfer, and can perform collaborative computation and analysis of two-party data while preserving data privacy. For more details, visit PETAce on GitHub.
32nd USENIX Security Symposium Workshop
The USENIX Security Symposium is an esteemed event that brings together professionals, researchers, and others interested in advances in the security and privacy of computer systems and networks. As a Gold Sponsor for the Symposium, TikTok introduced Privacy Innovation during a workshop session with over 50 attendees.
PERP Conference '23 Talk
The USENIX PERP (Privacy Engineering Practice and Respect) Conference focuses on designing and building products and systems with privacy and respect for their users and the societies in which they operate. Our PrivacyGo team gave a talk at PERP '23 titled: Protecting User Privacy in Private Set Intersection. Watch the recorded presentation.
Looking ahead to 2024
As we embark on our journey through 2024, we're excited to share our roadmap, packed with innovative projects and advancements aimed at elevating privacy standards and safeguarding user data.
The open-source community is the cornerstone of Privacy Innovation, shaping our software development process. Your feedback serves as our compass, guiding us toward the features you want and the improvements you need. We're grateful for your ongoing support and collaboration as we continue to innovate together.
PrivacyGo Evolution
In 2024, the engineering team is focused on enhancing four key areas for improvement.
- Output Privacy Guarantee in MPC — Privacy is paramount in multi-party computation (MPC), yet traditional methods may unintentionally leak sensitive information due to the MPC output in plaintext. In response, PrivacyGo is introducing output privacy-guaranteed MPC protocols, ensuring robust privacy guarantees while minimizing complexity. By addressing representation errors, we're enhancing secure computation and data privacy.
- Privacy-Preserving LLM Privacy — Our latest project leverages Large Language Models (LLMs) to address privacy concerns surrounding the use of sensitive data such as copyright information and personal user data in AI training. Through the implementation of techniques such as differential privacy, data obfuscation, and cryptography, we ensure the protection of user privacy during both LLM training and prediction processes, as well as safeguarding copyrighted input data. Our aim is to uphold user privacy and data integrity while utilizing LLM technology in commercial applications.
- Hybrid Trust Computing — Balancing trust levels in Trusted Execution Environments (TEEs) poses challenges, especially in multi-party settings. To address this, PrivacyGo proposes a hybrid framework combining MPC and TEEs, facilitating secure data processing across diverse trust scenarios. This approach fosters practical applications like advertising sandbox environments, ensuring data security across varying trust levels.
- Privacy Validation — Our product team is dedicated to fortifying product defensibility and explainability through rigorous privacy assesments. By evaluating privacy risks and leveraging advanced validation methods, we're enhancing the security and privacy of internal products. Our goal is to provide actionable feedback for continuous improvement and prepare for successful product launches.
PETAce Advancements
In addition to PrivacyGo initiatives, we're excited to introduce advancements in our PETAce platform, furthering privacy and efficiency in data computation.
Below is our timeline of PETAce advancements in 2024:
- MPC-based SQL Query — Enhancing support for SQL operators and in-memory computation, enabling joint query tasks with substantial datasets.
- MPC-based SQL Query for Big Data — Scaling up MPC computation capabilities by leveraging storage engines for increased data volume handling.
- High-Precision Computation — Extending precision in floating-point number representation for enhanced accuracy.
- Online and Offline Design — Streamlining MPC phases for improved performance by separating offline and online computations.
- Fully homomorphic encryption (FHE) — Supporting FHE algorithms and enhancing efficiency in MPC protocols.
- Slient Oblivous Transfer — Optimizing OT message generation and multiplication triples for enhanced privacy.
- Private Information Retrieval (PIR) — Designing and implementing PIR protocols for secure data queries.
- Multi-ID Matching — Enabling multi-ID and multi-feature matching based on circuit-PSI for comprehensive data analysis.
Shadowgraphy Introduction
Shadowgraphy is an open-source data pseudonymization SDK. Pseudonymization is the process of replacing identifiable data with pseudonyms for privacy compliance. Shadowgraphy provides secure APIs for seamless data pseudonymization, providing enterprises a consistent data protection service with industrial standards and best practices. By prioritizing ease of use and robustness, Shadowgraphy enables developers, even those without cryptography expertise, to integrate cryptographic pseudonymization techniques effectively, fostering a culture of privacy-aware development.
Below is our timeline of Shadowgraphy in 2024:
Trusted Execution Environments Introduction
Trusted Computing, also known as Confidential Computing, is integral to enhancing security and trust in TikTok's platform. Leveraging hardware-based Trusted Execution Environments (TEE), TikTok ensures resilience against cyber threats and protects sensitive data and critical infrastructure. Currently, we are focused on the first product adoption of TEE at TikTok: Confidential Data Clean Room for secure data analytics and collaborations between multiple parties.
Confidential Data Clean Room provides a secure environment protected by TEE where multiple parties can collaborate on sensitive data without compromising its confidentiality or integrity. Participants can also remotely attest to the environment to prove the security and trustworthiness of the data clean room. This enables organizations to share data and insights with partners with confidence.
In addition to Confidential Data Clean Room, below is a timeline of Trusted Execution Environments in 2024:
Privacy Platform Introduction
The Privacy Platform is designed to facilitate secure cross-domain data computing without compromising data privacy, and leverages technologies such as multi-party secure computing, homomorphic encryption, and federated learning. It encompasses two primary capabilities: cross-domain data joint modeling and cross-domain data joint analysis. The former involves collaborative model training across multiple data providers, spanning various model types like linear, tree, neural network, and clustering models. The latter enables joint statistical analysis, supporting common SQL analysis functionalities. To support these functions, the platform includes essential sub-modules such as scheduler, algorithm, computing engine, network, and storage.
Below is a timeline of Privacy Platform in 2024:
Conculsion
As we navigate through 2024, Privacy Innovation remains steadfast in our commitment to advancing privacy and security in the digital landscape. With our community's support and feedback, we're poised to achieve new heights in privacy-preserving technologies, empowering individuals and businesses with confidence in their data protection measures. Stay tuned for more updates as we embark on this exciting journey together!