At TikTok, we strive to advance privacy and security through innovative research and meaningful collaboration. Our team actively engages with the academic community to tackle critical challenges in privacy-preserving technologies, differential privacy, secure computing, and artificial intelligence. In this blog post, we highlight four impactful research publications from 2024 that exemplify our dedication to driving progress and shaping the future of these fields.
AnonPSI: An Anonymity Assessment Framework for Private Set Intersection (PSI)
At NDSS 2024 Symposium, we introduced AnonPSI, a groundbreaking framework for evaluating anonymity in Private Set Intersection (PSI) protocols. PSI is a crucial cryptographic technique that allows two parties to securely compute the intersection of their data sets without revealing any additional information. However, while PSI ensures data confidentiality, its anonymity guarantees have historically been less explored.
This research addresses this gap by offering a comprehensive framework to measure and evaluate the anonymity provided by PSI protocols. It introduces innovative metrics designed to assess the risk of sensitive information exposure, helping researchers and practitioners better understand and mitigate potential vulnerabilities. These insights not only enhance the theoretical understanding of PSI but also offer practical guidance for designing privacy-preserving systems that balance efficiency and security.
The AnonyPSI findings highlight the challenges of secure data sharing. Our subsequent work with PrivacyGo takes this to the next level by offering more advanced privacy-preserving solutions that tackle these challenges directly. The code can be found in the PrivacyGo GitHub repository, with details discussed in our October blog post.
Budget Recycling in Differential Privacy
Building on the exploration of privacy challenges, this second publication dives into one of the core obstacles in differential privacy: efficiently managing the finite privacy budget. Differential privacy ensures sensitive data is protected by introducing controlled noise to computations, but its effectiveness is often constrained by the limited "privacy budget," which dictates how much a dataset can be used without risking privacy breaches.
In this research, the authors propose an innovative technique for "recycling" unused portions of the privacy budget, allowing it to be repurposed for further analyses without compromising privacy guarantees. This breakthrough not only maximizes the utility of the budget but also enables more accurate analysis while staying within strict privacy constraints. Experimental results demonstrate the framework's effectiveness across a variety of real-world applications, making it a practical solution for resource-constrained environments.
This paper's contributions further advance the field of privacy-preserving techniques, offering a new perspective on how to balance utility and privacy in sensitive data analysis. For a deeper dive into this work, check out the full paper here.
Enabling Threshold Functionality for Private Set Intersection in Cloud Computing
Continuing the exploration of advancements in privacy-preserving technologies, the third publication, featured in IEEE Transactions on Information Forensics and Security, presents a novel protocol for Private Set Intersection (PSI) that introduces threshold functionality tailored for cloud computing environments. This protocol, known as threshold PSI, enables parties to compute set intersections only when the overlap surpasses a pre-defined threshold, offering greater flexibility and control in data-sharing scenarios.
The research addresses the challenges posed by large-scale datasets in cloud computing scenario, optimizing performance specifically for the large workloads common in cloud computing. Alongside these performance improvements, the publication includes a thorough security analysis, demonstrating the protocol's security against malicious actors and ensuring strong privacy guarantees. For a detailed look at this research, you can read the full paper here.
Characterizing Context Influence and Hallucination in Summarization
Rounding out another publication, this next paper tackles a critical challenge in natural language processing (NLP) - the phenomenon of "hallucination" in text summarization systems. This occurs when models generate content that is either factually inaccurate or contextually irrelevant, posing significant obstacles to the reliability of AI-driven tools.
The research delves into how contextual factors influence the accuracy of summarization systems, shedding light on the root causes of hallucination in generated content. To address this, the authors propose a comprehensive framework for quantifying and mitigating hallucination risks. They also outline practical strategies to enhance the reliability of summarization tools, making them more dependable for real-world applications.
Though centered on NLP systems, this work has far-reaching implications for building trustworthy and transparent AI applications across a range of domains. If you'd like to explore the full details, the paper is available here.
Advancing Privacy and Security
These publications reflect our mission to lead in privacy and security innovation. By addressing real-world challenges in PSI, differential privacy, cloud computing, and AI reliability, we aim to empower developers, researchers, and organizations with solutions that are both secure and practical.
We believe that fostering a culture of collaboration between industry and academia is essential for driving impactful progress. Stay tuned for more research updates as we continue our journey toward building a safer, more secure digital world.
For those interested in exploring these publications further, feel free to access the papers linked above or reach out to our team with questions. We are always excited to engage with the broader community on privacy and security topics.