TikTok for Developers
Bulk Suppressions: A new ESLint feature for large codebases
by Kevin Yang, Frontend Engineer, TikTok
Open source

Kevin Yang is a frontend engineer from TikTok's web infrastructure team. In this post, he describes his journey implementing a new feature for ESLint that simplifies suppressing legacy violations when introducing new style rules to a large, preexisting code base. The initial version of this feature is being released as part of Rush Stack's feature patch for ESLint.


As a codebase grows in size, engineering teams face increasing pressure to standardize coding practices and catch more bugs at compile time. For TikTok's TypeScript projects we use ESLint, a popular static analysis tool. ESLint offers a large ecosystem of premade lint rules, plus the ability to define your own custom rules via plugins. However each time we improve our ruleset, a familiar problem arises: how to handle violations in existing source files. In a small codebase, engineers can simply do the work to fix them. But what if there are hundreds or thousands of these "legacy violations?" New work should certainly follow the new rules, whereas tampering with old files can be costly and often isn't a business priority. It could even cause regressions.

An obvious solution is to suppress legacy violations using comment directives such as // eslint-disable-next-line, but we found that this didn't scale very well. Engineers normally add such directives only in special situations where a rule is not relevant. From a policy perspective, we want those situations to be rare. When a comment directive is spotted during the code review process, reviewers should take notice and ask for justification, or maybe start a conversation about improving the rule itself. It's hard work to establish an engineering culture where people think this way, so we worried about these norms being undermined if an automated tool added thousands of suppression comments into source files, which would also clutter the code.

A new way to handle legacy violations

This led to an insight that machine-generated "bulk suppressions" can be stored and managed separately from human-authored directives. It's not an entirely novel idea — for example, the .NET Framework's analyzers support a global suppression file along these lines. Since there didn't seem to be an established way to do this with ESLint, I'm excited to announce that we've created one! Today, I'll share a bit about our implementation, how it came together, plus some thoughts about evolving an internal prototype into a public contribution to an established open source project.

Basic idea

To use ESLint bulk suppressions, start by installing the NPM packages:

cd your-project

# Add the patch to your "devDependencies"
npm install --save-dev @rushstack/eslint-patch

# Install the command-line tool
npm install --global @rushstack/eslint-bulk

The initial release of our feature is part of Rush Stack's @rushstack/eslint-patch, which provides a clever mechanism for injecting experimental features into your existing ESLint version. It even works with the ESLint VS Code extension. The patch is loaded by adding this line to the top of your config file:


<your-project>/.eslintrc.js

require("@rushstack/eslint-patch/eslint-bulk-suppressions"); // 👈👈👈

module.exports = {
  rules: {
    rule1: 'error',
    rule2: 'warning'
  },
  parserOptions: { tsconfigRootDir: __dirname }
};

The command-line (CLI) syntax is as follows:

# Suppress legacy violations of one or more rules:
eslint-bulk suppress --rule NAME1 [--rule NAME2...] PATH1 [PATH2...]

# Suppress all legacy violations:
eslint-bulk suppress --all PATH1 [PATH2...]

# Clean up old suppressions
eslint-bulk prune

Bulk suppressions are stored in a JSON file in your project folder alongside package.json. The file format looks like this:


.eslint-bulk-suppressions.json

{
  "suppressions": [
    {
      "rule": "no-var",
      "file": "./src/your-file.ts",
      "scopeId": ".ExampleClass.example-method"
    }
  ]
}

Here's a typical usage workflow:

  1. Checkout your main branch, which is in a clean state where ESLint reports no violations.
  2. Update your configuration to enable the latest lint rules; ESLint now reports thousands of legacy violations.
  3. Run eslint-bulk suppress --all ./src to update .eslint-bulk-suppressions.json.
  4. ESLint now no longer reports violations, so commit the results to Git and merge your pull request.
  5. Over time, engineers may improve some of the suppressed code, in which case the associated suppressions are no longer needed. Run eslint-bulk prune periodically to find and remove unnecessary suppressions from .eslint-bulk-suppressions.json, ensuring that new violations will now get caught.

The shifting sands of code scopes

We considered many possibilities before finally arriving at the scopeId field seen above. Source code constantly changes, so specifiers must be flexible but not too broad. For example, ESLint can easily suppress a given rule for an entire source file (identified by file path), but such a broad scope makes it too easy for non-legacy violations to creep in.

At the other extreme, the // eslint-disable-next-line directive perfectly identifies a single line, but our whole premise was to avoid injecting these comment directives. We also considered storing line numbers in .eslint-bulk-suppressions.json, but that would have been too brittle. Even if an engineer manually adjusted the line numbers in the JSON file, they could get invalidated again when merging Git branches. We tried ignoring file location and matching specific error strings (for example "reportIconUrl" was defined but never used from the no-unused-vars rule). This works well for some rules, but too many ESLint rules emit the same string for every violation.

Taking inspiration from .NET's SuppressMessageAttribute, we ultimately decided to invent a scopeId syntax that identifies JavaScript code blocks along the parent hierarchy from the @typescript-eslint/parser abstract syntax tree (AST). If you're unfamiliar with parse trees, check out AST Explorer for an excellent visualization.

For a concrete example, consider a class Car with method drive() that violates some ESLint rules. Let's say the gas variable uses the deprecated var instead of const, which violates the no-var rule. Also suppose the drive() function's return value relies on type inference (which is appealing for casual projects but problematic in a large shared codebase). This violates the explicit-function-return-type rule. The scopeId notation identifies this region using the string ".Car.drive":

For a class Car with method drive(), our scopeId syntax represents that block as ".Car.drive"

Although seemingly straightforward, this approach brings some thorny design questions. What if there are two different top-level things called Car in different parts of the file, due to TypeScript declaration merging? Did you know about class expressions that allow a statement like return class Car { drive() {} };? Compared to languages like C#, TypeScript has a lot of unusual nesting constructs, and also many possibilities for the same scopeId to match multiple disjoint sections of a file.

I worried a lot about this at first, but in practice these concerns turned out to be negligible because the problem of bulk suppressions is inherently statistical. If a few extra code blocks get suppressed unnecessarily, that's no worse than if a new violation gets introduced in a block containing a legacy violation.

In hindsight, the important design requirements are:

  • Intuitive reading: Users should be able to see a string like ".Car.drive" and guess which function it's referring to without having to consult a technical spec. Intuitive writing isn't so important, since the scopeId is generally machine-generated.
  • Fast matching: The AST visitor algorithm must be able to match scopeId efficiently.
  • Stability across edits: Patterns should be unaffected by everyday edits and Git merges.

Thus, although we do expect further evolution of the scopeId design based on user feedback, the key requirements are sufficiently satisfied.

A monkey patch? What's that?

Due to its prominence, official ESLint contributions involve an RFC process and extended collaboration which seemed daunting for an untested prototype. We considered forking ESLint, but this could have hindered adoption — users may hesitate to switch to an unofficial NPM package. A fork also brings ongoing maintenance costs to stay current with the upstream project. Instead, we chose initially to contribute our feature to Rush Stack's @rushstack/eslint-patch, which we were already using at Tiktok. Instead of patching source files, it works by modifying the ESLint software in memory at runtime, a technique called "monkey patching." For JavaScript, this means overwriting object prototypes, which, in this case happens when ESLint evaluates the require("@rushstack/eslint-patch/eslint-bulk-suppressions"); statement while reading the .eslintrc.js file.

Monkey patching is easy when the code to patch belongs to an exported function, but unfortunately the relevant ESLint algorithm operates inside a private JavaScript closure, so despite relatively minor patches, I basically had to substitute the entire linter.js module. This file evolves somewhat across ESLint releases, which meant our NPM package would need to include multiple patched copies of linter.js: one for each supported ESLint version. The Rush Stack maintainers raised concerns about increased download times and also pointed out a licensing hitch since linter.js has a different copyright. We solved that by reading the file from ESLint's folder and replacing it at runtime, thus avoiding the need to redistribute it. To avoid taking dependency on a POSIX-style patch library, we decided to handcode a simple RegEx-based patcher, which turned out to allow more intelligent rewriting and is probably a superior approach. You can find the details in my PR #4366.

Open source is open to everyone

I'm writing this today not just to share what we've done at TikTok, but to tell you that you can get involved. Open source software (OSS) welcomes everyone's participation, whether you're an independent developer or a professional at a company. This feature originated from TikTok's engineering needs, but by sharing it with the world, we're hoping it could become so much more than that. Maybe with some contributions from you, you'll open up the door for use cases that we couldn't have even imagined. If you have a large codebase using ESLint, give the bulk suppressions feature a try, and let us know what you think.


Share this article
Discover more
Highlights from our Privacy Innovation Meetup at ACM CCS 2024
TikTok's Privacy Innovation team hosted a meetup at ACM CCS 2024, showcasing privacy-preserving technologies like ManaTEE and reinforcing the team's commitment to privacy and security through industry and academic collaboration.
Privacy
Community
A Recap of DevDay 2024: TikTok's Inaugural Developer Conference
Our first-ever TikTok DevDay in San Jose was an incredible success! With over 300 developers in attendance, the event provided an immersive experience into TikTok’s growing ecosystem of tools and innovations. Here is the recap blog of our event.
Community
TikTok Donates ManaTEE Open Source Project to the Linux Foundation
TikTok is donating ManaTEE, a platform built on Trusted Execution Environments, to the Linux Foundation’s Confidential Computing Consortium. ManaTEE is designed to address critical challenges in data privacy and security.
Tech @ TikTok
Open source