TikTok for Developers
Monitoring Your Assets in the Face of Emerging Cloud-Squatting Attacks
by Abdullah Husam Al-Sultani, Security Engineer, TikTok
Security

In our first post about Security, we talk about an emerging attack against cloud assets called Cloud-Squatting. We explain the root causes and propose a mitigating system design for such scenarios.

Introduction

In the ever-evolving landscape of cybersecurity, attackers are constantly inventing new techniques to compromise digital assets and infiltrate systems. In recent years, the adoption of cloud services significantly increased, enabling organizations to leverage the benefits of scalability, availability, and cost-efficiency. However, this shift towards cloud-based infrastructure also brought new security challenges, including the emergence of Cloud-Squatting attacks. Cloud-Squatting happens when a company leases space and IP addresses on a public server, uses it, and then releases the space and IP addresses back to the cloud vendors. However, they forget to remove resources (e.g. domain names) using these IPs or services. The server space providers such as AWS, GCP, and Azure then re-assign the same addresses to other companies. If the new company is a bad actor, it could take advantage of the information coming into the address intended for the previous organization.

Attack Scenarios

Subdomain takeover

DNS, which stands for Domain Name System, is a fundamental component of the internet that translates human-readable domain names into numerical IP addresses. All devices connected to the internet, such as computers, servers, and smartphones, use IP addresses to communicate with each other. However, remembering numerical IP addresses for every website or online service would be impractical for users.

Here's how DNS works:

  1. Domain Names: Users type domain names (e.g., www.example.com) into web browsers to access websites or online services.
  2. DNS Resolution: When a user enters a domain name, the device needs to obtain the corresponding IP address. This is where DNS comes into play.
  3. DNS Servers: DNS servers maintain a distributed database known as the DNS registry. This registry contains records that map domain names to IP addresses. There are different types of DNS records, such as A records (IPv4 addresses), AAAA records (IPv6 addresses), MX records (mail server addresses), and others.
  4. DNS Query: When a user's device requests the IP address for a domain, it sends a DNS query to a DNS resolver (usually provided by the Internet Service Provider or a third-party DNS service).
  5. Recursive Query: If the DNS resolver doesn't have the IP address in its cache, it starts a recursive query. It contacts the root DNS server, which directs it to the top-level domain (TLD) server associated with the domain name.
  6. Iterative Queries: The process continues with iterative queries to authoritative DNS servers responsible for the specific domain until the resolver receives the authoritative answer containing the IP address.
  7. Response: The DNS resolver caches the IP address and returns it to the user's device. Subsequent requests for the same domain can be answered directly from the cache, reducing the need for repeated DNS queries.

In this context, let's assume that an XYZ company started using a server from a cloud provider for a specific short-term project. After creating the environment for that server instance (the virtual computer), the cloud provider assigned an IP address of 203.0.113.42 to that asset. Then, XYZ added this IP address to their DNS records. For example, promo.example.com is a DNS entry that points to an A type record, matching to the IP address of the server instance.

After the project is over, a developer decides to remove the server instance. Then, the cloud provider returns 203.0.113.42 IP to their free-to-use IPs pool. The developer forgets to clean up the DNS config, and the promo.example.com domain is still pointing to the IP address which is now released. A malicious actor can register at the cloud provider and repeatedly request a new IP address for their virtual computer until 203.0.113.42 gets assigned to them. Since the DNS entry still points to this IP address, an attacker can serve their content on this domain (e.g. HTML pages, JS, ...etc).

When an attacker gains control over a domain, they can exploit it in different ways. For example:

  1. Phishing Attacks:
    • Attackers can use the compromised subdomain to host phishing websites, imitating legitimate services. This can trick users into entering sensitive information such as login credentials, credit card details, or personal information.
  2. Malware Distribution:
    • The compromised subdomain can be used to host malicious content or distribute malware. Visitors to the subdomain may unknowingly download and execute malicious files, compromising the security of their devices.
  3. Reputational Damage:
    • If the compromised subdomain is associated with a reputable organization, the attack can tarnish the organization's reputation. Users may lose trust in services offered under the affected subdomain, impacting the organization's brand image.
  4. Cross-Site Scripting (XSS) Attacks:
    • Attackers can serve malicious scripts from the compromised subdomain, which can effectively result in cross-site scripting that can be exploited for session hijacking or performing actions on behalf of the affected users. This can potentially compromise the security and privacy of users interacting with the affected subdomain.
  5. Data Exfiltration:
    • An attacker gaining control of a subdomain may use it as a platform for data exfiltration. This could involve stealing sensitive information from users interacting with the compromised subdomain or leveraging it as a pivot point to access other parts of an organization's network such as leaking OAuth tokens.

Data leak

Another type of attack is data leakage. To illustrate it, the Go code below establishes a connection between the local machine and a cloud server instance using UDP protocol in order to send log messages to a remote server. As shown, the cloud provided server IP address 203.0.113.42 is hardcoded inside the code (could be fetched from a config file as well).

package main

import (
        "log"
        "net"
)
const IP string = "203.0.113.42"
const PORT string = "8080"
func main() {
        // Remote IP address and port
        remoteAddr, err := net.ResolveUDPAddr("udp", IP + ":" + PORT)
        if err != nil {
                log.Fatal("Error resolving remote address:", err)
        }

        // Local address for sending UDP packets
        localAddr, err := net.ResolveUDPAddr("udp", "localhost:0")
        if err != nil {
                log.Fatal("Error resolving local address:", err)
        }

        // Open a UDP connection
        conn, err := net.DialUDP("udp", localAddr, remoteAddr)
        if err != nil {
                log.Fatal("Error opening UDP connection:", err)
        }
        defer conn.Close()

        // Log data to be sent
        logData := "Log entry: This is a sample log message."

        // Send log data to the remote IP address
        _, err = conn.Write([]byte(logData))
        if err != nil {
                log.Fatal("Error sending log data:", err)
        }

        log.Println("Log data sent successfully.")
}

This code was written for the promo.example.com project from the example above. The developer hardcoded the IP address because it was easier to test the project that way. Then, the developer deletes the server instance with the IP 203.0.113.42. An attacker can repeatedly request a new IP address until 203.0.113.42 gets assigned to their server. After that, they start listening to every incoming connection on every port and read the data that is received (in the example above, the log entries). If the Go code is still running somewhere, this will lead to leaking the log data to the attacker-controlled cloud server.

Hosting malicious files

One more example that has a different scenario, but similar root cause is the use of cloud service then delete it from the orgnization account while it is still being refered to in different places (e.g. DNS records). In this example, we will use Object Storage instead of cloud computing to demonstrate the impact of this issue. Lets assume that service XStorage is a widely used object storage service provided by a cloud provider. It allows individuals and organizations to store and retrieve any amount of data at any time, from anywhere on the web. XStorage is designed to be highly scalable, durable, and secure, making it suitable for a variety of use cases, ranging from simple storage to complex big data analytics. As the contents of XStorage buckets can be accessed via HTTP, they prove suitable for storing and delivering static assets like images, videos, stylesheets, user-uploaded content, or entire static websites. In these scenarios, XStorage buckets are assigned an HTTP address, such as http://[bucket-name].xstorage-website-us-west-2.cloudprovider.com. Organizations often establish personalized subdomains, directing them to the bucket's address using a DNS CNAME record, as illustrated below:

static.example.com.                59        IN        CNAME        static.example.com.xstorage-website-us-west-2.cloudprovider.com.
static.example.com.xstorage-website-us-west-2.cloudprovider.com. 59 IN CNAME xstorage-website-us-west-2.cloudprovider.com.
xstorage-website-us-west-2.cloudprovider.com. 4 IN A        127.0.0.1

The DNS entry for static.example.com shows that it points to an XStorage bucket. If this bucket was deleted from our cloud account, then someone else could register a bucket with the same name and serve different content for the files referenced from different pages.

<!DOCTYPE html>
<html>
   <head>
         <script src="https://static.example.com/status.js"></script>
   </head>
   <body>
      <h1>Example Blog</h1>
      ...
   </body>
</html>

If the attacker starts serving malicious JS files on static.example.com, it will be run on every page that includes a link to that file.

Solution

Overview

In order to solve this issue, a proactive approach needs to be taken to prevent this issue at scale. Developers have to keep track of their cloud assets that could be reassigned to different parties. We have two different root causes for this issue:

  1. An IP address gets reassigned by a cloud provider such as AWS, GCP, Azure, etc. to a different entity.
  2. A service provider that uses CNAME to point to a user. If CNAME could be registered by a different user, it could cause subdomain takeover.

We will design a system that covers these two cases and keep it flexible to accept more plugins.

System design

System components

To facilitate the system, we need the following components:

  1. DNS tracking worker #1:
    1. This dedicated worker plays a role in maintaining the integrity of our domain-related data and ensuring that our systems are always up to date. Its primary task involves fetching domains and their associated information from our DNS records.
  2. Cloud assets tracking worker #2:
    1. This dedicated worker diligently fetches the IPs (public ones) and their corresponding information from different cloud primary sources like AWS, GCP, Azure, etc. by utilizing their APIs to accomplish this task.
    2. It gathers the necessary data, and meticulously stores this information in our secure database. This database serves as the backbone of our system, providing a wealth of data for other workers to utilize in their tasks and ensuring that our services are powered by the latest information available.
  3. Matching worker #3:
    1. Expired IP Address Management (Fetch-Based): Its initial role is to meticulously inspect our records for any expired IP addresses retrieved during the last data fetch. It checks if a DNS entry has an IP address that belongs to a cloud provider. Upon identifying any such addresses, it promptly generates a comprehensive report and undertakes an exhaustive examination to determine whether these expired addresses are still referenced in any of our DNS records or utilized within our extensive codebase or configuration center.
    2. Expired IP Address Management (Cloud Management Status-Based): In tandem with its fetch-based checks, it also conducts an assessment of expired IP addresses based on their status property as reported by our cloud internal system. Any expired addresses detected through this method are promptly flagged, and a thorough investigation ensues to verify their presence in our DNS records or usage within our codebase or configuration center.
    3. Domain Information and Vulnerability Assessment: Another critical facet of its role is the scrutiny of domain information, particularly focusing on CNAME (Canonical Name) records. When a CNAME record matches any well-known vulnerable services, the server initiates a request to assess the response for specific fingerprint patterns. This vital process is instrumental in identifying potential security threats and responding swiftly to mitigate them. Using the following:
      • Subdomain Takeover Validation with subjack tool: Within this task, it harnesses the capabilities of the subjack tool to validate potential subdomain takeover issues.
      • Reference to 'Can I Takeover XYZ' Table: In concert with the subjack tool, it consults a specialized table known as 'Can I Takeover XYZ.' This repository serves as a knowledge base for identifying subdomain takeover risks, enabling proactive mitigation efforts when required.
    4. Chat Bot Communication: When any issues or vulnerabilities are uncovered during its vigilant monitoring, this component acts as an efficient messenger. It instantly transmits messages to our chat bot, providing detailed information about the detected issue. This proactive communication ensures that our team is promptly informed and can take appropriate actions to rectify any identified problems.
  4. IP ranges and CNAMEs collector worker #4:
    1. This dedicated worker holds a pivotal role in our infrastructure, specializing in the collection of IP address ranges from seven prominent cloud providers. These providers represent the backbone of modern cloud computing, and their IP address ranges are essential for various aspects of our operations. Moreover, it keeps a record of any subdomain takeover fingerprint available.
    2. The worker monitors changes on cloud providers ranges URLs, plus the subdomain takeover fingerprints file. For instance, it keeps track of the responses of many cloud providers ranges, including but not limited to:
      1. https://ip-ranges.amazonaws.com/ip-ranges.json
      2. https://www.gstatic.com/ipranges/cloud.json
      3. https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
      4. https://raw.githubusercontent.com/EdOverflow/can-i-take-over-xyz/master/fingerprints.json

Diagram

All in one diagram:


Result

Upon executing the system, the ultimate outcome should encompass the identification of all domains that are either:

  • Associated with a previously utilized dangling IP address.
  • Expired IPs, but are still being referred to in the code or configs.
  • Susceptible to subdomain takeover based on the response fingerprint.

Conclusion

To summarize, the widespread integration of cloud services has witnessed a substantial uptick, affording organizations the advantages of scalability, availability, and cost-effectiveness. However, this shift to cloud-centric infrastructure has concurrently introduced novel security complexities, exemplified by the emergence of Cloud-Squatting attacks. This article covered how this issue can be solved in an infrastructure that uses cloud services.

References


Share this article
Discover more
Highlights from our Privacy Innovation Meetup at ACM CCS 2024
TikTok's Privacy Innovation team hosted a meetup at ACM CCS 2024, showcasing privacy-preserving technologies like ManaTEE and reinforcing the team's commitment to privacy and security through industry and academic collaboration.
Privacy
Community
A Recap of DevDay 2024: TikTok's Inaugural Developer Conference
Our first-ever TikTok DevDay in San Jose was an incredible success! With over 300 developers in attendance, the event provided an immersive experience into TikTok’s growing ecosystem of tools and innovations. Here is the recap blog of our event.
Community
TikTok Donates ManaTEE Open Source Project to the Linux Foundation
TikTok is donating ManaTEE, a platform built on Trusted Execution Environments, to the Linux Foundation’s Confidential Computing Consortium. ManaTEE is designed to address critical challenges in data privacy and security.
Tech @ TikTok
Open source