aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering

Introducing Easier De-Identification Of Cloud Storage Data

  • aster.cloud
  • August 24, 2022
  • 4 minute read

De-identification of Cloud Storage just got easier

Many organizations require effective processes and techniques for removing or obfuscating certain sensitive information in the data they store. An important tool to achieve this goal is de-identification. Defined by NIST as a technique that “removes identifying information from a dataset so that individual data cannot be linked with specific individuals. De-identification can reduce the privacy risk associated with collecting, processing, archiving, distributing or publishing information.”

Always striving to make data security easier, today we are happy to announce the availability of a de-identification action for our Cloud Storage inspection jobs. Now, you can de-identify Cloud Storage objects, folders, and buckets without needing to run your own pipeline or custom code. Additionally, we have enhanced our transforms by adding a new dictionary replacement method that can help you achieve stronger privacy protection – especially with unstructured data you might store like customer support chat logs.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

The “De-identify findings” Action

The “de-identify findings” action for Cloud DLP inspection jobs is a fully managed feature that creates a de-identified copy of the data objects that are inspected. This means that you can inspect a Cloud Storage bucket for sensitive data like Personal Identifiable Information (PII) and then create a redacted copy of these objects all with a few clicks in the Console UI. No need to write custom code or manage complex pipelines and since it’s fully managed, it will auto-scale for you without you needing to manage quota.

 

This new action supports the following data types:

  • Text files
  • Comma- or tab-separated values
  • Images (see regional limitations)
Read More  People Want Data Privacy But Don't Always Know What They're Getting

Once enabled, the DLP job will perform an inspection of the data and produce a de-identified copy of all supported files into the output bucket or folder.

You can also use the new de-identify action on Job Triggers to automatically de-identify new content as it appears on a recurring schedule. This is useful for creating a workflow with a safe drop zone for incoming files that need to be de-identified before being made accessible.

What can automatic De-identification do?

Cloud DLP provides a set of transformation techniques to de-identify sensitive data while attempting to make the data still useful for your business.  These techniques include:

  • Redaction: Deletes all or part of a detected sensitive value.
  • Replacement: Replaces a detected sensitive value with a specified surrogate value.
  • Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*).
  • Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Cloud DLP supports several types of tokenization, including transformations that can be reversed, or “re-identified.”
  • Bucketing: “Generalizes” a sensitive value by replacing it with a range of values. (For example, replacing a specific age with an age range, or temperatures with ranges corresponding to “Hot,” “Medium,” and “Cold.”)
  • Date shifting: Shifts sensitive date values by a random amount of time.
  • Time extraction: Extracts or preserves specified portions of date and time values.

New Dictionary Replace method

When a sensitive data element is found, dictionary replacement replaces it with a randomly selected value from a list of words that you provide. This transformation method is especially useful if you want the redacted output to have more realistic surrogate values.

Read More  Cloud Data Loss Prevention (Cloud DLP) Overview

Consider the following example: You collect customer support chat logs as part of providing service to your customers. These support chat logs contain various types of Personal Identifiable Information (PII) including people’s names and email addresses. Cloud DLP can find and de-identify the sensitive elements with static replacements such as “[REDACTED]” to help prevent someone from seeing this sensitive data.

With the new dictionary replacement method you can instead replace these findings with a randomly selected value from a dictionary. This dictionary replacement provides two key benefits over static replacement:

  1. The resulting output can look more realistic
  2. Because the output looks more realistic, it can help conceal any residual names (a privacy de-identification technique sometimes referred to as “hiding in plain sight”)

An example of this:


Input:

[Agent] Hi, my name is Jason, can I have your name?

[Customer] My name is Valeria

[Agent] In case we need to contact you, what is your email address?

[Customer] My email is [email protected]

[Agent] Thank you.  How can I help you?

De-identified Output:

[Agent] Hi, my name is Gavaia, can I have your name?

[Customer] My name is Bijal

[Agent] In case we need to contact you, what is your email address?

[Customer] My email is [email protected]

[Agent] Thank you.  How can I help you?


As you can see in the output, the names and email addresses have been replaced with a random value that both protects the original sensitive information but also makes the output look more realistic. This can make the data more useful and help “hide” any residual PII.

Read More  Cloud On Spain’s Terms: New Google Cloud Region In Madrid Now Open

Next Steps:

To learn more about De-Identification check out our Technical Docs, try De-identification of Storage in the Cloud Console and Watch a recent Google I/O talk on De-identification of data.

 

 

By: Scott Ellis (Senior Product Manager) and Jordanna Chord (Staff Software Engineer)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Automation
  • Cloud Storage
  • De-Identification
  • Google Cloud
You May Also Like
Points, Lines and a Question
View Post
  • Architecture
  • Design
  • Engineering
  • People

What Is The Point In Making Points?

  • November 26, 2025
View Post
  • Engineering
  • Software Engineering

Development gets better with Age

  • October 9, 2025
View Post
  • Engineering
  • Technology

Apple supercharges its tools and technologies for developers to foster creativity, innovation, and design

  • June 9, 2025
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025

Stay Connected!
LATEST
  • 1
    Expectations vs. Reality: The AI We Thought We’d Have in 10 Years
    • June 19, 2026
  • digital-nomad-freelancer-worker-2151205464 2
    One paperwork problem – Get your Digital Nomad Visa employment documents fast from UK, EU or Singapore
    • June 16, 2026
  • 3
    Samsung Art Store Brings Art Basel to Homes Worldwide With New Curated Collection
    • June 15, 2026
  • 4
    You Do Not Need to Invest in the IPO of SpaceX, Anthropic, and OpenAI
    • June 10, 2026
  • 5
    The consequences of relying on AI for accurate news
    • June 10, 2026
  • 6
    Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers
    • June 10, 2026
  • 7
    WWDC26: Apple unveils next generation of Apple Intelligence, Siri AI, powerful parental controls, and an expansive set of software improvements
    • June 8, 2026
  • 8
    IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery
    • June 4, 2026
  • Data center 9
    Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency
    • June 3, 2026
  • 10
    Ink vs Pixels. What you miss versus what you are actually missing.
    • June 1, 2026
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Banks race to patch new cyber vulnerabilities, and other cybersecurity news
    • May 25, 2026
  • pope-leo-xiv-cq5dam-1500.844 2
    Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May
    • May 22, 2026
  • 3
    Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work
    • May 20, 2026
  • reMarkable Paper Pure 4
    Everything The reMarkable Paper Pure Actually Does
    • May 14, 2026
  • 5
    Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future
    • May 11, 2026
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.