aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Architecture
  • Programming
  • Public Cloud

From Receipts To Riches: Save Money W/ Google Cloud & Supermarket Bills – Part 1

  • aster.cloud
  • May 8, 2023
  • 6 minute read

In today’s world, every penny counts, and saving money on supermarket spending is no exception. Have you ever wondered: 

  • How many packets of quinoa did you buy this year? 
  • How much have egg prices gone up? 
  • Why are grocery bills high in certain months? 
  • What’s the most expensive item you’ve ever bought at the supermarket?

This is the first part of a two-part blog series that demonstrates how simple it is to combine managed services on Google Cloud to create a complete application that digitizes your grocery receipts and analyzes your spending patterns using Google Cloud services. This architecture is useful for any number of use cases where you want to apply the power of document AI to fully event-driven processing pipelines.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

In this first part of the blog series, we will discuss how to extract important information from your supermarket receipts, such as the date, store name, and items purchased, using Document AI. Storing this information in a Datastore allows you to access it quickly and easily. You can then use BigQuery to analyze the data and gain insights into your spending habits. This can help you identify areas where you can save money, such as by purchasing generic brands instead of name brands or buying in bulk. With this powerful combination of Google Cloud technologies, you can turn your receipts into a valuable data source that can help you save money and make smarter purchasing decisions. 

Don’t let your grocery receipts go to waste – turn them into riches with Google Cloud!

Architecture:

https://storage.googleapis.com/gweb-cloudblog-publish/images/1._Part1-ArchitectureDiagram.max-1200x1200.jpg

The following services will be used in this architecture:

  • Document AI: Document AI is a cloud-based AI tool that extracts structured data from unstructured documents, automates tasks, and saves businesses time and money. 
  • Cloud Functions: Google Cloud Functions is a serverless compute platform for scalable, event-driven applications.
  • BigQuery: BigQuery is a fast, easy-to-use, powerful serverless data warehouse for analyzing large amounts of data.
  • Cloud Datastore: Cloud Datastore is a fully managed, scalable, high-performance, durable, and secure NoSQL database for web, mobile, IoT, and big data applications.
  • Google Cloud Storage: Google Cloud Storage is a scalable, durable, and available object storage service for your data. It’s powerful, flexible, and cost-effective.
  • Cloud Logging: Cloud Logging is a fully-managed log management service. It collects, stores, and analyzes log data from GCP and on-premises.
Read More  Learn Why And How To Migrate Monolithic Workloads To Containers

Here are the steps to build this service: 

1. If you haven’t already, set up a Google Cloud account with Starter Checklist.

2. Set up Document AI Custom Document Extractor: 

a. This guide describes how to use Document AI Workbench to create and train a Custom Document Extractor that processes any document. 

b. Follow the same and create a custom processor that can process the invoice from your nearest supermarket that you visit most often (Document AI custom processors can support a wide range of languages). On average, we have observed ~15 receipts to train and to test each supermarket receipt model.

c. Processor schema should be as shown in the screenshot below. This schema covers all the information required on any supermarket bill, and most supermarket bills have these labels. Please note that if you change this schema, the code provided in step 4 below for Cloud Functions and BigQuery tables schema must be updated accordingly.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2._DocAI_Schema.max-1500x1500.jpg

d. This Custom Document Extractor processor will be called within Cloud Functions to parse and identify any bill uploaded by the users via a web application or a phone application. 

e. Here is a sample invoice used in this application. You can train on any invoice format as long as the schema in step 2 (c) above remains the same. The invoice should have an ID, items sold, date, items with their cost, sales tax, and subtotal.Here is an example of an invoice that follows this schema:

https://storage.googleapis.com/gweb-cloudblog-publish/images/3._InvoiceExample.max-400x400.jpg

3. Set up Google Cloud Storage: Follow this documentation and create storage buckets to store uploaded bills. 

4. Set up Google Cloud Function: 

a. For instructions on how to create a second-generation Cloud Function with Eventarc trigger that will be activated when a new object is added to the cloud storage created in step 3, please see this documentation. Please note the following while creating a cloud function and adding Eventarc.

i. When creating the function, make sure that “Allow internal traffic and traffic from Cloud Load Balancing” is selected in the network settings.

Read More  Announcing Support For On-Premises Windows Workloads With Certificate Authority Service

ii. When adding an Eventarc trigger, choose the “google.cloud.storage.object.v1.finalized” event type.

iii. It is recommended to create a new service account instead of using the “Default Compute Service Account” when creating Eventarc. The new service account should be granted the “Cloud Run Invoker” and “Eventarc Event Receiver” permissions.

iv. When creating a Cloud Function, do not use the “Default Compute Service Account” as the “Runtime service account”. Instead, create a new one and grant it the following roles to allow it to connect to DataStore, BigQuery, Logging, and Document AI: BigQuery Data Owner, Cloud Datastore User, Document AI API User, Logs Writer, Storage Object Creator, and Storage Object Viewer.

v. Once done, service accounts and permissions should appear as follows:

https://storage.googleapis.com/gweb-cloudblog-publish/images/4._IAM_permissions.max-1000x1000.jpg

b. Next, open the Cloud Function you created and select .NET 6.0 as the runtime. This will allow you to write C# functions.

c. To ensure a successful build, replace the current code in the Cloud Function with this code, replacing the variables in the code with the appropriate Google Cloud service names. Replace content in HelloGcs.csproj and Function.cs files.

d. Deploy the Cloud Function.

e. This code creates two types (Kind) of entities, “Invoices” and “Items”, in Datastore. Please refer to this documentation for more information on Datastore.

f. This code also creates a new LogName under global scope. Please refer to this documentation for more information on creating custom log entries. 

5. Set up BigQuery:

a. To create a new BigQuery dataset, please follow these instructions.

b. To create a new table called “Invoices” in the dataset created in step (a) above, follow the instructions in this documentation. In the Schema section and use the following JSON data as the “schema definition” for the “Invoices” table:

[
  {
    "name": "Items_sold_per_invoice",
    "mode": "NULLABLE",
    "type": "INTEGER",
    "description": null,
    "fields": []
  },
  {
    "name": "Purchase_Date",
    "mode": "NULLABLE",
    "type": "DATETIME",
    "description": null,
    "fields": []
  },
  {
    "name": "Sales_Tax",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "SubTotal",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "InvoiceID",
    "mode": "NULLABLE",
    "type": "BIGNUMERIC",
    "description": null,
    "fields": []
  }
]

When finished, the “Invoices” table and its schema should resemble the following:

https://storage.googleapis.com/gweb-cloudblog-publish/images/5._BQ_InvoicesTable.max-1700x1700.jpg

c. Please repeat the previous instructions, but this time create a new table called “Items.”

Read More  Google Cloud Migration Made Easy

d. To create a new table called “Items” in the dataset created in step (a) above, follow the instructions in this documentation. In the Schema section and use the following JSON data as the “schema definition” for the “Items” table:

[
  {
    "name": "Item_code",
    "mode": "NULLABLE",
    "type": "STRING",
    "description": null,
    "fields": []
  },
  {
    "name": "Item_cost",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "InvoiceID",
    "mode": "NULLABLE",
    "type": "STRING",
    "description": null,
    "fields": []
  }
]

When finished, the “Items” table and its schema should resemble the following:

https://storage.googleapis.com/gweb-cloudblog-publish/images/6._BQ_ItemsTable.max-1500x1500.jpg

6. Testing:

a. To test the application, upload the supermarket bill to the same cloud storage bucket that you created in step 3 and configured to “Receive events from” in Eventarc in step 4.

b. When a supermarket bill is uploaded to the blob storage, it triggers a cloud function that reads the file, processes it with a Document AI custom processor, extracts the relevant fields, and then stores the data in three different destinations: BQ tables (created in step 5 above), Datastore (two entities named “Invoices” and “Items” will be created), and a new text file per uploaded document with all extracted entities for archiving purposes.

There are many ways to explore data in BigQuery, and one of them is to explore data in Looker from BigQuery tables. Here are some of the sample reports you can build with data:

https://storage.googleapis.com/gweb-cloudblog-publish/images/7._InvoicesReport.max-1900x1900.jpg
https://storage.googleapis.com/gweb-cloudblog-publish/images/8._ItemsReport.max-1700x1700.jpg

Congratulations on digitizing your supermarket bills! You are one step closer to riches.

In the second part of this blog series, we will discuss how this architecture can be expanded to every supermarket bill in the market and how any supermarket bill can be digitized. So stay tuned!! 

In the meantime, please review Google Cloud’s tutorials to learn more about the platform and how to use it effectively.

By Krishna Chytanya Ayyagari Senior Customer Engineer, Infrastructure
Originally published at Google Cloud

Source: Cyberpogo


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • BigQuery
  • Cloud Datastore
  • Cloud Logging
  • Document AI
  • Google Cloud
You May Also Like
View Post
  • Public Cloud

Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers

  • June 10, 2026
Data center
View Post
  • Data
  • Public Cloud

Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency

  • June 3, 2026
View Post
  • Data
  • Platforms
  • Public Cloud

PayPal’s historically large data migration is the foundation for its gen AI innovation

  • March 4, 2026
Google Cloud and ElevenLabs
View Post
  • Public Cloud
  • Technology

ElevenLabs Partners with Google Cloud for Cloud Services and the Latest NVIDIA Blackwell GPUs

  • February 26, 2026
View Post
  • Public Cloud

Delivering a secure, open, and sovereign digital world

  • February 12, 2026
View Post
  • Public Cloud

Formula E and Google Cloud Announce Multi-Year ‘Principal Partnership’

  • January 26, 2026
View Post
  • Public Cloud

Sawasdee Thailand! Google Cloud launches new region in Bangkok

  • January 23, 2026
View Post
  • Public Cloud

Retailers Help Mitigate Risk with Oracle’s AI-Driven Supply Chain Collaboration

  • January 11, 2026

Stay Connected!
LATEST
  • digital-nomad-freelancer-worker-2151205464 1
    One paperwork problem – Get your Digital Nomad Visa employment documents fast from UK, EU or Singapore
    • June 16, 2026
  • 2
    Samsung Art Store Brings Art Basel to Homes Worldwide With New Curated Collection
    • June 15, 2026
  • 3
    You Do Not Need to Invest in the IPO of SpaceX, Anthropic, and OpenAI
    • June 10, 2026
  • 4
    The consequences of relying on AI for accurate news
    • June 10, 2026
  • 5
    Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers
    • June 10, 2026
  • 6
    WWDC26: Apple unveils next generation of Apple Intelligence, Siri AI, powerful parental controls, and an expansive set of software improvements
    • June 8, 2026
  • 7
    IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery
    • June 4, 2026
  • Data center 8
    Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency
    • June 3, 2026
  • 9
    Ink vs Pixels. What you miss versus what you are actually missing.
    • June 1, 2026
  • 10
    Banks race to patch new cyber vulnerabilities, and other cybersecurity news
    • May 25, 2026
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • pope-leo-xiv-cq5dam-1500.844 1
    Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May
    • May 22, 2026
  • 2
    Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work
    • May 20, 2026
  • reMarkable Paper Pure 3
    Everything The reMarkable Paper Pure Actually Does
    • May 14, 2026
  • 4
    Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future
    • May 11, 2026
  • Anthropic Institute 5
    Introducing The Anthropic Institute
    • March 11, 2026
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.