PowerPoint Data Sanitization Tool

See how our team cut presentation sanitization time from 3 hours to 20 minutes by creating a PowerPoint add-in that not only fastened, but also enhanced the process.

Content

PowerPoint Data Sanitization Tool

Presentations are crucial in many areas like consulting, financial services, or professional services. They deliver information in a convenient format, yet consist of confidential information which takes a lot of storage. Manually removing it is time-consuming, and has risks of data breach and compliance failures.

This project was delivered by our team of eight specialists, including a CTO, Project Manager, front-end and back-end developers, UX/UI designer, DevOps engineer, Business Analyst, and QA engineer. Within two months, we developed an enterprise-grade Microsoft PowerPoint add-in that automatically identifies and manages sensitive content, streamlining the sanitization process for the client.

Beginning

To build a practical sanitization add-in, firstly we had to understand what key features needed to be implemented:

  • The first thing was a tool that could automatically scan presentations for sensitive content, covering not only text, but also logos, images, and so on.
  • After identifying, the add-in needed a machine learning algorithm that classified and proposed appropriate treatment to it.
  • Add-in had to be easy to use and have an intuitive interface to be understandable for all users.
  • The opportunity to use sanitized copies of presentations was also important so users could safely recuse them.

Implementing

Our team implement the solutions that enhanced the tool’s functionality and user experiencing:

  1. We automated process by integrating OCR engine that scanned each slide for text and even within image, and ML models with enhanced prompts on various logo databases that analyzed logos and charts.
  2. We enabled the tool to classify the scanned content by using LLM Models and LLM APIs, and by classification it would suggest tailored treatment based on content type and context.
  3. FE developers used OfficeJS to ensure seamless work of the tool with different versions of PowerPoint, while UX/UI designers designed intuitive dashboards and icons for each functionality. For example, a progress bar for scanning and easy selection of treatment methods, to make it user-friendly.
  4. Using technologies such as Docker/Kubernetes, our BE developer implemented a process that generated the sanitized copies of presentations. Along with it we added an optional preview before finalization so users could double check, if the sensitive content has been treated right.

Results and Impacts

  1. Reduced Sanitization Time

According to the users, the automation of the sensitization process significantly reduced the time spent on it. If manual redaction took on average 2-3 hours, now this work takes under 20 minutes, saving teams’ time and allowing them to spend it on more important tasks.

  1. Increased Accuracy

By testing the large number of presentations and how they were treated with add-in, we discovered that accuracy in identification and managing sensitive content was enhanced up to 85%. It means that using advanced OCR and tuned ML algorithms, our team decreased the risk of data breaches and compliance failures.

  1. Enterprise-Grade Adoption and Secure Deployment

By integrating the add-in with Azure AD and MFA, we ensured that only verified employees could access its advanced features, fully aligning with the firm’s existing security and identity systems. Through the M365 Admin Center, IT teams gained a simple way to centrally manage, deploy, and update the add-in for the whole company.

Find more Related Projects

We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept