Seal Occlusion on Invoice (Electronic Documents)

by Nguyễn Hoàng Thiên Thư and Wu Cheng-Fu

Feb. 10, 2023

Why Coined It "Seal Occulsion"?

I mean, why not "stamp occlusion"?

Because, when you image search for "stamp", you get this:

However, when you image search for "seal", you get this:

Just Kidding

The naming is just personal preference.

Maybe also a mild desire to avoid collision with the object that we stick onto our envelopes.

Occlusion

One of the most important piece of information, issuer_name, is often occluded or screened by seals.

Influence

Sometimes LINE CLOVA OCR, the OCR engine currently used, could detect texts occluded by seals, but sometimes not, resulting in unacceptable accuracy by our clients.

What This Sharing Is Not

  • Showcase OCR team's muscle
  • Teach the whole lab computer vision

What This Sharing Is

  • Kick start sharing on big/small issues encountered in different projects
  • Ensemble Method: The whole lab's intelligence

What Types of Electronic Documents Are We Talking About?

So far we have been dealing with

  1. PDF
    • Scanned PDF (majority)
    • "Real" PDF (minority)
  2. PNG

What Kinds of Seals Do We Have?

  1. Red seal
  2. Black seal
  3. No seal
  1. Red seal
  2. Black seal
  3. No seal
  1. Red seal
  2. Black seal
  3. No seal
  1. Red seal
  2. Black seal
  3. No seal

A Piece of Cake!

Or, is it?

Zoom in

Holy!

The (Failed) Approach

  1. Capture a mask of only red pixels
  2. Replace the red pixels by some neighboring white color

Morphological Transformations

Dilation

膨張

sự giãn nở

Erosion

浸食

xói mòn

A More Promising Approach

Red channel

Idea:

  • Each color image has RGB format (red, green, blue)
  • We keep only red channel to remove red seal
(change image from 3d into 1d image)

Limitation:

  • Not only red color but also other color is changed

Detect and replace only red region

Idea:

  • Define range of color and a mask of image
  • Replace those range by behavior of red color

Limitation:

  • In some cases, not only red seal is red, but also other crucial parts (issuer name, cell in table ,…) is in red

Define which image should remove seal

Idea:

  • Seal only occupies small area in whole image
  • Image has too many red pixcel $\implies$ Don’t remove seal

Limitation:

  • Still cannot detect the seal position

RGB or Red-Green-Blue

How come taking the red channel and suddenly the seal's disappeared?

WYSIWYG Visualization

Original RGB Image

Processed Visualization

Quiz

How to obtain the red image above?

  1. (red, red, red)
  2. (red, black, black)
  3. (red, white, white)

Thanks

Caveats

Future Directions

References

Slides

Not a coder? Not a problem. There's a fully-featured visual editor for authoring these, try it out at https://slides.com.

Hidden Slides

This slide is visible in the source, but hidden when the presentation is viewed. You can show all hidden slides by setting the `showHiddenSlides` config option to `true`.

Pretty Code


            import React, { useState } from 'react';

            function Example() {
              const [count, setCount] = useState(0);

              return (
                ...
              );
            }
          

Code syntax highlighting courtesy of highlight.js.

With animations

Point of View

Press ESC to enter the slide overview.

Hold down the alt key (ctrl in Linux) and click on any element to zoom towards it using zoom.js. Click again to zoom back out.

(NOTE: Use ctrl + click in Linux.)

Auto-Animate

Automatically animate matching elements across slides with Auto-Animate.

Auto-Animate

Auto-Animate

Touch Optimized

Presentations look great on touch devices, like mobile phones and tablets. Simply swipe through your slides.

Add the r-fit-text class to auto-size text

FIT TEXT

Fragments

Hit the next arrow...

... to step through ...

... a fragmented slide.

Fragment Styles

There's different types of fragments, like:

grow

shrink

fade-out

fade-right, up, down, left

fade-in-then-out

fade-in-then-semi-out

Highlight red blue green

Transition Styles

You can select from different transitions, like:
None - Fade - Slide - Convex - Concave - Zoom

Themes

reveal.js comes with a few themes built in:
Black (default) - White - League - Sky - Beige - Simple
Serif - Blood - Night - Moon - Solarized

Slide Backgrounds

Set data-background="#dddddd" on a slide to change the background color. All CSS color formats are supported.

Down arrow

Gradient Backgrounds

<section data-background-gradient=
              "linear-gradient(to bottom, #ddd, #191919)">

Tiled Backgrounds

<section data-background="image.png" data-background-repeat="repeat" data-background-size="100px">

Video Backgrounds

<section data-background-video="video.mp4,video.webm">

Background Transitions

Different background transitions are available via the backgroundTransition option. This one's called "zoom".

Reveal.configure({ backgroundTransition: 'zoom' })

Background Transitions

You can override background transitions per-slide.

<section data-background-transition="zoom">

Iframe Backgrounds

Since reveal.js runs on the web, you can easily embed other web content. Try interacting with the page in the background.

Marvelous List

  • No order here
  • Or here
  • Or here
  • Or here

Fantastic Ordered List

  1. One is smaller than...
  2. Two is smaller than...
  3. Three!

Tabular Tables

Item Value Quantity
Apples $1 7
Lemonade $2 18
Bread $3 2

Clever Quotes

These guys come in two forms, inline: The nice thing about standards is that there are so many to choose from and block:

“For years there has been a theory that millions of monkeys typing at random on millions of typewriters would reproduce the entire works of Shakespeare. The Internet has proven this theory to be untrue.”

Intergalactic Interconnections

You can link between slides internally, like this.

Speaker View

There's a speaker view. It includes a timer, preview of the upcoming slide as well as your speaker notes.

Press the S key to try it out.

Export to PDF

Presentations can be exported to PDF, here's an example:

Global State

Set data-state="something" on a slide and "something" will be added as a class to the document element when the slide is open. This lets you apply broader style changes, like switching the page background.

State Events

Additionally custom events can be triggered on a per slide basis by binding to the data-state name.


Reveal.on( 'customevent', function() {
  console.log( '"customevent" has fired' );
} );
          

Take a Moment

Press B or . on your keyboard to pause the presentation. This is helpful when you're on stage and want to take distracting slides off the screen.

Much more

THE END

- Try the online editor
- Source code & documentation