From bcafe3a4b0fb2d59a560f6273104a64fdfe8b842 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Tue, 24 Oct 2023 23:46:16 +0200
Subject: [PATCH 1/7] chore(deps): update all dependencies (#188)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
[](https://renovatebot.com)
This PR contains the following updates:
| Package | Type | Update | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|---|---|
| [actions/checkout](https://togithub.com/actions/checkout) | action | major | `v3` -> `v4` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
| [pytest](https://docs.pytest.org/en/latest/) ([source](https://togithub.com/pytest-dev/pytest), [changelog](https://docs.pytest.org/en/stable/changelog.html)) | | patch | `==7.4.2` -> `==7.4.3` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
---
### Release Notes
actions/checkout (actions/checkout)
### [`v4`](https://togithub.com/actions/checkout/blob/HEAD/CHANGELOG.md#v400)
[Compare Source](https://togithub.com/actions/checkout/compare/v3...v4)
- [Support fetching without the --progress option](https://togithub.com/actions/checkout/pull/1067)
- [Update to node20](https://togithub.com/actions/checkout/pull/1436)
pytest-dev/pytest (pytest)
### [`v7.4.3`](https://togithub.com/pytest-dev/pytest/compare/7.4.2...v7.4.3)
[Compare Source](https://togithub.com/pytest-dev/pytest/compare/7.4.2...v7.4.3)
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.
â™» **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://togithub.com/renovatebot/renovate/discussions) if that's undesired.
---
- [ ] If you want to rebase/retry this PR, check this box
---
This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/googleapis/python-documentai-toolbox).
---
samples/snippets/requirements-test.txt | 2 +-
samples/snippets/requirements.txt | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/samples/snippets/requirements-test.txt b/samples/snippets/requirements-test.txt
index 331425b6..666a0c93 100644
--- a/samples/snippets/requirements-test.txt
+++ b/samples/snippets/requirements-test.txt
@@ -1,3 +1,3 @@
-pytest==7.4.2
+pytest==7.4.3
mock==5.1.0
google-cloud-bigquery==3.12.0
diff --git a/samples/snippets/requirements.txt b/samples/snippets/requirements.txt
index f02bf7a1..c0277e1b 100644
--- a/samples/snippets/requirements.txt
+++ b/samples/snippets/requirements.txt
@@ -1,4 +1,4 @@
google-cloud-bigquery==3.12.0
google-cloud-documentai==2.20.1
google-cloud-storage==2.12.0
-google-cloud-documentai-toolbox==0.10.2a0
+google-cloud-documentai-toolbox==0.11.1a0
From 8fff0a66119ceff414be9059b2f33504e79f17ca Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Mon, 30 Oct 2023 21:18:14 +0100
Subject: [PATCH 2/7] chore(deps): update all dependencies (#189)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
[](https://renovatebot.com)
This PR contains the following updates:
| Package | Type | Update | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|---|---|
| [actions/checkout](https://togithub.com/actions/checkout) | action | major | `v3` -> `v4` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
| [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | | minor | `==3.12.0` -> `==3.13.0` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
---
### Release Notes
actions/checkout (actions/checkout)
### [`v4`](https://togithub.com/actions/checkout/blob/HEAD/CHANGELOG.md#v400)
[Compare Source](https://togithub.com/actions/checkout/compare/v3...v4)
- [Support fetching without the --progress option](https://togithub.com/actions/checkout/pull/1067)
- [Update to node20](https://togithub.com/actions/checkout/pull/1436)
googleapis/python-bigquery (google-cloud-bigquery)
### [`v3.13.0`](https://togithub.com/googleapis/python-bigquery/blob/HEAD/CHANGELOG.md#3130-2023-10-30)
[Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v3.12.0...v3.13.0)
##### Features
- Add `Model.transform_columns` property ([#1661](https://togithub.com/googleapis/python-bigquery/issues/1661)) ([5ceed05](https://togithub.com/googleapis/python-bigquery/commit/5ceed056482f6d1f2fc45e7e6b84382de45c85ed))
- Add support for dataset.default_rounding_mode ([#1688](https://togithub.com/googleapis/python-bigquery/issues/1688)) ([83bc768](https://togithub.com/googleapis/python-bigquery/commit/83bc768b90a852d258a4805603020a296e02d2f9))
##### Bug Fixes
- AccessEntry API representation parsing ([#1682](https://togithub.com/googleapis/python-bigquery/issues/1682)) ([a40d7ae](https://togithub.com/googleapis/python-bigquery/commit/a40d7ae03149708fc34c962b43a6ac198780b6aa))
##### Documentation
- Remove redundant `bigquery_update_table_expiration` code sample ([#1673](https://togithub.com/googleapis/python-bigquery/issues/1673)) ([2dded33](https://togithub.com/googleapis/python-bigquery/commit/2dded33626b3de6c4ab5e1229eb4c85786b2ff53))
- Revised `create_partitioned_table` sample ([#1447](https://togithub.com/googleapis/python-bigquery/issues/1447)) ([40ba859](https://togithub.com/googleapis/python-bigquery/commit/40ba859059c3e463e17ea7781bc5a9aff8244c5d))
- Revised relax column mode sample ([#1467](https://togithub.com/googleapis/python-bigquery/issues/1467)) ([b8c9276](https://togithub.com/googleapis/python-bigquery/commit/b8c9276be011d971b941b583fd3d4417d438067f))
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.
â™» **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://togithub.com/renovatebot/renovate/discussions) if that's undesired.
---
- [ ] If you want to rebase/retry this PR, check this box
---
This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/googleapis/python-documentai-toolbox).
---
samples/snippets/requirements-test.txt | 2 +-
samples/snippets/requirements.txt | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/samples/snippets/requirements-test.txt b/samples/snippets/requirements-test.txt
index 666a0c93..e763bc58 100644
--- a/samples/snippets/requirements-test.txt
+++ b/samples/snippets/requirements-test.txt
@@ -1,3 +1,3 @@
pytest==7.4.3
mock==5.1.0
-google-cloud-bigquery==3.12.0
+google-cloud-bigquery==3.13.0
diff --git a/samples/snippets/requirements.txt b/samples/snippets/requirements.txt
index c0277e1b..f3e586d7 100644
--- a/samples/snippets/requirements.txt
+++ b/samples/snippets/requirements.txt
@@ -1,4 +1,4 @@
-google-cloud-bigquery==3.12.0
+google-cloud-bigquery==3.13.0
google-cloud-documentai==2.20.1
google-cloud-storage==2.12.0
google-cloud-documentai-toolbox==0.11.1a0
From 7db5d27ae6a086cd09e02ab1f370557617515d51 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Tue, 31 Oct 2023 21:08:14 +0100
Subject: [PATCH 3/7] chore(deps): update all dependencies (#190)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
[](https://renovatebot.com)
This PR contains the following updates:
| Package | Type | Update | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|---|---|
| [actions/checkout](https://togithub.com/actions/checkout) | action | major | `v3` -> `v4` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
| [google-cloud-storage](https://togithub.com/googleapis/python-storage) | | minor | `==2.12.0` -> `==2.13.0` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
---
### Release Notes
actions/checkout (actions/checkout)
### [`v4`](https://togithub.com/actions/checkout/blob/HEAD/CHANGELOG.md#v400)
[Compare Source](https://togithub.com/actions/checkout/compare/v3...v4)
- [Support fetching without the --progress option](https://togithub.com/actions/checkout/pull/1067)
- [Update to node20](https://togithub.com/actions/checkout/pull/1436)
googleapis/python-storage (google-cloud-storage)
### [`v2.13.0`](https://togithub.com/googleapis/python-storage/blob/HEAD/CHANGELOG.md#2130-2023-10-31)
[Compare Source](https://togithub.com/googleapis/python-storage/compare/v2.12.0...v2.13.0)
##### Features
- Add Autoclass v2.1 support ([#1117](https://togithub.com/googleapis/python-storage/issues/1117)) ([d38adb6](https://togithub.com/googleapis/python-storage/commit/d38adb6a3136152ad68ad8a9c4583d06509307b2))
- Add support for custom headers ([#1121](https://togithub.com/googleapis/python-storage/issues/1121)) ([2f92c3a](https://togithub.com/googleapis/python-storage/commit/2f92c3a2a3a1585d0f77be8fe3c2c5324140b71a))
##### Bug Fixes
- Blob.from_string parse storage uri with regex ([#1170](https://togithub.com/googleapis/python-storage/issues/1170)) ([0a243fa](https://togithub.com/googleapis/python-storage/commit/0a243faf5d6ca89b977ea1cf543356e0dd04df95))
- Bucket.delete(force=True) now works with version-enabled buckets ([#1172](https://togithub.com/googleapis/python-storage/issues/1172)) ([0de09d3](https://togithub.com/googleapis/python-storage/commit/0de09d30ea6083d962be1c1f5341ea14a2456dc7))
- Fix typo in Bucket.clear_lifecycle_rules() ([#1169](https://togithub.com/googleapis/python-storage/issues/1169)) ([eae9ebe](https://togithub.com/googleapis/python-storage/commit/eae9ebed12d26832405c2f29fbdb14b4babf080d))
##### Documentation
- Fix exception field in tm reference docs ([#1164](https://togithub.com/googleapis/python-storage/issues/1164)) ([eac91cb](https://togithub.com/googleapis/python-storage/commit/eac91cb6ffb0066248f824fc1f307140dd7c85da))
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.
â™» **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://togithub.com/renovatebot/renovate/discussions) if that's undesired.
---
- [ ] If you want to rebase/retry this PR, check this box
---
This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/googleapis/python-documentai-toolbox).
---
samples/snippets/requirements.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/snippets/requirements.txt b/samples/snippets/requirements.txt
index f3e586d7..132ddba0 100644
--- a/samples/snippets/requirements.txt
+++ b/samples/snippets/requirements.txt
@@ -1,4 +1,4 @@
google-cloud-bigquery==3.13.0
google-cloud-documentai==2.20.1
-google-cloud-storage==2.12.0
+google-cloud-storage==2.13.0
google-cloud-documentai-toolbox==0.11.1a0
From 574337cb7f4c109654735cc270b8cd72434a6604 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Thu, 2 Nov 2023 16:30:34 +0100
Subject: [PATCH 4/7] chore(deps): update all dependencies (#191)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
[](https://renovatebot.com)
This PR contains the following updates:
| Package | Type | Update | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|---|---|
| [actions/checkout](https://togithub.com/actions/checkout) | action | major | `v3` -> `v4` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
| [google-cloud-documentai](https://togithub.com/googleapis/google-cloud-python) | | patch | `==2.20.1` -> `==2.20.2` | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) | [](https://docs.renovatebot.com/merge-confidence/) |
---
### Release Notes
actions/checkout (actions/checkout)
### [`v4`](https://togithub.com/actions/checkout/blob/HEAD/CHANGELOG.md#v400)
[Compare Source](https://togithub.com/actions/checkout/compare/v3...v4)
- [Support fetching without the --progress option](https://togithub.com/actions/checkout/pull/1067)
- [Update to node20](https://togithub.com/actions/checkout/pull/1436)
googleapis/google-cloud-python (google-cloud-documentai)
### [`v2.20.2`](https://togithub.com/googleapis/google-cloud-python/releases/tag/google-cloud-documentai-v2.20.2): google-cloud-documentai: v2.20.2
[Compare Source](https://togithub.com/googleapis/google-cloud-python/compare/google-cloud-documentai-v2.20.1...google-cloud-documentai-v2.20.2)
##### Documentation
- updated comments ([#11950](https://togithub.com/googleapis/google-cloud-python/issues/11950)) ([a0da408](https://togithub.com/googleapis/google-cloud-python/commit/a0da408d0f322b3f4f11a0dbe39c92fa5770e59b))
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.
â™» **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://togithub.com/renovatebot/renovate/discussions) if that's undesired.
---
- [ ] If you want to rebase/retry this PR, check this box
---
This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/googleapis/python-documentai-toolbox).
---
samples/snippets/requirements.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/snippets/requirements.txt b/samples/snippets/requirements.txt
index 132ddba0..6d2bd72c 100644
--- a/samples/snippets/requirements.txt
+++ b/samples/snippets/requirements.txt
@@ -1,4 +1,4 @@
google-cloud-bigquery==3.13.0
-google-cloud-documentai==2.20.1
+google-cloud-documentai==2.20.2
google-cloud-storage==2.13.0
google-cloud-documentai-toolbox==0.11.1a0
From e05cf504256db171f175eec036fa40c0e21e50cc Mon Sep 17 00:00:00 2001
From: "gcf-owl-bot[bot]" <78513119+gcf-owl-bot[bot]@users.noreply.github.com>
Date: Thu, 2 Nov 2023 21:21:50 -0400
Subject: [PATCH 5/7] chore: update docfx minimum Python version (#192)
Source-Link: https://github.com/googleapis/synthtool/commit/bc07fd415c39853b382bcf8315f8eeacdf334055
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:30470597773378105e239b59fce8eb27cc97375580d592699206d17d117143d0
Co-authored-by: Owl Bot
---
.github/.OwlBot.lock.yaml | 4 ++--
.github/workflows/docs.yml | 2 +-
noxfile.py | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/.github/.OwlBot.lock.yaml b/.github/.OwlBot.lock.yaml
index 7f291dbd..ec696b55 100644
--- a/.github/.OwlBot.lock.yaml
+++ b/.github/.OwlBot.lock.yaml
@@ -13,5 +13,5 @@
# limitations under the License.
docker:
image: gcr.io/cloud-devrel-public-resources/owlbot-python:latest
- digest: sha256:4f9b3b106ad0beafc2c8a415e3f62c1a0cc23cabea115dbe841b848f581cfe99
-# created: 2023-10-18T20:26:37.410353675Z
+ digest: sha256:30470597773378105e239b59fce8eb27cc97375580d592699206d17d117143d0
+# created: 2023-11-03T00:57:07.335914631Z
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
index e97d89e4..221806ce 100644
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -28,7 +28,7 @@ jobs:
- name: Setup Python
uses: actions/setup-python@v4
with:
- python-version: "3.9"
+ python-version: "3.10"
- name: Install nox
run: |
python -m pip install --upgrade setuptools pip wheel
diff --git a/noxfile.py b/noxfile.py
index fc49ce9e..779d7921 100644
--- a/noxfile.py
+++ b/noxfile.py
@@ -301,7 +301,7 @@ def docs(session):
)
-@nox.session(python="3.9")
+@nox.session(python="3.10")
def docfx(session):
"""Build the docfx yaml files for this library."""
From 3f52e82eaa741cd2c8a08e8398ed6f4b3f65c419 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Tue, 7 Nov 2023 13:56:13 -0600
Subject: [PATCH 6/7] fix: Updates to hOCR Template to follow hOCR Spec (#195)
- Added validation in testing with https://github.com/kba/hocr-spec-python
---
.../templates/hocr_document_template.xml.j2 | 7 +-
.../test_convert_document_to_hocr_sample.py | 6 +-
setup.py | 1 +
testing/constraints-3.10.txt | 1 +
testing/constraints-3.11.txt | 1 +
testing/constraints-3.7.txt | 1 +
testing/constraints-3.8.txt | 1 +
testing/constraints-3.9.txt | 1 +
.../resources/toolbox_invoice_test_0_hocr.xml | 67 ++++++++++---------
tests/unit/test_document.py | 12 +++-
10 files changed, 60 insertions(+), 38 deletions(-)
diff --git a/google/cloud/documentai_toolbox/templates/hocr_document_template.xml.j2 b/google/cloud/documentai_toolbox/templates/hocr_document_template.xml.j2
index 63db0ada..dad071e1 100644
--- a/google/cloud/documentai_toolbox/templates/hocr_document_template.xml.j2
+++ b/google/cloud/documentai_toolbox/templates/hocr_document_template.xml.j2
@@ -6,8 +6,9 @@
+
-
+
{% for page in pages -%}
@@ -16,13 +17,13 @@
{% set bidx = loop.index0 -%}
{% for paragraph in docai_block.paragraphs -%}
{% set paridx = loop.index0 -%}
- {% for line in paragraph.lines -%}
+ {% for line in paragraph.lines -%}
{% set lidx = loop.index0 -%}
{{ line.text }}{% for token in line.tokens -%}
{% set tidx = loop.index0 -%}
{{ token.text }}{% endfor -%}
{% endfor -%}
-
{% endfor -%}
+
{% endfor -%}
{% endfor -%}
{% endfor -%}
diff --git a/samples/snippets/test_convert_document_to_hocr_sample.py b/samples/snippets/test_convert_document_to_hocr_sample.py
index e3ed9f2b..776c0b96 100644
--- a/samples/snippets/test_convert_document_to_hocr_sample.py
+++ b/samples/snippets/test_convert_document_to_hocr_sample.py
@@ -24,7 +24,11 @@ def test_convert_document_to_hocr_sample() -> None:
document_path=document_path, document_title=document_title
)
- with open("../../tests/unit/resources/toolbox_invoice_test_0_hocr.xml", "r") as f:
+ with open(
+ "../../tests/unit/resources/toolbox_invoice_test_0_hocr.xml",
+ "r",
+ encoding="utf-8",
+ ) as f:
expected = f.read()
assert actual == expected
diff --git a/setup.py b/setup.py
index 7a29932e..abece197 100644
--- a/setup.py
+++ b/setup.py
@@ -66,6 +66,7 @@
"immutabledict >= 2.0.0, < 3.0.0dev; python_version<'3.8'",
"Pillow >= 9.5.0, < 11.0.0",
"Jinja2 >= 3.1.0, <= 4.0.0",
+ "hocr-spec >= 0.2.0",
),
python_requires=">=3.7",
classifiers=[
diff --git a/testing/constraints-3.10.txt b/testing/constraints-3.10.txt
index c9f0e4bb..25aa22a8 100644
--- a/testing/constraints-3.10.txt
+++ b/testing/constraints-3.10.txt
@@ -11,3 +11,4 @@ google-cloud-documentai
google-cloud-storage
numpy
pikepdf
+hocr-spec
diff --git a/testing/constraints-3.11.txt b/testing/constraints-3.11.txt
index c9f0e4bb..25aa22a8 100644
--- a/testing/constraints-3.11.txt
+++ b/testing/constraints-3.11.txt
@@ -11,3 +11,4 @@ google-cloud-documentai
google-cloud-storage
numpy
pikepdf
+hocr-spec
diff --git a/testing/constraints-3.7.txt b/testing/constraints-3.7.txt
index 3c64ab2e..0a9af7ff 100644
--- a/testing/constraints-3.7.txt
+++ b/testing/constraints-3.7.txt
@@ -14,3 +14,4 @@ google-cloud-documentai==2.20.0
google-cloud-storage==2.7.0
numpy==1.19.5
pikepdf==6.2.9
+hocr-spec==0.2.0
diff --git a/testing/constraints-3.8.txt b/testing/constraints-3.8.txt
index ed1905e2..a9d4c497 100644
--- a/testing/constraints-3.8.txt
+++ b/testing/constraints-3.8.txt
@@ -11,3 +11,4 @@ google-cloud-documentai
google-cloud-storage
numpy==1.21.6
pikepdf==8.2.3
+hocr-spec
diff --git a/testing/constraints-3.9.txt b/testing/constraints-3.9.txt
index c9f0e4bb..25aa22a8 100644
--- a/testing/constraints-3.9.txt
+++ b/testing/constraints-3.9.txt
@@ -11,3 +11,4 @@ google-cloud-documentai
google-cloud-storage
numpy
pikepdf
+hocr-spec
diff --git a/tests/unit/resources/toolbox_invoice_test_0_hocr.xml b/tests/unit/resources/toolbox_invoice_test_0_hocr.xml
index 0cd8e171..4e265f7d 100644
--- a/tests/unit/resources/toolbox_invoice_test_0_hocr.xml
+++ b/tests/unit/resources/toolbox_invoice_test_0_hocr.xml
@@ -6,84 +6,85 @@
+
-
+
-Invoice
+Invoice
Invoice
-
DATE: 01/01/1970
+DATE: 01/01/1970
DATE: 01/01/1970
INVOICE: NO. 001
INVOICE: NO. 001
-
FROM: Company ABC
+FROM: Company ABC
FROM: Company ABC
user@companyabc.com
user@companyabc.com
-
TO: John Doe
+TO: John Doe
TO: John Doe
johndoe@email.com
johndoe@email.com
-
ADDRESS: 111 Main Street
+ADDRESS: 111 Main Street
ADDRESS: 111 Main Street
Anytown, USA
Anytown, USA
-
ADDRESS: 222 Main Street
+ADDRESS: 222 Main Street
ADDRESS: 222 Main Street
Anytown, USA
Anytown, USA
-
TERMS: 6 month contract
+TERMS: 6 month contract
TERMS: 6 month contract
DUE: 01/01/2025
DUE: 01/01/2025
-
Item Description
+Item Description
Item Description
-
Quantity
+Quantity
Quantity
-
Price
+Price
Price
-
Amount
+Amount
Amount
-
Tool A
+Tool A
Tool A
-
500
+500
500
-
$1.00
+$1.00
$1.00
-
$500.00
+$500.00
$500.00
-
Service B
+Service B
Service B
-
1
+1
1
-
$900.00
+$900.00
$900.00
-
$900.00
+$900.00
$900.00
-
Resource C
+Resource C
Resource C
-
50
+50
50
-
$12.00
+$12.00
$12.00
-
$600.00
+$600.00
$600.00
-
Subtotal
+Subtotal
Subtotal
-
$2000.00
+$2000.00
$2000.00
-
Tax
+Tax
Tax
-
$140.00
+$140.00
$140.00
-
BALANCE DUE
+BALANCE DUE
BALANCE DUE
-
$2140.00
+$2140.00
$2140.00
-
NOTES:
+NOTES:
NOTES:
-
Supplies used for Project Q.
+Supplies used for Project Q.
Supplies used for Project Q.
-
+