Vik Paruchuri commited on
Commit
5d1097f
·
1 Parent(s): c58dca9

Add in packages

Browse files
Files changed (4) hide show
  1. README.md +9 -9
  2. marker/settings.py +0 -2
  3. poetry.lock +484 -22
  4. pyproject.toml +2 -1
README.md CHANGED
@@ -6,7 +6,7 @@ Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, m
6
  - Removes headers/footers/other artifacts
7
  - Converts most equations to latex
8
  - Formats code blocks and tables
9
- - Support for multiple languages (although most testing is done in English). See `settings.py` for a language list.
10
  - Works on GPU, CPU, or MPS
11
 
12
  ## How it works
@@ -15,7 +15,7 @@ Marker is a pipeline of deep learning models:
15
 
16
  - Extract text, OCR if necessary (heuristics, tesseract)
17
  - Detect page layout ([layout segmenter](https://huggingface.co/vikp/layout_segmenter), [column detector](https://huggingface.co/vikp/column_detector))
18
- - Clean and format each block (heuristics, [nougat](https://huggingface.co/facebook/nougat-base))
19
  - Combine blocks and postprocess complete text (heuristics, [pdf_postprocessor](https://huggingface.co/vikp/pdf_postprocessor_t5))
20
 
21
  Relying on autoregressive forward passes to generate text is slow and prone to hallucination/repetition. From the nougat paper: `We observed [repetition] in 1.5% of pages in the test set, but the frequency increases for out-of-domain documents.` In my anecdotal testing, repetitions happen on 5%+ of out-of-domain (non-arXiv) pages.
@@ -48,10 +48,10 @@ See [below](#benchmarks) for detailed speed and accuracy benchmarks, and instruc
48
 
49
  PDF is a tricky format, so marker will not always work perfectly. Here are some known limitations that are on the roadmap to address:
50
 
51
- - Marker will convert fewer equations to latex than nougat. This is because it has to first detect equations, then convert them without hallucation.
52
  - Whitespace and indentations are not always respected.
53
  - Not all lines/spans will be joined properly.
54
- - Languages similar to English (Spanish, French, German, Russian, etc) have the best support. There is provisional support for Chinese, Japanese, Korean, and Hindi, but it may not work as well.
55
  - This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
56
 
57
  # Installation
@@ -88,17 +88,16 @@ First, clone the repo:
88
  - Install python requirements
89
  - `poetry install`
90
  - `poetry shell` to activate your poetry venv
91
- - On ARM macs (M1+), make sure to set the `TORCH_DEVICE` setting to `mps` (more details below) for a speedup
92
 
93
  # Usage
94
 
95
- First, some configuration:
96
 
97
- - Set your torch device in the `local.env` file. For example, `TORCH_DEVICE=cuda` or `TORCH_DEVICE=mps`. `cpu` is the default.
98
  - If using GPU, set `INFERENCE_RAM` to your GPU VRAM (per GPU). For example, if you have 16 GB of VRAM, set `INFERENCE_RAM=16`.
99
  - Depending on your document types, marker's average memory usage per task can vary slightly. You can configure `VRAM_PER_TASK` to adjust this if you notice tasks failing with GPU out of memory errors.
100
  - Inspect the other settings in `marker/settings.py`. You can override any settings in the `local.env` file, or by setting environment variables.
101
- - By default, the final editor model is off. Turn it on with `ENABLE_EDITOR_MODEL`.
102
  - By default, marker will use ocrmypdf for OCR, which is slower than base tesseract, but higher quality. You can change this with the `OCR_ENGINE` setting.
103
 
104
  ## Convert a single file
@@ -148,6 +147,8 @@ MIN_LENGTH=10000 METADATA_FILE=../pdf_meta.json NUM_DEVICES=4 NUM_WORKERS=15 bas
148
  - `NUM_WORKERS` is the number of parallel processes to run on each GPU. Per-GPU parallelism will not increase beyond `INFERENCE_RAM / VRAM_PER_TASK`.
149
  - `MIN_LENGTH` is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down)
150
 
 
 
151
  # Benchmarks
152
 
153
  Benchmarking PDF extraction quality is hard. I've created a test set by finding books and scientific papers that have a pdf version and a latex source. I convert the latex to text, and compare the reference to the output of text extraction methods.
@@ -203,7 +204,6 @@ I'm building a version that can be used commercially, by stripping out the depen
203
  Here are the non-commercial/restrictive dependencies:
204
 
205
  - LayoutLMv3: CC BY-NC-SA 4.0 . [Source](https://huggingface.co/microsoft/layoutlmv3-base)
206
- - Nougat: CC-BY-NC . [Source](https://github.com/facebookresearch/nougat)
207
  - PyMuPDF - GPL . [Source](https://pymupdf.readthedocs.io/en/latest/about.html#license-and-copyright)
208
 
209
  Other dependencies/datasets are openly licensed (doclaynet, byt5), or used in a way that is compatible with commercial usage (ghostscript).
 
6
  - Removes headers/footers/other artifacts
7
  - Converts most equations to latex
8
  - Formats code blocks and tables
9
+ - Support for multiple languages (although most testing is done in English). See `settings.py` for a language list, or to add your own.
10
  - Works on GPU, CPU, or MPS
11
 
12
  ## How it works
 
15
 
16
  - Extract text, OCR if necessary (heuristics, tesseract)
17
  - Detect page layout ([layout segmenter](https://huggingface.co/vikp/layout_segmenter), [column detector](https://huggingface.co/vikp/column_detector))
18
+ - Clean and format each block (heuristics, [texify](https://huggingface.co/vikp/texify))
19
  - Combine blocks and postprocess complete text (heuristics, [pdf_postprocessor](https://huggingface.co/vikp/pdf_postprocessor_t5))
20
 
21
  Relying on autoregressive forward passes to generate text is slow and prone to hallucination/repetition. From the nougat paper: `We observed [repetition] in 1.5% of pages in the test set, but the frequency increases for out-of-domain documents.` In my anecdotal testing, repetitions happen on 5%+ of out-of-domain (non-arXiv) pages.
 
48
 
49
  PDF is a tricky format, so marker will not always work perfectly. Here are some known limitations that are on the roadmap to address:
50
 
51
+ - Marker will not convert 100% of equations to LaTeX. This is because it has to first detect equations, then convert them.
52
  - Whitespace and indentations are not always respected.
53
  - Not all lines/spans will be joined properly.
54
+ - Languages similar to English (Spanish, French, German, Russian, etc) have the best support. There is provisional support for Chinese, Japanese, Korean, and Hindi, but it may not work as well. You can add other languages by adding them to the `TESSERACT_LANGUAGES` and `SPELLCHECK_LANGUAGES` settings in `settings.py`.
55
  - This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
56
 
57
  # Installation
 
88
  - Install python requirements
89
  - `poetry install`
90
  - `poetry shell` to activate your poetry venv
 
91
 
92
  # Usage
93
 
94
+ First, some configuration. Note that settings can be overridden with env vars, or in a `local.env` file in the root `marker` folder.
95
 
96
+ - Your torch device will be automatically detected, but you can manually set it also. For example, `TORCH_DEVICE=cuda` or `TORCH_DEVICE=mps`. `cpu` is the default.
97
  - If using GPU, set `INFERENCE_RAM` to your GPU VRAM (per GPU). For example, if you have 16 GB of VRAM, set `INFERENCE_RAM=16`.
98
  - Depending on your document types, marker's average memory usage per task can vary slightly. You can configure `VRAM_PER_TASK` to adjust this if you notice tasks failing with GPU out of memory errors.
99
  - Inspect the other settings in `marker/settings.py`. You can override any settings in the `local.env` file, or by setting environment variables.
100
+ - By default, the final editor model is off. Turn it on with `ENABLE_EDITOR_MODEL=true`.
101
  - By default, marker will use ocrmypdf for OCR, which is slower than base tesseract, but higher quality. You can change this with the `OCR_ENGINE` setting.
102
 
103
  ## Convert a single file
 
147
  - `NUM_WORKERS` is the number of parallel processes to run on each GPU. Per-GPU parallelism will not increase beyond `INFERENCE_RAM / VRAM_PER_TASK`.
148
  - `MIN_LENGTH` is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down)
149
 
150
+ Note that the env variables above are specific to this script, and cannot be set in `local.env`.
151
+
152
  # Benchmarks
153
 
154
  Benchmarking PDF extraction quality is hard. I've created a test set by finding books and scientific papers that have a pdf version and a latex source. I convert the latex to text, and compare the reference to the output of text extraction methods.
 
204
  Here are the non-commercial/restrictive dependencies:
205
 
206
  - LayoutLMv3: CC BY-NC-SA 4.0 . [Source](https://huggingface.co/microsoft/layoutlmv3-base)
 
207
  - PyMuPDF - GPL . [Source](https://pymupdf.readthedocs.io/en/latest/about.html#license-and-copyright)
208
 
209
  Other dependencies/datasets are openly licensed (doclaynet, byt5), or used in a way that is compatible with commercial usage (ghostscript).
marker/settings.py CHANGED
@@ -120,8 +120,6 @@ class Settings(BaseSettings):
120
  def MODEL_DTYPE(self) -> torch.dtype:
121
  if self.TORCH_DEVICE_MODEL == "cuda":
122
  return torch.bfloat16
123
- elif self.TORCH_DEVICE_MODEL == "mps":
124
- return torch.float16
125
  else:
126
  return torch.float32
127
 
 
120
  def MODEL_DTYPE(self) -> torch.dtype:
121
  if self.TORCH_DEVICE_MODEL == "cuda":
122
  return torch.bfloat16
 
 
123
  else:
124
  return torch.float32
125
 
poetry.lock CHANGED
@@ -14,6 +14,30 @@ files = [
14
  [package.dependencies]
15
  frozenlist = ">=1.1.0"
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  [[package]]
18
  name = "annotated-types"
19
  version = "0.6.0"
@@ -247,6 +271,28 @@ webencodings = "*"
247
  [package.extras]
248
  css = ["tinycss2 (>=1.1.0,<1.3)"]
249
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
250
  [[package]]
251
  name = "certifi"
252
  version = "2023.11.17"
@@ -776,6 +822,37 @@ files = [
776
  [package.dependencies]
777
  wcwidth = ">=0.2.12,<0.3.0"
778
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
779
  [[package]]
780
  name = "grpcio"
781
  version = "1.60.0"
@@ -2125,6 +2202,74 @@ files = [
2125
  {file = "packaging-23.2.tar.gz", hash = "sha256:048fb0e9405036518eaaf48a55953c750c11e1a1b68e0dd1a9d62ed0c092cfc5"},
2126
  ]
2127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2128
  [[package]]
2129
  name = "pandocfilters"
2130
  version = "1.5.0"
@@ -2436,6 +2581,54 @@ files = [
2436
  [package.extras]
2437
  tests = ["pytest"]
2438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2439
  [[package]]
2440
  name = "pycparser"
2441
  version = "2.21"
@@ -2598,6 +2791,25 @@ files = [
2598
  pydantic = ">=2.3.0"
2599
  python-dotenv = ">=0.21.0"
2600
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2601
  [[package]]
2602
  name = "pygments"
2603
  version = "2.17.2"
@@ -2682,6 +2894,28 @@ files = [
2682
  {file = "PyMuPDFb-1.23.7-py3-none-win_amd64.whl", hash = "sha256:7552793efa6976574b8b7840fd0091773c410e6048bc7cbf4b2eb3ed92d0b7a5"},
2683
  ]
2684
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2685
  [[package]]
2686
  name = "pyspellchecker"
2687
  version = "0.7.2"
@@ -2758,6 +2992,17 @@ files = [
2758
  {file = "python_magic-0.4.27-py2.py3-none-any.whl", hash = "sha256:c212960ad306f700aa0d01e5d7a325d20548ff97eb9920dcd29513174f0294d3"},
2759
  ]
2760
 
 
 
 
 
 
 
 
 
 
 
 
2761
  [[package]]
2762
  name = "pywin32"
2763
  version = "306"
@@ -3704,6 +3949,17 @@ files = [
3704
  {file = "six-1.16.0.tar.gz", hash = "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926"},
3705
  ]
3706
 
 
 
 
 
 
 
 
 
 
 
 
3707
  [[package]]
3708
  name = "sniffio"
3709
  version = "1.3.0"
@@ -3745,6 +4001,61 @@ pure-eval = "*"
3745
  [package.extras]
3746
  tests = ["cython", "littleutils", "pygments", "pytest", "typeguard"]
3747
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3748
  [[package]]
3749
  name = "sympy"
3750
  version = "1.12"
@@ -3773,6 +4084,20 @@ files = [
3773
  [package.extras]
3774
  widechars = ["wcwidth"]
3775
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3776
  [[package]]
3777
  name = "terminado"
3778
  version = "0.18.0"
@@ -3794,6 +4119,32 @@ docs = ["myst-parser", "pydata-sphinx-theme", "sphinx"]
3794
  test = ["pre-commit", "pytest (>=7.0)", "pytest-timeout"]
3795
  typing = ["mypy (>=1.6,<2.0)", "traitlets (>=5.11.1)"]
3796
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3797
  [[package]]
3798
  name = "thefuzz"
3799
  version = "0.20.0"
@@ -3952,6 +4303,17 @@ dev = ["tokenizers[testing]"]
3952
  docs = ["setuptools_rust", "sphinx", "sphinx_rtd_theme"]
3953
  testing = ["black (==22.3)", "datasets", "numpy", "pytest", "requests"]
3954
 
 
 
 
 
 
 
 
 
 
 
 
3955
  [[package]]
3956
  name = "tomli"
3957
  version = "2.0.1"
@@ -3963,33 +4325,44 @@ files = [
3963
  {file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"},
3964
  ]
3965
 
 
 
 
 
 
 
 
 
 
 
 
3966
  [[package]]
3967
  name = "torch"
3968
- version = "2.1.1"
3969
  description = "Tensors and Dynamic neural networks in Python with strong GPU acceleration"
3970
  optional = false
3971
  python-versions = ">=3.8.0"
3972
  files = [
3973
- {file = "torch-2.1.1-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:5ebc43f5355a9b7be813392b3fb0133991f0380f6f0fcc8218d5468dc45d1071"},
3974
- {file = "torch-2.1.1-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:84fefd63356416c0cd20578637ccdbb82164993400ed17b57c951dd6376dcee8"},
3975
- {file = "torch-2.1.1-cp310-cp310-win_amd64.whl", hash = "sha256:0a7a9da0c324409bcb5a7bdad1b4e94e936d21c2590aaa7ac2f63968da8c62f7"},
3976
- {file = "torch-2.1.1-cp310-none-macosx_10_9_x86_64.whl", hash = "sha256:1e1e5faddd43a8f2c0e0e22beacd1e235a2e447794d807483c94a9e31b54a758"},
3977
- {file = "torch-2.1.1-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:e76bf3c5c354874f1da465c852a2fb60ee6cbce306e935337885760f080f9baa"},
3978
- {file = "torch-2.1.1-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:98fea993639b0bb432dfceb7b538f07c0f1c33386d63f635219f49254968c80f"},
3979
- {file = "torch-2.1.1-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:61b51b33c61737c287058b0c3061e6a9d3c363863e4a094f804bc486888a188a"},
3980
- {file = "torch-2.1.1-cp311-cp311-win_amd64.whl", hash = "sha256:1d70920da827e2276bf07f7ec46958621cad18d228c97da8f9c19638474dbd52"},
3981
- {file = "torch-2.1.1-cp311-none-macosx_10_9_x86_64.whl", hash = "sha256:a70593806f1d7e6b53657d96810518da0f88ef2608c98a402955765b8c79d52c"},
3982
- {file = "torch-2.1.1-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:e312f7e82e49565f7667b0bbf9559ab0c597063d93044740781c02acd5a87978"},
3983
- {file = "torch-2.1.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:1e3cbecfa5a7314d828f4a37b0c286714dc9aa2e69beb7a22f7aca76567ed9f4"},
3984
- {file = "torch-2.1.1-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:9ca0fcbf3d5ba644d6a8572c83a9abbdf5f7ff575bc38529ef6c185a3a71bde9"},
3985
- {file = "torch-2.1.1-cp38-cp38-win_amd64.whl", hash = "sha256:2dc9f312fc1fa0d61a565a0292ad73119d4b74c9f8b5031b55f8b4722abca079"},
3986
- {file = "torch-2.1.1-cp38-none-macosx_10_9_x86_64.whl", hash = "sha256:d56b032176458e2af4709627bbd2c20fe2917eff8cd087a7fe313acccf5ce2f1"},
3987
- {file = "torch-2.1.1-cp38-none-macosx_11_0_arm64.whl", hash = "sha256:29e3b90a8c281f6660804a939d1f4218604c80162e521e1e6d8c8557325902a0"},
3988
- {file = "torch-2.1.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:bd95cee8511584b67ddc0ba465c3f1edeb5708d833ee02af1206b4486f1d9096"},
3989
- {file = "torch-2.1.1-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:b31230bd058424e56dba7f899280dbc6ac8b9948e43902e0c84a44666b1ec151"},
3990
- {file = "torch-2.1.1-cp39-cp39-win_amd64.whl", hash = "sha256:403f1095e665e4f35971b43797a920725b8b205723aa68254a4050c6beca29b6"},
3991
- {file = "torch-2.1.1-cp39-none-macosx_10_9_x86_64.whl", hash = "sha256:715b50d8c1de5da5524a68287eb000f73e026e74d5f6b12bc450ef6995fcf5f9"},
3992
- {file = "torch-2.1.1-cp39-none-macosx_11_0_arm64.whl", hash = "sha256:db67e8725c76f4c7f4f02e7551bb16e81ba1a1912867bc35d7bb96d2be8c78b4"},
3993
  ]
3994
 
3995
  [package.dependencies]
@@ -4186,6 +4559,34 @@ files = [
4186
  {file = "typing_extensions-4.8.0.tar.gz", hash = "sha256:df8e4339e9cb77357558cbdbceca33c303714cf861d1eef15e1070055ae8b7ef"},
4187
  ]
4188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4189
  [[package]]
4190
  name = "uri-template"
4191
  version = "1.3.0"
@@ -4216,6 +4617,67 @@ brotli = ["brotli (>=1.0.9)", "brotlicffi (>=0.8.0)"]
4216
  socks = ["pysocks (>=1.5.6,!=1.5.7,<2.0)"]
4217
  zstd = ["zstandard (>=0.18.0)"]
4218
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4219
  [[package]]
4220
  name = "wcwidth"
4221
  version = "0.2.12"
@@ -4377,4 +4839,4 @@ testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "p
4377
  [metadata]
4378
  lock-version = "2.0"
4379
  python-versions = ">=3.9,<3.13,!=3.9.7"
4380
- content-hash = "18b11c67696bc73165c460a9253d12f5e7ba48644ab4eddb0651fe26b99cb077"
 
14
  [package.dependencies]
15
  frozenlist = ">=1.1.0"
16
 
17
+ [[package]]
18
+ name = "altair"
19
+ version = "5.2.0"
20
+ description = "Vega-Altair: A declarative statistical visualization library for Python."
21
+ optional = false
22
+ python-versions = ">=3.8"
23
+ files = [
24
+ {file = "altair-5.2.0-py3-none-any.whl", hash = "sha256:8c4888ad11db7c39f3f17aa7f4ea985775da389d79ac30a6c22856ab238df399"},
25
+ {file = "altair-5.2.0.tar.gz", hash = "sha256:2ad7f0c8010ebbc46319cc30febfb8e59ccf84969a201541c207bc3a4fa6cf81"},
26
+ ]
27
+
28
+ [package.dependencies]
29
+ jinja2 = "*"
30
+ jsonschema = ">=3.0"
31
+ numpy = "*"
32
+ packaging = "*"
33
+ pandas = ">=0.25"
34
+ toolz = "*"
35
+ typing-extensions = {version = ">=4.0.1", markers = "python_version < \"3.11\""}
36
+
37
+ [package.extras]
38
+ dev = ["anywidget", "geopandas", "hatch", "ipython", "m2r", "mypy", "pandas-stubs", "pyarrow (>=11)", "pytest", "pytest-cov", "ruff (>=0.1.3)", "types-jsonschema", "types-setuptools", "vega-datasets", "vegafusion[embed] (>=1.4.0)", "vl-convert-python (>=1.1.0)"]
39
+ doc = ["docutils", "jinja2", "myst-parser", "numpydoc", "pillow (>=9,<10)", "pydata-sphinx-theme (>=0.14.1)", "scipy", "sphinx", "sphinx-copybutton", "sphinx-design", "sphinxext-altair"]
40
+
41
  [[package]]
42
  name = "annotated-types"
43
  version = "0.6.0"
 
271
  [package.extras]
272
  css = ["tinycss2 (>=1.1.0,<1.3)"]
273
 
274
+ [[package]]
275
+ name = "blinker"
276
+ version = "1.7.0"
277
+ description = "Fast, simple object-to-object and broadcast signaling"
278
+ optional = false
279
+ python-versions = ">=3.8"
280
+ files = [
281
+ {file = "blinker-1.7.0-py3-none-any.whl", hash = "sha256:c3f865d4d54db7abc53758a01601cf343fe55b84c1de4e3fa910e420b438d5b9"},
282
+ {file = "blinker-1.7.0.tar.gz", hash = "sha256:e6820ff6fa4e4d1d8e2747c2283749c3f547e4fee112b98555cdcdae32996182"},
283
+ ]
284
+
285
+ [[package]]
286
+ name = "cachetools"
287
+ version = "5.3.2"
288
+ description = "Extensible memoizing collections and decorators"
289
+ optional = false
290
+ python-versions = ">=3.7"
291
+ files = [
292
+ {file = "cachetools-5.3.2-py3-none-any.whl", hash = "sha256:861f35a13a451f94e301ce2bec7cac63e881232ccce7ed67fab9b5df4d3beaa1"},
293
+ {file = "cachetools-5.3.2.tar.gz", hash = "sha256:086ee420196f7b2ab9ca2db2520aca326318b68fe5ba8bc4d49cca91add450f2"},
294
+ ]
295
+
296
  [[package]]
297
  name = "certifi"
298
  version = "2023.11.17"
 
822
  [package.dependencies]
823
  wcwidth = ">=0.2.12,<0.3.0"
824
 
825
+ [[package]]
826
+ name = "gitdb"
827
+ version = "4.0.11"
828
+ description = "Git Object Database"
829
+ optional = false
830
+ python-versions = ">=3.7"
831
+ files = [
832
+ {file = "gitdb-4.0.11-py3-none-any.whl", hash = "sha256:81a3407ddd2ee8df444cbacea00e2d038e40150acfa3001696fe0dcf1d3adfa4"},
833
+ {file = "gitdb-4.0.11.tar.gz", hash = "sha256:bf5421126136d6d0af55bc1e7c1af1c397a34f5b7bd79e776cd3e89785c2b04b"},
834
+ ]
835
+
836
+ [package.dependencies]
837
+ smmap = ">=3.0.1,<6"
838
+
839
+ [[package]]
840
+ name = "gitpython"
841
+ version = "3.1.40"
842
+ description = "GitPython is a Python library used to interact with Git repositories"
843
+ optional = false
844
+ python-versions = ">=3.7"
845
+ files = [
846
+ {file = "GitPython-3.1.40-py3-none-any.whl", hash = "sha256:cf14627d5a8049ffbf49915732e5eddbe8134c3bdb9d476e6182b676fc573f8a"},
847
+ {file = "GitPython-3.1.40.tar.gz", hash = "sha256:22b126e9ffb671fdd0c129796343a02bf67bf2994b35449ffc9321aa755e18a4"},
848
+ ]
849
+
850
+ [package.dependencies]
851
+ gitdb = ">=4.0.1,<5"
852
+
853
+ [package.extras]
854
+ test = ["black", "coverage[toml]", "ddt (>=1.1.1,!=1.4.3)", "mock", "mypy", "pre-commit", "pytest", "pytest-cov", "pytest-instafail", "pytest-subtests", "pytest-sugar"]
855
+
856
  [[package]]
857
  name = "grpcio"
858
  version = "1.60.0"
 
2202
  {file = "packaging-23.2.tar.gz", hash = "sha256:048fb0e9405036518eaaf48a55953c750c11e1a1b68e0dd1a9d62ed0c092cfc5"},
2203
  ]
2204
 
2205
+ [[package]]
2206
+ name = "pandas"
2207
+ version = "2.1.4"
2208
+ description = "Powerful data structures for data analysis, time series, and statistics"
2209
+ optional = false
2210
+ python-versions = ">=3.9"
2211
+ files = [
2212
+ {file = "pandas-2.1.4-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:bdec823dc6ec53f7a6339a0e34c68b144a7a1fd28d80c260534c39c62c5bf8c9"},
2213
+ {file = "pandas-2.1.4-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:294d96cfaf28d688f30c918a765ea2ae2e0e71d3536754f4b6de0ea4a496d034"},
2214
+ {file = "pandas-2.1.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6b728fb8deba8905b319f96447a27033969f3ea1fea09d07d296c9030ab2ed1d"},
2215
+ {file = "pandas-2.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:00028e6737c594feac3c2df15636d73ace46b8314d236100b57ed7e4b9ebe8d9"},
2216
+ {file = "pandas-2.1.4-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:426dc0f1b187523c4db06f96fb5c8d1a845e259c99bda74f7de97bd8a3bb3139"},
2217
+ {file = "pandas-2.1.4-cp310-cp310-win_amd64.whl", hash = "sha256:f237e6ca6421265643608813ce9793610ad09b40154a3344a088159590469e46"},
2218
+ {file = "pandas-2.1.4-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b7d852d16c270e4331f6f59b3e9aa23f935f5c4b0ed2d0bc77637a8890a5d092"},
2219
+ {file = "pandas-2.1.4-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:bd7d5f2f54f78164b3d7a40f33bf79a74cdee72c31affec86bfcabe7e0789821"},
2220
+ {file = "pandas-2.1.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0aa6e92e639da0d6e2017d9ccff563222f4eb31e4b2c3cf32a2a392fc3103c0d"},
2221
+ {file = "pandas-2.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d797591b6846b9db79e65dc2d0d48e61f7db8d10b2a9480b4e3faaddc421a171"},
2222
+ {file = "pandas-2.1.4-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:d2d3e7b00f703aea3945995ee63375c61b2e6aa5aa7871c5d622870e5e137623"},
2223
+ {file = "pandas-2.1.4-cp311-cp311-win_amd64.whl", hash = "sha256:dc9bf7ade01143cddc0074aa6995edd05323974e6e40d9dbde081021ded8510e"},
2224
+ {file = "pandas-2.1.4-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:482d5076e1791777e1571f2e2d789e940dedd927325cc3cb6d0800c6304082f6"},
2225
+ {file = "pandas-2.1.4-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:8a706cfe7955c4ca59af8c7a0517370eafbd98593155b48f10f9811da440248b"},
2226
+ {file = "pandas-2.1.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b0513a132a15977b4a5b89aabd304647919bc2169eac4c8536afb29c07c23540"},
2227
+ {file = "pandas-2.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e9f17f2b6fc076b2a0078862547595d66244db0f41bf79fc5f64a5c4d635bead"},
2228
+ {file = "pandas-2.1.4-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:45d63d2a9b1b37fa6c84a68ba2422dc9ed018bdaa668c7f47566a01188ceeec1"},
2229
+ {file = "pandas-2.1.4-cp312-cp312-win_amd64.whl", hash = "sha256:f69b0c9bb174a2342818d3e2778584e18c740d56857fc5cdb944ec8bbe4082cf"},
2230
+ {file = "pandas-2.1.4-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:3f06bda01a143020bad20f7a85dd5f4a1600112145f126bc9e3e42077c24ef34"},
2231
+ {file = "pandas-2.1.4-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:ab5796839eb1fd62a39eec2916d3e979ec3130509930fea17fe6f81e18108f6a"},
2232
+ {file = "pandas-2.1.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:edbaf9e8d3a63a9276d707b4d25930a262341bca9874fcb22eff5e3da5394732"},
2233
+ {file = "pandas-2.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1ebfd771110b50055712b3b711b51bee5d50135429364d0498e1213a7adc2be8"},
2234
+ {file = "pandas-2.1.4-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:8ea107e0be2aba1da619cc6ba3f999b2bfc9669a83554b1904ce3dd9507f0860"},
2235
+ {file = "pandas-2.1.4-cp39-cp39-win_amd64.whl", hash = "sha256:d65148b14788b3758daf57bf42725caa536575da2b64df9964c563b015230984"},
2236
+ {file = "pandas-2.1.4.tar.gz", hash = "sha256:fcb68203c833cc735321512e13861358079a96c174a61f5116a1de89c58c0ef7"},
2237
+ ]
2238
+
2239
+ [package.dependencies]
2240
+ numpy = [
2241
+ {version = ">=1.22.4,<2", markers = "python_version < \"3.11\""},
2242
+ {version = ">=1.23.2,<2", markers = "python_version == \"3.11\""},
2243
+ {version = ">=1.26.0,<2", markers = "python_version >= \"3.12\""},
2244
+ ]
2245
+ python-dateutil = ">=2.8.2"
2246
+ pytz = ">=2020.1"
2247
+ tzdata = ">=2022.1"
2248
+
2249
+ [package.extras]
2250
+ all = ["PyQt5 (>=5.15.6)", "SQLAlchemy (>=1.4.36)", "beautifulsoup4 (>=4.11.1)", "bottleneck (>=1.3.4)", "dataframe-api-compat (>=0.1.7)", "fastparquet (>=0.8.1)", "fsspec (>=2022.05.0)", "gcsfs (>=2022.05.0)", "html5lib (>=1.1)", "hypothesis (>=6.46.1)", "jinja2 (>=3.1.2)", "lxml (>=4.8.0)", "matplotlib (>=3.6.1)", "numba (>=0.55.2)", "numexpr (>=2.8.0)", "odfpy (>=1.4.1)", "openpyxl (>=3.0.10)", "pandas-gbq (>=0.17.5)", "psycopg2 (>=2.9.3)", "pyarrow (>=7.0.0)", "pymysql (>=1.0.2)", "pyreadstat (>=1.1.5)", "pytest (>=7.3.2)", "pytest-xdist (>=2.2.0)", "pyxlsb (>=1.0.9)", "qtpy (>=2.2.0)", "s3fs (>=2022.05.0)", "scipy (>=1.8.1)", "tables (>=3.7.0)", "tabulate (>=0.8.10)", "xarray (>=2022.03.0)", "xlrd (>=2.0.1)", "xlsxwriter (>=3.0.3)", "zstandard (>=0.17.0)"]
2251
+ aws = ["s3fs (>=2022.05.0)"]
2252
+ clipboard = ["PyQt5 (>=5.15.6)", "qtpy (>=2.2.0)"]
2253
+ compression = ["zstandard (>=0.17.0)"]
2254
+ computation = ["scipy (>=1.8.1)", "xarray (>=2022.03.0)"]
2255
+ consortium-standard = ["dataframe-api-compat (>=0.1.7)"]
2256
+ excel = ["odfpy (>=1.4.1)", "openpyxl (>=3.0.10)", "pyxlsb (>=1.0.9)", "xlrd (>=2.0.1)", "xlsxwriter (>=3.0.3)"]
2257
+ feather = ["pyarrow (>=7.0.0)"]
2258
+ fss = ["fsspec (>=2022.05.0)"]
2259
+ gcp = ["gcsfs (>=2022.05.0)", "pandas-gbq (>=0.17.5)"]
2260
+ hdf5 = ["tables (>=3.7.0)"]
2261
+ html = ["beautifulsoup4 (>=4.11.1)", "html5lib (>=1.1)", "lxml (>=4.8.0)"]
2262
+ mysql = ["SQLAlchemy (>=1.4.36)", "pymysql (>=1.0.2)"]
2263
+ output-formatting = ["jinja2 (>=3.1.2)", "tabulate (>=0.8.10)"]
2264
+ parquet = ["pyarrow (>=7.0.0)"]
2265
+ performance = ["bottleneck (>=1.3.4)", "numba (>=0.55.2)", "numexpr (>=2.8.0)"]
2266
+ plot = ["matplotlib (>=3.6.1)"]
2267
+ postgresql = ["SQLAlchemy (>=1.4.36)", "psycopg2 (>=2.9.3)"]
2268
+ spss = ["pyreadstat (>=1.1.5)"]
2269
+ sql-other = ["SQLAlchemy (>=1.4.36)"]
2270
+ test = ["hypothesis (>=6.46.1)", "pytest (>=7.3.2)", "pytest-xdist (>=2.2.0)"]
2271
+ xml = ["lxml (>=4.8.0)"]
2272
+
2273
  [[package]]
2274
  name = "pandocfilters"
2275
  version = "1.5.0"
 
2581
  [package.extras]
2582
  tests = ["pytest"]
2583
 
2584
+ [[package]]
2585
+ name = "pyarrow"
2586
+ version = "14.0.2"
2587
+ description = "Python library for Apache Arrow"
2588
+ optional = false
2589
+ python-versions = ">=3.8"
2590
+ files = [
2591
+ {file = "pyarrow-14.0.2-cp310-cp310-macosx_10_14_x86_64.whl", hash = "sha256:ba9fe808596c5dbd08b3aeffe901e5f81095baaa28e7d5118e01354c64f22807"},
2592
+ {file = "pyarrow-14.0.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:22a768987a16bb46220cef490c56c671993fbee8fd0475febac0b3e16b00a10e"},
2593
+ {file = "pyarrow-14.0.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2dbba05e98f247f17e64303eb876f4a80fcd32f73c7e9ad975a83834d81f3fda"},
2594
+ {file = "pyarrow-14.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a898d134d00b1eca04998e9d286e19653f9d0fcb99587310cd10270907452a6b"},
2595
+ {file = "pyarrow-14.0.2-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:87e879323f256cb04267bb365add7208f302df942eb943c93a9dfeb8f44840b1"},
2596
+ {file = "pyarrow-14.0.2-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:76fc257559404ea5f1306ea9a3ff0541bf996ff3f7b9209fc517b5e83811fa8e"},
2597
+ {file = "pyarrow-14.0.2-cp310-cp310-win_amd64.whl", hash = "sha256:b0c4a18e00f3a32398a7f31da47fefcd7a927545b396e1f15d0c85c2f2c778cd"},
2598
+ {file = "pyarrow-14.0.2-cp311-cp311-macosx_10_14_x86_64.whl", hash = "sha256:87482af32e5a0c0cce2d12eb3c039dd1d853bd905b04f3f953f147c7a196915b"},
2599
+ {file = "pyarrow-14.0.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:059bd8f12a70519e46cd64e1ba40e97eae55e0cbe1695edd95384653d7626b23"},
2600
+ {file = "pyarrow-14.0.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3f16111f9ab27e60b391c5f6d197510e3ad6654e73857b4e394861fc79c37200"},
2601
+ {file = "pyarrow-14.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:06ff1264fe4448e8d02073f5ce45a9f934c0f3db0a04460d0b01ff28befc3696"},
2602
+ {file = "pyarrow-14.0.2-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:6dd4f4b472ccf4042f1eab77e6c8bce574543f54d2135c7e396f413046397d5a"},
2603
+ {file = "pyarrow-14.0.2-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:32356bfb58b36059773f49e4e214996888eeea3a08893e7dbde44753799b2a02"},
2604
+ {file = "pyarrow-14.0.2-cp311-cp311-win_amd64.whl", hash = "sha256:52809ee69d4dbf2241c0e4366d949ba035cbcf48409bf404f071f624ed313a2b"},
2605
+ {file = "pyarrow-14.0.2-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:c87824a5ac52be210d32906c715f4ed7053d0180c1060ae3ff9b7e560f53f944"},
2606
+ {file = "pyarrow-14.0.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:a25eb2421a58e861f6ca91f43339d215476f4fe159eca603c55950c14f378cc5"},
2607
+ {file = "pyarrow-14.0.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5c1da70d668af5620b8ba0a23f229030a4cd6c5f24a616a146f30d2386fec422"},
2608
+ {file = "pyarrow-14.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2cc61593c8e66194c7cdfae594503e91b926a228fba40b5cf25cc593563bcd07"},
2609
+ {file = "pyarrow-14.0.2-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:78ea56f62fb7c0ae8ecb9afdd7893e3a7dbeb0b04106f5c08dbb23f9c0157591"},
2610
+ {file = "pyarrow-14.0.2-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:37c233ddbce0c67a76c0985612fef27c0c92aef9413cf5aa56952f359fcb7379"},
2611
+ {file = "pyarrow-14.0.2-cp312-cp312-win_amd64.whl", hash = "sha256:e4b123ad0f6add92de898214d404e488167b87b5dd86e9a434126bc2b7a5578d"},
2612
+ {file = "pyarrow-14.0.2-cp38-cp38-macosx_10_14_x86_64.whl", hash = "sha256:e354fba8490de258be7687f341bc04aba181fc8aa1f71e4584f9890d9cb2dec2"},
2613
+ {file = "pyarrow-14.0.2-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:20e003a23a13da963f43e2b432483fdd8c38dc8882cd145f09f21792e1cf22a1"},
2614
+ {file = "pyarrow-14.0.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fc0de7575e841f1595ac07e5bc631084fd06ca8b03c0f2ecece733d23cd5102a"},
2615
+ {file = "pyarrow-14.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:66e986dc859712acb0bd45601229021f3ffcdfc49044b64c6d071aaf4fa49e98"},
2616
+ {file = "pyarrow-14.0.2-cp38-cp38-manylinux_2_28_aarch64.whl", hash = "sha256:f7d029f20ef56673a9730766023459ece397a05001f4e4d13805111d7c2108c0"},
2617
+ {file = "pyarrow-14.0.2-cp38-cp38-manylinux_2_28_x86_64.whl", hash = "sha256:209bac546942b0d8edc8debda248364f7f668e4aad4741bae58e67d40e5fcf75"},
2618
+ {file = "pyarrow-14.0.2-cp38-cp38-win_amd64.whl", hash = "sha256:1e6987c5274fb87d66bb36816afb6f65707546b3c45c44c28e3c4133c010a881"},
2619
+ {file = "pyarrow-14.0.2-cp39-cp39-macosx_10_14_x86_64.whl", hash = "sha256:a01d0052d2a294a5f56cc1862933014e696aa08cc7b620e8c0cce5a5d362e976"},
2620
+ {file = "pyarrow-14.0.2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:a51fee3a7db4d37f8cda3ea96f32530620d43b0489d169b285d774da48ca9785"},
2621
+ {file = "pyarrow-14.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:64df2bf1ef2ef14cee531e2dfe03dd924017650ffaa6f9513d7a1bb291e59c15"},
2622
+ {file = "pyarrow-14.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3c0fa3bfdb0305ffe09810f9d3e2e50a2787e3a07063001dcd7adae0cee3601a"},
2623
+ {file = "pyarrow-14.0.2-cp39-cp39-manylinux_2_28_aarch64.whl", hash = "sha256:c65bf4fd06584f058420238bc47a316e80dda01ec0dfb3044594128a6c2db794"},
2624
+ {file = "pyarrow-14.0.2-cp39-cp39-manylinux_2_28_x86_64.whl", hash = "sha256:63ac901baec9369d6aae1cbe6cca11178fb018a8d45068aaf5bb54f94804a866"},
2625
+ {file = "pyarrow-14.0.2-cp39-cp39-win_amd64.whl", hash = "sha256:75ee0efe7a87a687ae303d63037d08a48ef9ea0127064df18267252cfe2e9541"},
2626
+ {file = "pyarrow-14.0.2.tar.gz", hash = "sha256:36cef6ba12b499d864d1def3e990f97949e0b79400d08b7cf74504ffbd3eb025"},
2627
+ ]
2628
+
2629
+ [package.dependencies]
2630
+ numpy = ">=1.16.6"
2631
+
2632
  [[package]]
2633
  name = "pycparser"
2634
  version = "2.21"
 
2791
  pydantic = ">=2.3.0"
2792
  python-dotenv = ">=0.21.0"
2793
 
2794
+ [[package]]
2795
+ name = "pydeck"
2796
+ version = "0.8.0"
2797
+ description = "Widget for deck.gl maps"
2798
+ optional = false
2799
+ python-versions = ">=3.7"
2800
+ files = [
2801
+ {file = "pydeck-0.8.0-py2.py3-none-any.whl", hash = "sha256:a8fa7757c6f24bba033af39db3147cb020eef44012ba7e60d954de187f9ed4d5"},
2802
+ {file = "pydeck-0.8.0.tar.gz", hash = "sha256:07edde833f7cfcef6749124351195aa7dcd24663d4909fd7898dbd0b6fbc01ec"},
2803
+ ]
2804
+
2805
+ [package.dependencies]
2806
+ jinja2 = ">=2.10.1"
2807
+ numpy = ">=1.16.4"
2808
+
2809
+ [package.extras]
2810
+ carto = ["pydeck-carto"]
2811
+ jupyter = ["ipykernel (>=5.1.2)", "ipython (>=5.8.0)", "ipywidgets (>=7,<8)", "traitlets (>=4.3.2)"]
2812
+
2813
  [[package]]
2814
  name = "pygments"
2815
  version = "2.17.2"
 
2894
  {file = "PyMuPDFb-1.23.7-py3-none-win_amd64.whl", hash = "sha256:7552793efa6976574b8b7840fd0091773c410e6048bc7cbf4b2eb3ed92d0b7a5"},
2895
  ]
2896
 
2897
+ [[package]]
2898
+ name = "pypdfium2"
2899
+ version = "4.25.0"
2900
+ description = "Python bindings to PDFium"
2901
+ optional = false
2902
+ python-versions = ">= 3.6"
2903
+ files = [
2904
+ {file = "pypdfium2-4.25.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:25075d85834bf70a2244ce564063ee9aa2c738a019c09aeffa61920163892110"},
2905
+ {file = "pypdfium2-4.25.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:ab46ac5e257b0610ca2bed7b5baf588b1417abe5bc36339ffdc651620dfe02f8"},
2906
+ {file = "pypdfium2-4.25.0-py3-none-manylinux_2_17_aarch64.whl", hash = "sha256:7b5131374574d4f602346d1ef489362cc7fbd7a1920214e14efd5114c1915bf5"},
2907
+ {file = "pypdfium2-4.25.0-py3-none-manylinux_2_17_armv7l.whl", hash = "sha256:20cb1d9fbd78595f0d0750a232f8204caa2f2aec34a1dde80ec5184f1efcf90d"},
2908
+ {file = "pypdfium2-4.25.0-py3-none-manylinux_2_17_i686.whl", hash = "sha256:48ab21bed55bcb2cbce5c61363fc0f5481b26f5eb34b484fbea126c0d86f3697"},
2909
+ {file = "pypdfium2-4.25.0-py3-none-manylinux_2_17_x86_64.whl", hash = "sha256:766730b0422347770189de5d3c7ff0c85620be12678d813fcce1122900b31c40"},
2910
+ {file = "pypdfium2-4.25.0-py3-none-musllinux_1_1_aarch64.whl", hash = "sha256:485b404bd059a80a1bb0646647e2f0493f9240ca2c2672059d876b91c2e7e9f9"},
2911
+ {file = "pypdfium2-4.25.0-py3-none-musllinux_1_1_i686.whl", hash = "sha256:f49234b882c5c3fd1936fa20db665d6667d45bcbefc8120565c600aa7287bda0"},
2912
+ {file = "pypdfium2-4.25.0-py3-none-musllinux_1_1_x86_64.whl", hash = "sha256:b43a0206c2ac5dc812b6bdff2e74023a080f06ca847dd7905063f33ca385c655"},
2913
+ {file = "pypdfium2-4.25.0-py3-none-win32.whl", hash = "sha256:23e693cceb0154a609838bf827a139a475eb532297d2c2b6cd4033b9bac73dad"},
2914
+ {file = "pypdfium2-4.25.0-py3-none-win_amd64.whl", hash = "sha256:43f7612bc802518d67dacec11d1d0fb447c7371ce70cb2328f70ef71e9fe0582"},
2915
+ {file = "pypdfium2-4.25.0-py3-none-win_arm64.whl", hash = "sha256:d05155f6a50b8025c0fe0a11a527d4f1716c09f06fa4e4d46f3798fd040e8b95"},
2916
+ {file = "pypdfium2-4.25.0.tar.gz", hash = "sha256:1691bfa3e65bc84fd8b1d26a6d68af90f9662771a647dd6c4e031d81c4a72037"},
2917
+ ]
2918
+
2919
  [[package]]
2920
  name = "pyspellchecker"
2921
  version = "0.7.2"
 
2992
  {file = "python_magic-0.4.27-py2.py3-none-any.whl", hash = "sha256:c212960ad306f700aa0d01e5d7a325d20548ff97eb9920dcd29513174f0294d3"},
2993
  ]
2994
 
2995
+ [[package]]
2996
+ name = "pytz"
2997
+ version = "2023.3.post1"
2998
+ description = "World timezone definitions, modern and historical"
2999
+ optional = false
3000
+ python-versions = "*"
3001
+ files = [
3002
+ {file = "pytz-2023.3.post1-py2.py3-none-any.whl", hash = "sha256:ce42d816b81b68506614c11e8937d3aa9e41007ceb50bfdcb0749b921bf646c7"},
3003
+ {file = "pytz-2023.3.post1.tar.gz", hash = "sha256:7b4fddbeb94a1eba4b557da24f19fdf9db575192544270a9101d8509f9f43d7b"},
3004
+ ]
3005
+
3006
  [[package]]
3007
  name = "pywin32"
3008
  version = "306"
 
3949
  {file = "six-1.16.0.tar.gz", hash = "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926"},
3950
  ]
3951
 
3952
+ [[package]]
3953
+ name = "smmap"
3954
+ version = "5.0.1"
3955
+ description = "A pure Python implementation of a sliding window memory map manager"
3956
+ optional = false
3957
+ python-versions = ">=3.7"
3958
+ files = [
3959
+ {file = "smmap-5.0.1-py3-none-any.whl", hash = "sha256:e6d8668fa5f93e706934a62d7b4db19c8d9eb8cf2adbb75ef1b675aa332b69da"},
3960
+ {file = "smmap-5.0.1.tar.gz", hash = "sha256:dceeb6c0028fdb6734471eb07c0cd2aae706ccaecab45965ee83f11c8d3b1f62"},
3961
+ ]
3962
+
3963
  [[package]]
3964
  name = "sniffio"
3965
  version = "1.3.0"
 
4001
  [package.extras]
4002
  tests = ["cython", "littleutils", "pygments", "pytest", "typeguard"]
4003
 
4004
+ [[package]]
4005
+ name = "streamlit"
4006
+ version = "1.29.0"
4007
+ description = "A faster way to build and share data apps"
4008
+ optional = false
4009
+ python-versions = ">=3.8, !=3.9.7"
4010
+ files = [
4011
+ {file = "streamlit-1.29.0-py2.py3-none-any.whl", hash = "sha256:753510edb5bb831af0e3bdacd353c879ad5b4f0211e7efa0ec378809464868b4"},
4012
+ {file = "streamlit-1.29.0.tar.gz", hash = "sha256:b6dfff9c5e132e5518c92150efcd452980db492a45fafeac3d4688d2334efa07"},
4013
+ ]
4014
+
4015
+ [package.dependencies]
4016
+ altair = ">=4.0,<6"
4017
+ blinker = ">=1.0.0,<2"
4018
+ cachetools = ">=4.0,<6"
4019
+ click = ">=7.0,<9"
4020
+ gitpython = ">=3.0.7,<3.1.19 || >3.1.19,<4"
4021
+ importlib-metadata = ">=1.4,<7"
4022
+ numpy = ">=1.19.3,<2"
4023
+ packaging = ">=16.8,<24"
4024
+ pandas = ">=1.3.0,<3"
4025
+ pillow = ">=7.1.0,<11"
4026
+ protobuf = ">=3.20,<5"
4027
+ pyarrow = ">=6.0"
4028
+ pydeck = ">=0.8.0b4,<1"
4029
+ python-dateutil = ">=2.7.3,<3"
4030
+ requests = ">=2.27,<3"
4031
+ rich = ">=10.14.0,<14"
4032
+ tenacity = ">=8.1.0,<9"
4033
+ toml = ">=0.10.1,<2"
4034
+ tornado = ">=6.0.3,<7"
4035
+ typing-extensions = ">=4.3.0,<5"
4036
+ tzlocal = ">=1.1,<6"
4037
+ validators = ">=0.2,<1"
4038
+ watchdog = {version = ">=2.1.5", markers = "platform_system != \"Darwin\""}
4039
+
4040
+ [package.extras]
4041
+ snowflake = ["snowflake-connector-python (>=2.8.0)", "snowflake-snowpark-python (>=0.9.0)"]
4042
+
4043
+ [[package]]
4044
+ name = "streamlit-drawable-canvas-jsretry"
4045
+ version = "0.9.3"
4046
+ description = "A Streamlit custom component for a free drawing canvas using Fabric.js. A fork to enable retrying for bg images."
4047
+ optional = false
4048
+ python-versions = ">=3.6"
4049
+ files = [
4050
+ {file = "streamlit-drawable-canvas-jsretry-0.9.3.tar.gz", hash = "sha256:d9da8a863faeeae01c8521e8e282ed83cc15b845962519149a61fc8eead7afe6"},
4051
+ {file = "streamlit_drawable_canvas_jsretry-0.9.3-py3-none-any.whl", hash = "sha256:e8035daa0297b504cc184e58ddf15cfd59680241ce1c2d0d554de507a263ca20"},
4052
+ ]
4053
+
4054
+ [package.dependencies]
4055
+ numpy = "*"
4056
+ Pillow = "*"
4057
+ streamlit = ">=0.63"
4058
+
4059
  [[package]]
4060
  name = "sympy"
4061
  version = "1.12"
 
4084
  [package.extras]
4085
  widechars = ["wcwidth"]
4086
 
4087
+ [[package]]
4088
+ name = "tenacity"
4089
+ version = "8.2.3"
4090
+ description = "Retry code until it succeeds"
4091
+ optional = false
4092
+ python-versions = ">=3.7"
4093
+ files = [
4094
+ {file = "tenacity-8.2.3-py3-none-any.whl", hash = "sha256:ce510e327a630c9e1beaf17d42e6ffacc88185044ad85cf74c0a8887c6a0f88c"},
4095
+ {file = "tenacity-8.2.3.tar.gz", hash = "sha256:5398ef0d78e63f40007c1fb4c0bff96e1911394d2fa8d194f77619c05ff6cc8a"},
4096
+ ]
4097
+
4098
+ [package.extras]
4099
+ doc = ["reno", "sphinx", "tornado (>=4.5)"]
4100
+
4101
  [[package]]
4102
  name = "terminado"
4103
  version = "0.18.0"
 
4119
  test = ["pre-commit", "pytest (>=7.0)", "pytest-timeout"]
4120
  typing = ["mypy (>=1.6,<2.0)", "traitlets (>=5.11.1)"]
4121
 
4122
+ [[package]]
4123
+ name = "texify"
4124
+ version = "0.1.8"
4125
+ description = "OCR for latex images"
4126
+ optional = false
4127
+ python-versions = ">=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*"
4128
+ files = [
4129
+ {file = "texify-0.1.8-py3-none-any.whl", hash = "sha256:6ffdf467abb21cc9eca58d8f786a9f74016f73413d66db2e5ad7dc6a7cdb0f1d"},
4130
+ {file = "texify-0.1.8.tar.gz", hash = "sha256:1ff7615b58b55381aed723cb24202b107a5ec12a1a4564fe1edd8d1315b3f0d1"},
4131
+ ]
4132
+
4133
+ [package.dependencies]
4134
+ ftfy = ">=6.1.3,<7.0.0"
4135
+ huggingface-hub = "0.19.4"
4136
+ numpy = ">=1.26.2,<2.0.0"
4137
+ Pillow = ">=10.1.0,<11.0.0"
4138
+ pydantic = ">=2.5.2,<3.0.0"
4139
+ pydantic-settings = ">=2.1.0,<3.0.0"
4140
+ pypdfium2 = ">=4.25.0,<5.0.0"
4141
+ python-dotenv = ">=1.0.0,<2.0.0"
4142
+ streamlit = ">=1.29.0,<2.0.0"
4143
+ streamlit-drawable-canvas-jsretry = ">=0.9.3,<0.10.0"
4144
+ tabulate = ">=0.9.0,<0.10.0"
4145
+ transformers = ">=4.36.2,<5.0.0"
4146
+ watchdog = ">=3.0.0,<4.0.0"
4147
+
4148
  [[package]]
4149
  name = "thefuzz"
4150
  version = "0.20.0"
 
4303
  docs = ["setuptools_rust", "sphinx", "sphinx_rtd_theme"]
4304
  testing = ["black (==22.3)", "datasets", "numpy", "pytest", "requests"]
4305
 
4306
+ [[package]]
4307
+ name = "toml"
4308
+ version = "0.10.2"
4309
+ description = "Python Library for Tom's Obvious, Minimal Language"
4310
+ optional = false
4311
+ python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*"
4312
+ files = [
4313
+ {file = "toml-0.10.2-py2.py3-none-any.whl", hash = "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b"},
4314
+ {file = "toml-0.10.2.tar.gz", hash = "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f"},
4315
+ ]
4316
+
4317
  [[package]]
4318
  name = "tomli"
4319
  version = "2.0.1"
 
4325
  {file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"},
4326
  ]
4327
 
4328
+ [[package]]
4329
+ name = "toolz"
4330
+ version = "0.12.0"
4331
+ description = "List processing tools and functional utilities"
4332
+ optional = false
4333
+ python-versions = ">=3.5"
4334
+ files = [
4335
+ {file = "toolz-0.12.0-py3-none-any.whl", hash = "sha256:2059bd4148deb1884bb0eb770a3cde70e7f954cfbbdc2285f1f2de01fd21eb6f"},
4336
+ {file = "toolz-0.12.0.tar.gz", hash = "sha256:88c570861c440ee3f2f6037c4654613228ff40c93a6c25e0eba70d17282c6194"},
4337
+ ]
4338
+
4339
  [[package]]
4340
  name = "torch"
4341
+ version = "2.1.2"
4342
  description = "Tensors and Dynamic neural networks in Python with strong GPU acceleration"
4343
  optional = false
4344
  python-versions = ">=3.8.0"
4345
  files = [
4346
+ {file = "torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:3a871edd6c02dae77ad810335c0833391c1a4ce49af21ea8cf0f6a5d2096eea8"},
4347
+ {file = "torch-2.1.2-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:bef6996c27d8f6e92ea4e13a772d89611da0e103b48790de78131e308cf73076"},
4348
+ {file = "torch-2.1.2-cp310-cp310-win_amd64.whl", hash = "sha256:0e13034fd5fb323cbbc29e56d0637a3791e50dd589616f40c79adfa36a5a35a1"},
4349
+ {file = "torch-2.1.2-cp310-none-macosx_10_9_x86_64.whl", hash = "sha256:d9b535cad0df3d13997dbe8bd68ac33e0e3ae5377639c9881948e40794a61403"},
4350
+ {file = "torch-2.1.2-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:f9a55d55af02826ebfbadf4e9b682f0f27766bc33df8236b48d28d705587868f"},
4351
+ {file = "torch-2.1.2-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:a6ebbe517097ef289cc7952783588c72de071d4b15ce0f8b285093f0916b1162"},
4352
+ {file = "torch-2.1.2-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:8f32ce591616a30304f37a7d5ea80b69ca9e1b94bba7f308184bf616fdaea155"},
4353
+ {file = "torch-2.1.2-cp311-cp311-win_amd64.whl", hash = "sha256:e0ee6cf90c8970e05760f898d58f9ac65821c37ffe8b04269ec787aa70962b69"},
4354
+ {file = "torch-2.1.2-cp311-none-macosx_10_9_x86_64.whl", hash = "sha256:76d37967c31c99548ad2c4d3f2cf191db48476f2e69b35a0937137116da356a1"},
4355
+ {file = "torch-2.1.2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:e2d83f07b4aac983453ea5bf8f9aa9dacf2278a8d31247f5d9037f37befc60e4"},
4356
+ {file = "torch-2.1.2-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:f41fe0c7ecbf903a568c73486139a75cfab287a0f6c17ed0698fdea7a1e8641d"},
4357
+ {file = "torch-2.1.2-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:e3225f47d50bb66f756fe9196a768055d1c26b02154eb1f770ce47a2578d3aa7"},
4358
+ {file = "torch-2.1.2-cp38-cp38-win_amd64.whl", hash = "sha256:33d59cd03cb60106857f6c26b36457793637512998666ee3ce17311f217afe2b"},
4359
+ {file = "torch-2.1.2-cp38-none-macosx_10_9_x86_64.whl", hash = "sha256:8e221deccd0def6c2badff6be403e0c53491805ed9915e2c029adbcdb87ab6b5"},
4360
+ {file = "torch-2.1.2-cp38-none-macosx_11_0_arm64.whl", hash = "sha256:05b18594f60a911a0c4f023f38a8bda77131fba5fd741bda626e97dcf5a3dd0a"},
4361
+ {file = "torch-2.1.2-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:9ca96253b761e9aaf8e06fb30a66ee301aecbf15bb5a303097de1969077620b6"},
4362
+ {file = "torch-2.1.2-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:d93ba70f67b08c2ae5598ee711cbc546a1bc8102cef938904b8c85c2089a51a0"},
4363
+ {file = "torch-2.1.2-cp39-cp39-win_amd64.whl", hash = "sha256:255b50bc0608db177e6a3cc118961d77de7e5105f07816585fa6f191f33a9ff3"},
4364
+ {file = "torch-2.1.2-cp39-none-macosx_10_9_x86_64.whl", hash = "sha256:6984cd5057c0c977b3c9757254e989d3f1124f4ce9d07caa6cb637783c71d42a"},
4365
+ {file = "torch-2.1.2-cp39-none-macosx_11_0_arm64.whl", hash = "sha256:bc195d7927feabc0eb7c110e457c955ed2ab616f3c7c28439dd4188cf589699f"},
4366
  ]
4367
 
4368
  [package.dependencies]
 
4559
  {file = "typing_extensions-4.8.0.tar.gz", hash = "sha256:df8e4339e9cb77357558cbdbceca33c303714cf861d1eef15e1070055ae8b7ef"},
4560
  ]
4561
 
4562
+ [[package]]
4563
+ name = "tzdata"
4564
+ version = "2023.4"
4565
+ description = "Provider of IANA time zone data"
4566
+ optional = false
4567
+ python-versions = ">=2"
4568
+ files = [
4569
+ {file = "tzdata-2023.4-py2.py3-none-any.whl", hash = "sha256:aa3ace4329eeacda5b7beb7ea08ece826c28d761cda36e747cfbf97996d39bf3"},
4570
+ {file = "tzdata-2023.4.tar.gz", hash = "sha256:dd54c94f294765522c77399649b4fefd95522479a664a0cec87f41bebc6148c9"},
4571
+ ]
4572
+
4573
+ [[package]]
4574
+ name = "tzlocal"
4575
+ version = "5.2"
4576
+ description = "tzinfo object for the local timezone"
4577
+ optional = false
4578
+ python-versions = ">=3.8"
4579
+ files = [
4580
+ {file = "tzlocal-5.2-py3-none-any.whl", hash = "sha256:49816ef2fe65ea8ac19d19aa7a1ae0551c834303d5014c6d5a62e4cbda8047b8"},
4581
+ {file = "tzlocal-5.2.tar.gz", hash = "sha256:8d399205578f1a9342816409cc1e46a93ebd5755e39ea2d85334bea911bf0e6e"},
4582
+ ]
4583
+
4584
+ [package.dependencies]
4585
+ tzdata = {version = "*", markers = "platform_system == \"Windows\""}
4586
+
4587
+ [package.extras]
4588
+ devenv = ["check-manifest", "pytest (>=4.3)", "pytest-cov", "pytest-mock (>=3.3)", "zest.releaser"]
4589
+
4590
  [[package]]
4591
  name = "uri-template"
4592
  version = "1.3.0"
 
4617
  socks = ["pysocks (>=1.5.6,!=1.5.7,<2.0)"]
4618
  zstd = ["zstandard (>=0.18.0)"]
4619
 
4620
+ [[package]]
4621
+ name = "validators"
4622
+ version = "0.22.0"
4623
+ description = "Python Data Validation for Humans™"
4624
+ optional = false
4625
+ python-versions = ">=3.8"
4626
+ files = [
4627
+ {file = "validators-0.22.0-py3-none-any.whl", hash = "sha256:61cf7d4a62bbae559f2e54aed3b000cea9ff3e2fdbe463f51179b92c58c9585a"},
4628
+ {file = "validators-0.22.0.tar.gz", hash = "sha256:77b2689b172eeeb600d9605ab86194641670cdb73b60afd577142a9397873370"},
4629
+ ]
4630
+
4631
+ [package.extras]
4632
+ docs-offline = ["myst-parser (>=2.0.0)", "pypandoc-binary (>=1.11)", "sphinx (>=7.1.1)"]
4633
+ docs-online = ["mkdocs (>=1.5.2)", "mkdocs-git-revision-date-localized-plugin (>=1.2.0)", "mkdocs-material (>=9.2.6)", "mkdocstrings[python] (>=0.22.0)", "pyaml (>=23.7.0)"]
4634
+ hooks = ["pre-commit (>=3.3.3)"]
4635
+ package = ["build (>=1.0.0)", "twine (>=4.0.2)"]
4636
+ runner = ["tox (>=4.11.1)"]
4637
+ sast = ["bandit[toml] (>=1.7.5)"]
4638
+ testing = ["pytest (>=7.4.0)"]
4639
+ tooling = ["black (>=23.7.0)", "pyright (>=1.1.325)", "ruff (>=0.0.287)"]
4640
+ tooling-extras = ["pyaml (>=23.7.0)", "pypandoc-binary (>=1.11)", "pytest (>=7.4.0)"]
4641
+
4642
+ [[package]]
4643
+ name = "watchdog"
4644
+ version = "3.0.0"
4645
+ description = "Filesystem events monitoring"
4646
+ optional = false
4647
+ python-versions = ">=3.7"
4648
+ files = [
4649
+ {file = "watchdog-3.0.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:336adfc6f5cc4e037d52db31194f7581ff744b67382eb6021c868322e32eef41"},
4650
+ {file = "watchdog-3.0.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:a70a8dcde91be523c35b2bf96196edc5730edb347e374c7de7cd20c43ed95397"},
4651
+ {file = "watchdog-3.0.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:adfdeab2da79ea2f76f87eb42a3ab1966a5313e5a69a0213a3cc06ef692b0e96"},
4652
+ {file = "watchdog-3.0.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:2b57a1e730af3156d13b7fdddfc23dea6487fceca29fc75c5a868beed29177ae"},
4653
+ {file = "watchdog-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:7ade88d0d778b1b222adebcc0927428f883db07017618a5e684fd03b83342bd9"},
4654
+ {file = "watchdog-3.0.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7e447d172af52ad204d19982739aa2346245cc5ba6f579d16dac4bfec226d2e7"},
4655
+ {file = "watchdog-3.0.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:9fac43a7466eb73e64a9940ac9ed6369baa39b3bf221ae23493a9ec4d0022674"},
4656
+ {file = "watchdog-3.0.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:8ae9cda41fa114e28faf86cb137d751a17ffd0316d1c34ccf2235e8a84365c7f"},
4657
+ {file = "watchdog-3.0.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:25f70b4aa53bd743729c7475d7ec41093a580528b100e9a8c5b5efe8899592fc"},
4658
+ {file = "watchdog-3.0.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:4f94069eb16657d2c6faada4624c39464f65c05606af50bb7902e036e3219be3"},
4659
+ {file = "watchdog-3.0.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:7c5f84b5194c24dd573fa6472685b2a27cc5a17fe5f7b6fd40345378ca6812e3"},
4660
+ {file = "watchdog-3.0.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:3aa7f6a12e831ddfe78cdd4f8996af9cf334fd6346531b16cec61c3b3c0d8da0"},
4661
+ {file = "watchdog-3.0.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:233b5817932685d39a7896b1090353fc8efc1ef99c9c054e46c8002561252fb8"},
4662
+ {file = "watchdog-3.0.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl", hash = "sha256:13bbbb462ee42ec3c5723e1205be8ced776f05b100e4737518c67c8325cf6100"},
4663
+ {file = "watchdog-3.0.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:8f3ceecd20d71067c7fd4c9e832d4e22584318983cabc013dbf3f70ea95de346"},
4664
+ {file = "watchdog-3.0.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:c9d8c8ec7efb887333cf71e328e39cffbf771d8f8f95d308ea4125bf5f90ba64"},
4665
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_aarch64.whl", hash = "sha256:0e06ab8858a76e1219e68c7573dfeba9dd1c0219476c5a44d5333b01d7e1743a"},
4666
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_armv7l.whl", hash = "sha256:d00e6be486affb5781468457b21a6cbe848c33ef43f9ea4a73b4882e5f188a44"},
4667
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_i686.whl", hash = "sha256:c07253088265c363d1ddf4b3cdb808d59a0468ecd017770ed716991620b8f77a"},
4668
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_ppc64.whl", hash = "sha256:5113334cf8cf0ac8cd45e1f8309a603291b614191c9add34d33075727a967709"},
4669
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_ppc64le.whl", hash = "sha256:51f90f73b4697bac9c9a78394c3acbbd331ccd3655c11be1a15ae6fe289a8c83"},
4670
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_s390x.whl", hash = "sha256:ba07e92756c97e3aca0912b5cbc4e5ad802f4557212788e72a72a47ff376950d"},
4671
+ {file = "watchdog-3.0.0-py3-none-manylinux2014_x86_64.whl", hash = "sha256:d429c2430c93b7903914e4db9a966c7f2b068dd2ebdd2fa9b9ce094c7d459f33"},
4672
+ {file = "watchdog-3.0.0-py3-none-win32.whl", hash = "sha256:3ed7c71a9dccfe838c2f0b6314ed0d9b22e77d268c67e015450a29036a81f60f"},
4673
+ {file = "watchdog-3.0.0-py3-none-win_amd64.whl", hash = "sha256:4c9956d27be0bb08fc5f30d9d0179a855436e655f046d288e2bcc11adfae893c"},
4674
+ {file = "watchdog-3.0.0-py3-none-win_ia64.whl", hash = "sha256:5d9f3a10e02d7371cd929b5d8f11e87d4bad890212ed3901f9b4d68767bee759"},
4675
+ {file = "watchdog-3.0.0.tar.gz", hash = "sha256:4d98a320595da7a7c5a18fc48cb633c2e73cda78f93cac2ef42d42bf609a33f9"},
4676
+ ]
4677
+
4678
+ [package.extras]
4679
+ watchmedo = ["PyYAML (>=3.10)"]
4680
+
4681
  [[package]]
4682
  name = "wcwidth"
4683
  version = "0.2.12"
 
4839
  [metadata]
4840
  lock-version = "2.0"
4841
  python-versions = ">=3.9,<3.13,!=3.9.7"
4842
+ content-hash = "f91c7dd6b1e0ce34ca55fd50d54791e95d323b17e0331eef988eb7cf61bed5e1"
pyproject.toml CHANGED
@@ -1,6 +1,6 @@
1
  [tool.poetry]
2
  name = "marker-pdf"
3
- version = "0.1.2"
4
  description = "Convert PDF to markdown with high speed and accuracy."
5
  authors = ["Vik Paruchuri <github@vikas.sh>"]
6
  readme = "README.md"
@@ -42,6 +42,7 @@ nltk = "^3.8.1"
42
  ocrmypdf = "^15.4.0"
43
  bitsandbytes = "^0.41.2.post2"
44
  grpcio = "^1.60.0"
 
45
 
46
  [tool.poetry.group.dev.dependencies]
47
  jupyter = "^1.0.0"
 
1
  [tool.poetry]
2
  name = "marker-pdf"
3
+ version = "0.1.3"
4
  description = "Convert PDF to markdown with high speed and accuracy."
5
  authors = ["Vik Paruchuri <github@vikas.sh>"]
6
  readme = "README.md"
 
42
  ocrmypdf = "^15.4.0"
43
  bitsandbytes = "^0.41.2.post2"
44
  grpcio = "^1.60.0"
45
+ texify = "^0.1.8"
46
 
47
  [tool.poetry.group.dev.dependencies]
48
  jupyter = "^1.0.0"