fix bug of table in docx (#510)

### What problem does this PR solve?
#509 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
This commit is contained in:
KevinHuSh
2024-04-23 19:10:33 +08:00
committed by GitHub
parent 6405041b4d
commit 369400c483
2 changed files with 5 additions and 4 deletions

View File

@@ -76,6 +76,7 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,
binary if binary else filename, from_page=from_page, to_page=to_page)
remove_contents_table(sections, eng=is_english(
random_choices([t for t, _ in sections], k=200)))
tbls = [((None, lns), None) for lns in tbls]
callback(0.8, "Finish parsing.")
elif re.search(r"\.pdf$", filename, re.IGNORECASE):