bug/Bold characters get repeated while extracting

**Describe the bug**
I'm trying to read a pdf file that contains bold and normal text. The normal text gets read correctly, but all the characters of the bold text are repeated.

For example, **BOLD TEXT** is read as BBOOLLDD  TTEEXXTT.

**To Reproduce**
```
filename = "example_files/creatinine.pdf" # cannot share this file because it contains confidential information
with open(filename, "rb") as f:
    files=shared.Files(
        content=f.read(), 
        file_name=filename,
    )

req = shared.PartitionParameters(
    files=files,
    strategy='hi_res',
    pdf_infer_table_structure=True,
    languages=["eng"],
)
try:
    resp = s.general.partition(req)
    print(json.dumps(resp.elements[19], indent=2))
except SDKError as e:
    print(e)
```

**Expected behavior**
The output of the above code should be as follows:

`
{
  "type": "NarrativeText",
  "element_id": "681ea37fceaad7479d246b8ccc52ec2d",
  "text": ">60",
  "metadata": {
    "filetype": "application/pdf",
    "languages": [
      "eng"
    ],
    "page_number": 2,
    "parent_id": "e72be637f803a9bf4509b64448ff1133",
    "filename": "creatinine.pdf"
  }
}
`

But since the text **>60** is BOLD in the pdf, the output looks like this:

`
{
  "type": "NarrativeText",
  "element_id": "681ea37fceaad7479d246b8ccc52ec2d",
  "text": ">60>60",
  "metadata": {
    "filetype": "application/pdf",
    "languages": [
      "eng"
    ],
    "page_number": 2,
    "parent_id": "e72be637f803a9bf4509b64448ff1133",
    "filename": "creatinine.pdf"
  }
}
`

**Screenshots**
Here's a screenshot from the pdf showing **>60** in bold
<img width="389" alt="image" src="https://github.com/user-attachments/assets/e99d7c5d-223d-4707-a3f1-868123b89459" />

Here's a screenshot of the code and the output:
<img width="384" alt="image" src="https://github.com/user-attachments/assets/a0e16417-0f84-47da-9615-c60c867b2a8e" />


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug/Bold characters get repeated while extracting #3864

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug/Bold characters get repeated while extracting #3864

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions