Open
Description
I have this pdf file: https://docs.ton.org/ton.pdf
I used following recipe to create a toc
:
[[heading]]
# TON Blockchain
level = 1
greedy = true
font.name = "F102"
font.size = 17.21540069580078
# font.size_tolerance = 1e-5
# font.color = 0x000000
# font.superscript = false
# font.italic = false
# font.serif = false
# font.monospace = false
# font.bold = false
# bbox.left = 138.70851135253906
# bbox.top = 127.66803741455078
# bbox.right = 274.1837158203125
# bbox.bottom = 144.88343811035156
# bbox.tolerance = 1e-5
[[heading]]
# TON Blockchain as a Collection of 2-Blockchains
level = 2
greedy = true
font.name = "F108"
font.size = 14.346199989318848
# font.size_tolerance = 1e-5
# font.color = 0x000000
# font.superscript = false
# font.italic = false
# font.serif = false
# font.monospace = false
# font.bold = false
# bbox.left = 146.76255798339844
# bbox.top = 291.47509765625
# bbox.right = 486.075927734375
# bbox.bottom = 305.8212890625
# bbox.tolerance = 1e-5
[[heading]]
# 2.1.1. List of blockchain types.
level = 3
greedy = false
font.name = "F104"
font.size = 11.9552001953125
# font.size_tolerance = 1e-5
# font.color = 0x000000
# font.superscript = false
# font.italic = false
# font.serif = false
# font.monospace = false
# font.bold = false
# bbox.left = 110.85400390625
# bbox.top = 395.5226745605469
# bbox.right = 289.56573486328125
# bbox.bottom = 407.52569580078125
# bbox.tolerance = 1e-5
The problem is that level 3
would contain many wrong outputs, for example:
"1 Brief Description of TON Components" 3
"2 2.1.17 2.4.20" 3
"3" 3
"4.1.7" 3
"4.1.10 3.1.6" 3
"3.2 3.2.10 3.2.14 3.2.12" 3
"4 4.3.14 4.3.17 3.2.12 4.1.6" 4
"4.3.1" 4
"5" 4
"4.3.23" 4
"2.9.13 4.1" 4
"2 TON Blockchain" 5
"2.1 TON Blockchain as a Collection of 2-Blockchains" 5
"2.1.17" 5
"2.1.1. List of blockchain types." 5
"2.8.8 2.9.7 2.9.8" 5
"2.8.12 2.8.8" 6
"2.1.17" 6
"2.1.2. Innite Sharding Paradigm." 6
"2.1.3. Messages. Instant Hypercube Routing. 2.4.2 2.4.20" 7
"2.1.4. Quantity of masterchains, workchains and shardchains." 7
The correct ones all share the same pattern: "\d+\.\d+\.\d+\.
. Currently I can delete wrong level 3
lines in vim
using this command
:'<,'>g!/"\d\+\.\d\+\.\d\+\./d
But it's better to have a regex pattern
matching filter. The filter should be able to:
- exclude an output that doesn't match a regex
Metadata
Metadata
Assignees
Labels
No labels