You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/intro.Rmd
+6-18Lines changed: 6 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -121,30 +121,18 @@ text <- input %>%
121
121
cat(text)
122
122
```
123
123
124
-
## Read from scanned PDF files and multipage TIFF files
124
+
## Read from PDF files
125
125
126
-
If your images are stored in PDF files they first need to be converted to a proper image format. Use a high DPI to keep quality of the image.
126
+
If your images are stored in PDF files they first need to be converted to a proper image format. We can do this in R using the `pdf_convert` function from the pdftools package. Use a high DPI to keep quality of the image.
Tesseract supports hundreds of "control parameters" which alter the OCR engine. Use `tesseract_params()` to list all parameters with their default value and a brief description. It also has a handy `filter` argument to quickly find parameters that match a particular string.
0 commit comments