Skip to content

[Feature Request] Support Extracting / Reading Text From Image #73

@ChristopherGabba

Description

@ChristopherGabba

Hey Marc!

Was just looking at a feature request for my app and thought of a neat thing this library could support. I have an image list in my app, kind of like iOS photos and it would be awesome if I could just dynamically search text across all images.

I think there is an iOS API that allows you to read text from images (maybe the same for android).

Typescript API could be something like this:

Image

const text = NitroImage.extractText().toLowerCase()
console.log(text) // "marc rousavy is a coding machine" or maybe it returns ["marc", "rousavy", "is", "a", "coding", "machine"]

Looking through the NitroImage package and asking AI, it seems like this is possible (I'm not naturally a native engineer):

import Vision

public extension NativeImage {
  /// Extracts all recognized text from the image using Apple's Vision framework.
  func extractText() throws -> String {
    guard let cgImage = uiImage.cgImage else {
      throw RuntimeError.error(withMessage: "This image does not have an underlying .cgImage!")
    }

    let request = VNRecognizeTextRequest()
    request.recognitionLevel = .accurate
    request.usesLanguageCorrection = true

    let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])
    try handler.perform([request])

    guard let results = request.results as? [VNRecognizedTextObservation] else {
      return ""
    }

    let recognizedStrings = results.compactMap { $0.topCandidates(1).first?.string }
    return recognizedStrings.joined(separator: "\n")
  }

  /// Async wrapper for text extraction.
  func extractTextAsync() throws -> Promise<String> {
    return Promise.async {
      return try self.extractText()
    }
  }
}

Supposedly there is a method on Android:

import com.google.mlkit.vision.common.InputImage
import com.google.mlkit.vision.text.TextRecognition
import com.google.mlkit.vision.text.TextRecognizerOptions

fun extractTextFromBitmap(bitmap: Bitmap, callback: (String) -> Unit, errorCallback: (Exception) -> Unit) {
    val image = InputImage.fromBitmap(bitmap, 0)
    val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)
    recognizer.process(image)
        .addOnSuccessListener { visionText ->
            // Get full text
            val resultText = visionText.text  // This gives all recognized text joined with newline by default
            callback(resultText)
        }
        .addOnFailureListener { e ->
            errorCallback(e)
        }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions