Calculating WeightWatchers points from a picture of the Nutritional Information

19. October 2013 00:00 by Jay Grossman in   //  Tags: , , ,   //   Comments (0)

When I have tried to track the food I am eating as part of a diet, entering the key nutrition elements was not always convenient. Since I haven't seen a WeightWatchers product for reading nutritional values from a picture, I thought it would be a cool thing to prototype.

One of the guys in my office had a box of oatmeal packets on his desk, so he took a picture with his iPhone 5 -  (oatmeal_photo.JPG (2.99 mb)). Below is a lower resolution version:

 

I had never previously done anything at all with OCR, so I knew I'd learn at the very least. I googled looking for open source OCR libraries and web sites that parsed text from images. I tried quite a few and got the best results from the tesseract project:
https://code.google.com/p/tesseract-ocr/ 

I banged out a quick powershell script to:

  1. Execute tesseract to output a text file. 
  2. Parsed the output file for values for Servings, Protein, Carbohydrates, Fat, and Fiber
  3. Calculated the WeightWatchers Points valuations based on these values.

Although I am unable to share my code with everyone, here's my PowerGUI screen showing the output from the prototype: 

 

Conclusions:

  1. I am really psyched I was able to get this functionality working in under 3 hours. It's really important to do this kind of exercise from time to time.
  2. None of the OCR options I found were anywhere close to perfect at parsing all the text from this image. There's a bunch of assumptions and fuzzy logic transformation rules that a production quality version would require. 
  3. I tried to run some of the libraries with other pictures. I realized quickly that picture quality (size, resolution, and clarity) and lighting glares make a huge difference on how accurately the text in the image gets recognized. This is not a trivial challenge in production!
  4. I was pleasantly surprised that tesseract was able to understand most of the text on the page, including the text written sideways on the right side of the picture.
Comments 

About the author

Jay Grossman

techie / entrepreneur that enjoys:
 1) my kids + awesome wife
 2) building software projects/products
 3) digging for gold in data
 4) rooting for my Boston sports teams:New England PatriotsBoston PatriotsBoston Red SoxBoston CelticsBoston Bruins

Month List