Seeing AI can now see a little better. What’s new in Seeing AI 5.2

Seeing AI 5.2 Feature Update in front of person using the app to take a photo of someone sitting a table

by David Redmond

I think it’s fair to say that in recent times Seeing AI got a bit of a kicking. Speaking with my technology trainer hat on for a minute, where once people were impressed by the Microsoft tool, in recent times it’s been Be My AI that’s been getting all the praise. People using Seeing AI complain that they find it hard to listen to, and they want Be My AI on their phone instead.

Determined to stay in the game, Microsoft released Seeing AI 5.2 this week, and it’s got several features to try to compete in this new world where blind and visually impaired people have higher standards. So what’s new?

Seeing AI 5.2 release notes

  • Rich Image Descriptions: Seeing AI can now provide far more detail in image descriptions. Take a photo on the Scene channel, select an image in Browse Photos, or share a photo from another app – and then tap “More Info”.
  • Ask Questions About Documents: On the Document channel, after scanning a page, you can now ask Seeing AI questions about its contents. This can be much more efficient than listening to the document from beginning to end. For example, you could scan a menu, and ask for just the vegetarian options; scan a receipt, and find out how much a particular item costs; or scan a calendar, and ask when a particular event is.
  • Multi-page document scanning: On the Document channel, you can now scan multiple pages into a single document for reading, or sharing.
  • For easier access to image descriptions, the Scene channel has been moved to earlier in the channel switcher. You can customize the order of channels in Settings.
  • Plus, various bug fixes under the hood.
  • Please remember to use your judgement when reading AI-generated content. We appreciate your feedback as the technology continues to evolve.

Is it a good update?

Let’s start with the image descriptions. It’s quite different to Be My AI. When you take a photo you don’t get the detailed description straight away. The user needs to find the more info button to generate a detailed response. Some feel that Seeing AI is a bit less descriptive, but this is hard to measure.

I can’t help but feel that Seeing AI is still in catch-up mode here and that Be My AI provides a more seamless experience. With that said, having both apps in the space is certainly good and will hopefully result in both upping their game.

As for the document improvements, they are also a solid update. The icons are small but are all well-labeled with VoiceOver.

The ability to add multiple pages feels like it should have been there ages ago, but it works well. When you scan multiple pages you get a blue bar at the bottom with buttons for switching between them.

Ask Seeing AI has a button on the bottom right next to share, and moves you into a chat window similar to Be My AI. Responses are great and provide a lot of context and I didn’t come across many issues.

When you scan a document the AI only sees the OCR output and not the image, so while you can ask for information in the text you can not ask for any visual info. When I asked it to describe the visual layout of a cover page, it responded “Sorry, I could not find that information in the document”. There’s nothing wrong with this method necessarily but it’s worth noting the difference.

Unfortunately Seeing AI doesn’t help much with digital documents as far as follow-up questions are concerned. If you try to share a PDF into the app the Seeing AI doesn’t show up in the share sheet which is a real shame. Be My AI does show up but only ever scans the first page. Both tools have a long way to go as far as documents are concerned.

These features all feel like they are part of the foundation, and I’m now waiting to see the developers build the house.

Conclusion

Comparing Seeing AI and Be My AI is not that dissimilar from comparing JAWS and NVDA. Both are making good efforts but in a way I want both to do more. Be My AI needs to add multi-page document scanning from the share sheet, and Seeing AI also needs that alongside a simple interface.

Seeing AI has loads of incredible features such as currency scanning and product analysis, but in some ways Be My AI can do the exact same things just with a different approach.

I can’t help but feel that while AI is a huge feature of both tools, they are taking very different approaches. Ultimately it’s personal choice which you pick.

If it’s simplicity you want to go with Be My AI. If you want more of an everything accessibility app then go Seeing AI. Both are great options that will only get better.

Check out our Seeing AI 5.2 demo:

Sign Up For Our Technology Newsletter




*By clicking submit you are consenting to receive information from Vision Ireland

Please Subscribe to our Talking Technology Podcast