At Home with Tech

Unlock the power of all your technology and learn how to master your photography, computers and smartphone.

Category: AI

Is ChatGPT’s Emotional Voice Assistant Getting too Personal?

The lines that define humanity have gotten a bit blurrier, now that it’s harder to differentiative between an interactive life-like AI voice and flesh and blood.

When watching science fiction, we accept it when a talking computer sounds like a real person. From Iron Man’s J.A.R.V.I.S. to the Starship Discovery’s Zora, it’s a common sci-fi character device. And, of course, there’s the mother of all talking computers… HAL. Some fictional computer voices are friendly. Others are not. But they all sound like us.

Well, it isn’t science fiction anymore. With ChatGPT 4.o, now we’ve got a young, perky, friendly woman’s voice waiting to talk with you. And it seems entirely life-like with a total range of interactive emotions.

I don’t think OpenAI has given this new AI voice assistant a name yet, like Alexa or Siri. So, I’ll just call it Jane, the name I gave to my talking Garmin car GPS unit a few centuries back.

Well, you’ve done it, OpenAI. Yes, Jane seems alive.

Jane’s got Personality
I’m simultaneously enthralled and appalled. Sure, OpenAI presented the world just a demo of this female AI voice interface, and it wasn’t perfect, but it was close enough. It was hard to tell if her Scarlett Johansson-like vibe was real or not. She certainly sounded like she had feelings.

The three on-camera people all laughed and talked with Jane about mostly frivolous topics. It all seemed so wonderful and natural. They were perfect humans having a virtual coffee with a digital proto-human at the edge of the ‘singularity.’ Just another day at the office.

What could possibly be concerning?

There’s another Barrett
I was distracted about a separate detail that hit a little closer to home. One of the human presenters was named Barrett. Yes. There aren’t too many first-name Barretts out there. So, that coincidence struck me. My inner-Spock eyebrow raised a tad. “Fascinating.”

Perhaps I should pay closer attention.

The demo proceeded to show off Jane’s skills. She wasn’t just a voice. She had eyes too. She can see and process information through your phone’s camera. Yes.

Then, Jane complimented Barrett on what he was wearing. It felt strangely personal.

Okay. Now, I think we’ve crossed beyond the typical definition of a phone app.

And then I fell down the rabbit hole…

Is Humanity Replaceable?
I can’t stop thinking about the season 3 finale to “Westworld” (2020) when the evil Man in Black, played by Ed Harris, comes face to face with his robot host duplicate and realizes there’s no difference between them. He is entirely replaceable.

And I happened to recently stream “Mission: Impossible – Dead Reckoning Part One” (2023) during family movie night. The AI ‘Entity’ is of course the scary omniscient villain in the background. We never really get to meet it, but the self-aware AI seems impossible to beat. (We’ll have to wait until next summer to find out how Tom Cruise figures out the key solution.)

Fiction writers have forever been telling scary stories about computers gone amok. The Terminator. Ultron. Better-Stronger-Faster. (Wait, that’s just Steve Austin. Never mind.)

We’re in Control?
We’ve been trained for years to fear a superior AI-driven entity that will simply take over one day.

Now, I’m not sure anyone knows what’s going to happen when a computer actually becomes self-aware. But I don’t think we’re there yet.

Friendly Jane is just a new ‘emotion-simulation’ interface from ChatGPT. It’s a tool for us to use.

ChatGPT and other generative AI chatbots are supposed to help us do certain things faster. And they certainly do.

So, why the fuss?

Identity Crisis
I think our deeply embedded human fear of a Skynet overlord is partially a biproduct of years of exposure to scary storytelling.

Is this a branding problem to solve? Clearly, Barrett and his OpenAI colleagues are trying to address that with their very helpful Jane.

But I believe we’re also struggling with this redefining moment of what it really means to be human.

Artificial Human?
Did people feel threatened when the pocket calculator was introduced? Or the PC? Or the act of Googling? I don’t think so.

Sure, ChatGPT can process and present information faster than any human mind. But computers already passed that threshold years ago. We know that.

What’s so different now that there’s simply a young, engaged female ‘human’ voice attached to that interface?

Have we crossed over some invisible line of authenticity that defines our very identity as a species?

Maybe.

Activate your Inner John Connor
What’s clear is we are in the middle of an insanely rapid technological evolution. And if you want to know what it is to be human in the 21st century, you may be forced to redefine it a bit.

And so, you’d better figure out how to control the tools that are already doing what yesterday only we could do.

This is not a choice.

For starters, it’s time to learn how to be a good ‘prompt engineer.’ I guarantee tomorrow’s children will grow up being experts at this the same way yesterday’s toddlers intuitively knew how to navigate the first iPads.

Pay Attention
Don’t we already know that a pretty voice and manufactured beauty shouldn’t be a defining characteristic of any real person?

Will we need to pay more attention in the future when presented with reasonable facsimiles of the human form and function? Absolutely.

If you spot your doppelganger tomorrow on the street staring at you, you probably have something to worry about.

But I think eventually having a helpful J.A.R.V.I.S. in your life can be productive, empowering and even nurturing.

…As long as you don’t forget ‘what’ you’re dealing with. It’s the ‘what.’ Not the ‘who.’

Jane is not alive.

That’s the line we don’t want to cross.

Master the Art of Transcribing Speech in an Audio File with Microsoft Word’s AI Magic

If you want to convert an audio file into an AI-generated text transcript for free, you could find a robot from the future to handle the task. But look no further than your Microsoft 365 subscription. Here are the quick and easy steps to upload and transcribe your file using Word on the web.

Many AI transcription tools available today convert recorded speech to text from audio and video files. AI voice-to-text conversion isn’t perfect, but it’s getting better all the time.

The easy ‘pro’ solution is to use Adobe Premiere Pro, which integrates strong transcription powers into its video-editing interface. But if you’re not in the Adobe ecosystem or looking for other solutions that don’t require video editing, you’ll need to look elsewhere.

Most standalone AI transcription tools do cost money after a limited free trial. In doing research for a personal project, I tried to identify a low-cost or free AI solution.

And I was hoping to find it with a recognizable brand I felt I could trust. Happily, I discovered that Microsoft effectively offers what I needed.

And the Microsoft solution is free (as long as you already pay for a Microsoft 365 account).

Microsoft Word on the Web via your Microsoft 365 Account
There are a few important details to take care of before getting started on your transcription journey with Microsoft:

  1. You have to use the web version of Microsoft Word via your Microsoft 365 account. (That’s the key to opening this free transcription door.)
  2. You also need to use Chrome or Edge (not Safari… which seemingly doesn’t offer the ‘Transcribe’ feature).
  3. If you’re working with a video file, you first need to convert your video to an audio file. Yes, Microsoft Word transcription only lets you upload audio files, and that creates an extra step (but it’s worth it).

To make the audio file, I use the ‘Compressor’ app on my Mac. (You can also use QuickTime.) The conversion process goes surprisingly fast. (I converted to MP3s with Compressor, but you can convert to other audio formats.)

Step-by-Step Guide to your AI Transcript

Once you’ve got your audio files created and ready to go, here’s how to create your AI transcripts:

Click on the ‘Dictate’ drop-down on the top right of Word’s ribbon. Then the ‘Transcribe’ option displays. Click it.

Click on ‘Upload audio.’

Choose your audio file to be used. The Transcribe feature activates.

The full transcription will quickly appear on the right after 30 seconds or so, depending on the length of your audio file. Click ‘Add to document.’ Click ‘With speakers and time stamps.’

Click on ‘File.’ Choose ‘Save as.’ Choose ‘Download a copy.’

Done! That’s it.

Now you’ve got your AI-driven transcript with time stamps that you can easily work with.

Perfection Not Required
If you’re planning to edit a video using these transcripts, you can select your sound bites by simply highlighting the sentences you want. From there, you can create your paper edit. (A paper edit is the roadmap you’ll follow when doing your actual video editing.)

Is the AI transcript from Microsoft perfect? Not at all! But it’s good enough to select the sound bites you need.

Not Subtitle-Ready
Warning: If you’re eventually planning on taking the next step to create subtitles/captions from these clips for a final video, you’ll still have some ‘human-powered’ proofing work ahead of you (as there are plenty of AI misspellings and misinterpreted words).

Easy and Fast Solution
In the old days, people would transcribe long audio or video files themselves. If you paid someone to do this, it could cost hundreds of dollars. Now, AI has effectively taken over this painstaking task.

Though imperfect, it gets the job done.
(And don’t forget- AI transcription technologies will only continue to improve over time.)

Plus, it’s free and already baked into your Microsoft 365 account.
(Yes, you do need to pay for Microsoft 365, but then this is a great way to maximize that necessary investment.)

They say emerging AI will continue to make our lives ‘easier.’ I’m happy to report that this is just another example.

How AI can Fix your Low-Resolution Photos

If you’ve got an old digital photo that looks grainy when you crop in, it’s time to add in more pixels with a little AI assistance. This cropped photo of our cat from 2008 benefits from 4x more pixels on the left generated by Adobe Lightroom.

We all know the famous scene in the 1982 sci-fi movie “Blade Runner” where Harrison Ford’s futuristic detective inserts a photo into a computer and tells it to zoom in and enhance the clarity of the background until he finds a person hidden in a reflection from a tiny mirror.

No, we can’t tell today’s computers to scan a photo, “track 45 left” and then “enhance 15 to 23” to find what’s there. But we’re getting closer.

That’s thanks to today’s software that can increase resolution in lower-res photos while maintaining the quality (and without adding digital artifacts). This trick can also clean up jaggy edges that become more apparent when you zoom into a low-res pic.

Often, when you crop in too tight on a photo, grainy problems show up, because you’ve deleted too many of the pixels. You’ve suddenly created a low-res photo that clearly needs pixel infusion.

Enhance Tool is Not Science Fiction
Adobe Lightroom can help. It has an AI-powered upsampling ‘Enhance’ feature called ‘Super Resolution.’ This nifty tool creates a duplicate photo with four times the pixels. And that can make a significant difference.

Here’s how to ‘enhance’ a digital photo in Lightroom:

  • Click on the Photo dropdown on the top menu
  • Click on Enhance
  • Click on Super Resolution
  • Then click Enhance
    (You can preview the effect before you proceed.)
  • Voilà! An ‘enhanced’ file is generated in a DNG format.

There are other companies that offer similar solutions, but as Adobe Lightroom Classic is my main photo-editing and organization tool, I’m very happy to keep my workflow in one place.

A Useful Tool for the Right Circumstances
I’ve used this enhance trick mostly when I work with digital photos that I took twenty years ago. That’s, of course, during the early age of digital photography when original file sizes were relatively tiny.

It’s a helpful solution, but this tool is not magic. It can’t create what’s not there or fix a blurry photo. But it does add in a bit more visual crispness, even if you’re not having a pixelization problem.

It’s also quite helpful if you want to print out the photo. A physical print is usually more unforgiving than a computer screen.

Adding Pixels into My Old Photos
Here’s a photo I took of an actor playing a Klingon at the Star Trek Experience in Las Vegas back in 2001.
The original photo file was only 1024 x 768 pixels. I’ve cropped it in tight to just 198 x 264 pixels. The enhanced version on the left gets our friendly Klingon up to 396 x 598, which does make a difference.

Here’s a street shot I took in Hong Kong in 2005.
The enhanced shot on the left helps to bring out the background. You can also make out some of the car’s license plate letters.

Smile for AI
If you’ve found yourself having to squint to pick out the above differences, that’s okay. They’re minor, but they’re there. I think it’s fair to say that Adobe Lightroom’s “Super Resolution” mostly gives you minor sharpening.

It’s not a magic wand, but it does give you 4x more pixels to work with out of thin air.

With AI’s text-to-image capabilities already in common use today, I’m sure this is not the last time we’ll be discussing how AI can rebuild old photos in just a few clicks.