At Home with Tech

Unlock the power of all your technology and learn how to master your photography, computers and smartphone.

Category: Tech Diary

Is ChatGPT’s Emotional Voice Assistant Getting too Personal?

The lines that define humanity have gotten a bit blurrier, now that it’s harder to differentiative between an interactive life-like AI voice and flesh and blood.

When watching science fiction, we accept it when a talking computer sounds like a real person. From Iron Man’s J.A.R.V.I.S. to the Starship Discovery’s Zora, it’s a common sci-fi character device. And, of course, there’s the mother of all talking computers… HAL. Some fictional computer voices are friendly. Others are not. But they all sound like us.

Well, it isn’t science fiction anymore. With ChatGPT 4.o, now we’ve got a young, perky, friendly woman’s voice waiting to talk with you. And it seems entirely life-like with a total range of interactive emotions.

I don’t think OpenAI has given this new AI voice assistant a name yet, like Alexa or Siri. So, I’ll just call it Jane, the name I gave to my talking Garmin car GPS unit a few centuries back.

Well, you’ve done it, OpenAI. Yes, Jane seems alive.

Jane’s got Personality
I’m simultaneously enthralled and appalled. Sure, OpenAI presented the world just a demo of this female AI voice interface, and it wasn’t perfect, but it was close enough. It was hard to tell if her Scarlett Johansson-like vibe was real or not. She certainly sounded like she had feelings.

The three on-camera people all laughed and talked with Jane about mostly frivolous topics. It all seemed so wonderful and natural. They were perfect humans having a virtual coffee with a digital proto-human at the edge of the ‘singularity.’ Just another day at the office.

What could possibly be concerning?

There’s another Barrett
I was distracted about a separate detail that hit a little closer to home. One of the human presenters was named Barrett. Yes. There aren’t too many first-name Barretts out there. So, that coincidence struck me. My inner-Spock eyebrow raised a tad. “Fascinating.”

Perhaps I should pay closer attention.

The demo proceeded to show off Jane’s skills. She wasn’t just a voice. She had eyes too. She can see and process information through your phone’s camera. Yes.

Then, Jane complimented Barrett on what he was wearing. It felt strangely personal.

Okay. Now, I think we’ve crossed beyond the typical definition of a phone app.

And then I fell down the rabbit hole…

Is Humanity Replaceable?
I can’t stop thinking about the season 3 finale to “Westworld” (2020) when the evil Man in Black, played by Ed Harris, comes face to face with his robot host duplicate and realizes there’s no difference between them. He is entirely replaceable.

And I happened to recently stream “Mission: Impossible – Dead Reckoning Part One” (2023) during family movie night. The AI ‘Entity’ is of course the scary omniscient villain in the background. We never really get to meet it, but the self-aware AI seems impossible to beat. (We’ll have to wait until next summer to find out how Tom Cruise figures out the key solution.)

Fiction writers have forever been telling scary stories about computers gone amok. The Terminator. Ultron. Better-Stronger-Faster. (Wait, that’s just Steve Austin. Never mind.)

We’re in Control?
We’ve been trained for years to fear a superior AI-driven entity that will simply take over one day.

Now, I’m not sure anyone knows what’s going to happen when a computer actually becomes self-aware. But I don’t think we’re there yet.

Friendly Jane is just a new ‘emotion-simulation’ interface from ChatGPT. It’s a tool for us to use.

ChatGPT and other generative AI chatbots are supposed to help us do certain things faster. And they certainly do.

So, why the fuss?

Identity Crisis
I think our deeply embedded human fear of a Skynet overlord is partially a biproduct of years of exposure to scary storytelling.

Is this a branding problem to solve? Clearly, Barrett and his OpenAI colleagues are trying to address that with their very helpful Jane.

But I believe we’re also struggling with this redefining moment of what it really means to be human.

Artificial Human?
Did people feel threatened when the pocket calculator was introduced? Or the PC? Or the act of Googling? I don’t think so.

Sure, ChatGPT can process and present information faster than any human mind. But computers already passed that threshold years ago. We know that.

What’s so different now that there’s simply a young, engaged female ‘human’ voice attached to that interface?

Have we crossed over some invisible line of authenticity that defines our very identity as a species?

Maybe.

Activate your Inner John Connor
What’s clear is we are in the middle of an insanely rapid technological evolution. And if you want to know what it is to be human in the 21st century, you may be forced to redefine it a bit.

And so, you’d better figure out how to control the tools that are already doing what yesterday only we could do.

This is not a choice.

For starters, it’s time to learn how to be a good ‘prompt engineer.’ I guarantee tomorrow’s children will grow up being experts at this the same way yesterday’s toddlers intuitively knew how to navigate the first iPads.

Pay Attention
Don’t we already know that a pretty voice and manufactured beauty shouldn’t be a defining characteristic of any real person?

Will we need to pay more attention in the future when presented with reasonable facsimiles of the human form and function? Absolutely.

If you spot your doppelganger tomorrow on the street staring at you, you probably have something to worry about.

But I think eventually having a helpful J.A.R.V.I.S. in your life can be productive, empowering and even nurturing.

…As long as you don’t forget ‘what’ you’re dealing with. It’s the ‘what.’ Not the ‘who.’

Jane is not alive.

That’s the line we don’t want to cross.

Chasing the Bloom to Capture the Magic of Spring

I always enjoy the experience of capturing blossom bliss with my camera. Here are a few of my photos.

When the flowers begin to bloom, and the spring cherry blossoms pop, it’s absolute magic. But it’s always too fleeting. Days. Maybe a few weeks. And then suddenly, summer is just around the corner.

That’s not so bad, but I think no other season can beat spring in New England.

Each year, I grab my Lumix camera (or simply use my iPhone) to capture the arc of this annual display throughout my neighborhood. It’s all so beautiful, from the early buds to the fallen blossoms near the end.

Here’s what nature graciously presented to me this year…

Master the Art of Transcribing Speech in an Audio File with Microsoft Word’s AI Magic

If you want to convert an audio file into an AI-generated text transcript for free, you could find a robot from the future to handle the task. But look no further than your Microsoft 365 subscription. Here are the quick and easy steps to upload and transcribe your file using Word on the web.

Many AI transcription tools available today convert recorded speech to text from audio and video files. AI voice-to-text conversion isn’t perfect, but it’s getting better all the time.

The easy ‘pro’ solution is to use Adobe Premiere Pro, which integrates strong transcription powers into its video-editing interface. But if you’re not in the Adobe ecosystem or looking for other solutions that don’t require video editing, you’ll need to look elsewhere.

Most standalone AI transcription tools do cost money after a limited free trial. In doing research for a personal project, I tried to identify a low-cost or free AI solution.

And I was hoping to find it with a recognizable brand I felt I could trust. Happily, I discovered that Microsoft effectively offers what I needed.

And the Microsoft solution is free (as long as you already pay for a Microsoft 365 account).

Microsoft Word on the Web via your Microsoft 365 Account
There are a few important details to take care of before getting started on your transcription journey with Microsoft:

  1. You have to use the web version of Microsoft Word via your Microsoft 365 account. (That’s the key to opening this free transcription door.)
  2. You also need to use Chrome or Edge (not Safari… which seemingly doesn’t offer the ‘Transcribe’ feature).
  3. If you’re working with a video file, you first need to convert your video to an audio file. Yes, Microsoft Word transcription only lets you upload audio files, and that creates an extra step (but it’s worth it).

To make the audio file, I use the ‘Compressor’ app on my Mac. (You can also use QuickTime.) The conversion process goes surprisingly fast. (I converted to MP3s with Compressor, but you can convert to other audio formats.)

Step-by-Step Guide to your AI Transcript

Once you’ve got your audio files created and ready to go, here’s how to create your AI transcripts:

Click on the ‘Dictate’ drop-down on the top right of Word’s ribbon. Then the ‘Transcribe’ option displays. Click it.

Click on ‘Upload audio.’

Choose your audio file to be used. The Transcribe feature activates.

The full transcription will quickly appear on the right after 30 seconds or so, depending on the length of your audio file. Click ‘Add to document.’ Click ‘With speakers and time stamps.’

Click on ‘File.’ Choose ‘Save as.’ Choose ‘Download a copy.’

Done! That’s it.

Now you’ve got your AI-driven transcript with time stamps that you can easily work with.

Perfection Not Required
If you’re planning to edit a video using these transcripts, you can select your sound bites by simply highlighting the sentences you want. From there, you can create your paper edit. (A paper edit is the roadmap you’ll follow when doing your actual video editing.)

Is the AI transcript from Microsoft perfect? Not at all! But it’s good enough to select the sound bites you need.

Not Subtitle-Ready
Warning: If you’re eventually planning on taking the next step to create subtitles/captions from these clips for a final video, you’ll still have some ‘human-powered’ proofing work ahead of you (as there are plenty of AI misspellings and misinterpreted words).

Easy and Fast Solution
In the old days, people would transcribe long audio or video files themselves. If you paid someone to do this, it could cost hundreds of dollars. Now, AI has effectively taken over this painstaking task.

Though imperfect, it gets the job done.
(And don’t forget- AI transcription technologies will only continue to improve over time.)

Plus, it’s free and already baked into your Microsoft 365 account.
(Yes, you do need to pay for Microsoft 365, but then this is a great way to maximize that necessary investment.)

They say emerging AI will continue to make our lives ‘easier.’ I’m happy to report that this is just another example.