Missing Modes

2025-04-03 (Mod: 2025-05-27) | 5 minutes

As my Norwegian text skills get better, I’m increasingly annoyed that my acoustic conversational skills are not, but it’s because I still haven’t found a practice method or tool that suits me. So today I’m putting some more concrete thought into what I actually want.

◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇

My grammar is decent, my vocabulary is good, and reading is without question my strongest skill in norsk, but I don’t have access to a regular conversation partner or native speaking coach, so I’m just not advancing very fast here in the “language noises” department.

I know I can find partners online, but I want to be able to practice frequently - whenever I have a few free minutes. I’m all about the spontaneity, but that gets strangled in the crib when you have to schedule things in advance with strangers. And there’s not much point in getting a coach if I don’t have a way to practice effectively between coaching sessions.

For a while, I tried working with AI tools for this, but I haven’t found a combination of tools and workflows that work for me there either. So, bottom line, it’s time to get more specific with how I’m going to get Frankie involved.

There are three dimensions I want to work on, which I’ll call: brain training, ear training, and tongue training.

Brain Training: the ability to compose sentences quickly to express what I’m trying to say. In other words, deciding what to say, and doing so quickly enough for an interactive conversation.

Ear Training: the ability to hear spoken norsk, at conversational speed, and understand what is being said.

Tongue Training: the ability to speak smooth and natural sounding norsk.

To get there, I’ll be adding some new practice modes to FrankenToungues, and to get maximum effectiveness from them make, I want them to function without relying on text. The drill prompts should be audio only, so even the interface will be reinforcing the skills I’m trying to focus on.

Tongue Training

To build my speaking skills, I’m simply going to create an audio database of norsk sentences. The drill will consist of a norsk sentence being played, which I then have to repeat. This separates the tongue training from the brain training, so I won’t be distracted by trying to figure out what to say. I can focus entirely on the intonations of each part of the sentence. The objective is to match both the individual word pronunciations and the rhythm and melody of the sentence as a whole.

Feature-wise I need to be able to replay the prompt as often as necessary, but each repetition should carry a slight penalty in terms of scoring. Judging success will be strictly subjective, but whatever score I assign will be reduced by the accumulated repetition penalties.

And remember, scoring isn’t about achieving a game score - it’s about determining how well I perform on the given sentence so I can focus my practice sessions on the ones I have trouble with.

For now, normal speed audio is all I’m going to allow myself, but it is possible that I’ll want to add slow speech recordings as well.

Req: Norsk audio sentences

Level: Increasing complexity and length.

Ear Training

To build my ear for norsk, I have to start listening to more of it. But it hasn’t proven sufficient for me to just listen passively to norsk audio content. That’s not progressing my skills quickly at all, so I need more focused, intentional drilling.

For this I want a database of norsk audio sentences on random topics. Imagine taking 10 novels and cutting them up into individual sentences, then throwing all of those sentences into a bag and pulling them out randomly, one by one. That’s the drill.

For each prompt, I’ll play a random sentence from the database and have to provide a translation. Once I’ve locked in my translation, I’ll then play an English translation audio of the sentence and judge my success. My experience is that, when reading novels, many of my translations have been helped enormously by already knowing what’s going on in that scene. So by ditching any ability to rely on context to offer clues about meaning, I’ll be completely dependent on my ear to do the work.

Scoring of success will again be subjective, and penalized by the number of repetitions required before locking in.

Req: Norsk audio sentences

English audio translations

Level: Variety of complexities and lengths.

Brain Training

I can think of three different levels for this: being given a sentence in English and having to translate it into norsk; being given a scenario and having to formulate a response; being asked an open ended question and having to compose a thoughtful reply. I’ll call these modes: mirroring, volleying, and interviewing.

Mirroring is easy enough, and can be implemented as the reverse of Ear Training. It even has the same data requirements, as the English translations become the prompt and the norsk sentences become the target replies.

Volleying will require some thought. I’ll need to brainstorm a bunch of scenarios and see what works, in terms of the most effective kinds of prompts to work with and how to judge the responses.

Interviewing is going to be the hardest. Especially when it comes to judging the replies. This may require either a good AI tool or a live human coach. But for now, I can implement mirroring easily enough.

But regardless of which content I work from, and what form the drills take, i skal know that I’ll need a way to generate English language audio translations of the Norsk sentences. So after investigating the current state of text-to-speech tools I can run locally in my lab, I’m pleased with the results I get from Coqui TTS. Here’s what the first few paragraphs of this article sound like when read by the Damien Black voice.

That’s good enough for me.

Missing Modes

Tongue Training

Ear Training

Brain Training

Read More

Enlightenment is Overrated

Over

Frankie Achieves Enlightenment

Over

Tormenting AI For Fun and Profit

Over