Looting The Battlefield

/images/_91cd6bf6-9ea8-4a54-b099-89afdaaa47fc.jpeg

I’ve been experimenting for months on different ways to convert podcast content into a form I can use in Frankie. I’ve found several solutions, but along the way, I’ve left the resulting file tree in absolute chaos. I’ve got files in a dozen directories and an equal number of formats. There are fragments of scripts and clips and tools scattered throughout the hierarchy like bodies left to rot on a medieval battlefield, but there’s also no consistency, little documentation, and worse, still no definitive process.

Well it’s time to loot those bodies, gather the useful bits into a proper plan, and bury whatever is left.

◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇

Today, my process for ingesting a Norwegian podcast looks something like this:

  1. Download the audio

  2. Download the transcript

  3. Split the transcript into sentences

  4. Determine timecodes in the audio to match each sentence

  5. Split the audio into sentence clips

  6. Generate slow machine audio for the Norwegian sentences

  7. Translate the transcript sentences into English

  8. Generate machine audio of the English sentences

  9. Pack everything into one of the file archive formats I’ve taught Frankie to understand

My recent road trip experiments have left me wondering if I should rethink Frankie’s default document format, but regardless of how I end up packaging the results, I need to get a handle on Steps 1 thru 8 and then clean up the file battleground.

I have scripts to help streamline every step except Step 4, which is still painfully slow and has to be done manually, using a subtitle editor to listen to the audio carefully and mark the start and end of every sentence. This takes approximately three times as long to do as the audio duration itself, making it the biggest hurdle to expanding my library of practice documents.

Finding a way to do this step faster has been the biggest reason for the carnage of files scattered around here, so solving it is the linchpin holding back any real progress in cleaning up these files.

Sentence Timing

Since I have the transcripts already split into sentences, I should be able to do what’s called “forced alignment” - which is an AI process that matches a transcript with the corresponding sections of audio.

I’ve done it before, with English audio, but I’m not sure if the AI models for Norwegian are available in a form I can work with. I also remember being disappointed by a lack of accuracy with the English version. The timecodes were in roughly the right spot, but often clipped off the beginning or end of the passages. So this is going to be more of another exploration, rather than a straightforward coding exercise.

I don’t have a clear plan yet, but I’ll set off in roughly that direction (points to the horizon) and see what I find. Stay tuned.


Read More


/images/_1ccd4951-245c-4fc9-94ad-b67299ba11a6.jpeg

How Time Works In Plim

To me, budgeting is fun - until something goes wrong - and then it’s like herding fish. Money is always in constant motion, so it’s hard to look at just a single frozen moment - to examine all the parts and see how they fit together - before everything changes again.

Funds go in and out of your accounts a hundred times a day. You get pay deposits, cash withdrawals, you buy stuff on credit or with cash, you pay bills. Then there are the transactions that happen without you even being involved, like automatic bill payments, subscription services, charitable donations… And let’s not even talk about our kids and spouses out in the world spending more money while you’re still in the basement trying to get a handle on what they spent yesterday.

/images/_afb96080-dcb3-4d43-93e4-3fc8dafd1dc1.jpeg

Packaging Lesson Plans

As mentioned here, I recently got some exciting features working in console mode, but now I have to get them working on my phone. Getting the new code there should be easy enough - I can just pull the git repo - but delivering the lesson content is going to be a bit more complicated.

/images/_f604f7b8-ed8c-42b5-abac-2cab2ccaaa91.jpeg

Random Body Parts Singing Together At Last

FrankenTongues began with some very specific features in mind - features that I couldn’t find in other tools - and it’s taken a while to get here, but today was the day I finally got to play with two of the biggest.