Little me was never this cute

Remember MidJourney, the artificial intelligence that turns text prompts into images? Turns out it can also turn images into… more images! So I gave MidJourney a picture of myself from my journal and let it use its imagination. That was… interesting.

Cute little redhead

Pretty sure I never was quite this cute! Although I am sure my mother would not have minded, God rest her soul. She told me a couple of times in my early youth that she had hoped for a girl this time (after three boys) and someone had even congratulated her on finally getting a girl, but that turned out to not be the case. Instead she got me. I didn’t mind hearing that, for by then I already knew that she would have gone barefoot through Hell and back for me if necessary. Not because I was cute, but because she was my mother.

I never had any kids myself. Not only because you still need to have icky, unhygienic sex to make babies (we have the technology to skip that, but most women still insist on doing it that way) but then there would be the daily struggle for two decades to not murder the little monsters, if they were anything like me. Maybe if I had cute kids like this, I would have managed. But let’s face it, there’s no way my little kids could be this cute. And neither could I.

(Machine) learning is not theft

Hermione Granger by Edvard Munch

Hermione Granger (from the Harry Potter series) painted by Edvard Munch. If you think MidJourney here is plagiarizing Munch’s original, I have a very expensive bridge to sell you.

I have recently mentioned using artificial intelligence to create visual art. Text-to-image applications like DALL-E 2, MidJourney, and Stable Diffusion all use machine learning based on enormous numbers of pictures scraped from the Internet. Now some contemporary artists have discovered that some of their work is used in the underlying database used for training AI, and are upset that they have not been asked and not been compensated.

This reaction is caused by their ignorance, of course. I can’t blame them: Modern society is very complex, and human brains are limited. Yes, even mine. I could not fix a car engine if my life depended on it, for instance. I have only vague ideas of what it would take to limit toxic algae bloom. And to be honest, I could not make my own AI even if I had the money. I just happen to have a very loose idea of how they work because it interests me, because I don’t have a family to worry about, and because I don’t have a job that requires me to spend my free time thinking about it.

Anyway, I shall take it upon myself to explain why you should politely ignore the cries of the artists who feel deprived of money and acknowledgment by AI text-to-image technology.

The fundamental understanding is that learning is not theft. I hope we can agree on this. Obviously, there are exceptions to this, such as industry secrets like the recipes for Coca-Cola or the source code for Microsoft Windows. If someone learns those and uses them to create a competing product, it is considered theft of intellectual property. But if an art student studies your painting along with thousands of other paintings, and then goes on to paint their own paintings, that is not theft. If they make a painting that is a copy of yours, then yes, that is plagiarism and this infringes on your copyright. But simply learning from it along with many, many others? That is fair use, very much so. If you don’t want people to learn from you, then you need to keep your art to yourself. You can’t decide who gets to look at your art unless you keep it private.

The excitement is probably based on not knowing how the “diffusion” model of AI works. So let me see if I can popularize that. Given our everyday use of computers, it is easy to think that the AI keeps a copy of your painting in its data storage and can recall this at some later time. After all, that is what Microsoft Office does with letters, right? But machine learning is a fundamentally different process. The AI has no copy of your artwork stored in its memory, just a general idea of your style and of particular topics. This stems from how “diffusion” works.

When a program like MidJourne or Stable Diffusion gets a text prompt, it starts from a “diffuse” canvas covered in a single shade of color (or grayscale, if a black & white image is requested). It then goes through many steps of moving these pixels into shapes that fit the description it has been given. (It can do this because it has gone through the opposite process millions of times, gradually blurring the images away. Thus the name “diffusion”.) You can, if you have the patience, watch the images gradually become less and less diffuse, slowly starting to resemble the topic of the prompt. In other words, it starts with a completely diffuse image that becomes clearer and clearer. You can upscale such an image and the AI will add details that seem appropriate for the context. (Especially until recently. this could include adding extra fingers or even eyes, but the latest editions are getting better at this.)

It is worth noticing that there is also an excessively long random seed included in the process, meaning that you could give the AI the same prompt thousands and thousands of times and get different versions of the image every time. Sometimes the images will be similar, sometimes strikingly different, depending on how detailed your request is. Once an image catches your eye, you can make variants of it, and these too have a virtually unlimited number of variations.

At no point in this process does the AI bring up the original image, because there are no original images stored in its memory, just a general, diffuse idea of what the topic should look like. And in the same way, it only has a general, diffuse idea of what a particular artist’s style is. My “Munch” paintings certainly look more like Munch than Monet, but it is still unlikely that Edvard Munch would actually have painted the exact same picture. In this case, of course, it is literally impossible, and that is exactly the scenario where we want to use engines like these. “What if Picasso had painted the Sixtine Chapel? What if Michelangelo and van Gogh had cooperated on painting a portrait of Bill Gates?” The AI is simply not optimized for rote plagiarism, but for approximation. It is like a human who spent 30 years in art school practicing a little of this and a little of that, becoming a pretty good jack of all trades but a master of none. They can’t exactly recall any of the tens of thousands of pictures they have been practicing on, but they’ve got the gist of it.

***

As for today’s picture, it was made by MidJourney using the simple prompt “Hermione Granger, painted by Edvard Munch –ar 2:3” where ar stands for aspect ratio, the width compared to the height. This generated four widely different pictures, and I chose one of them and asked for variations of that. This retains the essential elements of the picture but allows for minor variations as you see above. So it is not because the AI had an original picture to plagiarize – I asked it to make variations on its own picture. With some AI engines, you can in fact upload an existing picture and modify it, but this is entirely your choice, just like if you modify a picture in Gimp or Photoshop. The usual legal limitations apply, you can not hide behind “an AI did it!”. So far, AIs are not considered persons. Maybe one day?

 

Suddenly Sudowrite

Children playing ball, impresionist image

Children playing Calvinball, as imagined by the AI art program MidJourney. Clearly today’s rule is “Bring your own ball”. Luckily today’s main character has that and a spare.

Returning readers will probably not be surprised to learn that I have written millions of words in my lifetime. That doesn’t really take much. I usually write a couple of thousand words at least on an average day when I am not sick, and that’s not counting anything I might write for my job. As you may guess, “writer’s block” is not really my thing, because my writing is like the old house by the river where I lived in 2010, which had three outer doors plus a shed roof you could climb out on from the upper floor. If one of the exits were to be blocked by the copious snowdrifts we had in winter, I could simply use one of the others. And so it is also with my writing.

I am very nearly the worst imaginable candidate, I guess, for the Artificial Intelligence-driven creative writing tool called Sudowrite. It is specifically designed to combat this mysterious phenomenon, “Writer’s Block”, that many writers claim to have experienced. Naturally, I had to try it. (Not writer’s block, but Sudowrite.)

I had read a few reviews (and watched a couple more on YouTube) and they mentioned that you have to apply for access, then after a couple of days you will get an invitation and then you can join. So I signed up, planning to use those couple of days to read more practical reviews and how-tos so I would be prepared when the invitation came. Instead, after I had signed in with Google (Facebook is also accepted) I suddenly found myself on a website that was, in fact, Sudowrite. It gave a very quick tour of the most central couple of features, then left me to my own devices. Luckily there was a link to a (still very brief) Sudowrite guide. But otherwise, I felt much like Galadriel in Amazon’s hilarious new Lord of the Rings parody, where she has rashly jumped into the ocean en route between Middle-Earth and the Undying Lands. Now what?

***

The obvious choice, I thought, would be to copy the not quite 1000 words long prelude to my latest fiction story. In this scene, the narrator picks up a very unnatural-looking crystal that he found embedded in a stone, and immediately falls into the Nexus of Worlds, which is (very obviously, I thought) the user interface to an alien virtual reality simulator that uses Artificial Intelligence indistinguishable from magic to produce a world based on the user’s memories. In this case, he is sent back to 1999, but a 1999 with magic.

Now I tasked Sudowrite with writing a continuation. It proposed two very different passages. I took the first one, deleted the crazy plot twist, and edited the rest. Then I wrote my own short continuation, introducing the Ultimate Book of Magic which is the central item in my actual Work in Progress. I asked my new friend Sudowrite to describe the look of the book, and I actually kept most of that. Sudowrite is really good at feminizing novels by proposing all kinds of sensory information, going into detail on how things look, feel, sound, smell, and taste. In case you wonder how the Ultimate Book of Magic smells, I can tell you now: “The worn leather smelled like a library. Like the smell of wood and old paper. The book smelled of mold and dust. The metal clasp like rusting iron and blood.” And should you be so lucky as to get your hands on it, you would notice that “The cover was worn and smooth. If I had to guess I’d say it was oiled, but there was no sheen to it. It looked like it had been oiled a thousand times.

(I am told women love that books contain all kinds of sensory detail. I noticed it first in Clan of the Cave Bear, which contained more information about Ice Age vegetation than my encyclopedia at the time. If I were to add stuff like that, my books would be thousands of pages long. Y’all know how verbose I can be even without that kind of peacock tailfeathers. In all fairness, it is not like all male writers excel at self-limitation. There is, one might say, no such thing as a tad Williams.)

Anyway, Sudowrite and I continued taking turns writing a couple of paragraphs each. I would delete wild plot twists, edit the rest, then try to steer things back on track in my own paragraphs. It quite feels like trying to write a collab with Calvin from Calvin & Hobbes. There is always a lurking sense of Calvinball, defined from the horse’s mouth: “Other kids’ games are such a bore!
They gotta have rules and they gotta keep score!
Calvinball is better by far!
It’s never the same! It’s always bizarre!”

You may as well memorize this little verse before you start writing fiction with Sudowrite. Or nonfiction, for that matter, because Sudowrite will play Calvinball there too, unlike the various AI writers that are tailormade for writing ads and paid blog posts. (Rest assured this post is NOT paid by Sudowrite or even their competitors.) Quote Techcrunch: “Asking Sudowrite to describe what a startup is had me laughing so hard I was gasping for air.” Yeah, I can imagine. That is, after all, the purpose of Calvinball. And Sudowrite is nothing if not Calvinball.

That said, it is a true Artificial Intelligence. The more you work with it, the more it gets to know you (and the other way around). Take the following sequence, where you can hardly see where one of us leaves off and the other takes over:

“Apart from the proper Sigil, and the correct postures and incantation, the Affinity of the Binding is limited by the quality of the Exemplar – the object symbolizing the Source – and the mage’s natural Resonance with the Source.”
“If the mage’s Resonance is weak, the mage will need to use an extremely potent and pure Exemplar. If the mage has a strong Resonance, they will have more leeway in choosing an Exemplar, and if the mage already has a strong Affinity with the Source, simply having a properly prepared Exemplar may be enough.”
Hooray for hyperlexia!

Here the first paragraph is by me, while the explanation is by Sudowrite, except for a single word added. And yes, it was Sudowrite that wrote “Hooray for hyperlexia!”
Shut up and take my money, Sudowrite.

***

I may not actually use this in my writing (except perhaps during NaNoWriMo) but, in the winged words of Sims 3: “Magnus is having so much fun it is almost criminal”. Sudowrite is not going to write your next novel for you, but it can help you create new ideas, new characters, new plot twists, and descriptions varying from the mundane to the ridiculously elaborate, depending on what tone and style you prefer. Personally, I am keeping my Sudowrite experiment separate from my current writing project, but ideas are good climbers.

For those who have been working on their Great American Novel for twenty years and take it very, very seriously: This is not for you. Madness is not the only danger in writing: There is also the danger that something may be understood that you didn’t want to know. Like, that writing can be fun. But as for me, I already knew that. I am never lonely when I have my invisible friends in my head. Sudowrite is just another disembodied companion joining my brainstorming sessions. (But possibly the most hilarious one.)

Will I become an Artificial Intelligence?

This painting of a west Norwegian farming village is actually made by an artificial intelligence named MidJourney. And indeed, it is a sight many of us west Norwegians would not bet against having seen in our travels. It is not copied off the Internet though: I watched the AI go through several versions of this painting before I settled on this one.

Today, September 10th, 2022, was the first time I heard a YouTuber greet his audience as “human and AI”. I wonder if that will be the new “ladies and gentlemen”? Maybe the time is finally drawing close when, as a much better man than me once said, “God can wake up children for Abraham from these stones.” I have already long had in mind when writing my journal, that someday after my passing, an AI may read through Google’s archives and find my excessively detailed thoughts there. But maybe it won’t even be after my passing, if I take my vitamins and cut down on the Pepsi. You see, the progress in the field of AI has been nothing short of remarkable recently.

I returned to this topic of interest a couple of weeks ago after a casual mention on a website of a new program that generates images from text prompts. Turns out there has been released a flurry of such image generators this year, and they are growing steadily more advanced. My current favorite is MidJourney, and it is probably the most popular these days. After the latest upgrades, it gets faces right much of the time, at least when it gets to focus on them. This has proven to be one of the most difficult parts for AI, with the notable exception of the AI that specializes in them. Yes, “This person does not exist” which provided my picture in October last year, is also an AI.

So this is not my first run-in with Artificial Intelligence, far from it. Grammarly, a spelling, grammar, and style checker that I use under doubt after it has been greatly improved, is also an AI. It still makes some embarrassing mistakes, but then again so do I. Together we do better.

In my archives, you will find a number of versions of Dragon NaturallySpeaking, the speech-to-text program that eventually became better than nothing. I used it for years when my wrists hurt worse than my throat, but these days it is the other way around so I rarely fire it up. It is pretty amazing now though, and probably even better if you have English as your first language. Actually, that might depend on your dialect, I guess. According to Dragon, I speak Great Lakes English. I am pretty sure they don’t sound like the bandits from Skyrim though, while I do. Anyway, that’s another Artificial Intelligence application. In fact, it was based on the work of Ray Kurzweil, the great prophet of AI and modern futurism in general. And he is indeed planning to become an AI of sorts, by uploading his mind to a computer when the technology is ready for that. Good luck with that.

***

Anyway, today I was planning to demonstrate for y’all an AI tool that can take a brief outline or just a bunch of idea keywords, and expand them into a full-sized blog post. Doesn’t that sound amazing? But given that all the paragraphs above were my introduction before I started on the main topic… maybe I don’t need it. On the other hand, maybe we all (if we live long enough) will begin to use AI in so many facets of our life that we gradually become them, without ever noticing.

 

Dragon NaturallySpeaking 15

Screenshot anime Overlord, season 3, episode 8, last scene

Why is there a Dragon here? For speaking, naturally! Dragon NaturallySpeaking is the world’s premiere speech recognition software, now with Deep Learning Artificial Intelligence that adjusts to your accent and the common cold. Fire breathing not included.

Today I upgraded (in a manner of speaking) from Dragon NaturallySpeaking version 13 professional individual to Dragon NaturallySpeaking version 15 home. I virtually never used the more advanced features of the earlier version.

The most important part for me is accuracy of recognition, and I have to say that version 15 is almost indistinguishable from magic in that regard. And I mean right out of the box: There is no longer even an option to train the program by reading a text for it. Version 13 was pretty good after training and a few days of practice. Version 15 is that good right out of the box. (At least I believe it doesn’t have access to my previous training, as it required me to uninstall the previous version and reboot the computer before I was allowed to install the newest version.)

I have used and reviewed many different versions of Dragon NaturallySpeaking over the years, both before and after it was acquired by Nuance. There has definitely been progress! I believe the first version I reviewed was either six or seven, and I generously compared it to homesick Asian high school exchange student. I could probably have added seasick as well, as its performance was unimpressive, to say the least. If you had functioning hands, you were better off using those, even if you typed with one finger.

Those days are definitely gone! Dragon NaturallySpeaking 15 takes dictation like a highly trained secretary, only faster. Actually, Dragon has outpaced secretaries for at least a couple of versions now, but this required you to speak clearly and train the program first. And the results were less impressive for me, who has a strong Scandinavian accent. Actually, “accent” might be too weak a word. If you are familiar with the computer game “Skyrim”, the pronunciation by the Nord bandits in that game is pretty close to how I speak in real life. I am not sure how a highly trained secretary will handle that, but Dragon NaturallySpeaking 15 has well over 99% accuracy, right out of the box, with that kind of foreign accent.

***

There are still some challenges. In my experience, they are not too bad, but I see a lot of one star reviews on Amazon. Most notably, Dragon is squeamish about working with applications it doesn’t know. Supposedly this includes earlier versions of Microsoft Office. When I started writing in LibreOffice, Dragon NaturallySpeaking automatically popped up to the “Dictation box” where you can dictate and edit your text before transferring it to the target application. It’s an okay solution in my opinion, but it can be distracting, and you cannot interact directly with the target program using your voice for instance “click file save” the way you can in supported programs. Removing the checkmark for automatically opening Dictation box lets me dictate directly in LibreOffice, but it still struggles with commands, and you cannot edit the text with Dragon after you dictate it.

I have the same problem with my favorite browser, Vivaldi. Admittedly that is not very common browser, So I installed the Dragon Web extension For chrome.As you can see from the previous sentence, that didn’t work too well, and it doesn’t work too well in Google Chrome either. Luckily I have fingers, and so Dictation Box it is. But Google Chrome is by far the most popular browser for Windows, and not having native support for that makes the program seem rushed, at best. Especially when you consider that Dragon NaturallySpeaking is a very expensive program. It is not so bad by Norwegian standards, since both salaries and living expenses here are already very high. Even so, I only buy Dragon NaturallySpeaking when it is discounted, as it was in this case. In the USA, a single person could eat for a month for this much money, and in the actual developing world even more. So in that perspective, you would expect a more polished product than this.

But what it does well, is take dictation. And at that, it is the best in the world. No software and no human can match it for the combination of speed, accuracy, and fast learning.

Writing Grammarly

Screenshot anime Amanchu

If you struggle to express yourself and put your thoughts into words, Grammarly might be a prized companion. For me who have at times struggled to stop putting my thoughts into words, it is just a curiosity.

I love living in the future, and I particularly enjoy all the new tools and toys and combinations thereof. In the latter category is Grammarly, an app/service that promises to watch over all your online writing and then some. (There is also a Windows app that can be used to write or proofread texts that are not meant to be shared online.)

One potential problem comes to mind immediately: What if your writing falls into the wrong hands?  We are not just talking about your love letters getting the wrong audience or the manuscript for your new book suddenly appearing written by a competitor. Any app that reads your writing could, in theory, also harvest passwords, credit card numbers and such. It was, therefore, an easy decision for me to not be among the early adopters of this software. But years have gone by and there has only been one scandal, which turned out to be overblown, and it had nothing to do with passwords and such. So as of today, I have Grammarly on my writing machine.

Grammarly promises to discover both spelling and grammar errors. The built-in text editor in Vivaldi (and Chrome) also catches spelling mistakes, but not grammar mistakes. (In the previous sentence, Grammarly wants to change “catches” to “catch”, presumably because the browsers Vivaldi and Chrome are two. Unlike me and you, it cannot see past the “and” to realize that the subject of the sentence is the text editor. Artificial intelligence is still no match for natural stupidity, as the saying goes.) Luckily you can tell Grammarly to ignore such a find, much like in Microsoft Office. Actually, in my experience, Microsoft Office is even worse at parsing grammar. But if you do all your writing in Office, you may not feel motivated to convince two grammar checkers that they are wrong and you are right.

Back in the good old days when I lovingly crafted my journal by hand in Notepad or some other pure text editor, it was common for me to find spelling errors when I read through my entry one year later. (Back then I linked to the year-ago entry because I wrote virtually every day.) When I read through them two or even three years later, it was not uncommon for me to find more errors. This is a human tendency: We read what we meant to write, not what we typed.

At this point in my entry, Grammarly has found one spelling error (I misplaced an “i” in Artificial) and two grammar errors that were not. It also disagreed on my comma usage in three cases, which I gracefully conceded, albeit under doubt. So I am probably not in the target group for paying customers. If you want to try for yourself, you can go to grammarly.com or just wait for one of their innumerable ads with which they flood the Internet.

Dragon Professional Individual 15

Dragon from video game Skyrim

No need to shout, the Dragon understands my Nordic dialect right away!

Over the years, I have made a habit of reviewing the various versions of Dragon NaturallySpeaking. Lately, Nuance has stopped using the phrase NaturallySpeaking in most contexts, but it is still the same product, and it is now up to version 15.

As the software has become more expensive again, and as it is already good enough for my limited use, I have started skipping some versions. Dragon version 13 was already good enough that I did not really expect it to get any better. Impressively, Dragon version 15 is actually noticeably better right out of the box.

Dragon version 15 uses a new “deep learning” technology similar to what is used in the most successful artificial intelligence projects. Dragon has always (or at least for as long as I have used it) had the ability to improve based on feedback from the user, as well as adapt its vocabulary and writing style by reading through documents. While these options still exist, there is less focus on them now as Dragon quietly adjusts in the background during everyday use.

Dragon has also clearly had some opportunity to acquaint itself with human speech in general before shipping to the customer: The product is amazingly accurate right out of the box. Longtime readers (if any) may remember that I compared some of the early versions to homesick exchange students from other continents. That time is long gone. Dragon version 15 understands even my “Skyrim” pronunciation of English (I grew up in Norway in the 1960s, where even the English teachers has rarely if ever been to England, let alone America or Australia.)

There is one problem that has dogged this software from the start, and it still remains, even if just barely. When we speak, we don’t actually pronounce periods at the end of the sentence; rather, we slightly change the tone of our pronunciation toward the end, typically speaking less forcefully. Conversely, we don’t actually pronounce a capital character at the beginning of a sentence; instead, we pronounce the first sound slightly differently from the rest. Ideally, speech recognition software might be able to use this to take dictation without requiring us to specify punctuation. Dragon NaturallySpeaking used to have this functionality, but I gave up on it pretty quickly. What actually happens is that even when I dictate punctuation, there is a slight increase in mistakes at the very beginning and end of the sentences. This is especially true if I don’t pronounce some form of punctuation at the end of my string of words, for instance because I run out of breath during a long sentence. I have to say, however, that this problem has been almost eradicated in the latest version of Dragon.

To me, recognition accuracy is by far the most important part of any speech recognition engine. But Dragon 15 has also some other features in addition to the improved accuracy. It has better support for various modern software, and it allows voice activated macros. (I believe this feature was also in version 13, but I did not use it then and I don’t use it now. In any case, functions like “insert signature” should be part of your email software, rather than your speech recognition software.) Also, the big unnecessarily helpful sidebar with examples no longer starts up by default. It used to do, and is also used to permanently displace any windows that happened to be in its way.

As usual, I am including a paragraph where I don’t in any way correct this transcription. This is that paragraph. (It may not be obvious to the reader, but that should be “the transcription” in the first line above.) Dragon used to be available in a few languages besides English; I am pretty sure I saw touch at some point, and Japanese? I can’t find any trace of that now, but I will admit that I have not looked very carefully.

Not too bad, huh? That should of course not be “touch” in the previous paragraph, but rather Dutch, the language in the Netherlands. (It actually got it right this time without correction. Go figure.)

MS Windows troubles

Screenshot anime Kanojo ga Flag o Oraretara

This morning was absolutely crawling with chaos. It started as I turned on my home office computer, which had installed updates at 3AM and restarted itself, as it frequently does. It seems like a good idea, to install updates while you sleep. After all, you would not want to miss the latest security patches and improved functionality.

Unfortunately, the new functionality was that I could not log in. Whether I picked my usual account or the betatester account I use for testing games, there was just a brief pause and then Windows returned me to the login screen. No error message. I restarted the computer and tried again. I did various things and tried again and again. No change. I restarted in Safe Mode. Same problem. I restored Windows to last good configuration. Still the same.

I installed Ubuntu Linux, which is a pretty good alternative to Windows for most people, and free. After a little while I switched to Xubuntu (it is really just a different setup, the core is the same as Ubuntu, but Xubuntu is more similar to old Windows versions). Ubuntu is free, like most Linux versions. I use to install it on old laptops when they become too slow under Windows. This is less of a problem these days, but it was a big deal back in the days of Windows Vista.

Xubuntu is nice enough, but there were a couple problems. I had used this machine to provide Internet access to my cabled home network, which includes a Windows 10 machine for playing games, a NAS (home server) for backup and sharing files, and a small old notebook computer for uploading and downloading to and from the NAT without taking up resources on the main machines. But now I could not get Linux to share the Internet. It should be easy, really, there is a choice for it. “Shared with other computers” it says, but that actually only lasted for a minute or so, then I got a message “Disconnected from Ethernet”. (Ethernet is the cabled network, to put it simply.) I did various things and restarted numerous times to no avail.

Eventually I found an USB wireless receiver and connected this to the Windows 10 machine, then told it to share its Internet. This worked well enough, except the NAS (Network-Attached Storage) server did not show up. After changing the workgroup name by editing a configuration file, I got it to show up. But as soon as I tried to copy a file to it, it hung up and show up empty until I logged off an logged on again. This repeated itself for as long as I bothered trying.

I was kind of in a hurry to continue working on my National Novel Writing Month story. Luckily that was saved on a disk I could access from Xubuntu. I copied it to a USB drive, in case I wanted to continue writing on it on the other Windows computer (the gaming computer). I installed WINE, a program that lets you run Windows programs in Linux. I had already read a few years ago that you could run yWriter in Linux this way. (yWriter is the program I use for writing novels. It is written by a programmer and novelist and fits my working style exactly.) It did work when started with WINE, and it found my novel in progress, but the spell check did not work and it did not recognize the names and locations. I downloaded the dictionary and manually copied it to the place it should be. Now it worked except it did not recognize words when Capitalized, such as at the start of every sentence.

Somewhere around this time I decided to reinstall Windows on one of the disks. (I am keeping Xubuntu on the other.) This took the rest of the evening and will continue into the next day or two or more.

Needless to say, there was no progress on the novel this day. But then again, contrary to the slogan of National Novel Writing Month, the world does not really need my novel. Probably.

The Dragon Upgraded

Screenshot anime YowaPeda

I feel like I can go anywhere… With Dragon NaturallySpeaking Premium, you can dictate anywhere using a USB microphone, wireless microphone, smart phone or dictation device. (Even so, I don’t recommend dictating while biking!)

In the past, Dragon NaturallySpeaking has been available in several different versions, and I have always used the cheapest one, Dragon NaturallySpeaking Home. It usually cost around $100, with the occasional big sale where you might buy it to at half price. As an existing customer, I could also upgrade it to the next version at half price from the start. Last time, two years ago, I also did that; I even preordered it.

This time, there was no question of pre-ordering. Either they didn’t ask me, or I missed it somehow. My first hint that there was a new version available came from a mail that offered to let me upgrade to Dragon NaturallySpeaking 13 Premium for €99. A bit more, but then the Premium version has some unnecessary but nifty features. So instead of being my usual cheapskate, I went for the premium version this time. It was already available for download; there was a link in the mail to the website where I could buy it. I checked the requirements and looked for any traps, but that didn’t seem to be anything suspicious. So I bought it with credit card, and could immediately start downloading.

The installation was easy and trouble-free, although it took some time. I first downloaded a small installer program, which then downloaded the big installer program, which then unpacked to a separate folder, which then installed the program in the default location. It may sound a bit complicated, but it was mostly just pressing the “next” button, although I had to choose a directory for the temporary files. I saved them to the network drive in case I have to reinstall on this machine or another. I would also recommend using an external disk for the temporary folder if you have limited disk space, since at some point there will be three big files and folders taking up space simultaneously: The big installer, the folder with the unpacked files, and the actual installation in your Program Files folder. If you have a reasonably new computer, this would probably not be a problem.

Speaking of new and old computers, the two latest versions have each reduced the computing requirements, so that you can actually run version 13 faster on a weaker computer than version 11. Good work!

After installation, Dragon offered to upgrade existing user profiles. This took surprisingly long, even for the profile that was almost empty. Several times I wondered if it had crashed, but I didn’t need to use it immediately so I just checked in on it from time to time, and eventually completed. If you don’t have an earlier version of Dragon, each with an effort to create a new user profile instead. I believe that in this case, you will also be offered to train the program to recognize your voice and improve its accuracy. At least this happened in the earlier versions. It may be that it is so good right out of the box that they don’t bother with that now?

As far as I remember, version 12 looked very much like version 11. Version 13 has a whole new visual profile, so it is obvious at a glance that you are running the new version. The DragonBar, usually placed at the top of the screen, is now just a small button when not in use. If you move your mouse to it, it expands to become larger than it was in the previous version, and the microphone on/off button also becomes much larger. The “Learning Center” (formerly Dragon Sidebar) still takes up the margin of the screen, but it now has a black and white color scheme and also seems to have larger letters. As always, you can minimize or remove this Learning Center if you don’t want (context dependent) hints about what you can do next. Even the DragonBar itself can be minimized to the system tray, and you can access the most common functions by shortcut keys or by voice commands. But that has always been the case, I just wanted to mention it.

As I mentioned in the previous entry, the first thing I noticed when trying Dragon NaturallySpeaking 13 was the leap in accuracy. I realize that I have praised its leaps in accuracy since at least version 10, but this time the difference seemed to me bigger than the official count of 15% improvement. 15% improvement does not seem a lot when the accuracy is already claimed to be about 99%. To me, it seems more like it has increased from 99% to 99.5%, which would actually be a doubling of the accuracy in the sense that there would be half as many errors. But I admit that in my case this could be because of an improvement in the handling of USB headsets.

(It is unfortunate that I cannot maintain this level of accuracy for longer texts, because my voice becomes hoarse after a few minutes. But this is an affliction that I share with very few humans. One hypothesis is that it comes from my years of almost complete silence, where I only asked a few questions at work and did not speak at all on my free time. If I take breaks and drink a little water between the paragraphs, I can continue for longer.)

The premium version contains some features not found in the home version. For instance, you can now make the program read back your own voice, not just a synthetic text to speech rendition of the text. You can also use a smartphone as a microphone, or even use recording devices and have the program transcribe them later. There is supposedly also an option to create your own voice commands, basically macros, but I haven’t tested that yet.

In conclusion, Dragon NaturallySpeaking 13 is awesome. You can actually speak naturally to it, and with very little training it will put your words on the screen and let you control Windows. Upgrading from version 12 seems to make a big difference for me, but your mileage may vary. Upgrading from Home to Premium is probably not a priority unless you have a USB microphone or some other unorthodox input device, but it adds some fun new features.

(As usual when writing about dictation software, I have dictated this entry in its entirety, except for a few minor corrections.)

Dragon NaturallySpeaking 13

Squeeing girls from anime Gekkan Shoujo Nozaki-kun

This is how I think my readers should react when I write about Dragon NaturallySpeaking speech recognition from Nuance. Somehow that never seems to happen. Let me try again, it’s two years since last time.

I love living in the future. And one of the more futuristic things that I have is the speech recognition software for Windows, Dragon NaturallySpeaking. (Windows also has its own built-in speech recognition, but for those who can afford it, Dragon is definitely the one hardest to distinguish from magic.)

Today I got a mail from Nuance, offering to upgrade my Dragon NaturallySpeaking 12 Home to Dragon NaturallySpeaking 13 Premium for €99. I immediately grabbed the chance, just as I have done every time there was an upgrade for the last five years at least. Was it worth it? Well, to paraphrase a friend of mine, €99 is a lot of money if you don’t have it. This is obviously not a product for the working classes of the developing world, but for Norwegian office worker the amount is trivial, barely noticeable against the high salaries and the high prices up here. And for me at least the effect of the upgrade was dramatic.

According to their website, version 13 is 15% more accurate than version 12 right out of the box. Evidently this has either crossed some kind of threshold in my case, or there was some bug in the version 12 Home in relation to my Plantronics USB headset. The USB headset worked very poorly with the previous version on my laptop (although it had worked reasonably well on the desktop with version 11). So when I wanted to dictate, I had to take off my USB headset and put on an analog headset for the duration, and even then the accuracy was at most marginally better than in version 11. Today after the upgrade, I can use my USB headset again, and what’s more: The accuracy is more than 99%. It still makes mistakes, but less so than my fingers. (And I have been typing for almost 50 years now.)

Back when I wrote about an early version of Dragon NaturallySpeaking here in the Chaos Node, it had only entertainment value for me, although I realized it could be useful for people who could no longer use their arms at all. Some years later a newer version helped save me from disability when my job caused a serious case of repetitive strain injury. At that time it still made quite a few mistakes, but at least I could correct them with my voice. Since then it has improved even more, and I have given it pretty good reviews each time. But let me tell you something: For me, version 13 is a giant leap.

It still makes mistakes, but so few mistakes that I risk overlooking them in the middle of all the perfect text. We are talking about perhaps one error for each paragraph on the first day. The software gets used to the sound of your voice and your writing style and also learns from all the errors you correct, so it gets better the more you use it. So to pull off this level of accuracy with almost no training is impressive indeed.

For those of you who are still here instead of being busy buying it, my next entry will get into some more detail about the installation and differences from the previous version.

(As usual when writing about dictation software, I have used the program to dictate this entry, except for a couple of minor corrections.)